At what level does boxing of an object occur in .net? - c#

If I have an object such as List<string> that I cast into an object, and then back again, will all the strings get cast as well or just the list that contains them?
I'm thinking that the compiler would only have to check if the object was of type List<string> before casting back into a List<string> but I grew up in C#, so I'm not entirely certain what goes on behind the code that I write.

When you cast a List<string> to an object, you're not really doing any casting at all. You're assigning one reference to some data, to a less-specific reference. The string objects that it contains aren't changed at all, either.
Also, to clarify, there is no boxing involved in this case. Boxing occurs when you create a reference to a value type like an int or some struct, by assigning it or passing it somehow to a variable of type object.

Boxing occurs when a struct / value type is stored in a location which is typed to object or an interface which that struct implements. In this scenario both List<string> and string are reference types so no boxing occurs.
struct S1 : IComparable {
...
}
S1 local = new S1(); // No box.
object obj = local; // Box S1 instance into object
IComparable comp = local; // Box S1 instance into IComparable
obj = "hello"; // String is a reference type, no boxing

string is a reference type - an instance of string will never get boxed.
If however you had a List<int>, the List is a reference type, so there will be no boxing here either. int is a value type and it could be boxed IF it were cast to an object (either implicitly or explicitly).
Boxing only affects value types - List<T> is a class, hence a reference type, changing the generic type T does not affect whether an instance is passed by value or by reference.
Generic collections help prevent boxing as it enables value types to be read / written without being boxed to an object.

For every value type, there is a corresponding object type with the same name which is derived from ValueType. Whenever it is necessary to create a storage location (field, variable, or parameter) of a given type, the system will allocate space to store either a heap reference (if the type does not derive from ValueType), all of the type's fields (if it's a 'struct'), or the bits holding the types value. Boxing occurs when an attempt is made to store an instance of a value type into a heap reference. Unboxing occurs when an attempt is made to use a heap reference as though it were a value type. Note that in some contexts, unboxing will copy the contents of boxed object to a new storage location, but in other contexts it will regard the boxed instance as a storage location. The semantics of this are not always clear, which is one of the reasons some people hate mutable structs. In practice, mutable structs are fine if one avoids usage scenarios where the semantics get murky, and even immutable structs can suffer from the same murky semantics.

Boxing and Unboxing is the actions that applies to Value Types not to Reference Types.
It's not that, it's just a simple cast.

Only the List<string> will be casted.
If you want to cast the List<string> items do this:
List<string> list = new List<string>{"first", "second"};
List<object> objectsList = list.Cast<object>();
P.S. string is a reference type so it can't really get boxed and unboxed

Related

Why is Array Covariance not safe?

i was reading about the covariance & contravariance from this blog and the
covariance on Array got me confused
now, if i have this
object[] obj= new string[5];
obj[0]=4;
why am i getting error during run time? Theoretically obj is a variable of type Object and Object can store any type, as all the types are inherited from the Object class. Now when i run this code i am not getting any run time error, can anyone explain me why
class baseclass
{
}
class client
{
static void Main()
{
object obj = new baseclass();
obj = 4;
Console.Read();
}
}
It is in fact confusing.
When you say object[] objs = new string[4] {}; then objs is actually an array of strings. Unsafe array covariance is unsafe because the type system is lying to you. That's why it is unsafe. You think that your array can hold a boxed integer, but it is really an array of strings and it cannot actually hold anything but strings.
Your question is "why is this not safe", and then you give an example of why it is not safe. It is not safe because it crashes at runtime when you do something that looks like it should be safe. It's a violation of the most basic rule of the type system: that a variable actually contains a value of the type of the variable.
For a variable of type object, that's not a lie. You can store any object in that variable, so it's safe. But a variable of type object[] is a lie. You can store things in that variable that are not object[].
This is in my opinion the worst feature of C# and the CLR. C# has this feature because the CLR has it. The CLR has it because Java has it, and the CLR designers wanted to be able to implement Java-like languages in the CLR. I do not know why Java has it; it's a terrible idea and they should not have done it.
Is that now clear?
An object of array type T[] has three notable abilities:
Any value read from the array may be stored in a container of type T.
Any value read from the array may be stored back into the same array.
Any value that fits in a container of type T may be stored into the array.
A non-null reference of type T[] will be capable of holding a reference to any object of type U[], where U derives from T. For any possible type U derived from T, any value read from a U[] may be stored into a container of type T, and may also be stored back into the same array. If a container of type T holds an reference to an object which is derived from T, but is not of type U nor any type derived from U, then a U[] would be incapable of holding that reference.
It would be awkward to allow code to read an item from one array and write it back to the same array, without also allowing it to ask for an item be read from one array and written into another. Rather than trying to limit such operations via compile-time constraints, C# opts instead to say that if code tries to store a value held in a T into an array identified via T[], such an operation will succeed if the T is null or identifies an object of a type not derived from the element type of the actual array identified by the T[].

Is System.Array.Clone() guaranteed to clone value types?

int[] array1 = new[] { 1, 2, 3 };
int[] array2 = (int[])array1.Clone();
array2[0] = 9;
Debug.Assert(array1[0] != array2[0]);
This works fine. Clone() does a shallow copy, but the array types are value types, so they get cloned too.
My question is whether this is explicit in the language spec, or whether this is just an artifact of the current implementation?
My doubt is due to System.Array supporting value types "invisibly" behind the scenes via run-time generics . Looking at the public methods you would expect value types to be boxed.
It works because there's absolutely no way two arrays could share the same instance of a value type.
The spec doesn't specifically say how Array.Clone behaves with value types vs how it behaves with reference types. But the spec does say that instances of value types are copied, bit-by-bit, on assignment. So when array1[i] is copied to array2[i], you get a clone of the instance at index i. Always.
Keep in mind though, that if the value type has a field of a reference type, only the reference will be copied - not the instance of the reference type.
my query was whether potential boxing by Array would negate this. ie the boxed references are copied rather than the underlying value type.
Even if array1[i] was boxed during the cloning, it would have to be unboxed so that you end up with a int[] and not an object[]. The value would be cloned on unboxing.

C# heap space allocation when boxing and unboxing

Been using C# for a while and I've been thinking this:
public static void Main(Strings[] args){
...
Person p = new Person("Bob",23,"Male");
List<Object> al = new List<Object>();
al.Add(p);
Person p = (Person)al[0];
}
A typical example of boxing and unboxing in Collection, but question is: when boxing the variable, the CLR allocates a extra space in GC heap and treat p as object, yet the Person class is "larger" than System.Object
So according to that, that may lose some values that Person class owns additionally, it will fail to get some data after unboxing.
How CLR work that out?
Any solutions are welcomed
Person is a class, so used by reference. No boxing or unboxing.
As opposed to a value type that may require boxing.
If the Person type is really a structure, what you haven't declared explicitly, the space on the heap is surely larger than the space needed for an object of System.Object class. However, at the moment when the data are moved to heap, the object itself is not the value you are giving to the Add method, this value is only a reference to the boxed object. If System.Object was a structure, then yes, the data will have to be truncated to fit the size of the structure. This is the reason why inherited structures aren't allowed.

Unable to understand how boxing is done

I was studying boxing and unboxing.
I went through this example, I am unable to understand the answer.
Can anyone explain to me please.
I know what boxing and unboxing does now, by looking at a simple example, but this example, confuses a bit.
An example of boxing and then unboxing, a tricky example.
[struct|class] Point {
public int x, y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
Point p = new Point(1, 1);
object o = p; p.x = 2;
Console.WriteLine(((Point)o).x);
I read the answer as:
It depends! If Point is a struct then the output is 1 but if Point is a class then the output is 2! A boxing conversion makes a copy of the value being boxed explaining the difference in behavior.
Here is ((point)o).x a boxing or unboxing?
Didn't understand, can anyone explain to me please.
I know that the answer should come 1, but if class then how 2?
I don't know why everyone is writing an essay, it's pretty simple to explain:
When you cast a struct into object, it is copied into a new object.
When you cast an object into a struct, it is copied into a new struct.
When you cast between classes, the object's contents are not copied; only the reference is copied.
Hope that helps.
Although C# tries to pretend that structure types derive from Object, that's only half true. According to the CLI spec, a structure type specification actually defines two kinds of thing: a type of heap object which derives from System.ValueType (and in turn System.Object), and a kind of storage location (local variable, static variable, class field, struct field, parameter, or array slot) which does not derive from anything, but is implicitly convertible to the heap object type.
Every heap object instance contains all the fields defined by the type or its parent classes (if any), along with a header which identifies its type and some other information about the instance. Every struct-type storage location contains either the bytes necessary to hold its value (if a primitive type), or else holds the concatenated values of all its fields; in neither case does it contain any sort of header that identifies its type. Instead, value types rely upon information in the generated code to know what they are.
If one stores a value type to a storage location of that value type, the compiler will overwrite all the bytes occupied by the destination with values taken from the original value type. If, however, one tries to store a value type to a reference-type storage location (like Object), the runtime will generate a new heap object with enough space to hold all the data from the value type, along with a header identifying its type, and store in the destination location a reference to that new object. If one tries to typecast a reference type to a value type, the runtime will first verify that the object is of the proper type and, if so, copy the data from the heap object to the destination.
There are a couple of tricky scenarios involving interfaces and generics. Interface types are reference types, but if a struct implements an interface, the implementing methods may act directly upon a boxed struct instance without having to unbox and rebox it. Further, interface types used as generic constraints do not require boxing. If one passes a variable of a value type like List<int>.Enumerator to a function EnumerateThings<TEnumerator>(ref TEnumerator it) where TEnumerator: IEnumerator<int>, that method will be able to accept a reference to that variable without boxing.
Since Point is Struct and Structs are value types which means they are copied when they are passed around.
So if you change a copy you are changing only that copy, not the original.
However if Point was a class, then it was passed by reference.
So if you change a copy you are changing only that copy and also the original.
As for your confusion
object o = p; is boxing
whereas
(Point)o is unboxing
To understand boxing you need to understand the difference between Value types and Reference types.
I think the simplest way to understand it is:
"Value types are allocated in-line. Reference types are always
allocated on the Heap"
meaning, that if you add a value type (struct, int, float, bool) inside of a reference-type as a class variable (public or private), that value-type's data is embedded wherever that reference-type lives on the Heap.
If you create a value-type inside of a function, but do NOT assign it to a public/private variable, that value-type is allocated in the Function-Stack (meaning once you leave that function it will get collected)
So, given that background knowledge, it should be pretty self-explanatory what happens when you "box" a value-type: you have to take that value type (wherever it was in-line allocated) and turn it into a reference-type (create a new object for it on the Heap).
First you need to know where objects are stored. Structs, enums and other value types are stored on the stack, in registers, or on the heap; classes and other reference types are stored on the heap. A good tutorial is here.
Boxing is done when a value type is being stored in a heap. A copy is being made from stack to heap. Unboxing is the other way around, a copy of a value will be made from heap to stack.
1 Point p = new Point(1, 1);
2 object o = p;
3 p.x = 2;
4 Console.WriteLine(((Point)o).x);
In your code above, if Point is a struct, a copy will be made to object "o". On your line 3, what you modified is is the Point in stack. On the last line you unboxed the object "o" but the value you will get is the copied value from the heap.
If the Point is class, in line 1, a space is created for the Point in the heap. Line 2, creates a new variable "o" that references to the same memory space. Remember that "p" & "o" are referencing on the same memory address location, so if you modify any of the variable just like in line 3, you will get the modified value on both variable.

Behind the scenes while boxing and unboxing

Could anybody please explain what is actually happening behind the scenes when we perform boxing and unboxing?? I know boxing is conversion of value to reference type and unboxing is the reverse, but behind the scenes while boxing is that the boxed variables actually gets stored in heap and what is the basic use of boxing and unboxing?
Thanks!
There is no magic at all, just keep it simple as is...
Boxing is the act of converting a value-type instance to a reference type instance.
Unboxing reverses the operation by casting the object(reference type) back to the original value type.
So you have to understand difference between value-type and reference-type and also stack and heap
Value types - build-in types like int, string, char, double and struct - stored in a block of memory called STACK
Reference type - class, delegate, object - stored in a block of memory called HEAP
Now, when you understend the diagram above let's take a look on the real, simple code.
int i = 1;
object O = i; // Box the int
int j = (int)O; // Unbox the int
According to MSDN:
Boxing and unboxing enable value types
to be treated as objects. Boxing a
value type packages it inside an
instance of the Object reference type.
This allows the value type to be
stored on the garbage collected heap.
Unboxing extracts the value type from
the object.
[...]
In relation to simple assignments,
boxing and unboxing are
computationally expensive processes.
When a value type is boxed, an
entirely new object must be allocated
and constructed. To a lesser degree,
the cast required for unboxing is also
expensive computationally.
I found that the following articles are very helpful and informative:
Why do we need boxing and unboxing in C#?
http://www.codeproject.com/KB/cs/boxing.aspx
http://msdn.microsoft.com/en-us/library/yz2be5wk.aspx
http://msdn.microsoft.com/en-us/library/yz2be5wk(v=vs.80).aspx

Categories