Could anybody please explain what is actually happening behind the scenes when we perform boxing and unboxing?? I know boxing is conversion of value to reference type and unboxing is the reverse, but behind the scenes while boxing is that the boxed variables actually gets stored in heap and what is the basic use of boxing and unboxing?
Thanks!
There is no magic at all, just keep it simple as is...
Boxing is the act of converting a value-type instance to a reference type instance.
Unboxing reverses the operation by casting the object(reference type) back to the original value type.
So you have to understand difference between value-type and reference-type and also stack and heap
Value types - build-in types like int, string, char, double and struct - stored in a block of memory called STACK
Reference type - class, delegate, object - stored in a block of memory called HEAP
Now, when you understend the diagram above let's take a look on the real, simple code.
int i = 1;
object O = i; // Box the int
int j = (int)O; // Unbox the int
According to MSDN:
Boxing and unboxing enable value types
to be treated as objects. Boxing a
value type packages it inside an
instance of the Object reference type.
This allows the value type to be
stored on the garbage collected heap.
Unboxing extracts the value type from
the object.
[...]
In relation to simple assignments,
boxing and unboxing are
computationally expensive processes.
When a value type is boxed, an
entirely new object must be allocated
and constructed. To a lesser degree,
the cast required for unboxing is also
expensive computationally.
I found that the following articles are very helpful and informative:
Why do we need boxing and unboxing in C#?
http://www.codeproject.com/KB/cs/boxing.aspx
http://msdn.microsoft.com/en-us/library/yz2be5wk.aspx
http://msdn.microsoft.com/en-us/library/yz2be5wk(v=vs.80).aspx
Related
Boxing and Unboxing are defined only for value types. Source:
Boxing is the process of converting a value type to the type object or to any interface type implemented by this value type. When the CLR boxes a value type, it wraps the value inside a System.Object and stores it on the managed heap. Unboxing extracts the value type from the object. Boxing is implicit; unboxing is explicit. The concept of boxing and unboxing underlies the C# unified view of the type system in which a value of any type can be treated as an object.
Performance of Boxing and Unboxing is an expensive process, Source:
Boxing and unboxing are computationally expensive processes. When a value type is boxed, an entirely new object must be created. This can take up to 20 times longer than a simple reference assignment. When unboxing, the casting process can take four times as long as an assignment.
Now, If I am using string and string[], which are reference types and I do the following:
string A;
return (string)(object)A;
// IMP: Here first casting is similar to boxing (though for a reference type), and second casting is similar to unboxing.
Similarly,
string[] A;
return (string[])(object)A;
// IMP: Here first casting is similar to boxing (though for a reference type), and second casting is similar to unboxing.
Unlike value types which are computationally expensive here we are using reference types. Is there a similar performance impact in using boxing/unboxing like technique for reference type?
It looks similar to the following but none talk about performance impact (if any):
Object type boxing with a reference type variable,
What is boxing and unboxing and what are the trade offs?
Will Boxing and Unboxing happen in Array?
You may be interested to know that the C# compiler completely removes the object cast1. What you end up with is (assuming the method assigns a value to A from a constant and then has the code you've shown):
.method private hidebysig static string Thing() cil managed
{
.maxstack 8
L_0000: ldstr "fred"
L_0005: castclass string
L_000a: ret
}
You may end up with a runtime check of the type of the reference here but I wouldn't be surprised if the JIT wasn't able to demonstrate statically that the reference on the stack due to ldstr was already string and so can remove any code that it might have considered generating for the castclass operation.
Reference casts are assertions (I know what type I'm dealing with better than the compiler). They're nothing like boxing and unboxing.
1As it will, in general, for any upcast between reference types.
There will be no significant performance impact as strings are already reference types.
Been using C# for a while and I've been thinking this:
public static void Main(Strings[] args){
...
Person p = new Person("Bob",23,"Male");
List<Object> al = new List<Object>();
al.Add(p);
Person p = (Person)al[0];
}
A typical example of boxing and unboxing in Collection, but question is: when boxing the variable, the CLR allocates a extra space in GC heap and treat p as object, yet the Person class is "larger" than System.Object
So according to that, that may lose some values that Person class owns additionally, it will fail to get some data after unboxing.
How CLR work that out?
Any solutions are welcomed
Person is a class, so used by reference. No boxing or unboxing.
As opposed to a value type that may require boxing.
If the Person type is really a structure, what you haven't declared explicitly, the space on the heap is surely larger than the space needed for an object of System.Object class. However, at the moment when the data are moved to heap, the object itself is not the value you are giving to the Add method, this value is only a reference to the boxed object. If System.Object was a structure, then yes, the data will have to be truncated to fit the size of the structure. This is the reason why inherited structures aren't allowed.
I was studying boxing and unboxing.
I went through this example, I am unable to understand the answer.
Can anyone explain to me please.
I know what boxing and unboxing does now, by looking at a simple example, but this example, confuses a bit.
An example of boxing and then unboxing, a tricky example.
[struct|class] Point {
public int x, y;
public Point(int x, int y) {
this.x = x;
this.y = y;
}
}
Point p = new Point(1, 1);
object o = p; p.x = 2;
Console.WriteLine(((Point)o).x);
I read the answer as:
It depends! If Point is a struct then the output is 1 but if Point is a class then the output is 2! A boxing conversion makes a copy of the value being boxed explaining the difference in behavior.
Here is ((point)o).x a boxing or unboxing?
Didn't understand, can anyone explain to me please.
I know that the answer should come 1, but if class then how 2?
I don't know why everyone is writing an essay, it's pretty simple to explain:
When you cast a struct into object, it is copied into a new object.
When you cast an object into a struct, it is copied into a new struct.
When you cast between classes, the object's contents are not copied; only the reference is copied.
Hope that helps.
Although C# tries to pretend that structure types derive from Object, that's only half true. According to the CLI spec, a structure type specification actually defines two kinds of thing: a type of heap object which derives from System.ValueType (and in turn System.Object), and a kind of storage location (local variable, static variable, class field, struct field, parameter, or array slot) which does not derive from anything, but is implicitly convertible to the heap object type.
Every heap object instance contains all the fields defined by the type or its parent classes (if any), along with a header which identifies its type and some other information about the instance. Every struct-type storage location contains either the bytes necessary to hold its value (if a primitive type), or else holds the concatenated values of all its fields; in neither case does it contain any sort of header that identifies its type. Instead, value types rely upon information in the generated code to know what they are.
If one stores a value type to a storage location of that value type, the compiler will overwrite all the bytes occupied by the destination with values taken from the original value type. If, however, one tries to store a value type to a reference-type storage location (like Object), the runtime will generate a new heap object with enough space to hold all the data from the value type, along with a header identifying its type, and store in the destination location a reference to that new object. If one tries to typecast a reference type to a value type, the runtime will first verify that the object is of the proper type and, if so, copy the data from the heap object to the destination.
There are a couple of tricky scenarios involving interfaces and generics. Interface types are reference types, but if a struct implements an interface, the implementing methods may act directly upon a boxed struct instance without having to unbox and rebox it. Further, interface types used as generic constraints do not require boxing. If one passes a variable of a value type like List<int>.Enumerator to a function EnumerateThings<TEnumerator>(ref TEnumerator it) where TEnumerator: IEnumerator<int>, that method will be able to accept a reference to that variable without boxing.
Since Point is Struct and Structs are value types which means they are copied when they are passed around.
So if you change a copy you are changing only that copy, not the original.
However if Point was a class, then it was passed by reference.
So if you change a copy you are changing only that copy and also the original.
As for your confusion
object o = p; is boxing
whereas
(Point)o is unboxing
To understand boxing you need to understand the difference between Value types and Reference types.
I think the simplest way to understand it is:
"Value types are allocated in-line. Reference types are always
allocated on the Heap"
meaning, that if you add a value type (struct, int, float, bool) inside of a reference-type as a class variable (public or private), that value-type's data is embedded wherever that reference-type lives on the Heap.
If you create a value-type inside of a function, but do NOT assign it to a public/private variable, that value-type is allocated in the Function-Stack (meaning once you leave that function it will get collected)
So, given that background knowledge, it should be pretty self-explanatory what happens when you "box" a value-type: you have to take that value type (wherever it was in-line allocated) and turn it into a reference-type (create a new object for it on the Heap).
First you need to know where objects are stored. Structs, enums and other value types are stored on the stack, in registers, or on the heap; classes and other reference types are stored on the heap. A good tutorial is here.
Boxing is done when a value type is being stored in a heap. A copy is being made from stack to heap. Unboxing is the other way around, a copy of a value will be made from heap to stack.
1 Point p = new Point(1, 1);
2 object o = p;
3 p.x = 2;
4 Console.WriteLine(((Point)o).x);
In your code above, if Point is a struct, a copy will be made to object "o". On your line 3, what you modified is is the Point in stack. On the last line you unboxed the object "o" but the value you will get is the copied value from the heap.
If the Point is class, in line 1, a space is created for the Point in the heap. Line 2, creates a new variable "o" that references to the same memory space. Remember that "p" & "o" are referencing on the same memory address location, so if you modify any of the variable just like in line 3, you will get the modified value on both variable.
I've been trying to understand this paragraph, but somehow I couldn't virtualize it in my mind, some one please elaborate it little bit:
Unboxing is not the exact opposite of boxing. The unboxing operation
is much less costly than boxing. Unboxing is really just the operation
of obtaining a pointer to the raw value type (data fields) contained
within an object. In effect, the pointer refers to the unboxed portion
in the boxed instance. So, unlike boxing, unboxing doesn't involve the
copying of any bytes in memory. Having made this important
clarification, it is important to note that an unboxing operation is
typically followed by copying the fields.
Richter, Jeffrey (2010-02-05). CLR via C# (Kindle Locations
4167-4171). OReilly Media - A. Kindle Edition.
In order to box an int you need to create an object on the heap large enough to hold all of the data that the struct holds. Allocating a new object on the heap means work for the GC to find a spot, and work for the GC to clean it up/move it around during and after its lifetime. These operations, while not super expensive, aren't cheap either.
To unbox a value type all you're doing is de-reference the pointer, so to speak. You simply need to look at the reference (which is what the object you have is) to find the location of the actual values. Looking up a value in memory is very cheap, which is why that paragraph is saying 'unboxing' is cheap.
Update:
While an unboxed value type will usually be copied to some other location right after being unboxed, that isn't always the case. Consider the following example:
public struct MyStruct
{
private int value = 42;
public void Foo()
{
Console.WriteLine(value);
}
}
static void Main()
{
object obj = new MyStruct();
((MyStruct)obj).Foo();
}
The MyStruct is boxed into obj but when it's unboxed it's never copied anywhere, a method is simply invoked on it. LIkewise you could pull a property/field out of the struct and copy just that part of it without needing to copy the whole thing. This might look a bit contrived, but it's still not entirely absurd. That said, as your quote implies, it's still likely to copy the struct after you unbox it.
If I have an object such as List<string> that I cast into an object, and then back again, will all the strings get cast as well or just the list that contains them?
I'm thinking that the compiler would only have to check if the object was of type List<string> before casting back into a List<string> but I grew up in C#, so I'm not entirely certain what goes on behind the code that I write.
When you cast a List<string> to an object, you're not really doing any casting at all. You're assigning one reference to some data, to a less-specific reference. The string objects that it contains aren't changed at all, either.
Also, to clarify, there is no boxing involved in this case. Boxing occurs when you create a reference to a value type like an int or some struct, by assigning it or passing it somehow to a variable of type object.
Boxing occurs when a struct / value type is stored in a location which is typed to object or an interface which that struct implements. In this scenario both List<string> and string are reference types so no boxing occurs.
struct S1 : IComparable {
...
}
S1 local = new S1(); // No box.
object obj = local; // Box S1 instance into object
IComparable comp = local; // Box S1 instance into IComparable
obj = "hello"; // String is a reference type, no boxing
string is a reference type - an instance of string will never get boxed.
If however you had a List<int>, the List is a reference type, so there will be no boxing here either. int is a value type and it could be boxed IF it were cast to an object (either implicitly or explicitly).
Boxing only affects value types - List<T> is a class, hence a reference type, changing the generic type T does not affect whether an instance is passed by value or by reference.
Generic collections help prevent boxing as it enables value types to be read / written without being boxed to an object.
For every value type, there is a corresponding object type with the same name which is derived from ValueType. Whenever it is necessary to create a storage location (field, variable, or parameter) of a given type, the system will allocate space to store either a heap reference (if the type does not derive from ValueType), all of the type's fields (if it's a 'struct'), or the bits holding the types value. Boxing occurs when an attempt is made to store an instance of a value type into a heap reference. Unboxing occurs when an attempt is made to use a heap reference as though it were a value type. Note that in some contexts, unboxing will copy the contents of boxed object to a new storage location, but in other contexts it will regard the boxed instance as a storage location. The semantics of this are not always clear, which is one of the reasons some people hate mutable structs. In practice, mutable structs are fine if one avoids usage scenarios where the semantics get murky, and even immutable structs can suffer from the same murky semantics.
Boxing and Unboxing is the actions that applies to Value Types not to Reference Types.
It's not that, it's just a simple cast.
Only the List<string> will be casted.
If you want to cast the List<string> items do this:
List<string> list = new List<string>{"first", "second"};
List<object> objectsList = list.Cast<object>();
P.S. string is a reference type so it can't really get boxed and unboxed