I don't understand it...
Why do they need a common base?
The question presupposes a falsehood. They don't need a common base type. This choice was not made out of necessity. It was made out of a desire to provide the best value for the customer.
When designing a type system, or anything else for that matter, sometimes you reach decision points -- you have to decide either X or not-X. Common base type or no common base type. When that happens you weigh up the costs and the benefits of X to determine the net value, and then you weigh up the costs and the benefits of not-X to determine the net value, and you go with the one that was higher value. The benefits of having a common base type outweigh the costs, and the net benefit thereby accrued is larger than the net benefit of having no common base type. So we choose to have a common base type.
That's a pretty vague answer. If you want a more specific answer, try asking a more specific question.
I think, mainly in order to have a type to refer to any object. An argument of type object could be a primitive type, a struct, a reference type, just anything. It is important to have such a type.
Then there are some common members that are implemented by every type, like Equals, ToString, GetHashCode. This could indeed be a common interface as well, but then you wouldn't inherit a default implementation.
Because its in the spec:
8.9.9 Object type inheritance
With the
sole exception of System.Object, which
does not inherit from any other object
type, all object types shall either
explicitly or implicitly declare
support for (i.e., inherit from)
exactly one other object type. The
graph of the inherits-relation shall
form a singly rooted tree with
System.Object at the base; i.e., all
object types eventually inherit from
the type System.Object.
There are several benefits to this design approach.
Some reasons why many / most are:
To provide common members such as Equals, Finalize, GetHashCode, ToString....
To help with boxing....
Java and C# took a different approach to this. It's mainly to do with performance.
If you imagine that every object can be nullable, then it has to be a pointer to a value, which can be changed to a pointer to null. Now this means that for every object you need at least a pointer and a chunk of memory to store the value.
Java has the concept of a primitive value, which is NOT an object. It doesn't have a pointer and it isn't nullable. This breaks the OOP paradigm but performance wise makes sense.
C# (or more correctly the CLR + BCL) attempted a good compromise, the ReferenceType and ValueType derivations. Anything that derives from ValueType are treated like primitives to the CLR, avoiding having an object reference. However this value can still be treated like an object via boxing, allowing you to have the performance benefits of primitive types but allowing everything to be treated like an object.
The real key difference between these things is the semantics of passing parameters to methods. If everything is an object, then you are passing references to the object, i.e the object can be changed by passing it to a method. Primitives and C# value types are passed by value, so they are effectively copied into the method call and the original value is unchanged.
It's the standard story of development. Try and get it right first, then see if you can optimise it later once you see the bottlenecks. Having pass by value semantics also allow you to prevent coding mistakes from mutability. (eg. passing a class vs a struct in C#)
ToString. For example.
Useful for Boxing and Unboxing
see reference
Every object in .NET shares common properties and methods. However, these are then divided into two categories: value types and reference types. Value types (ie, int) are stored on the stack, Reference types (ie, your custom class) are stored on the heap. Reference types store a reference to the actual data (thats on the heap). Value types directly contain their data.
You can read more over at MSDN:
http://msdn.microsoft.com/en-us/library/system.object.aspx
As a side note to other "answers" a struct is a value type, that also inherits from object.
Maybe the answer is that its assumed that we are programming object-oriented style/paradigm? A ball is an object. A sword is an object. An employee is an object. A purchase order is an object.
Some companies design their .NET (or Java or Ruby or PHP) applications to inherit from a common base class, so that they can all be treated the same way in their system. If I remember correctly from back in my old Java days... all EJBs share the same base class so that they can be managed and identified uniformly.
Related
I have an exam question from a past paper that I'm trying to answer:
Discuss variables of type primitive, reference and static in the context of a programming language. Give suitable examples [8].
The answer I have so far is:
A primitive type is an object which the language has given a predefined value. These types include int, bool and float. Reference type objects refer to these primitive types in a particular sequence when instantiated. Examples of these are strings and arrays. The static keyword, when assigned to a variable, means that there is only one instance of this variable and the value assigned applies to all references of the variable.
I'm fairly new to programming so I don't know if this is exactly right, so if anyone could give me some tips on how to improve the mark I would get for this question I'd greatly appreciate it.
A primitive type is an object which the language has given a
predefined value
Why? Even references can have predefined values as noted. For primitive (built in) types you may want to say these are types that a language provides built in support for. What your instructor might be glad to hear about is if you say that most primitive types are also value types in C# and you might want to discuss value types semantics (e.g., value type variable directly contains value - whereas a reference variable just contains an address to some object in memory).
About reference types again you may say that a reference variable doesn't contain the value or object directly - rather just a reference to it. Now again you may want to discuss reference semantics. For example if you have two reference variables pointing to same object - and you change the object from one reference change will be visible from another reference too - because both references point to same object. This is not the case with value types. If you assign same value type object to two different value type variables and change one variable - this change will not be visible in the second value type variable because each of them holds the value directly (e.g. each will have its own copy of the value type variable it was assigned to).
Static types you have already described.
You are on the right track for sure, but you are missing some fundamental concepts about these. Also, the 3 are not mutually exclusive:
A primitive type is simply a syntax shortcut defined by the compiler for Framework Class Library or FCL types.
A reference type is a pointer that represents an instance of a class. The objects they point to are allocated on the heap and the value of the variable is the memory address of that object rather than the class itself.
Static is not a type at all, but really defines where and when fields, properties, methods, and classes can be used. A static variable lives on the class rather then an instance. A static constructor is called the first time you access the class. A static method can be called from the class definition. That explains the persistence you see on static variables as you create and destroy them.
The answer to that question, in my opinion -- has not a thing to do with OOP and everything to do with the compiler and microprocessor.
The simplest and most accurate definition of the term that subsumes all of the qualities of a primitive type -- as I understand it -- is:
A primitive type must fit into the register used for operations on it -- IOW, in an X86 system -- the Accumulator.
So, primitive types are limited to the size of the Accumulator and can be operated upon by native processor instructions. (Basic math and Boolean/bit-shifting operations). Yes, it fits into heap memory and on the stack, but those are still essentially 8-bit entities and the registers are not.
OOP languages do not use primitive types for their managed memory processes, they use structures that mimic primitive types. (Even in .NET, when you use the keyword int -- it uses System.Int32 to wrap that.)
As Int32 is a struct which means it is a System.ValueType (which inherits System.Object), when I pass an Integer to a function which expects Object, why should CLR box it?
Does CLR assumes that Object is always a reference type?
It is a bit confusing to think that ValueType "is" an Object but when you have to pass it "as" object, you need box it...
Am I the only one who is wondering about this?
It's not that a type derived from Object is always a reference type, but rather that a variable of type Object always contains a reference. Suppose you wanted to store the actual value in the Object; how then would you decide how big the Object value would need to be?
A variable of a compile-time-known value type has a known size for which space can be allocated, but an Object, being able to 'contain' any value type, cannot be sized in advance. One logical solution then is to have the Object variable contain a special type of reference to a boxed object, whereby the size of the 'box' is allocated dynamically depending on what type is being boxed.
Some slightly more technical notes:
Another solution to the above problem would be to treat the Object as a reference to an arbitrary location in memory, which would prevent having to create a boxed copy. This is how it's done in C, where you can create a pointer to a value on the stack, for instance, then pass that to another function for use. This can be quite dangerous though, as what happens, for instance, if the function decides to keep that pointer around and use it at some undefined later time. Since the call stack has changed, that pointer is now pointing to something entirely different than was originally intended and writing to it will almost certainly have disastrous side effects.
Part of the goal of .NET, as a managed runtime, is to provide a 'safe' environment where these particular kinds of failures can't happen. Part of that trade-off is disallowing persisted direct references to stack memory, necessitating boxing when you want to 'persist' the contents of a value type in a variable containing a reference. This used to be a performance problem with collections in .NET 1.1, but the addition of Generics in .NET 2.0 meant that boxing was far less common an occurrence.
My coworker made the claim that there is never a need to use Object when declaring variables, return parameters, etc in .NET 2.0 and newer.
He went further and said in all such cases, a Generic should be used as the alternative.
Is there any validity to this claim? Off the top of my head I use Object for locking concurrent threads...
Generics do trump object in a lot of cases, but only where the type is known.
There are still times when you don't know the type - object, or some other relevant base type is the answer in those instances.
For example:
object o = Activator.CreateInstance("Some type in an unreferenced assembly");
You won't be able to cast that result or maybe even know what the type is at compile time, so object is a valid use.
Your co-worker is generalising too much - perhaps point him at this question. Generics are great, give him that much, but they do not "replace" object.
object is perfect for a lock. Generics allow you to keep it typed appropriately. You can even constrain it to an interface or base class. You can't do that with object.
Consider this:
void DoSomething(object foo)
{
foo.DoFoo();
}
That won't work without any casting. But with generics...
void DoSomething<T>(T foo) where T : IHasDoFoo
{
foo.DoFoo();
}
With C# 4.0 and dynamic, you could deffer this to runtime, but I really haven't seen a need.
void DoSomething(dynamic foo)
{
foo.DoFoo();
}
When using interop with COM, you don't always have a choice... Generic don't really cater for the issues of interop.
Object is also the most lightweight option for a lock, as #Daniel A. White mentioned in his answer.
Yes there is validity. A good breakdown has already been made here.
However, I cannot confirm if there is no instance where you will never use objects, but personally I do not use them and even before generics I avoided boxing/unboxing.
There are lots of counterexamples, including the one you mentioned, using an object for synchronisation.
Another example is the DataSource property used in databinding, which can be set to one of a variety of different object types.
Broad counterexample: The System.Collections namespace is alive and well in .NET 4, no sign of deprecation or warning against its use on MSDN. The methods you find there take and return Objects.
Inherent in the question are actually two questions:
When should storage locations of type `Object` be used
When should instances of type `Object` be used
Storage locations of type Object must obviously be used in any circumstance where it will be necessary to hold references to instances of that type (since references to such instances cannot be held in any other type). Beyond that, they should be used in cases where they will hold references to objects which have no single useful common base type. This is obviously true in many scenarios using Reflection (where the type of an object may depend upon a string computed at run-time), but can also apply to certain varieties of collection which are populated with things whose type is known at compile time. As a simple example, one could represent a hierarchical collection of string indexed by sequences of int by having each node be of type Object, and having it hold either a String or an Object[]. Reading out items from such a collection would be somewhat clunky, since one would have to examine each item and determine whether it was an instance of Object[] or String, but such a method of storage would be extremely memory-efficient, since the only object instances would be those which either held the strings or the arrays. One could define a Node type with a field of type String and one of type Node[], or even define an abstract Node type with derived types StringNode (including a field of type String) and ArrayNode (with a field of type Node[]) but such approaches would increase the number of heap objects used to hold a given set of data.
Note that in general it's better to design collections so that the type of an object to be retrieved won't depend upon what's been shoved into the collection (perhaps using "parallel collections" for different types) but not everything works out that way semantically.
With regard to instances of type Object, I'm not sure there's any role they can fill which wouldn't be just as well satisfied by a sealed type called something like TokenObject which inherits from Object. There are a number of situations where it is useful to have an object instance whose sole purpose is to be a unique token. Conceptually, it might have been nicer to say:
TokenObject myLock = new TokenObject;
than to say
Object myLock = new Object;
since the former declaration would make clear that the declared variable was never going to be used to hold anything other than a token object. Nonetheless, common practice is to use instances of type Object in cases where the only thing that matters about the object is that its reference will be unique throughout the lifetime of the program.
Quick note on the accepted answer: I disagree with a small part of Jeffrey's answer, namely the point that since Delegate had to be a reference type, it follows that all delegates are reference types. (It simply isn't true that a multi-level inheritance chain rules out value types; all enum types, for example, inherit from System.Enum, which in turn inherits from System.ValueType, which inherits from System.Object, all reference types.) However I think the fact that, fundamentally, all delegates in fact inherit not just from Delegate but from MulticastDelegate is the critical realization here. As Raymond points out in a comment to his answer, once you've committed to supporting multiple subscribers, there's really no point in not using a reference type for the delegate itself, given the need for an array somewhere.
See update at bottom.
It has always seemed strange to me that if I do this:
Action foo = obj.Foo;
I am creating a new Action object, every time. I'm sure the cost is minimal, but it involves allocation of memory to later be garbage collected.
Given that delegates are inherently themselves immutable, I wonder why they couldn't be value types? Then a line of code like the one above would incur nothing more than a simple assignment to a memory address on the stack*.
Even considering anonymous functions, it seems (to me) this would work. Consider the following simple example.
Action foo = () => { obj.Foo(); };
In this case foo does constitute a closure, yes. And in many cases, I imagine this does require an actual reference type (such as when local variables are closed over and are modified within the closure). But in some cases, it shouldn't. For instance in the above case, it seems that a type to support the closure could look like this: I take back my original point about this. The below really does need to be a reference type (or: it doesn't need to be, but if it's a struct it's just going to get boxed anyway). So, disregard the below code example. I leave it only to provide context for answers the specfically mention it.
struct CompilerGenerated
{
Obj obj;
public CompilerGenerated(Obj obj)
{
this.obj = obj;
}
public void CallFoo()
{
obj.Foo();
}
}
// ...elsewhere...
// This would not require any long-term memory allocation
// if Action were a value type, since CompilerGenerated
// is also a value type.
Action foo = new CompilerGenerated(obj).CallFoo;
Does this question make sense? As I see it, there are two possible explanations:
Implementing delegates properly as value types would have required additional work/complexity, since support for things like closures that do modify values of local variables would have required compiler-generated reference types anyway.
There are some other reasons why, under the hood, delegates simply can't be implemented as value types.
In the end, I'm not losing any sleep over this; it's just something I've been curious about for a little while.
Update: In response to Ani's comment, I see why the CompilerGenerated type in my above example might as well be a reference type, since if a delegate is going to comprise a function pointer and an object pointer it'll need a reference type anyway (at least for anonymous functions using closures, since even if you introduced an additional generic type parameter—e.g., Action<TCaller>—this wouldn't cover types that can't be named!). However, all this does is kind of make me regret bringing the question of compiler-generated types for closures into the discussion at all! My main question is about delegates, i.e., the thing with the function pointer and the object pointer. It still seems to me that could be a value type.
In other words, even if this...
Action foo = () => { obj.Foo(); };
...requires the creation of one reference type object (to support the closure, and give the delegate something to reference), why does it require the creation of two (the closure-supporting object plus the Action delegate)?
*Yes, yes, implementation detail, I know! All I really mean is short-term memory storage.
The question boils down to this: the CLI (Common Language Infrastructure) specification says that delegates are reference types. Why is this so?
One reason is clearly visible in the .NET Framework today. In the original design, there were two kinds of delegates: normal delegates and "multicast" delegates, which could have more than one target in their invocation list. The MulticastDelegate class inherits from Delegate. Since you can't inherit from a value type, Delegate had to be a reference type.
In the end, all actual delegates ended up being multicast delegates, but at that stage in the process, it was too late to merge the two classes. See this blog post about this exact topic:
We abandoned the distinction between Delegate and MulticastDelegate
towards the end of V1. At that time, it would have been a massive
change to merge the two classes so we didn’t do so. You should
pretend that they are merged and that only MulticastDelegate exists.
In addition, delegates currently have 4-6 fields, all pointers. 16 bytes is usually considered the upper bound where saving memory still wins out over extra copying. A 64-bit MulticastDelegate takes up 48 bytes. Given this, and the fact that they were using inheritance suggests that a class was the natural choice.
There is only one reason that Delegate needs to be a class, but it's a big one: while a delegate could be small enough to allow efficient storage as a value type (8 bytes on 32-bit systems, or 16 bytes on 64-bit systems), there's no way it could be small enough to efficiently guarantee if one thread attempts to write a delegate while another thread attempts to execute it, the latter thread wouldn't end up either invoking the old method on the new target, or the new method on the old target. Allowing such a thing to occur would be a major security hole. Having delegates be reference types avoids this risk.
Actually, even better than having delegates be structure types would be having them be interfaces. Creating a closure requires creating two heap objects: a compiler-generated object to hold any closed-over variables, and a delegate to invoke the proper method on that object. If delegates were interfaces, the object which held the closed-over variables could itself be used as the delegate, with no other object required.
Imagine if delegates were value types.
public delegate void Notify();
void SignalTwice(Notify notify) { notify(); notify(); }
int counter = 0;
Notify handler = () => { counter++; }
SignalTwice(handler);
System.Console.WriteLine(counter); // what should this print?
Per your proposal, this would internally be converted to
struct CompilerGenerated
{
int counter = 0;
public Execute() { ++counter; }
};
Notify handler = new CompilerGenerated();
SignalTwice(handler);
System.Console.WriteLine(counter); // what should this print?
If delegate were a value type, then SignalEvent would get a copy of handler, which means that a brand new CompilerGenerated would be created (a copy of handler) and passed to SignalEvent. SignalTwice would execute the delegate twice, which increments the counter twice in the copy. And then SignalTwice returns, and the function prints 0, because the original was not modified.
Here's an uninformed guess:
If delegates were implemented as value-types, instances would be very expensive to copy around since a delegate-instance is relatively heavy. Perhaps MS felt it would be safer to design them as immutable reference types - copying machine-word sized references to instances are relatively cheap.
A delegate instance needs, at the very least:
An object reference (the "this" reference for the wrapped method if it is an instance method).
A pointer to the wrapped function.
A reference to the object containing the multicast invocation list. Note that a delegate-type should support, by design, multicast using the same delegate type.
Let's assume that value-type delegates were implemented in a similar manner to the current reference-type implementation (this is perhaps somewhat unreasonable; a different design may well have been chosen to keep the size down) to illustrate. Using Reflector, here are the fields required in a delegate instance:
System.Delegate: _methodBase, _methodPtr, _methodPtrAux, _target
System.MulticastDelegate: _invocationCount, _invocationList
If implemented as a struct (no object header), these would add up to 24 bytes on x86 and 48 bytes on x64, which is massive for a struct.
On another note, I want to ask how, in your proposed design, making the CompilerGenerated closure-type a struct helps in any way. Where would the created delegate's object pointer point to? Leaving the closure type instance on the stack without proper escape analysis would be extremely risky business.
I can tell that making delegates as reference types is definitely a bad design choice. They could be value types and still support multi-cast delegates.
Imagine that Delegate is a struct composed of, let's say:
object target;
pointer to the method
It can be a struct, right?
The boxing will only occur if the target is a struct (but the delegate itself will not be boxed).
You may think it will not support MultiCastDelegate, but then we can:
Create a new object that will hold the array of normal delegates.
Return a Delegate (as struct) to that new object, which will implement Invoke iterating over all its values and calling Invoke on them.
So, for normal delegates, that are never going to call two or more handlers, it could work as a struct.
Unfortunately, that is not going to change in .Net.
As a side note, variance does not requires the Delegate to be reference types. The parameters of the delegate should be reference types. After all, if you pass a string were an object is required (for input, not ref or out), then no cast is needed, as string is already an object.
I saw this interesting conversation on the Internet:
Immutable doesn't mean it has to be a value type. And something that
is a value type is not required to be immutable. The two often go
hand-in-hand, but they are not actually the same thing, and there are
in fact counter-examples of each in the .NET Framework (the String
class, for example).
And the answer:
The difference being that while immutable reference types are
reasonably common and perfectly reasonable, making value types mutable
is almost always a bad idea, and can result in some very confusing
behaviour!
Taken from here
So, in my opinion the decision was made by language usability aspects, and not by compiler technological difficulties. I love nullable delegates.
I guess one reason is support for multi cast delegates Multi cast delegates are more complex than simply a few fields indicating target and method.
Another thing that's only possible in this form is delegate variance. This kind of variance requires a reference conversion between the two types.
Interestingly F# defines it's own function pointer type that's similar to delegates, but more lightweight. But I'm not sure if it's a value or reference type.
What is the definition of a value class and reference class in C#?
How does this differ from a value type and reference type?
I ask this question because I read this in the MCTS Self-Paced Training Kit (Exam 70-536). Chapter 1, Lesson 1, Lesson review 4 :
You need to create a simple class or
structure that contains only value
types. You must create the class or
structure so that it runs as
efficiently as possible. You must be
able to pass the class or structure to
a procedure without concern that the
procedure will modify it. Which of the
following should you create?
A reference class
B reference structure
C value class
D value structure
Correct Answer: D
A Incorrect: You could create a
reference class; however, it could be
modified when passed to a procedure.
B Incorrect: You cannot create a
reference structure.
C Incorrect: You could create a value
class; however, structures tend to be
more efficient.
D Correct: Value structures are
typically the most efficient.
You may be thinking of C++/CLI which, unlike C#, allows the user to declare a "value class" or a "ref class."
In C#, any class you declare will implicitly be a reference class - only built-in types, structs, and enums have value semantics.
To read about value class in C++/CLI, look here:
http://www.ddj.com/cpp/184401955
Value classes have very little functionality compared to ref classes, and are useful for "plain old data"; that is, data which has no identity. Since you're copying the data when you assign one to another, the system provides you with a default (and mandatory) copy constructor which simply copies the data over to the other object.
To convert a value class into a reference class (thereby putting it on the garbage-collected heap) you can "box" it.
To decide whether a class you are writing is one or the other, ask yourself whether it has an identity. That usually means that it has some state, or has an identifier or a name, or a notion of its own context (for example a node pointing to nearby nodes).
If it doesn't, it's probably a value class.
In C#, however, value classes are declared as "structs".
See the overview on the subject, but seriously follow the msnd links and read the full Common Type system chapter of it. (You could also have asked in a comment in the first, question)
Value types are passed by value, while reference types are passed by reference.
Edit: value/reference classes
There is no concept of a 'value class' or 'reference class' in C#, so asking for its definition is moot.
Value types store the actual data while reference types store references to the data. Reference types are stored dynamically on the heap while value types are stored on the stack.
Value Types: http://msdn.microsoft.com/en-us/library/s1ax56ch.aspx
Reference Types: http://msdn.microsoft.com/en-us/library/490f96s2.aspx
When you refer to a value type (that is, by using its name), you're talking about the place in memory where the data is. As such, value types can't be null because there's no way for the memory location to say "I don't represent anything." By default, you pass value types by value (that is, the object you pass in to methods doesn't change as a result of the method's execution).
When you use a reference type object, you're actually using a pointer in disguise. The name refers to a memory location, which then references a place in memory where the object actually lives. Hence you can assign null to a reference type, because they have a way of saying "I point to nowhere." Reference types also allow the object to be changed as a result of methods executing, so you can change myReferenceObject's properties by passing it into a method call.
Reference types are passed to methods by reference and value types by value; in the latter case a method receives a copy of the variable and in the former it receives a reference to the original data. If you change your copy, the original does not change. If you change the original data you have a reference to, the data changes everywhere a reference to the data is changed. If a similar program to your C# program was created in C, generally reference types would be like data using pointers and value types would be normal data on the stack.
Numeric types, char, date, enumerations, and structures are all value types. Strings, arrays, delegates and classes (i.e., most things, really) are reference types.
If my understanding is correct, you can accomplish a "value class", or immutable class, through the use of readonly member variables initialized through the constructor. Once created, these cannot be changed.