Should I use default(Foo), Foo.Empty, or null? - c#

So C# now allows you to use default(Foo) to get a recognized "not filled in yet"/empty instance of a class -- I'm not sure if it is exactly the same as new Foo() or not. Many library classes also implement a Foo.Empty property, which returns a similar instance. And of course any reference type can point to null. So really, what's the difference? When is one right or wrong? What's more consistent, or performs better? What tests should I use when checking if an object is conceptually "not ready for prime time"? Not everybody has Foo.IsNullOrEmpty().

default(Foo) will return null when Foo is a class type, zero where Foo is a value type (such as int), and an instance of Foo with all fields initialized to their respective default() values where Foo is a struct. It was added to the language so that generics could support both value and reference types - more info at MSDN
Use default(Foo) when you're testing a T in the context of SomeClass<T> or MyMethod<T> and you don't know whether T will be value type, a class type or a struct.
Otherwise, null should mean "unknown", and empty should mean "I know this is empty". Use the Foo.Empty pattern if you genuinely need an empty - but non-null - instance of your class; e.g. String.Empty as an alternative to "" if you need to initialize some variable to the empty string.
Use null if you know you're working with reference types (classes), there's no generics involved, and you're explicitly testing for uninitialized references.

When you know the actual type involved, or if you've got a type parameter constrained with ": class" it's simplest to use the known value (null, 0 etc).
When you've just got a type parameter which is unconstrained or constrained other than to be a reference type, you need to use default(T).

default(Foo) works for both value types and reference types. New Foo(), null and Foo.Empty() do not. This makes it a good choice for use with generic types, for example, when you may not know which you're dealing with. But in most reference-type cases, null is probably good enough.

Related

Why is Nullable type supported by CLR?

I heard that the addition of Nullable<T> type to C# 2.0 need to revise CLR(runtime) a little, is this change necessary? Could it make the same goal if a new Nullable<T> generic class was added only?
Nullable isn't a generic class as you indicate, Nullable<T> is generic (it has a type parameter, T). That's why Nullable<T> only arrived in C# 2.0: it came with the addition of generics to the CLR.
You could do the same thing with a general Nullable, but you couldn't do things like this:
int? myInt = 123;
int result = myInt.Value;
You would instead have to:
int result = (int)myInt.Value;
...and it might not be type-safe, by that I mean what if myInt.Value is a string? The generic version, Nullable<T>, would only allow an int into the Value property.
I don't completely understand what you're asking, though.. "why are generic classes useful"?
If I understand you correctly, you are asking why can't it just be Type introduced in the framework library? but instead the CLR need to be changed?
Well based on my understanding Nullable is a special type, not quite like other container type. First it is actually a value type - defined as struct, not a class. Separately it allows to be assigned a value of null (for value type, that is a special case), plus it support the use of '?' and operator '??' which is all new. The Nullable also become part of the Common type system. So I guess from these perspective, the specification, compiler and CLR will need to be changed.
There are only two reasons I know of for the special handling of Nullable<T>:
To allow a typecast from null Object reference to a Nullable<T> with HasValue = false.
To allow a Nullable<T> where HasValue is false, to compare equal to null.
Frankly, think it would have been better to let Nullable<T> box like any other value type, and define Nullable<T>.Empty as a value which may be compared against (for those cases where one might want to compare against a variable that might be null or might hold a value). To my mind, there's no reason why Object.Equals should report that an "int?" which is equal to null is equal to a "long?" which is also equal to null. The first should be viewed as an empty int-sized box, and the latter as an empty long-sized box.

Definition of Type in .NET

This question gives me curiosity... When you want to define a type you must say GetType(Type) ex.: GetType(string), but ain't String a type itself?
Why do you need to use GetType in those situations? And, if the reason is because it is expecting a Type 'Type'... why isn't the conversion implicit... I mean, all the data is there...
What you're doing is getting a reference to the meta-data of the type ... it might be a little more obvious if you look at the C# version of the API, which is typeof(string) ... which returns a Type object with information about the string type.
You would generally do this when using reflection or other meta-programming techniques
string is type, int is type and Type is type and they are not the same. but about why there is no implicit conversion it's not recommended by MSDN:
By eliminating unnecessary casts,
implicit conversions can improve
source code readability. However,
because implicit conversions can occur
without the programmer's specifying
them, care must be taken to prevent
unpleasant surprises. In general,
implicit conversion operators should
never throw exceptions and never lose
information so that they can be used
safely without the programmer's
awareness. If a conversion operator
cannot meet those criteria, it should
be marked explicit.
Take attention to :
never lose information so that they
can be used safely without the
programmer's awareness
When you want to define a type you must say GetType(Type) ex.: GetType(string)...
That's not true. Every time you do any of the following
class MyClass
{
///...
}
class MyChildClass : MyClass
{
}
struct MyStruct
{
///...
}
you're defining a new type.
if the reason is because it is expecting a Type 'Type'... why isn't the conversion implicit... I mean, all the data is there...
One reason for this is polymorphism. For instance, if we were allowed to do the following:
MyChildClass x;
....GetType(x)
GetType(x) could return MyChildClass, MyClass, or Object, since x is really an instance of all of those types.
It's also worth noting that Type is itself a class (ie, it inherits from Object), so you can inherit from it. Although I'm not sure why you'd want to do this other than overriding the default reflection behavior (for instance, to hide the internals from prying eyes).
GetType(string) will return the same information. Look at it like you would a constant. The only other way to get the Type object that represents a string would be to instantiate the string object and call o.GetType(). Also, this is not possible for interfaces and abstract types.
If you want to know the runtime type of a variable, call the .GetType() method off of it, as the runtime type may not be the same as the declared type of the variable.

What is a Value Class and what is a reference Class in C#?

What is the definition of a value class and reference class in C#?
How does this differ from a value type and reference type?
I ask this question because I read this in the MCTS Self-Paced Training Kit (Exam 70-536). Chapter 1, Lesson 1, Lesson review 4 :
You need to create a simple class or
structure that contains only value
types. You must create the class or
structure so that it runs as
efficiently as possible. You must be
able to pass the class or structure to
a procedure without concern that the
procedure will modify it. Which of the
following should you create?
A reference class
B reference structure
C value class
D value structure
Correct Answer: D
A Incorrect: You could create a
reference class; however, it could be
modified when passed to a procedure.
B Incorrect: You cannot create a
reference structure.
C Incorrect: You could create a value
class; however, structures tend to be
more efficient.
D Correct: Value structures are
typically the most efficient.
You may be thinking of C++/CLI which, unlike C#, allows the user to declare a "value class" or a "ref class."
In C#, any class you declare will implicitly be a reference class - only built-in types, structs, and enums have value semantics.
To read about value class in C++/CLI, look here:
http://www.ddj.com/cpp/184401955
Value classes have very little functionality compared to ref classes, and are useful for "plain old data"; that is, data which has no identity. Since you're copying the data when you assign one to another, the system provides you with a default (and mandatory) copy constructor which simply copies the data over to the other object.
To convert a value class into a reference class (thereby putting it on the garbage-collected heap) you can "box" it.
To decide whether a class you are writing is one or the other, ask yourself whether it has an identity. That usually means that it has some state, or has an identifier or a name, or a notion of its own context (for example a node pointing to nearby nodes).
If it doesn't, it's probably a value class.
In C#, however, value classes are declared as "structs".
See the overview on the subject, but seriously follow the msnd links and read the full Common Type system chapter of it. (You could also have asked in a comment in the first, question)
Value types are passed by value, while reference types are passed by reference.
Edit: value/reference classes
There is no concept of a 'value class' or 'reference class' in C#, so asking for its definition is moot.
Value types store the actual data while reference types store references to the data. Reference types are stored dynamically on the heap while value types are stored on the stack.
Value Types: http://msdn.microsoft.com/en-us/library/s1ax56ch.aspx
Reference Types: http://msdn.microsoft.com/en-us/library/490f96s2.aspx
When you refer to a value type (that is, by using its name), you're talking about the place in memory where the data is. As such, value types can't be null because there's no way for the memory location to say "I don't represent anything." By default, you pass value types by value (that is, the object you pass in to methods doesn't change as a result of the method's execution).
When you use a reference type object, you're actually using a pointer in disguise. The name refers to a memory location, which then references a place in memory where the object actually lives. Hence you can assign null to a reference type, because they have a way of saying "I point to nowhere." Reference types also allow the object to be changed as a result of methods executing, so you can change myReferenceObject's properties by passing it into a method call.
Reference types are passed to methods by reference and value types by value; in the latter case a method receives a copy of the variable and in the former it receives a reference to the original data. If you change your copy, the original does not change. If you change the original data you have a reference to, the data changes everywhere a reference to the data is changed. If a similar program to your C# program was created in C, generally reference types would be like data using pointers and value types would be normal data on the stack.
Numeric types, char, date, enumerations, and structures are all value types. Strings, arrays, delegates and classes (i.e., most things, really) are reference types.
If my understanding is correct, you can accomplish a "value class", or immutable class, through the use of readonly member variables initialized through the constructor. Once created, these cannot be changed.

Why is null not allowed for DateTime in C#?

Why it is not allowed to assign null to a DateTime in C#? How has this been implemented? And can this feature be used to make your own classes non-nullable?
Example:
string stringTest = null; // Okay
DateTime dateTimeTest = null; // Compile error
I know that I can use DateTime? in C# 2.0 to allow null to be assigned to dateTimeTest and that I could use Jon Skeet's NonNullable class on my string to get a run time error on the assignment of stringTest. I'm just wondering why the two types behave differently.
DateTime is a value-type (struct), where-as string is a reference-type (class etc). That is the key difference. A reference can always be null; a value can't (unless it uses Nullable<T> - i.e. DateTime?), although it can be zero'd (DateTime.MinValue), which is often interpreted as the same thing as null (esp. in 1.1).
DateTime is a struct and not a class. Do a 'go to definition' or look at it in the object browser to see.
HTH!
The important distinction between ValueTypes and reference types is that value types have these "value semantics". A DateTime, Int32 and all other value types have no identity, an Int32 "42" is essentially indistinguishable from any other Int32 with the same value.
All value type "objects" exist either on stack or as a part of a reference type object. One special case is when you cast a value type instance to an Object or an interface - this is called "boxing", and it simply creates a dummy reference-type object which only contains the value that can be extracted back ("unboxed").
Reference types, on the other hand, have an identity. a "new Object()" does not equal any other "new Object()", because they are separate instances on the GC heap. Some reference types provide Equals method and overloaded operators so that they behave more value-like, eg. a String "abc" equals other "abc" String even if they are in fact two different objects.
So when you have a reference, it can either contain the address of a valid object, or it can be null. When value type objects are all-zero, they are simply zero. Eg. an integer zero, a float zero, Boolean false, or DateTime.MinValue. If you need to distinguish between "zero" and "value missing/null", you need to use either a separate Boolean flag, or, better yet, use the Nullable<T> class in .NET 2.0. Which is simply the value plus a Boolean flag. There's also support in the CLR so that boxing of a Nullable with HasValue=false results in a null reference, not in a boxed structure with false+zero, as it would if you were to implement this structure yourself.
DateTime is a value type, same as an int. Only reference types (like string or MyCustomObject) can be null. Reference types really store "references" to the objects location on the heap.
here's a article I found that explains it better. and here's the MSDN article on it
For a value-type to be null, there must be some value it can hold which would have no other legitimate meaning, and which the system will somehow know should be regarded as "null". Some value types could meet the first criterion without requiring any extra storage. If .net had been designed from the ground up with the concept of nullable values in mind, it could have had Object include a virtualIsLogicalNullproperty, and a non-virtualIsNullwhich would returntrueifthisis null and, otherwise invoke itsIsLogicalNullproperty and return the result. If .net had done this, it would have avoided the need for the quirky boxing behavior andstructconstraint ofNullable(an emptyNullablecould be boxed as an emptyNullable, and still be recognized asnull`).
By the time it was decided to provide support for nullable value types in .net framework 2.0, however, a lot of code had been written which assumed that the default values for things like Guid and DateTime would not be regarded as null. Since much of the value in nullable types lies with their predictable default value (i.e. null) , having types which had a null value, but defaulted to something else, would have added more confusion than value.
string is a class whereas DateTime is a structure. Thats why you cannot set it to null

Confusion with NULLs in C#

I am always confused with the different ways of expressing nulls. There is the null reference type (aka "null"). Then I've seen that throughout my application, developers have used MinValue to represent nulls. Ex: Double.MinValue or DateTime.MinValue except for a String for which they use "null"
Then there is System.DBNull (and System.DBNull.Value - not sure what to use when). To add to the confusion, there are also System.Nullable and System.Nullable<T> namespaces.
Can someone help me clear this null confusion?
Thanks
Sure.
System.DBNull is a class that was (and still is) used by ADO.NET to represent a null value in a database.
null is actually a null reference, and in your application code any reference type should use null as its, well, null value.
The usage of MinValue for various primitive types (which, since they are value types cannot be assigned null) dates back to the dark days before C# 2.0, which introduced generics. Now the preferred method for representing a nullable primitive type is to use the Nullable<T> generic type, usually represented in shorthand with a question mark after the primitive type. For example, I could declare a nullable int variable named foo in C# as:
Nullable<int> foo;
or
int? foo;
Both are identical.
null is only valid for reference types: types that are a class rather than a structure.
.Net also has value types: int, double, DateTime, etc. Value types cannot be null, you so you normally compare those to their default value, which is very often type.MinValue, but might be something (consider Boolean, for example).
Nullable<T> is for when you have a value type that might genuinely be null. The default value (maybe MinValue) is also valid and you need to distinguish it from when the variable has not been assigned yet. In C#, you can use a ? with a value type as a short hand notation (for example: int?), but you're still creating the same Nullable<T>.
DBNull specifically refers to NULL values from a database. It's not the same thing as using null elsewhere in the language: it's only for talking to a database so you can know when a query returned a null value.
When using generics, you'll also see the default(T) construct used, where T is a type parameter. This allows you to set a type's default value without knowing whether that type is a reference type or a value type, much less what a specific value type's default value might be.
A Null value represents any reference type object that has not had its memory allocated yet.
The MinValue does not represent a Null, rather it is a property that is often used to represent the smallest possible constant value that a given value type can have.
The DBNull.Value class is a mapping of the Nulls returned/passed to a database.
The Nullable generic type enables you to assign Null values to a Value-type.
null for reference types in the
actual null value
Nullable, added in .NET 2.0 is a
way of expressing nulls for value
types, which by definition can not be
null.
Same with DateTime.MinValue -
DateTime is a value type, and can not
be null, so you can have a convention
that a well known value, like
DateTime.MinValue is treated as if it
was null. It also has other usages.
"MinValue"s were used with value types before nullable types came around in C# 2.0. So there is a lot of legacy code which uses the old style of knowing when a value type doesn't have a value. Nowadays it's much easier to use a DateTime? date = null than DateTime date = DateTime.MinValue. As far as DBNull goes, that is something that is required as an abstraction layer to databases, but you can avoid having to deal with it yourself by employing an ORM such as NHibernate or some such - you pretty much, from a app. development standpoint will only have to deal with built-in C# types.
MinValue is not null. It's MinValue. It is sometimes used "as null" by people using C# 1.0, which did not have nullable value types. Instead, they should use Nullable, ot DateTime?.
DBNull is a .NET value that is used to communicate with a database ("DB" null). The database concept of NULL means something different from the .NET null reference. In database terms, NULL means "unknown" or "absent".
Reference types (aka objects) can be set to null since reference type variables is just a pointer to the actual instance. Indicating the lack of an instance is easy since you can set the variable to null directly.
For value types this is a bit harder since a value type variable always contains a value.
Therefore, the use of Double.MinValue or DateTime.MinValue was somewhat valid in the pre-Nullable days. Back then there was no easy way of expressing the lack of a value in value types.
Now, with nullable types you can say:
double? d = null;
And thus you can also have value type variables containing null.
System.DBNull is a different story since it is directly tied to expressing the value "NULL" in datasets. This was introduced before nullable types which imo supersedes DBNull.
Most types in .NET are reference types, and null is "not a reference" which can be assigned to a variable or field to indicate there is no object.
All other types in .NET are value types. There is always a value with a value type, so no ability to indicate "no value". Therefore an implementation may define specific special values to indicate this. Often the value type's MinValue (constant) field is used for this. So using DateTime.MinValue for a date of birth can indicate "not known" or "not applicable" (e.g. for a corporate entity).
DbNUll exists to express RDBMS's meaning of NULL which is a little different to .NET's. In most cases this will be translated to null or similar.
Finally, Nullable<T> is a wrapper for value types to more directly express "not known", and is generally a better option than using MinValue, but was added in .NET 2, so older designs may have started using MinValue before Nullable<T> is available.
the best way for tou to understand this different null values is to try them and see what the use of each one but to make things simple here is some guideline
null can be used only on reference types like collections and custom classes (string is special)
DBNull is for null values that came out of the db query.
because we cant assign null to a decimal or double we assign them with the MinValue property (they are value objects)
Nullable is a way for the developer to assign null values to value objects like decimal,
Nullable<int> is same as int?
i hope i helped you understanding the differences.
The important difference here is that between value types and reference types.
Value types represent directly a certain value, whereas reference types point to a memory location that should represent a value. When a a reference type actually does not point to a memory location with valid contents, the reference is Null.
Since value types are direct representations of value, they cannot be null. However, it might be the case that a actual value of a value type is unknown. For this case, .Net provides the Nullable types construct. When such a construct is not available, however, people tend to use special or default values, such as MinValue.
When communicating with databases, a lot of things we expect to be value types can actually be Null, since that it the way the database handles unknown values. This can also be solved by the Nullable types, but these were not always available. That's why DBNull exist, to deal with a possible null in a database.

Categories