Why can an int? set to null have instance properties? - c#

I'm curious as to why the following code works (run under the VS debugger):
int? x = null;
null
x.HasValue
false
If x is indeed null, what instance does HasValue refer to? Is HasValue implemented as an extension method, or does the compiler special case this to make it magically work?

Because x isn't a reference type. The ? is just syntactic sugar for Nullable<T>, which is a struct (value type).

int? is actually a structure Nullable<int>. Hence this, your x cannot be null, because it is always instance of a structure.

Hand-waving answer: Nullable structs are magic.
Longer answer: Null is not actually what is represented by the value. When you assign null to a nullable struct, what you will see happen behind the scenes is different.
int? val = null; // what you wrote
Nullable<Int32> val = new Nullable<Int32>(); // what is actually there
In this case, an instance of the struct is created that has the T Value set to a default value and the bool HasValue property set to false.
The only time you will actually obtain a null reference from a Nullable<T> is when it is boxed, as a Nullable<T> boxes directly to T or null, depending upon if the value is set.

There are several meanings to null.
One in programming languages which present variables and memory in a pointer-based manner (which includes C#'s references though it hides some of the details) is "this doesn't point to anything".
Another is "this has no meaningful value".
With reference types, we often use the former to represent the latter. We might use string label = null to mean "no meaningful label. It remains though that it's still also a matter of what's going on in terms of what's where in memory and what's pointing to it. Still, it's pretty darn useful, what a shame we couldn't do so with int and DateTime in C#1.1
That's what Nullable<T> provides, a means to say "no meaningful value", but at the level below it's not null in the same way a null string is (unless boxed). It's been assigned null and is equal to null so it's logically null and null according to some other semantics, but it's not null in the "doesn't point to anything" implementation difference between reference and value types.
It's only the "doesn't point to anything" aspect of reference-type null that stops you from calling instance methods on it.
And actually, even that isn't strictly true. IL let's you call instance methods on a null reference and as long as it doesn't interact with any fields, it will work. It can't work if it needs (directly or indirectly) those fields since they don't exist on a null refernce, but it could call null.FineWithNull() if that method was defined as:
int FineWithNull()
{
//note that we don't actually do anything relating to the state of this object.
return 43;
}
With C# it was decided to disallow this, but it's not a rule for all .NET (I think F# allows it, but I'm not sure, I know unmanaged C++ allowed it and it was useful in some very rare cases).

When using int? x = null then x is assigned a new instance of Nullable<int> and ist value is set to null.
I don't exactly know the internals but I would assume that the assignment operator itself is somewhat overloaded.

A nullable type isn't actually null since it still doesn't get around the fact that value types can't be null. It is, instead, a reference to the Nullable<> struct (which is also a value type and can't be null).
More information here.
Basically, you're always referring to an instance of something when you use a nullable type. The default information returned by that reference is the stored value (or null if there is no stored value).

Nullable isn't really a reference type, and its instance methods are one of the places where this shows up. Fundamentally, it is a struct type containing a boolean flag and a value of the type it is a nullable of. The languages special-case various operators [to be lifting, or to consider (bool?)null false in some cases] and the null literal, and the runtime special-cases boxing, but apart from that it's just a struct.

It's a completely new type. Nullable is not T.
What you have is a generic class something like this:
public struct Nullable<T>
{
public bool HasValue { get { return Value != null; } }
public T Value { get; set; }
}
I'm sure there's more to it (particularly in the getter and setter, but that's it in a nutshell.

The nullable type (in this case: nullable int) has a property of HasValue which is boolean. If HasValue is True, the Value property (of Type T, in this case, an int) will have a valid value.

Related

Difference between Nullable types "not having a value" and "being Null" in C#

What is the difference in the terms
Not having a value
and
2.Being Null
in Nullable types in C#.
In otherwords when do we say a type does not have a value and when do we say the value is null.
Nullable<T> is a struct, and being a struct, it cannot ever actually be null. The only actual data stored in a Nullable<T> is an underlying value of type T and a boolean HasValue flag. A Nullable<T> value in which HasValue is false can be thought of as being null, even though it technically is not null.
There is some compiler magic such that someNullable == null can be called, and the result of that expression is actually the same as someNullable.HasValue. You can also assign null to a Nullable<T>. What this will actually do is create a new Nullalbe<T> struct where HasValue is false. (Interestingly, it's not possible for you to do that yourself if you were to write your own type.)
There is some more compiler magic such that if you box a Nullalbe<T> it doesn't actually box the nullable type. If HasValue is actually true then it will pull out the underlying value and box that. If HasValue is false then it will just box null. When unboxing something to a Nullable it then does the reverse; creating a Nullable<T> struct based on whether it's null or not. (This is another thing that you couldn't do with a custom type.)
Because of these special compiler features it does a fairly good job of making it appear as if Nullable can actually be null, but the reality is that it is not actually null.
A Nullable<T> itself cannot be null because its a structure. So you may find a Nullable<T> that follow the case: Not having a value but it is not null actually.
EDIT:
---------------------------- Not Having a Value case ------- Can be null case
Reference types ----(equals to can be null case)-------- Possible
Value types------------ Not Possible ---------------------------Not Possible
Nullable-----------------Possible --------------------------------Not Possible
Here, the sepration of Nullable and Value types do not mean Nullable is not a value type.
It's been mentioned that the generic nullable type, Nullable<T> (where T must be a value type) has the important properties:
bool HasValue // Indicating that a real value is present, or if the instance should be considered null
T Value // Your value, if present. (non-nullable)
So you could conceivably have these cases for a Nullable<int> myInt:
HasValue | Value | Represents
------------------------------
false | 0 | No Value / null
true | 0 | 0
true | 3 | 3
In the first case, HasValue indicates that the myInt has no value, and represents the null state; a comparison for equality with the null literal will be considered true:
(myInt == null) // true.
The nullable type itself will not actually be null (it's a struct, it can't be), but the compiler will resolve the the null comparison by querying HasValue, as mentioned in one of the other answers linked.
If we are discussing terms, let's be clear with language. As many of the other answers have said, in c#, when a non-nullable type is made to be nullable using the ? operator, it becomes the Nullable struct, where T will always be a value type.
Ref: http://msdn.microsoft.com/en-us/library/b3h38hb0(v=vs.110).aspx
see Servy's answer for a good explanation of this.
However, once any type is nullable (regardless of if it is a reference type or has been made to use the Nullable struct) it would be correct to say that if the type has no value, the type is null, and vice versa. From the above Microsoft documentation:
A type is said to be nullable if it can be assigned a value or can be
assigned null, which means the type has no value whatsoever.

Why is the default value of the string type null instead of an empty string?

It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc...
If the default value of string were the empty string, I would not have to test, and I would feel it to be more consistent with the other value types like int or double for example.
Additionally Nullable<String> would make sense.
So why did the designers of C# choose to use null as the default value of strings?
Note: This relates to this question, but is more focused on the why instead of what to do with it.
Why is the default value of the string type null instead of an empty
string?
Because string is a reference type and the default value for all reference types is null.
It's quite annoying to test all my strings for null before I can
safely apply methods like ToUpper(), StartWith() etc...
That is consistent with the behaviour of reference types. Before invoking their instance members, one should put a check in place for a null reference.
If the default value of string were the empty string, I would not have
to test, and I would feel it to be more consistent with the other
value types like int or double for example.
Assigning the default value to a specific reference type other than null would make it inconsistent.
Additionally Nullable<String> would make sense.
Nullable<T> works with the value types. Of note is the fact that Nullable was not introduced on the original .NET platform so there would have been a lot of broken code had they changed that rule.(Courtesy #jcolebrand)
Habib is right -- because string is a reference type.
But more importantly, you don't have to check for null each time you use it. You probably should throw a ArgumentNullException if someone passes your function a null reference, though.
Here's the thing -- the framework would throw a NullReferenceException for you anyway if you tried to call .ToUpper() on a string. Remember that this case still can happen even if you test your arguments for null since any property or method on the objects passed to your function as parameters may evaluate to null.
That being said, checking for empty strings or nulls is a common thing to do, so they provide String.IsNullOrEmpty() and String.IsNullOrWhiteSpace() for just this purpose.
You could write an extension method (for what it's worth):
public static string EmptyNull(this string str)
{
return str ?? "";
}
Now this works safely:
string str = null;
string upper = str.EmptyNull().ToUpper();
You could also use the following, as of C# 6.0
string myString = null;
string result = myString?.ToUpper();
The string result will be null.
Empty strings and nulls are fundamentally different. A null is an absence of a value and an empty string is a value that is empty.
The programming language making assumptions about the "value" of a variable, in this case an empty string, will be as good as initiazing the string with any other value that will not cause a null reference problem.
Also, if you pass the handle to that string variable to other parts of the application, then that code will have no ways of validating whether you have intentionally passed a blank value or you have forgotten to populate the value of that variable.
Another occasion where this would be a problem is when the string is a return value from some function. Since string is a reference type and can technically have a value as null and empty both, therefore the function can also technically return a null or empty (there is nothing to stop it from doing so). Now, since there are 2 notions of the "absence of a value", i.e an empty string and a null, all the code that consumes this function will have to do 2 checks. One for empty and the other for null.
In short, its always good to have only 1 representation for a single state. For a broader discussion on empty and nulls, see the links below.
https://softwareengineering.stackexchange.com/questions/32578/sql-empty-string-vs-null-value
NULL vs Empty when dealing with user input
Why the designers of C# chose to use null as the default value of
strings?
Because strings are reference types, reference types are default value is null. Variables of reference types store references to the actual data.
Let's use default keyword for this case;
string str = default(string);
str is a string, so it is a reference type, so default value is null.
int str = (default)(int);
str is an int, so it is a value type, so default value is zero.
The fundamental reason/problem is that the designers of the CLS specification (which defines how languages interact with .net) did not define a means by which class members could specify that they must be called directly, rather than via callvirt, without the caller performing a null-reference check; nor did it provide a meany of defining structures which would not be subject to "normal" boxing.
Had the CLS specification defined such a means, then it would be possible for .net to consistently follow the lead established by the Common Object Model (COM), under which a null string reference was considered semantically equivalent to an empty string, and for other user-defined immutable class types which are supposed to have value semantics to likewise define default values. Essentially, what would happen would be for each member of String, e.g. Length to be written as something like [InvokableOnNull()] int String Length { get { if (this==null) return 0; else return _Length;} }. This approach would have offered very nice semantics for things which should behave like values, but because of implementation issues need to be stored on the heap. The biggest difficulty with this approach is that the semantics of conversion between such types and Object could get a little murky.
An alternative approach would have been to allow the definition of special structure types which did not inherit from Object but instead had custom boxing and unboxing operations (which would convert to/from some other class type). Under such an approach, there would be a class type NullableString which behaves as string does now, and a custom-boxed struct type String, which would hold a single private field Value of type String. Attempting to convert a String to NullableString or Object would return Value if non-null, or String.Empty if null. Attempting to cast to String, a non-null reference to a NullableString instance would store the reference in Value (perhaps storing null if the length was zero); casting any other reference would throw an exception.
Even though strings have to be stored on the heap, there is conceptually no reason why they shouldn't behave like value types that have a non-null default value. Having them be stored as a "normal" structure which held a reference would have been efficient for code that used them as type "string", but would have added an extra layer of indirection and inefficiency when casting to "object". While I don't foresee .net adding either of the above features at this late date, perhaps designers of future frameworks might consider including them.
Because a string variable is a reference, not an instance.
Initializing it to Empty by default would have been possible but it would have introduced a lot of inconsistencies all over the board.
If the default value of string were the empty string, I would not have to test
Wrong! Changing the default value doesn't change the fact that it's a reference type and someone can still explicitly set the reference to be null.
Additionally Nullable<String> would make sense.
True point. It would make more sense to not allow null for any reference types, instead requiring Nullable<TheRefType> for that feature.
So why did the designers of C# choose to use null as the default value of strings?
Consistency with other reference types. Now, why allow null in reference types at all? Probably so that it feels like C, even though this is a questionable design decision in a language that also provides Nullable.
Perhaps if you'd use ?? operator when assigning your string variable, it might help you.
string str = SomeMethodThatReturnsaString() ?? "";
// if SomeMethodThatReturnsaString() returns a null value, "" is assigned to str.
A String is an immutable object which means when given a value, the old value doesn't get wiped out of memory, but remains in the old location, and the new value is put in a new location. So if the default value of String a was String.Empty, it would waste the String.Empty block in memory when it was given its first value.
Although it seems minuscule, it could turn into a problem when initializing a large array of strings with default values of String.Empty. Of course, you could always use the mutable StringBuilder class if this was going to be a problem.
Since string is a reference type and the default value for reference type is null.
Since you mentioned ToUpper(), and this usage is how I found this thread, I will share this shortcut (string ?? "").ToUpper():
private string _city;
public string City
{
get
{
return (this._city ?? "").ToUpper();
}
set
{
this._city = value;
}
}
Seems better than:
if(null != this._city)
{ this._city = this._city.ToUpper(); }
Maybe the string keyword confused you, as it looks exactly like any other value type declaration, but it is actually an alias to System.String as explained in this question.
Also the dark blue color in Visual Studio and the lowercase first letter may mislead into thinking it is a struct.
Nullable types did not come in until 2.0.
If nullable types had been made in the beginning of the language then string would have been non-nullable and string? would have been nullable. But they could not do this du to backward compatibility.
A lot of people talk about ref-type or not ref type, but string is an out of the ordinary class and solutions would have been found to make it possible.

Why does .ToString() on a null string cause a null error, when .ToString() works fine on a nullable int with null value?

selectedItem has two fields:
int? _cost
string _serialNumber
In this example, _cost and _serialNumber of selectedItem are BOTH null. I am reading through the fields of selectedItem via their properties, and filling in textboxes with their values, when...
TextBox1.Text = selectedItem.Cost.ToString(); //no error
TextBox2.Text = selectedItem.SerialNumber.ToString(); //error
I understand that SerialNumber.ToString() is redundant (because it is already a string), but I don't understand why this causes this exception:
Nullable object must have a value.
int? _cost is nullable, and does not have a value, yet it does not give me the exception.
string _serialNumber is nullable, and does not have a value, yet it does give me the exception.
This question touches on it, the guy is essentially asking the same thing, but there is no designated answer, and it also doesn't explain why a nullable int? For example, can I use .ToString() on a nullable int but not on a null string?
Because string type's null really points to nothing, there isn't any object in memory.But int? type(nullable) even with value set to null still points to some object.If you read Jeffrey Richter's "CLR via C#" you'll find out that nullable type are just facade classes for common types with some incapsulated logics in order to make work with DB null more convenient.
Check msdn to learn about nullable types.
A Nullable<int> is a struct and can't really be null. So a method call on a "null" struct still works.
There is some "compiler magic" that makes _cost == null a valid expression.
int? is not actually an object in its own but it's a Nullable<int> object.
So when you declare int? _Cost, you are actually declaring Nullable<int> _Cost and the property of _Cost.Value is undefined not the _Cost object itself.
It is actually a syntactic sugar to use non nullable types like int, bool or decimal easily.
According to MSDN:
The syntax T? is shorthand for System.Nullable<T>, where T is a value type. The two forms are interchangeable.
A string is a reference type, but a nullable int is a value type. Here is a Good discussion of the differences http://www.albahari.com/valuevsreftypes.aspx.
The Nullable is actually a struct exposing two properties: HasValue and Value. If you do this you will get your error:
int? i = null;
i.Value.ToString()
In order to check whether or not your int? has a value you can access i.HasValue
The reason is simple. int? or Nullable<int> is a struct or a value type, it can never be null.
So what happens when we do:
int? _cost = null;
_cost will have two fields Value and HasValue, when we assign null to _cost its HasValue flag will be set to false and the Value field would be assigned default(T) in case of int? it would 0.
Now when we call ToString on _cost, Nullable<T> has an override definition of ToString, which if we look at Microsoft's provided Source Reference is implemented like:
public override string ToString() {
return HasValue ? value.ToString() : "";
}
Thus it returns an empty string, since _cost is assigned null.
Now comes the case of string _serialNumber. Being string it is a reference type and it can purely hold null. If it is holding null then calling ToString on it would produce the Null Reference Exception as expected.
You may see: Value Types and Reference Types - MSDN
what i think the reason is, when the compiler encounters a primitive data type it wraps it, to its corresponding object. The toString() method call is just an indirect call(wrapping and then calling the method) here and the exception is handled there.
While in the case of String, we are directly calling the method. When pointing to a null, the method throws the exception.
TextBox2.Text = selectedItem.SerialNumber.ToString(); //error
yiels error because it's calling function ToString() which is member of System.String. This function returns this instance of System.String; no actual conversion is performed. Also, String is a reference type. A reference type contains a pointer to another memory location that holds the data.
TextBox1.Text = selectedItem.Cost.ToString(); //no error
yields no error because it's calling to function ToString() which is a member of System.Integer. This function converts the numeric value of this instance to its equivalent string representation. Also, Integer is a value type. A data type is a value type if it holds the data within its own memory allocation.
The same function name ToString() but performs different task.
String.ToString Method
Int32.ToString Method
Value types and reference types

Comparing boxed value types

Today I stumbled upon an interesting bug I wrote. I have a set of properties which can be set through a general setter. These properties can be value types or reference types.
public void SetValue( TEnum property, object value )
{
if ( _properties[ property ] != value )
{
// Only come here when the new value is different.
}
}
When writing a unit test for this method I found out the condition is always true for value types. It didn't take me long to figure out this is due to boxing/unboxing. It didn't take me long either to adjust the code to the following:
public void SetValue( TEnum property, object value )
{
if ( !_properties[ property ].Equals( value ) )
{
// Only come here when the new value is different.
}
}
The thing is I'm not entirely satisfied with this solution. I'd like to keep a simple reference comparison, unless the value is boxed.
The current solution I am thinking of is only calling Equals() for boxed values. Doing a check for a boxed values seems a bit overkill. Isn't there an easier way?
If you need different behaviour when you're dealing with a value-type then you're obviously going to need to perform some kind of test. You don't need an explicit check for boxed value-types, since all value-types will be boxed** due to the parameter being typed as object.
This code should meet your stated criteria: If value is a (boxed) value-type then call the polymorphic Equals method, otherwise use == to test for reference equality.
public void SetValue(TEnum property, object value)
{
bool equal = ((value != null) && value.GetType().IsValueType)
? value.Equals(_properties[property])
: (value == _properties[property]);
if (!equal)
{
// Only come here when the new value is different.
}
}
( ** And, yes, I know that Nullable<T> is a value-type with its own special rules relating to boxing and unboxing, but that's pretty much irrelevant here.)
Equals() is generally the preferred approach.
The default implementation of .Equals() does a simple reference comparison for reference types, so in most cases that's what you'll be getting. Equals() might have been overridden to provide some other behavior, but if someone has overridden .Equals() in a class it's because they want to change the equality semantics for that type, and it's better to let that happen if you don't have a compelling reason not to. Bypassing it by using == can lead to confusion when your class sees two things as different when every other class agrees that they're the same.
Since the input parameter's type is object, you will always get a boxed value inside the method's context.
I think your only chance is to change the method's signature and to write different overloads.
How about this:
if(object.ReferenceEquals(first, second)) { return; }
if(first.Equals(second)) { return; }
// they must differ, right?
Update
I realized this doesn't work as expected for a certain case:
For value types, ReferenceEquals returns false so we fall back to Equals, which behaves as expected.
For reference types where ReferenceEquals returns true, we consider them "same" as expected.
For reference types where ReferenceEquals returns false and Equals returns false, we consider them "different" as expected.
For reference types where ReferenceEquals returns false and Equals returns true, we consider them "same" even though we want "different"
So the lesson is "don't get clever"
I suppose
I'd like to keep a simple reference comparison, unless the value is boxed.
is somewhat equivalent to
If the value is boxed, I'll do a non-"simple reference comparison".
This means the first thing you'll need to do is to check whether the value is boxed or not.
If there exists a method to check whether an object is a boxed value type or not, it should be at least as complex as that "overkill" method you provided the link to unless that is not the simplest way. Nonetheless, there should be a "simplest way" to determine if an object is a boxed value type or not. It's unlikely that this "simplest way" is simpler than simply using the object Equals() method, but I've bookmarked this question to find out just in case.
(not sure if I was logical)

is this explicit cast necessary when I know it isn't null?

I have the following:
MyEnum? maybeValue = GetValueOrNull();
if (null != maybeValue)
{
MyEnum value = (MyEnum)maybeValue;
}
What I want to know is if that explicit (MyEnum) cast is necessary on an instance of type MyEnum?. This seems like a simple question, I just felt paranoid that there could possibly be some runtime error if I just did MyEnum value = maybeValue within that if statement.
For a nullable type, you can do
if (maybeValue.HasValue)
{
MyEnum value = maybeValue.Value;
}
Since you're using nullable types, try using
if(maybeValue.HasValue)
{
MyEnum value = maybeValue.Value; // no cast needed! Yay!
}
A number of answers have noted that using the Value property avoids the cast. This is correct and it answers the question:
is that explicit (MyEnum) cast necessary on an instance of type "MyEnum?" ?
However, we haven't addressed the other concern:
I just felt paranoid that there could possibly be some runtime error if I just did MyEnum value = maybeValue within that if statement.
Well, first off, you cannot simply assign a nullable value to a variable of its underlying type. You have to do something to do the conversion explicitly.
However, if you do so in the manner you describe -- checking whether there is a value first -- then this is safe. (Provided of course that no one is mutating the variable containing the nullable value between the call to HasValue and the fetch of the value.)
If you use ILDASM or some other tool to examine the generated code you will discover that casting a nullable value to its underlying type is simply generated as an access to the Value property; using the cast operator or accessing the Value property is a difference which actually makes no difference at all. The Value property accessor will throw if HasValue is false.
Use whichever syntax you feel looks better. I personally would probably choose the "Value" syntax over the cast syntax because I think it reads better to say "if it has a value, gimme the value" than "if it has a value, convert it to its underlying type".
I believe that when you define a variable as a nullable type, it adds a .HasValue property and a .Value property. You should be able to use those to avoid any casting.
You can rewrite your code like this
MyEnum? maybeValue = GetValueOrNull();
if (maybeValue.HasValue == true)
{
MyEnum value = maybeValue.Value;
}
Another option would be to set up your enum to have a Default (0) value and then you would be able to just do the following:
MyEnum value = GetValueOrNull().GetValueOrDefault();
However, this would require that your code have knowledge of what the default value means and possibly handle it differently from what the other enum types do.

Categories