How Nullable<T> = null works? - c#

int? num = null;
How does it work under the hood? I always assumed Nullable is a class and today I was debugging and was surprised to see it's a struct. I checked the source code and found this implicit operator:
[System.Runtime.Versioning.NonVersionable]
public static implicit operator Nullable<T>(T value)
{
return new Nullable<T>(value);
}
It's obviously not the answer here since null is not an int. In addition the constructor is this:
[System.Runtime.Versioning.NonVersionable]
public Nullable(T value)
{
this.value = value;
this.hasValue = true;
}
So the constructor is not called either.
Is compiler using some synthetic sugar here?

type? t = null;
is recognized by the compiler and replaced with type? t = new type?(), which invokes the default constructor which in turn sets the object to all zeros. The property HasValue is backed by a boolean variable such that false is the correct initial value in this case.
In just about every place, these constructions are treated as though they were references to a an object of type type. In order to observe they aren't you would need a mutable struct. The compiler knows about type? in quite a few ways including checking the HasValue property when boxing and boxing to a null or a boxed type. If you manage to get a boxed type? the unboxer should be able to handle it, but there doesn't seem to be a way to get a boxed type? anymore unless you're writing extern functions.

Related

Making value type variables nullable by using "?": does it imply boxing?

Given the following assumptions in C#:
boxing and unboxing let me convert any value type variable to an object type, which is a reference type (therefore it's also nullable), like in the example:
int i = 123;
object box = i;
The "?" operator let me convert a not-nullable integer into a nullable variable, like in the example:
int? n = 0;
My Question: for only reference type variables are nullable, can I say that in the second example I'm doing implicit boxing? In other terms, when I use the "?" operator to make nullable an integer, is there boxing operation implied as well (even if it's not explicit)?
int? is suggar for Nullable<T> See documentation. If we look at the signature of this we see:
public struct Nullable<T> where T : struct
{
...
public override bool Equals(object other)
{
if (!this.hasValue)
return other == null;
return other != null && this.value.Equals(other);
}
Since it is a struct the value will not be boxed.
If you need to compare values, n.Equals(1) would cause boxing of the argument. I cannot find any documentation about the equality operator ==, but I think it would be fairly safe to assume that it should not cause boxing.

WHY Nullable<T>,struct, allows 'null'

Nullable is an struct. And i know structs cant be assigned 'null'. So how could we assign Nullable's object a null? What is the reason?
It doesn't actually accept a value of null; it simply has syntactic sugar that allows it to act like it's null. The type actually looks a bit like this:
struct Nullable<T>
{
private bool hasValue;
private T value;
}
So when it's initialized hasValue == false, and value == default(T).
When you write code like
int? i = null;
The C# compiler is actually doing this behind the scenes:
Nullable<T> i = new Nullable<T>();
And similar checks are made when casting null to an int? at run-time as well.
Further Reading
Why can assign “null” to nullable types

Boxing Occurrence in C#

I'm trying to collect all of the situations in which boxing occurs in C#:
Converting value type to System.Object type:
struct S { }
object box = new S();
Converting value type to System.ValueType type:
struct S { }
System.ValueType box = new S();
Converting value of enumeration type to System.Enum type:
enum E { A }
System.Enum box = E.A;
Converting value type into interface reference:
interface I { }
struct S : I { }
I box = new S();
Using value types in C# string concatenation:
char c = F();
string s1 = "char value will box" + c;
note: constants of char type are concatenated at compile time
note: since version 6.0 C# compiler optimizes concatenation involving bool, char, IntPtr, UIntPtr types
Creating delegate from value type instance method:
struct S { public void M() {} }
Action box = new S().M;
Calling non-overridden virtual methods on value types:
enum E { A }
E.A.GetHashCode();
Using C# 7.0 constant patterns under is expression:
int x = …;
if (x is 42) { … } // boxes both 'x' and '42'!
Boxing in C# tuple types conversions:
(int, byte) _tuple;
public (object, object) M() {
return _tuple; // 2x boxing
}
Optional parameters of object type with value type default values:
void M([Optional, DefaultParameterValue(42)] object o);
M(); // boxing at call-site
Checking value of unconstrained generic type for null:
bool M<T>(T t) => t != null;
string M<T>(T t) => t?.ToString(); // ?. checks for null
M(42);
note: this may be optimized by JIT in some .NET runtimes
Type testing value of unconstrained or struct generic type with is/as operators:
bool M<T>(T t) => t is int;
int? M<T>(T t) => t as int?;
IEquatable<T> M<T>(T t) => t as IEquatable<T>;
M(42);
note: this may be optimized by JIT in some .NET runtimes
Are there any more situations of boxing, maybe hidden, that you know of?
That’s a great question!
Boxing occurs for exactly one reason: when we need a reference to a value type. Everything you listed falls into this rule.
For example since object is a reference type, casting a value type to object requires a reference to a value type, which causes boxing.
If you wish to list every possible scenario, you should also include derivatives, such as returning a value type from a method that returns object or an interface type, because this automatically casts the value type to the object / interface.
By the way, the string concatenation case you astutely identified also derives from casting to object. The + operator is translated by the compiler to a call to the Concat method of string, which accepts an object for the value type you pass, so casting to object and hence boxing occurs.
Over the years I’ve always advised developers to remember the single reason for boxing (I specified above) instead of memorize every single case, because the list is long and hard to remember. This also promotes understanding of what IL code the compiler generates for our C# code (for example + on string yields a call to String.Concat). When your’e in doubt what the compiler generates and if boxing occurs, you can use IL Disassembler (ILDASM.exe). Typically you should look for the box opcode (there is just one case when boxing might occur even though the IL doesn't include the box opcode, more detail below).
But I do agree that some boxing occurrences are less obvious. You listed one of them: calling a non-overridden method of a value type. In fact, this is less obvious for another reason: when you check the IL code you don’t see the box opcode, but the constraint opcode, so even in the IL it’s not obvious that boxing happens! I won't get into the exact detail why to prevent this answer from becoming even longer...
Another case for less obvious boxing is when calling a base class method from a struct. Example:
struct MyValType
{
public override string ToString()
{
return base.ToString();
}
}
Here ToString is overridden, so calling ToString on MyValType won’t generate boxing. However, the implementation calls the base ToString and that causes boxing (check the IL!).
By the way, these two non-obvious boxing scenarios also derive from the single rule above. When a method is invoked on the base class of a value type, there must be something for the this keyword to refer to. Since the base class of a value type is (always) a reference type, the this keyword must refer to a reference type, and so we need a reference to a value type and so boxing occurs due to the single rule.
Here is a direct link to the section of my online .NET course that discusses boxing in detail: http://motti.me/mq
If you are only interested in more advanced boxing scenarios here is a direct link there (though the link above will take you there as well once it discusses the more basic stuff): http://motti.me/mu
I hope this helps!
Motti
Calling non-virtual GetType() method on value type:
struct S { };
S s = new S();
s.GetType();
Mentioned in Motti's answer, just illustrating with code samples:
Parameters involved
public void Bla(object obj)
{
}
Bla(valueType)
public void Bla(IBla i) //where IBla is interface
{
}
Bla(valueType)
But this is safe:
public void Bla<T>(T obj) where T : IBla
{
}
Bla(valueType)
Return type
public object Bla()
{
return 1;
}
public IBla Bla() //IBla is an interface that 1 inherits
{
return 1;
}
Checking unconstrained T against null
public void Bla<T>(T obj)
{
if (obj == null) //boxes.
}
Use of dynamic
dynamic x = 42; (boxes)
Another one
enumValue.HasFlag
Using the non-generic collections in System.Collections such as
ArrayList or HashTable.
Granted these are specific instances of your first case, but they can be hidden gotchas. It's amazing the amount of code I still come across today that use these instead of List<T> and Dictionary<TKey,TValue>.
Adding any value type value into the ArrayList causes boxing:
ArrayList items = ...
numbers.Add(1); // boxing to object

implicit operator on generic types

Is there anything wrong with using an implicit operator like the following:
//linqpad c# program example
void Main()
{
var testObject = new MyClass<int>() { Value = 1 };
var add = 10 + testObject; //implicit conversion to int here
add.Dump(); // 11
}
class MyClass<T>
{
public T Value { get; set; }
public static implicit operator T (MyClass<T> myClassToConvert)
{
return myClassToConvert.Value;
}
}
I was thinking I could treat as instance of the object as a value type this way, but seeing as I've never seen an example of this I thought maybe there was a reason not to do something like this that someone could point out?
In my actual code I was thinking of doing this as part of a data abstraction layer, so that I could return objects with information describing the underlying data, but allow the logic code to treat it as a value type when all it needs to know about is the value, and at the same time keep it all nice and type safe with the generics.
If all of the following are true:
all possible values of your MyClass<T> type (including null if it’s not a value type!) map to a valid value of T
the implicit operator never throws (not even for null!)
the implicit conversion makes semantic sense and is not confusing to the client programmer
then there is nothing wrong with this. Of course you could do any of these three things, but it would be bad design. In particular, an implicit operator that throws can be very hard to debug because the place where it is called doesn’t say that it is being called.
For example, consider that T? has no implicit conversion to T (where T is, of course, a value type). If there was such an implicit operator, it would have to throw when the T? is null, as there is no obvious value to convert null to that would make sense for any value type T.
Let me give an example where I had trouble debugging an issue where the implicit operator threw:
public string Foo()
{
return some_condition ? GetSomething() : null;
}
Here, GetSomething returned something of a type I wrote which has a user-defined implicit conversion to string. I made absolutely sure that GetSomething could never return null, and yet I got a NullReferenceException! Why? Because the above code is not equivalent to
return some_condition ? (string)GetSomething() : (string)null;
but to
return (string)(some_condition ? GetSomething() : (Something)null);
Now you can see where the null came from!
That's a great pattern. Just keep in mind that in order to use it as a variable of type T, you have to either explicitly cast it to T, or assign it to a variable of type T. The cast will take place automatically in method calls and other things (such as your addition example) that take a T.
Implicit conversion without assignment?

Overwriting Default values in C# structs

For an assignment I have to write a Tribool class in C# using a struct. There are only three possible tribools, True, False, and Unknown, and I have these declared as static readonly. Like this:
public static readonly Tribool True, False, Unknown;
I need my default constructor to provide a False Tribool, but I'm not sure how to go about this. I've tried Tribool() { this = False; } and Tribool() { False; } but I keep getting a "Structs cannot contain explicit parameterless constructors" error.
The assignment specified that the default constructor for Tribool should provide a False Tribool. Otherwise, a user should not be able to create any other Tribools. I don't really know what to do at this point. Any advice would be greatly appreciated. Thanks.
Just to add a bit to Jason's answer: design your struct carefully so that the default value is meaningful. As an example, consider the Nullable<T> struct. The desired behaviour is that when you say
Nullable<int> x = new Nullable<int>(); // default constructor
that the resulting value is logically null. How do we do that? The struct is defined something like this:
struct Nullable<T>
{
private readonly T value;
private readonly bool hasValue;
public Nullable(T value)
{
this.value = value;
this.hasValue = true;
}
...
So when the default constructor runs, hasValue is automatically set to false and value is set to the default value of T. A nullable with hasValue set to false is treated as null, which is the desired behaviour. That's why the bool is hasValue and not isNull.
As the error is telling you, you absolutely can not have a parameterless instance constructor for a struct. See §11.3.8 of the C# 3.0 language specification:
Unlike a class, a struct is not permitted to declare a parameterless instance constructor.
The language provides one for you known as the default constructor. This constructor returns the value of the struct where are fields have been set to their default value.
Now, what you could do is have the default value of the struct represent the False value. I'll leave the details to you.
Little late to the game but, is there any reason your not using an enum?
after all if you just need a Trinary value then the struct is massive overkill
public enum Tribool{
Unknown =0,
True = -1,
False = 1
}

Categories