In a C# 8 project with nullable reference types enabled, I have the following code which I think should give me a warning about a possible null dereference, but doesn't:
public class ExampleClassMember
{
public int Value { get; }
}
public struct ExampleStruct
{
public ExampleClassMember Member { get; }
}
public class Program
{
public static void Main(string[] args)
{
var instance = new ExampleStruct();
Console.WriteLine(instance.Member.Value); // expected warning here about possible null dereference
}
}
When instance is initialized with the default constructor, instance.Member is set to the default value of ExampleClassMember, which is null. Thus, instance.Member.Value will throw a NullReferenceException at runtime. As I understand C# 8's nullability detection, I should get a compiler warning about this possibility, but I don't; why is that?
Note that there is no reason for there to be a warning on the call to Console.WriteLine(). The reference type property is not a nullable type, and so there's no need for the compiler to warn that it might be null.
You might argue that the compiler should warn about the reference in the struct itself. That would seem reasonable to me. But, it doesn't. This seems to be a loophole, caused by the default initialization for value types, i.e. there must always be a default (parameterless) constructor, which always just zeroes out all the fields (nulls for reference type fields, zeroes for numeric types, etc.).
I call it a loophole, because in theory non-nullable reference values should in fact always be non-null! Duh. :)
This loophole appears to be addressed in this blog article: Introducing Nullable Reference Types in C#
Avoiding nulls
So far, the warnings were about protecting nulls in nullable references from being dereferenced. The other side of the coin is to avoid having nulls at all in the nonnullable references.
There are a couple of ways null values can come into existence, and most of them are worth warning about, whereas a couple of them would cause another “sea of warnings” that is better to avoid:
…
Using the default constructor of a struct that has a field of nonnullable reference type. This one is sneaky, since the default constructor (which zeroes out the struct) can even be implicitly used in many places. Probably better not to warn [emphasis mine - PD], or else many existing struct types would be rendered useless.
In other words, yes this is a loophole, but no it's not a bug. The language designers are aware of it, but have chosen to leave this scenario out of the warnings, because to do otherwise would be impractical given the way struct initialization works.
Note that this is also in keeping with the broader philosophy behind the feature. From the same article:
So we want it to complain about your existing code. But not obnoxiously. Here’s how we are going to try to strike that balance:
…
There is no guaranteed null safety [emphasis mine - PD], even if you react to and eliminate all the warnings. There are many holes in the analysis by necessity, and also some by choice.
To that last point: Sometimes a warning is the “correct” thing to do, but would fire all the time on existing code, even when it is actually written in a null safe way. In such cases we will err on the side of convenience, not correctness. We cannot be yielding a “sea of warnings” on existing code: too many people would just turn the warnings back off and never benefit from it.
Also note that this same issue exists with arrays of nominally non-nullable reference types (e.g. string[]). When you create the array, all of the reference values are null, and yet this is legal and won't generate any warnings.
So much for explaining why things are the way the are. Then the question becomes, what to do about it? That's a lot more subjective, and I don't think there's a right or wrong answer. That said…
I personally would treat my struct types on a case-by-case basis. For those where the intent is actually a nullable reference type, I would apply the ? annotation. Otherwise, I would not.
Technically, every single reference value in a struct should be "nullable", i.e. include the ? nullable annotation with the type name. But as with many similar features (like async/await in C# or const in C++), this has an "infectious" aspect, in that you'll either need to override that annotation later (with the ! annotation), or include an explicit null check, or only ever assign that value to another nullable reference type variable.
To me, this defeats a lot of the purpose of enabling nullable reference types. Since such members of struct types will require special-case handling at some point anyway, and since the only way to truly safely handle it while still being able to use non-nullable reference types is to put null checks everywhere you use the struct, I feel that it's a reasonable implementation choice to accept that when code initializes the struct, it is that code's responsibility to do so correctly and make sure the non-nullable reference type member is in fact initialized to a non-null value.
This can be aided by providing an "official" means of initialization, such as a non-default constructor (i.e. one with parameters) or factory method. There will still always be the risk of using the default constructor, or no constructor at all (as in array allocations), but by providing a convenient means to initialize the struct correctly, this will encourage code that uses it to avoid null references in non-nullable variables.
That said, if what you want is 100% safety with respect to nullable reference types, then clearly the correct approach for that particular goal is to always annotate every reference type member in a struct with ?. This means every field and every auto-implemented property, along with any method or property getter that directly returns such values or the product of such values. Then the consuming code will need to include null checks or the null-forgiving operator at every point where such values are copied into non-nullable variables.
In the light of the excellent answer by #peter-duniho it seems that as of Oct-2019 it's best to mark all non-value-type members a nullable reference.
#nullable enable
public class C
{
public int P1 { get; }
}
public struct S
{
public C? Member { get; } // Reluctantly mark as nullable reference because
// https://devblogs.microsoft.com/dotnet/nullable-reference-types-in-csharp/
// states:
// "Using the default constructor of a struct that has a
// field of nonnullable reference type. This one is
// sneaky, since the default constructor (which zeroes
// out the struct) can even be implicitly used in many
// places. Probably better not to warn, or else many
// existing struct types would be rendered useless."
}
public class Program
{
public static void Main()
{
var instance = new S();
Console.WriteLine(instance.Member.P1); // Warning
}
}
Related
I've been trying to understand the implications of nullable reference types for a while now, but I'm still confused. My understanding is that, in a nullable context, you're not supposed to be able to do this:
TestClass nonNullable = null;
public class TestClass { }
I thought you're supposed to need to declare a reference type as nullable to be able to assign it a value of null in a nullable context, like this:
TestClass? nullable = null;
When I compile the first block of code in net5.0 or net6.0 with the Nullable node in the project file set to "enabled", all I get is a compiler warning CS8600. Other than that, everything seems to work the same. However, I understand that nullable reference types are a massive breaking change vis-a-vis older libraries. Why? I've read the Microsoft API reference on this topic: Nullable reference types, as well as chapters in two books. The behavior I'm observing seems to violate Microsoft's specification. Either I'm stupid, the explanations are inaccurate/unclear, or maybe both of these.
Yes, reference variables in any nullable context can be assigned a null value.
"Nullable" and "non-nullable" reference variables are perhaps misleading terms here, because they only indicate whether the compiler should generate warnings for a given variable.
It's also a bit confusing that the context itself is called "nullable" and that enabling it makes reference variables non-nullable in that context, unless you specify otherwise.
TLDR: enable some version of the nullable context feature to generate related warnings at compile time - that's all it does.
As a side note, if you actually want to block your builds based on those warnings, you'll need to take additional steps. See this related answer: https://stackoverflow.com/a/62116924/3743418
If F were "on" a, I could do this...
var obj = a?.F();
If F is not on a, I have to do this...
var obj = a == null ? null : MyFunc.F((A) a);
Or do I? Is there a more succinct way of skipping the method call if a parameter value is null?
The short answer is no, there's no succinct way to do that.
The slightly longer answer is still no, but there's an interesting language design point here. C# was designed by people with extremely good taste, if I say so myself, but it was designed over a very long period of time. Nowhere is this more obvious than with respect to its treatment of nullability.
In C# 1.0 we had a straightforward language in the tradition of C, C++, Java, JavaScript, and so on, where there are references and values, and references can be null. This has benefits; if it did not, Sir Tony would not have invented the null reference in the first place. But it has downsides: we have the possibility of null references, dereferencing null leads to program crashes, and there is an inconsistency between reference and value types: reference types have a natural "no value" value, and value types do not.
In C# 2.0 we added nullable value types, but nullable value types do not behave like nullable reference types. Of course nullable values types are not references, so you cannot "dereference" them, but if we squint a little, the .Value property looks a lot like "dereferencing", and it leads to a similar crash if the value is null. In that sense, they behave the same, but in other senses, they do not. Adding together two nullable integers does not crash if one of them is null; instead, the result is also null.
So at this point we have a contradiction built into the language:
Using a null value of a nullable value type usually automatically propagates the null, but using a null reference can crash.
And of course C# has then gone on to add a variety of features that make null references behave more like null values, like ?. and the related operations. There are also proposals for C# 8 that are very exciting, and will support "non nullable reference type" scenarios.
But the bolded text above is the fundamental problem you've pinpointed: the semantics of operators on nullable reference types are almost always "lift the non-nullable version of the operator to nullable types; if all the operands are non-null then the result is the same as the unlifted version; otherwise, the result is null". However, those semantics are not automatically extended to the . member access operator or the () call operator, regardless of whether the operands are nullable value types or nullable reference types. . can be lifted explicitly by ?. but the () operator does not get lifted to nullable semantics, ever.
Imagine a language like C# 1.0, but with Nullable<T> built in from the start, such that it applied to both reference and value types. In that world, you can see natural ways to implement generalized lifting, where if you have a method
class C { string M(double, int[]) }
and you call it with a Nullable<C> receiver, or Nullable<double> and Nullable<int[]> arguments, you automatically get the code that we build for you for nullable integer arithmetic: check whether the receiver or any arguments are null, and if they are, result in a null Nullable<string>. Otherwise, call the function normally and use the non-nullable result.
The C# compiler already implements these semantics for all user-defined operators declared on struct types; it would not be hardly any difficulty at all to extend those semantics to other kinds of methods. But we can't do it now; there are far too many backwards-compatibility issues to solve.
This design choice would also have the nice property that it would be a correct implementation of the "maybe monad".
But that's not the world we are in, and it's largely because these design considerations evolved over time, rather than being invented all at once. The next time you invent a new language, consider carefully how to represent nullability!
Does maintenance become a nightmare on code that allows a nullable type of a value type? I realize that int? is the equivalent of Nullable<int>, but my question is more geared towards the usability of it. We see value types and naturally overlook them as not allowing null. But bringing in the Nullable<T> with a shorthand of the question mark, it's obvious what it does but not always noticeable.
Is this one of those features that just because you can do it, doesn't mean you should?
What should be the preference? A default value of a value type (i.e. int SomeConfigOption = -1;) or utilizing Nullable<T> (i.e. int? SomeConfigOption;)?
What should be the preference? A default value of a value type (i.e.
int SomeConfigOption = -1;) or utilizing Nullable (i.e. int?
SomeConfigOption;)?
In this case clearly you want Nullable<T> whenever you have the case that you have to account for the absence of a value. Magic numbers like -1 are a far worse maintenance nightmare.
This is a core feature of the C# language, as with other features it can be abused but it provides clear benefits as well - these benefits far outweigh any problems someone not proficient in the language might have reading the source code - time to get up to speed.
I think Nullable looks nice: code with Nullable types is quite self-documented.
Examples:
int? someConfigOption;
if (someConfigOption.HasValue)
{
// Use someConfigOption.Value property.
}
else
{
// Value is absent.
}
Another handy approach:
int value = someConfigOption.GetValueOrDefault();
Of course, the methods which take Nullable values as their parameters should be well documented.
I much prefer a nullable type to a value type with a default value (which the developer then means null). I have found more issues in code where default values are used to mean nothing.
Nullable types should be used when "null" is a valid value. If null is a valid value, then using nullable types is a good practice. But if nullable values are not valid, then carrying the null value is bad practice.
Suppose a function GetClientID where the function reads from the DB, and returns a ClientID. Let's assume that GetClientID should never return NULL or empty.
While reading the value from the DB, a nullable type is the best practice for handling possible nulls from the DB. GetClientID therefore should use the nullable type to handle correctly the exception situation (failing / logging / etc) and making sure only valid values return.
The worst practice would be to return a nullable type. You can potentially be carrying an already invalid value that would be read somewhere in your code, failing far from where the invalid value was loaded (and becoming a maintenance nightmare).
So ask yourself, if the null value a valid value (valid is no the same as possible) and code accordingly.
I heard that the addition of Nullable<T> type to C# 2.0 need to revise CLR(runtime) a little, is this change necessary? Could it make the same goal if a new Nullable<T> generic class was added only?
Nullable isn't a generic class as you indicate, Nullable<T> is generic (it has a type parameter, T). That's why Nullable<T> only arrived in C# 2.0: it came with the addition of generics to the CLR.
You could do the same thing with a general Nullable, but you couldn't do things like this:
int? myInt = 123;
int result = myInt.Value;
You would instead have to:
int result = (int)myInt.Value;
...and it might not be type-safe, by that I mean what if myInt.Value is a string? The generic version, Nullable<T>, would only allow an int into the Value property.
I don't completely understand what you're asking, though.. "why are generic classes useful"?
If I understand you correctly, you are asking why can't it just be Type introduced in the framework library? but instead the CLR need to be changed?
Well based on my understanding Nullable is a special type, not quite like other container type. First it is actually a value type - defined as struct, not a class. Separately it allows to be assigned a value of null (for value type, that is a special case), plus it support the use of '?' and operator '??' which is all new. The Nullable also become part of the Common type system. So I guess from these perspective, the specification, compiler and CLR will need to be changed.
There are only two reasons I know of for the special handling of Nullable<T>:
To allow a typecast from null Object reference to a Nullable<T> with HasValue = false.
To allow a Nullable<T> where HasValue is false, to compare equal to null.
Frankly, think it would have been better to let Nullable<T> box like any other value type, and define Nullable<T>.Empty as a value which may be compared against (for those cases where one might want to compare against a variable that might be null or might hold a value). To my mind, there's no reason why Object.Equals should report that an "int?" which is equal to null is equal to a "long?" which is also equal to null. The first should be viewed as an empty int-sized box, and the latter as an empty long-sized box.
Why it is not allowed to assign null to a DateTime in C#? How has this been implemented? And can this feature be used to make your own classes non-nullable?
Example:
string stringTest = null; // Okay
DateTime dateTimeTest = null; // Compile error
I know that I can use DateTime? in C# 2.0 to allow null to be assigned to dateTimeTest and that I could use Jon Skeet's NonNullable class on my string to get a run time error on the assignment of stringTest. I'm just wondering why the two types behave differently.
DateTime is a value-type (struct), where-as string is a reference-type (class etc). That is the key difference. A reference can always be null; a value can't (unless it uses Nullable<T> - i.e. DateTime?), although it can be zero'd (DateTime.MinValue), which is often interpreted as the same thing as null (esp. in 1.1).
DateTime is a struct and not a class. Do a 'go to definition' or look at it in the object browser to see.
HTH!
The important distinction between ValueTypes and reference types is that value types have these "value semantics". A DateTime, Int32 and all other value types have no identity, an Int32 "42" is essentially indistinguishable from any other Int32 with the same value.
All value type "objects" exist either on stack or as a part of a reference type object. One special case is when you cast a value type instance to an Object or an interface - this is called "boxing", and it simply creates a dummy reference-type object which only contains the value that can be extracted back ("unboxed").
Reference types, on the other hand, have an identity. a "new Object()" does not equal any other "new Object()", because they are separate instances on the GC heap. Some reference types provide Equals method and overloaded operators so that they behave more value-like, eg. a String "abc" equals other "abc" String even if they are in fact two different objects.
So when you have a reference, it can either contain the address of a valid object, or it can be null. When value type objects are all-zero, they are simply zero. Eg. an integer zero, a float zero, Boolean false, or DateTime.MinValue. If you need to distinguish between "zero" and "value missing/null", you need to use either a separate Boolean flag, or, better yet, use the Nullable<T> class in .NET 2.0. Which is simply the value plus a Boolean flag. There's also support in the CLR so that boxing of a Nullable with HasValue=false results in a null reference, not in a boxed structure with false+zero, as it would if you were to implement this structure yourself.
DateTime is a value type, same as an int. Only reference types (like string or MyCustomObject) can be null. Reference types really store "references" to the objects location on the heap.
here's a article I found that explains it better. and here's the MSDN article on it
For a value-type to be null, there must be some value it can hold which would have no other legitimate meaning, and which the system will somehow know should be regarded as "null". Some value types could meet the first criterion without requiring any extra storage. If .net had been designed from the ground up with the concept of nullable values in mind, it could have had Object include a virtualIsLogicalNullproperty, and a non-virtualIsNullwhich would returntrueifthisis null and, otherwise invoke itsIsLogicalNullproperty and return the result. If .net had done this, it would have avoided the need for the quirky boxing behavior andstructconstraint ofNullable(an emptyNullablecould be boxed as an emptyNullable, and still be recognized asnull`).
By the time it was decided to provide support for nullable value types in .net framework 2.0, however, a lot of code had been written which assumed that the default values for things like Guid and DateTime would not be regarded as null. Since much of the value in nullable types lies with their predictable default value (i.e. null) , having types which had a null value, but defaulted to something else, would have added more confusion than value.
string is a class whereas DateTime is a structure. Thats why you cannot set it to null