What is a C# for ! expression? [duplicate] - c#

I've recently seen the following code:
public class Person
{
//line 1
public string FirstName { get; }
//line 2
public string LastName { get; } = null!;
//assign null is possible
public string? MiddleName { get; } = null;
public Person(string firstName, string lastName, string middleName)
{
FirstName = firstName;
LastName = lastName;
MiddleName = middleName;
}
public Person(string firstName, string lastName)
{
FirstName = firstName;
LastName = lastName;
MiddleName = null;
}
}
Basically, I try to dig into new c# 8 features. One of them is NullableReferenceTypes.
Actually, there're a lot of articles and information about it already. E.g. this article is quite good.
But I didn't find any information about this new statement null!
Can someone provide me an explanation for it?
Why do I need to use this?
And what is the difference between line1 and line2?

TL;DR
The key to understanding what null! means is understanding the ! operator. You may have used it before as the "not" operator. However, since C# 8.0 and its new "nullable-reference-types" feature, the operator got a second meaning. It can be used on a type to control Nullability, it is then called the "Null Forgiving Operator".
Basically, null! applies the ! operator to the value null. This overrides the nullability of the value null to non-nullable, telling the compiler that null is a "non-null" type.
Typical usage
Assuming this definition:
class Person
{
// Not every person has a middle name. We express "no middle name" as "null"
public string? MiddleName;
}
The usage would be:
void LogPerson(Person person)
{
Console.WriteLine(person.MiddleName.Length); // WARNING: may be null
Console.WriteLine(person.MiddleName!.Length); // No warning
}
This operator basically turns off the compiler null checks for this usage.
Technical Explanation
The groundwork that you will need to understand what null! means.
Null Safety
C# 8.0 tries to help you manage your null-values. Instead of allowing you to assign null to everything by default, they have flipped things around and now require you to explicitly mark everything you want to be able to hold a null value.
This is a super useful feature, it allows you to avoid NullReferenceExceptions by forcing you to make a decision and enforcing it.
How it works
There are 2 states a variable can be in - when talking about null-safety.
Nullable - Can be null.
Non-Nullable - Can not be null.
Since C# 8.0 all reference types are non-nullable by default.
Value types have been non-nullable since C# 2.0!
The "nullability" can be modified by 2 new (type-level) operators:
! = from Nullable to Non-Nullable
? = from Non-Nullable to Nullable
These operators are counterparts to one another.
The Compiler uses the information, you define with those operators, to ensure null-safety.
Examples
? Operator usage.
This operator tells the compiler that a variable can hold a null value. It is used when defining variables.
Nullable string? x;
x is a reference type - So by default non-nullable.
We apply the ? operator - which makes it nullable.
x = null Works fine.
Non-Nullable string y;
y is a reference type - So by default non-nullable.
y = null Generates a warning since you assign a null value to something that is not supposed to be null.
Nice to know: Using object? is basically just syntactic sugar for System.Nullable<object>
! Operator usage.
This operator tells the compiler that something that could be null, is safe to be accessed. You express the intent to "not care" about null safety in this instance. It is used when accessing variables.
string x;
string? y;
x = y
Illegal! Warning: "y" may be null
The left side of the assignment is non-nullable but the right side is nullable.
So it does not work, since it is semantically incorrect
x = y!
Legal!
y is a reference type with the ? type modifier applied so it is nullable if not proven otherwise.
We apply ! to y which overrides its nullability settings to make it non-nullable
The right and left side of the assignment are non-nullable. Which is semantically correct.
WARNING The ! operator only turns off the compiler-checks at a type-system level - At runtime, the value may still be null.
Use carefully!
You should try to avoid using the Null-Forgiving-Operator, usage may be the symptom of a design flaw in your system since it negates the effects of null-safety you get guaranteed by the compiler.
Reasoning
Using the ! operator will create very hard to find bugs. If you have a property that is marked non-nullable, you will assume you can use it safely. But at runtime, you suddenly run into a NullReferenceException and scratch your head. Since a value actually became null after bypassing the compiler-checks with !.
Why does this operator exist then?
There are valid use-cases (outlined in detail below) where usage is appropriate. However, in 99% of the cases, you are better off with an alternative solution. Please do not slap dozens of !'s in your code, just to silence the warnings.
In some (edge) cases, the compiler is not able to detect that a nullable value is actually non-nullable.
Easier legacy code-base migration.
In some cases, you just don't care if something becomes null.
When working with Unit-tests you may want to check the behavior of code when a null comes through.
Ok!? But what does null! mean?
It tells the compiler that null is not a nullable value. Sounds weird, doesn't it?
It is the same as y! from the example above. It only looks weird since you apply the operator to the null literal. But the concept is the same. In this case, the null literal is the same as any other expression/type/value/variable.
The null literal type is the only type that is nullable by default! But as we learned, the nullability of any type can be overridden with ! to non-nullable.
The type system does not care about the actual/runtime value of a variable. Only its compile-time type and in your example the variable you want to assign to LastName (null!) is non-nullable, which is valid as far as the type-system is concerned.
Consider this (invalid) piece of code.
object? null;
LastName = null!;

When the "nullable reference types" feature is turned on, the compiler tracks which values in your code it thinks may be null or not. There are times where the compiler could have insufficient knowledge.
For example, you may be using a delayed initialization pattern, where the constructor doesn't initialize all the fields with actual (non-null) values, but you always call an initialization method which guarantees the fields are non-null. In such case, you face a trade-off:
if you mark the field as nullable, the compiler is happy, but you have to un-necessarily check for null when you use the field,
if you leave the field as non-nullable, the compiler will complain that it is not initialized by the constructors (you can suppress that with null!), then the field can be used without null check.
Note that by using the ! suppression operator, you are taking on some risk. Imagine that you are not actually initializing all the fields as consistently as you thought. Then the use of null! to initialize a field covers up the fact that a null is slipping in. Some unsuspecting code can receive a null and therefore fail.
More generally, you may have some domain knowledge: "if I checked a certain method, then I know that some value isn't null":
if (CheckEverythingIsReady())
{
// you know that `field` is non-null, but the compiler doesn't. The suppression can help
UseNonNullValueFromField(this.field!);
}
Again, you must be confident of your code's invariant to do this ("I know better").

null! is used to assign null to non-nullable variables, which is a way of promising that the variable won't be null when it is actually used.
I'd use null! in a Visual Studio extension, where properties are initialized by MEF via reflection:
[Import] // Set by MEF
VSImports vs = null!;
[Import] // Set by MEF
IClassificationTypeRegistryService classificationRegistry = null!;
(I hate how variables magically get values in this system, but it is what it is.)
I also use it in unit tests to mark variables initialized by a setup method:
public class MyUnitTests
{
IDatabaseRepository _repo = null!;
[OneTimeSetUp]
public void PrepareTestDatabase()
{
...
_repo = ...
...
}
}
If you don't use null! in such cases, you'll have to use an exclamation mark every single time you read the variable, which would be a hassle without benefit.
Note: cases where null! is a good idea are fairly rare. I treat it as somewhat of a last resort.

I don't believe this question can be discussed without specific C# version being used.
public string RpyDescription { get; set; } = null!;
This is common in .NET 6 Core ... in fact it's required if you want to describe a string as not being nullable. One reason this exists is when one is working with SQL databases where a field can be set to "Allow Null". Even more so when working with JSON structures that accept null. EF needs to know.
Reference Type (Heap - pointer to memory location where the data is stored) of Value Type (Stack - memory location where data is stored).
.NET 6 (C#10) enables nullable context for the project templates by default (prior to this nullable context is disabled by default).
In EF/Core, it's very important to understand relationship between database null and model/entities null.

This question needs to be updated, in C# 10 in object relational mapping, this operator, combined with the ? operator is critical, this is a way to tell other coders on the project, future coders, and remind yourself how the data is going to end up being consumed and the null rules regarding that data.
public string Data { get; set; } = null!;
public string? Nullable { get; set; }
The beauty of this is that you don't have to go and look at the API docs(which you might not have), or go look through your database (which you might not even have access to). You already know by glancing at the class what the rules regarding null values are.
The downside is that if you instantiate the class, and don't have the information you need to instantiate the NOT NULL values, you will get strange default values that don't always make sense. First of all this doesn't happen nearly as commonly as people think and often comes from lazy programming. When you instantiate an instance of a class. You should calculating or assigning those NOT NULL properties. Right away. If you declare a car, and in your database or API cars have wheels, if you declare a car without assigning property values to the wheel.
Then did you truly instantiate the car in a meaningful way? You certainly didn't define the car the way the Database understands a car, it's a car with no wheels and it shouldn't be, doing this might be convenient but it goes against basic principles of object oriented programming.
are there exceptions for example perhaps the value can't be known at that point in time or until other events have transpired in these edge cases. Create a meaningful default value manually, if your default value hits the database you will know EXACTLY what's wrong
for people discussing unit tests why would you test the null case when the null case is impossible in the context of the object there is no need to have an understanding of how a car without wheels would behave for example. This is a silly, badly designed test.
I would go so far as to say that when it comes to strings this operator should be used the overwhelming majority of the time. When we declare int, or bool, or float, we have the understanding that these cannot be null unless we say so explicitly. Why in the world would you advocate for a different rule for strings!? the fact that we couldn't do this with strings previously was a design flaw.

Related

Wondering why to use `string?` instead of `string` in a property declaration

I have always developed using ASP.NET Framework, where I used a string property this way: public string FirstName { get; set; }.
Now I started a new .NET Core 6 project and I declared the same property in a custom IdentityUser class.
In this case, Visual Studio told me that it is better to use nullable string. Why does it suggest that since a string type can be already null?
Suggestion message appears in Spanish but it is basically what I have described.
That suggestion is gone when I use string?. Note that if I use Nullable<string> shows a compiler error since string is a reference type.
Just wondering.
Thanks
Jaime
No need to use Nullable<> type. That's for value types. Just decide if you want to use the Nullable Context. You will find this setting under
Project Properties >> Build >> General >> Nullable.
You have two basic options here.
Turn off the setting to make your code work like before (I would not do this)
Make your code honor the setting.
When this Nullable Context enabled, you are telling the compiler the following:
Any reference type declared without the ? symbol may never be null. It must always refer to a valid object.
Likewise any reference type declared with the ? symbol right after the type means that it may be null, just like old-school C# code.
That goes for strings, or any other reference type. The code-analyzer to checks for this and complains. So when you declare your property like this...
public string FirstName { get; set; }
Then you need to ensure in your constructor that this property is initialized to some valid string object or the code-analyzer will complain. Or you could declare it with a default value like this:
public string FirstName { get; set; } = string.Empty
If you want to be able to set the FirstName to null -- if that actually makes sense for your project -- then declare it with the question mark
public string? FirstName { get; set; }
Now it acts like an old style reference type, before C# 8.0.
Once you turn this setting on you'll find yourself dealing with this a lot. If you have warnings set to the highest level you'll be forced to chase down all these things in your code and address them. This is a good thing. Don't avoid it. Yes it is a pain to address up front but it saves you countless headaches down the road.
I once did it for an established application and spent two full days fixing all the warnings it generated. But in the year or two since then, I have lost count of the number of times this feature made the compiler/code-analyzer catch me failing to initialize a non-nullable reference type and saved me from a potential NullReferenceException down the line. I think this feature is priceless.
Nullable Context forces you to think about every reference type you have when you write the API. Can that reference be null or not? It makes you set the rule and honor it and is a lifesaver in a large project with multiple developers
Note: If you do use Nullable Context, you will want to be familiar with a very useful code attribute to use on some out values for functions: The [MayBeNullWhen] attribute. It comes in handy when you write a Try-type function to retrieve a reference type.
For example, suppose you wrote a function like this. This would work fine before but generates errors with Nullable Context enabled
public bool TryGetWidget(out Widget value)
{
value = null; // *** ERROR: Not valid if value is not nullable
if (/* some code that tries to retrieve the widget */)
value = retrievedWidget;
return value != null
}
The implication here is that the function might not succeed and return false. But if Widget is a reference type, then it will have to set the out value to be null in this case. Nullable Context will not allow that. You declared the out value as out Widget not out Widget? So you cannot set it to null. So what do you do?
You could change the argument to out Widget?
public bool TryGetWidget(out Widget? value)
That would make the function build. But then the Nullable context would complain if you tried to use the return value after the function returned true.
if (TryGetWidget(out var widget))
widget.SomeFunction(); // ERROR: `widget` might be null
Seems you can't win either way, doesn't it? Unless you use [MayBeNullWhen(false)] on the function declaration
public bool TryGetWidget([MaybeNullWhen(false)] out Widget value)
Now, your function will compile and your code that calls it will compile. The compiler/code-analyzer is smart enough to realize that a true return means you can use the out reference and a false returns means you cannot.
In addition to the ? operator there is also the ! operator. This one tells the compiler, "Ignore nullability. Assume this reference is valid" So if you had a function like this
Widget? GetWidget();
The fillowing code would not compile
GetWidget().SomeFunction(); // ERROR: Return value of GetWidget() Might be null
but this would compile by forcing the compiler to pretend the return value is valid.
GetWidget()!.SomeFunction(); // Better hope that returned reference is valid.

Intended or less-wrong pattern for using non-nullable types of class properties

There's a great question here about nullable types:
Non-nullable property must contain a non-null value when exiting constructor. Consider declaring the property as nullable
Which has some great ways to overcome the warning, but I'm interested in the reasoning behind using any particular method.
I'm working with EF Core to craft an API. I'm being careful to never instantiate a model class without having the correct data, but the class doesn't know that so we get the typical "Non-nullable property must contain..." warning.
At least in my head, I want the null reference error to happen when appropriate, to be passed back to a client or to prevent inappropriately null/empty values being passed down.
Example: a "this field required situation", if it somehow gets past the front-end validation an exception should be thrown, and the exception handler returns an appropriate BadRequest response to the client. I'm currently doing this rather than handling each individual property with an explicit validation method, at least for required properties.
It's seems disingenuous to try to accommodate the possibility of a null being passed in by:
Declaring the property nullable: this property really shouldn't be nullable, so I would just be allowing it in order to suppress the warning, which may cause functional issues or extraneous considerations elsewhere.
Setting a null-forgiveness null!: which seems appropriate in some situations, but not anywhere where the property should be set and not-null at all times.
Setting a default such as string.Empty: again appropriate in some situations, but not where the property should be set to something substantial and meaningful.
Setting up a constructor: this seems a lot of extra work to do essentially the same thing, and I'm not certain this overcomes the warning as much as moves it.
Am I wrong in wanting to use an exception to catch when an instantiation doesn't set or sets null a non-nullable property? I only question this because of the warning about the non-nullable property.
Is there some other pattern I'm unaware of that would be more appropriate, and that's why the warning?
Am I just putting too much stock in the warning?
Declaring the property nullable: this property really shouldn't be nullable...
Shouldn't be nullable ... or cannot be? The compiler can only tell about can be or cannot be null.
Declaring a type as non-nullable means that you really don't expect this value ever to be null. The compiler helps you to ensure that you never assign null to the field or forget to initalize it. That is the reason why the compiler expects you to initialize all non-nullable fields and properties to non-null values.
If you know for sure that the fields are initialized to non-null values you can declare them as non-nullable and use the null-forgiving operator: private string _name = default!; This can be useful when you know that the fields will definitely be initialized by an initializer method rather than the constructor before you use them.
If you want to throw an exception at runtime if the value is null you actually DO expect that it might be null in some cases. In that case you really should mark those fields as nullable. You then decide whether a null check is needed or when the warning can be suppressed with ! or when an exception should be thrown.
There are also a bunch of attributes that assist the compiler in its null-state analysis. This is an example of a property with a non-nullable type where the setter allows null anyway:
private string? _name;
[AllowNull]
public string Name
{
get => _name ?? string.Empty;
set => _name = value;
}
Here we tell the compiler that the out parameter is not null when the method returns true:
public bool TryGetValue(string key, [NotNullwhen(true)] out string? value) ...
This can also be useful in some scenarios:
private string? _name;
[MemberNotNull(namof(_name))]
private void NullGuard()
{
if (_name is null)
{
throw new InvalidOperationException("Name really shouldn't be null");
}
// _name must be known to be not null when returning from the method
}
The attribute tells the compiler that _name is not null when NullGuard returns:
NullGuard();
int length = _name.Length; // no warning because NullGuard has the MemberNotNull attribute
Am I just putting too much stock in the warning?
Probably yes. Don't forget that nullable reference types is just a tool which tries to help you not to run into unwanted NullReferenceExceptions. It also is a better way of expressing your intent.

What's the difference between a is string and a !is string in c# [duplicate]

I've recently seen the following code:
public class Person
{
//line 1
public string FirstName { get; }
//line 2
public string LastName { get; } = null!;
//assign null is possible
public string? MiddleName { get; } = null;
public Person(string firstName, string lastName, string middleName)
{
FirstName = firstName;
LastName = lastName;
MiddleName = middleName;
}
public Person(string firstName, string lastName)
{
FirstName = firstName;
LastName = lastName;
MiddleName = null;
}
}
Basically, I try to dig into new c# 8 features. One of them is NullableReferenceTypes.
Actually, there're a lot of articles and information about it already. E.g. this article is quite good.
But I didn't find any information about this new statement null!
Can someone provide me an explanation for it?
Why do I need to use this?
And what is the difference between line1 and line2?
TL;DR
The key to understanding what null! means is understanding the ! operator. You may have used it before as the "not" operator. However, since C# 8.0 and its new "nullable-reference-types" feature, the operator got a second meaning. It can be used on a type to control Nullability, it is then called the "Null Forgiving Operator".
Basically, null! applies the ! operator to the value null. This overrides the nullability of the value null to non-nullable, telling the compiler that null is a "non-null" type.
Typical usage
Assuming this definition:
class Person
{
// Not every person has a middle name. We express "no middle name" as "null"
public string? MiddleName;
}
The usage would be:
void LogPerson(Person person)
{
Console.WriteLine(person.MiddleName.Length); // WARNING: may be null
Console.WriteLine(person.MiddleName!.Length); // No warning
}
This operator basically turns off the compiler null checks for this usage.
Technical Explanation
The groundwork that you will need to understand what null! means.
Null Safety
C# 8.0 tries to help you manage your null-values. Instead of allowing you to assign null to everything by default, they have flipped things around and now require you to explicitly mark everything you want to be able to hold a null value.
This is a super useful feature, it allows you to avoid NullReferenceExceptions by forcing you to make a decision and enforcing it.
How it works
There are 2 states a variable can be in - when talking about null-safety.
Nullable - Can be null.
Non-Nullable - Can not be null.
Since C# 8.0 all reference types are non-nullable by default.
Value types have been non-nullable since C# 2.0!
The "nullability" can be modified by 2 new (type-level) operators:
! = from Nullable to Non-Nullable
? = from Non-Nullable to Nullable
These operators are counterparts to one another.
The Compiler uses the information, you define with those operators, to ensure null-safety.
Examples
? Operator usage.
This operator tells the compiler that a variable can hold a null value. It is used when defining variables.
Nullable string? x;
x is a reference type - So by default non-nullable.
We apply the ? operator - which makes it nullable.
x = null Works fine.
Non-Nullable string y;
y is a reference type - So by default non-nullable.
y = null Generates a warning since you assign a null value to something that is not supposed to be null.
Nice to know: Using object? is basically just syntactic sugar for System.Nullable<object>
! Operator usage.
This operator tells the compiler that something that could be null, is safe to be accessed. You express the intent to "not care" about null safety in this instance. It is used when accessing variables.
string x;
string? y;
x = y
Illegal! Warning: "y" may be null
The left side of the assignment is non-nullable but the right side is nullable.
So it does not work, since it is semantically incorrect
x = y!
Legal!
y is a reference type with the ? type modifier applied so it is nullable if not proven otherwise.
We apply ! to y which overrides its nullability settings to make it non-nullable
The right and left side of the assignment are non-nullable. Which is semantically correct.
WARNING The ! operator only turns off the compiler-checks at a type-system level - At runtime, the value may still be null.
Use carefully!
You should try to avoid using the Null-Forgiving-Operator, usage may be the symptom of a design flaw in your system since it negates the effects of null-safety you get guaranteed by the compiler.
Reasoning
Using the ! operator will create very hard to find bugs. If you have a property that is marked non-nullable, you will assume you can use it safely. But at runtime, you suddenly run into a NullReferenceException and scratch your head. Since a value actually became null after bypassing the compiler-checks with !.
Why does this operator exist then?
There are valid use-cases (outlined in detail below) where usage is appropriate. However, in 99% of the cases, you are better off with an alternative solution. Please do not slap dozens of !'s in your code, just to silence the warnings.
In some (edge) cases, the compiler is not able to detect that a nullable value is actually non-nullable.
Easier legacy code-base migration.
In some cases, you just don't care if something becomes null.
When working with Unit-tests you may want to check the behavior of code when a null comes through.
Ok!? But what does null! mean?
It tells the compiler that null is not a nullable value. Sounds weird, doesn't it?
It is the same as y! from the example above. It only looks weird since you apply the operator to the null literal. But the concept is the same. In this case, the null literal is the same as any other expression/type/value/variable.
The null literal type is the only type that is nullable by default! But as we learned, the nullability of any type can be overridden with ! to non-nullable.
The type system does not care about the actual/runtime value of a variable. Only its compile-time type and in your example the variable you want to assign to LastName (null!) is non-nullable, which is valid as far as the type-system is concerned.
Consider this (invalid) piece of code.
object? null;
LastName = null!;
When the "nullable reference types" feature is turned on, the compiler tracks which values in your code it thinks may be null or not. There are times where the compiler could have insufficient knowledge.
For example, you may be using a delayed initialization pattern, where the constructor doesn't initialize all the fields with actual (non-null) values, but you always call an initialization method which guarantees the fields are non-null. In such case, you face a trade-off:
if you mark the field as nullable, the compiler is happy, but you have to un-necessarily check for null when you use the field,
if you leave the field as non-nullable, the compiler will complain that it is not initialized by the constructors (you can suppress that with null!), then the field can be used without null check.
Note that by using the ! suppression operator, you are taking on some risk. Imagine that you are not actually initializing all the fields as consistently as you thought. Then the use of null! to initialize a field covers up the fact that a null is slipping in. Some unsuspecting code can receive a null and therefore fail.
More generally, you may have some domain knowledge: "if I checked a certain method, then I know that some value isn't null":
if (CheckEverythingIsReady())
{
// you know that `field` is non-null, but the compiler doesn't. The suppression can help
UseNonNullValueFromField(this.field!);
}
Again, you must be confident of your code's invariant to do this ("I know better").
null! is used to assign null to non-nullable variables, which is a way of promising that the variable won't be null when it is actually used.
I'd use null! in a Visual Studio extension, where properties are initialized by MEF via reflection:
[Import] // Set by MEF
VSImports vs = null!;
[Import] // Set by MEF
IClassificationTypeRegistryService classificationRegistry = null!;
(I hate how variables magically get values in this system, but it is what it is.)
I also use it in unit tests to mark variables initialized by a setup method:
public class MyUnitTests
{
IDatabaseRepository _repo = null!;
[OneTimeSetUp]
public void PrepareTestDatabase()
{
...
_repo = ...
...
}
}
If you don't use null! in such cases, you'll have to use an exclamation mark every single time you read the variable, which would be a hassle without benefit.
Note: cases where null! is a good idea are fairly rare. I treat it as somewhat of a last resort.
I don't believe this question can be discussed without specific C# version being used.
public string RpyDescription { get; set; } = null!;
This is common in .NET 6 Core ... in fact it's required if you want to describe a string as not being nullable. One reason this exists is when one is working with SQL databases where a field can be set to "Allow Null". Even more so when working with JSON structures that accept null. EF needs to know.
Reference Type (Heap - pointer to memory location where the data is stored) of Value Type (Stack - memory location where data is stored).
.NET 6 (C#10) enables nullable context for the project templates by default (prior to this nullable context is disabled by default).
In EF/Core, it's very important to understand relationship between database null and model/entities null.
This question needs to be updated, in C# 10 in object relational mapping, this operator, combined with the ? operator is critical, this is a way to tell other coders on the project, future coders, and remind yourself how the data is going to end up being consumed and the null rules regarding that data.
public string Data { get; set; } = null!;
public string? Nullable { get; set; }
The beauty of this is that you don't have to go and look at the API docs(which you might not have), or go look through your database (which you might not even have access to). You already know by glancing at the class what the rules regarding null values are.
The downside is that if you instantiate the class, and don't have the information you need to instantiate the NOT NULL values, you will get strange default values that don't always make sense. First of all this doesn't happen nearly as commonly as people think and often comes from lazy programming. When you instantiate an instance of a class. You should calculating or assigning those NOT NULL properties. Right away. If you declare a car, and in your database or API cars have wheels, if you declare a car without assigning property values to the wheel.
Then did you truly instantiate the car in a meaningful way? You certainly didn't define the car the way the Database understands a car, it's a car with no wheels and it shouldn't be, doing this might be convenient but it goes against basic principles of object oriented programming.
are there exceptions for example perhaps the value can't be known at that point in time or until other events have transpired in these edge cases. Create a meaningful default value manually, if your default value hits the database you will know EXACTLY what's wrong
for people discussing unit tests why would you test the null case when the null case is impossible in the context of the object there is no need to have an understanding of how a car without wheels would behave for example. This is a silly, badly designed test.
I would go so far as to say that when it comes to strings this operator should be used the overwhelming majority of the time. When we declare int, or bool, or float, we have the understanding that these cannot be null unless we say so explicitly. Why in the world would you advocate for a different rule for strings!? the fact that we couldn't do this with strings previously was a design flaw.

Why is the default value of the string type null instead of an empty string?

It's quite annoying to test all my strings for null before I can safely apply methods like ToUpper(), StartWith() etc...
If the default value of string were the empty string, I would not have to test, and I would feel it to be more consistent with the other value types like int or double for example.
Additionally Nullable<String> would make sense.
So why did the designers of C# choose to use null as the default value of strings?
Note: This relates to this question, but is more focused on the why instead of what to do with it.
Why is the default value of the string type null instead of an empty
string?
Because string is a reference type and the default value for all reference types is null.
It's quite annoying to test all my strings for null before I can
safely apply methods like ToUpper(), StartWith() etc...
That is consistent with the behaviour of reference types. Before invoking their instance members, one should put a check in place for a null reference.
If the default value of string were the empty string, I would not have
to test, and I would feel it to be more consistent with the other
value types like int or double for example.
Assigning the default value to a specific reference type other than null would make it inconsistent.
Additionally Nullable<String> would make sense.
Nullable<T> works with the value types. Of note is the fact that Nullable was not introduced on the original .NET platform so there would have been a lot of broken code had they changed that rule.(Courtesy #jcolebrand)
Habib is right -- because string is a reference type.
But more importantly, you don't have to check for null each time you use it. You probably should throw a ArgumentNullException if someone passes your function a null reference, though.
Here's the thing -- the framework would throw a NullReferenceException for you anyway if you tried to call .ToUpper() on a string. Remember that this case still can happen even if you test your arguments for null since any property or method on the objects passed to your function as parameters may evaluate to null.
That being said, checking for empty strings or nulls is a common thing to do, so they provide String.IsNullOrEmpty() and String.IsNullOrWhiteSpace() for just this purpose.
You could write an extension method (for what it's worth):
public static string EmptyNull(this string str)
{
return str ?? "";
}
Now this works safely:
string str = null;
string upper = str.EmptyNull().ToUpper();
You could also use the following, as of C# 6.0
string myString = null;
string result = myString?.ToUpper();
The string result will be null.
Empty strings and nulls are fundamentally different. A null is an absence of a value and an empty string is a value that is empty.
The programming language making assumptions about the "value" of a variable, in this case an empty string, will be as good as initiazing the string with any other value that will not cause a null reference problem.
Also, if you pass the handle to that string variable to other parts of the application, then that code will have no ways of validating whether you have intentionally passed a blank value or you have forgotten to populate the value of that variable.
Another occasion where this would be a problem is when the string is a return value from some function. Since string is a reference type and can technically have a value as null and empty both, therefore the function can also technically return a null or empty (there is nothing to stop it from doing so). Now, since there are 2 notions of the "absence of a value", i.e an empty string and a null, all the code that consumes this function will have to do 2 checks. One for empty and the other for null.
In short, its always good to have only 1 representation for a single state. For a broader discussion on empty and nulls, see the links below.
https://softwareengineering.stackexchange.com/questions/32578/sql-empty-string-vs-null-value
NULL vs Empty when dealing with user input
Why the designers of C# chose to use null as the default value of
strings?
Because strings are reference types, reference types are default value is null. Variables of reference types store references to the actual data.
Let's use default keyword for this case;
string str = default(string);
str is a string, so it is a reference type, so default value is null.
int str = (default)(int);
str is an int, so it is a value type, so default value is zero.
The fundamental reason/problem is that the designers of the CLS specification (which defines how languages interact with .net) did not define a means by which class members could specify that they must be called directly, rather than via callvirt, without the caller performing a null-reference check; nor did it provide a meany of defining structures which would not be subject to "normal" boxing.
Had the CLS specification defined such a means, then it would be possible for .net to consistently follow the lead established by the Common Object Model (COM), under which a null string reference was considered semantically equivalent to an empty string, and for other user-defined immutable class types which are supposed to have value semantics to likewise define default values. Essentially, what would happen would be for each member of String, e.g. Length to be written as something like [InvokableOnNull()] int String Length { get { if (this==null) return 0; else return _Length;} }. This approach would have offered very nice semantics for things which should behave like values, but because of implementation issues need to be stored on the heap. The biggest difficulty with this approach is that the semantics of conversion between such types and Object could get a little murky.
An alternative approach would have been to allow the definition of special structure types which did not inherit from Object but instead had custom boxing and unboxing operations (which would convert to/from some other class type). Under such an approach, there would be a class type NullableString which behaves as string does now, and a custom-boxed struct type String, which would hold a single private field Value of type String. Attempting to convert a String to NullableString or Object would return Value if non-null, or String.Empty if null. Attempting to cast to String, a non-null reference to a NullableString instance would store the reference in Value (perhaps storing null if the length was zero); casting any other reference would throw an exception.
Even though strings have to be stored on the heap, there is conceptually no reason why they shouldn't behave like value types that have a non-null default value. Having them be stored as a "normal" structure which held a reference would have been efficient for code that used them as type "string", but would have added an extra layer of indirection and inefficiency when casting to "object". While I don't foresee .net adding either of the above features at this late date, perhaps designers of future frameworks might consider including them.
Because a string variable is a reference, not an instance.
Initializing it to Empty by default would have been possible but it would have introduced a lot of inconsistencies all over the board.
If the default value of string were the empty string, I would not have to test
Wrong! Changing the default value doesn't change the fact that it's a reference type and someone can still explicitly set the reference to be null.
Additionally Nullable<String> would make sense.
True point. It would make more sense to not allow null for any reference types, instead requiring Nullable<TheRefType> for that feature.
So why did the designers of C# choose to use null as the default value of strings?
Consistency with other reference types. Now, why allow null in reference types at all? Probably so that it feels like C, even though this is a questionable design decision in a language that also provides Nullable.
Perhaps if you'd use ?? operator when assigning your string variable, it might help you.
string str = SomeMethodThatReturnsaString() ?? "";
// if SomeMethodThatReturnsaString() returns a null value, "" is assigned to str.
A String is an immutable object which means when given a value, the old value doesn't get wiped out of memory, but remains in the old location, and the new value is put in a new location. So if the default value of String a was String.Empty, it would waste the String.Empty block in memory when it was given its first value.
Although it seems minuscule, it could turn into a problem when initializing a large array of strings with default values of String.Empty. Of course, you could always use the mutable StringBuilder class if this was going to be a problem.
Since string is a reference type and the default value for reference type is null.
Since you mentioned ToUpper(), and this usage is how I found this thread, I will share this shortcut (string ?? "").ToUpper():
private string _city;
public string City
{
get
{
return (this._city ?? "").ToUpper();
}
set
{
this._city = value;
}
}
Seems better than:
if(null != this._city)
{ this._city = this._city.ToUpper(); }
Maybe the string keyword confused you, as it looks exactly like any other value type declaration, but it is actually an alias to System.String as explained in this question.
Also the dark blue color in Visual Studio and the lowercase first letter may mislead into thinking it is a struct.
Nullable types did not come in until 2.0.
If nullable types had been made in the beginning of the language then string would have been non-nullable and string? would have been nullable. But they could not do this du to backward compatibility.
A lot of people talk about ref-type or not ref type, but string is an out of the ordinary class and solutions would have been found to make it possible.

Local constant initialised to null reference

I have read that C# allows local constants to be initialised to the null reference, for example:
const string MyString = null;
Is there any point in doing so however? What possible uses would this have?
My guess is because null is a valid value that can be assigned to reference types and nullable values types.
I can't see any reason to forbid this.
There might be some far off edge cases where this can be useful, for example with multi targeting and conditional compilation. IE you want to define a constant for one platform but define it as null for another due to missing functionality.
Ex, of possible usefull usage:
#IF (SILVELIGHT)
public const string DefaultName = null;
#ELSE
public const string DefaultName = "Win7";
#ENDIF
Indeed, you can initialize local const strings and readonly reference types to null, even though it seems to be redundant since their default value is null anyway.
However, the fact remains that null is a compile-time constant suitable enough to initialize strings and reference types. Therefore, the compiler would have to go out of its way in order to consider, identify and reject this special case, even though it's still perfectly valid from a language standpoint.
The merits of doing that would be disputable, and it probably wouldn't be worth the effort in the first place.
It could be used if you want to write code without keywords, if that strikes your fancy:
const string UninitializedSetting = null;
if (mySetting == UninitializedSetting)
{
Error("mySetting string not initialized");
}
Choosing to name a value (rather than using an in-place magic constant), using const, and setting to null are more or less orthogonal issues, although I agree that the venn diagram might have a very small area of overlap for the three :)
A case that I can think of is when you have as much or more throw-away data than you do code, but you want to ensure the values don't accidentally get changed while writing or maintaining your code. This situation is fairly common in unit tests:
[Test]
public void CtorNullUserName()
{
const string expectedUserName = null;
var user = new User(expectedUserName);
Assert.AreEqual(expectedUserName, user.Name, "Expected user name to be unchanged from ctor");
}
You could arguably structure such code in a plethora of ways that didn't involve assigning null to a const, but this is still a valid option.
This might also be useful to help resolve method overloading issues:
public class Something
{
public void DoSomething(int? value) { Console.WriteLine("int?"); }
public void DoSomething(string value) { Console.WriteLine("string"); }
}
// ...
new Something().DoSomething(null); // This is ambiguous, and won't compile
const string value = null;
new Something().DoSomething(value); // Not ambiguous
If you use constants, for example, for configuration of your application then why not? The null value can represent a valid state - e.g. that a logger is not installed. Also note that when you declare a local constant, you can initialize it to a value given by global constant (which may be a more interesting scenario).
EDIT: Another question is, what are good situations for using const anyway? For configuration, you'd probably want a configuration file and other variables usually change (unless they are immutable, but then readonly is a better fit...)
Besides the situations already pointed out, it may have to do with a quirk of the C# language. The C# Specification 3.0, section 8.5.2 states:
The type and constant-expression of a local constant declaration must follow the same rules as those of a constant member declaration (§10.4).
And within 10.4 reads as follows:
As described in §7.18, a constant-expression is an expression that can be fully evaluated at compile-time. Since the only way to create a non-null value of a reference-type other than string is to apply the new operator, and since the new operator is not permitted in a constant-expression, the only possible value for constants of reference-types other than string is null.

Categories