Never use Nulls? - c#

We are currently going through the long process of writing some coding standards for C#.
I've written a method recently with the signature
string GetUserSessionID(int UserID)
GetUserSession() returns null in the case that a session is not found for the user.
in my calling code... I say...
string sessionID = GetUserSessionID(1)
if (null == sessionID && userIsAllowedToGetSession)
{
session = GetNewUserSession(1);
}
In a recent code review, the reviewer said "you should never return null from a method as it puts more work on the calling method to check for nulls."
Immediately I cried shenanigans, as if you return string.Empty you still have to perform some sort of check on the returned value.
if (string.Empty == sessionID)
However, thinking about this further I would never return null in the case of a Collection/Array/List.
I would return an empty list.
The solution to this (I think) would be to refactor this in to 2 methods.
bool SessionExists(int userID);
and
string GetUserSessionID(int UserID);
This time, GetUserSessionID() would throw a SessionNotFound exception (as it should not return null)
now the code would look like...
if(!SessionExists(1) && userIsAllowedToGetSession))
{
session = GetNewUserSession(1);
}
else
{
session = GetUserSessionID(1);
}
This now means that there are no nulls, but to me this seems a bit more complicated. This is also a very simple example and I was wondering how this would impact more complicated methods.
There is plenty of best-practice advise around about when to throw exceptions and how to handle them, but there seems to be less information regarding the use of null.
Does anyone else have any solid guidelines (or even better standards) regarding the use of nulls, and what does this mean for nullable types (should we be using them at all?)
Thanks in advance,
Chris.
Thanks everyone! LOTS of very interesting discussion there.
I've given the answer to egaga as I like thier suggestion of Get vs Find as a coding guideline, but all were interesting answers.

nulls are definitely better, i.e., more honest, than "magic values". But they should not be returned when an error has happened - that's what exceptions are made for. When it comes to returning collections... better an empty collection than null, I agree.

returning null is fine and the following is concise and easy to understand:
var session = GetUserSessionID(1) ?? GetNewUserSession(1);

A possible practice is to use get prefix for methods that throw an exception if result is not found, and find prefix, if null is possible. Thus it's easy to see in client side whether the code could have a problem dealing with null.
Of course one should avoid nulls, and Andrej Heljsberg has said in an interview that if C# was created now, it would have better ways dealing with nullability. http://www.computerworld.com.au/article/261958/-z_programming_languages_c?pp=3&fp=&fpid=

In my opinion you shouldn't rule out the use of null as a return value. I think it is valid in many cases. But you should carefully consider the pro's and con's for every method. It totally depends on the purpose of the method and the expectations callers might have.
I personally use a null return value a lot in cases where you might expect that the method does not return an abject (i.e. a database search by the primary key, might return exactly one instance or none / null). When you can correctly expect a method to return a value, you should use an exception.
In your particular example I think it depends on the context of the system. If the call for instance is only made from code where you might expect a logged in user, yo should throw an exception. If however it is likely that no user is logged in and you therefore don't have a session id to return, you should choose to return null.

Why don't you use the Null design pattern instead?

Keep things Simple. Make it as painless as possible for the consumer of your type. It's a tradeoff : time to implement and benefits gained.
When dealing with things like fetching lists/collections, return an empty list as you rightly pointed out.
Similarly for cases where you need to return a 'null' value to signify 'not-found' - try using the Null Object pattern for a heavily used type. For rarely used types, I guess you could live with a few checks for null or -1 (String.IndexOf).
If you have more categories, please post as comments and I'll attempt solutions.

Nulls are far better than magic values and make much more sense.
You could also try and write a TryGetUserSession method.
bool TryGetUserSession(int sessionId, out session)
Also try not to write nulls == ???, as some developers find it harder to read.
Kind regards,

I'm wary of returning nulls myself, though in your example it certainly seems like the right thing to do.
What is important is being clear about nullability both for arguments and return values. This must be specified as part of your API documentation if your language can't express this concept directly. (Java can't, I've read that C# can.) If an input parameter or return value can be null, be sure to tell users what that means in the context of your API.
Our largest code base is a Java app that has grown steadily over the past decade. Many of the internal APIs are very unclear about their behavior WRT null. This leads to cancerous growth of null-checks all over the place because everyone is programming defensively.
Particularly ugly: functions which return collections returning null instead of an empty collection. WTF!? Don't do that!
In the case of Strings be sure to distinguish between Strings that must be there, but can be empty and strings that are truly optional (can be null). In working with XML we come across this distinction often: elements that are mandatory, but can be empty, versus elements that are optional.
One part of the system, which provides querying over XML-like structures for implementing business rules, is very radical about being null-free. It uses the Null Object Pattern everywhere. In this context, this turns out to be useful because it allows you to do things like:
A.children().first().matches("Blah")
The idea being that this expression should return true if the first child of A is named "Blah". It won't crash if A has no children because in that case first() returns a NullNode which implements the Node interface, but never 'matches' anything.
In summary:
Be clear about the meaning and permissibility of null in your APIs.
Use your brain.
Don't be dogmatic.

One solution might be to declare a simple return type for such things:
class Session
{
public bool Exists;
public string ID;
}
Session GetUserSession(int userID) {...}
...
Session session = GetUserSessionID(1);
if (session.Exists)
{
... use session.ID
}
I do generally like to avoid nulls, but strings are a bit of a special case. I often end up calling string.IsNullOrEmpty(). So much so that we use an extension method for string:
static class StringExtensions
{
public static bool IsNullOrEmpty(this string value)
{
return string.IsNullOrEmpty(value);
}
}
Then you can do:
string s = ... something
if (s.IsNullOrEmpty())
{
...
}
Also, since we're on the subject of coding style:
What's with the unnatural-looking "if (null == ...)" style? That's completely unneccessary for C# - it looks like someone's brought some C++ baggage into the C# coding styles!

I prefer returning empty collections instead of nulls, because that helps avoid cluttering the caller code like the following:
if( list != null) {
foreach( Item item in list ) {
...
}
}

The design by contract answer you have would also be my solution:
if (myClass.SessionExists)
{
// Do something with myClass.Session
}

We always return empty lists / collections and do not return NULL lists / collections because handling empty collections/lists reduces the burden on the calling functions.
Nullable - nullables are particularly useful when you want to avoid handling the "null" values from the databases. You can simply use GetDefaultOrValue for any Nullable type without worrying about the database value being null in your business objects. We define Nullables typically ONLY in our business objects and that too where we really want to escape the db null values check, etc.

The inventor of nulls thinks that they are a bad idea..
see: http://lambda-the-ultimate.org/node/3186

I've had a similar issue, though I through an Exception and was told I should of returned null,
We ended up reading
this blog post about vexing exceptions
I'd recommend having a read.

NULL means unknown/undetermined.
You mostly need it when working with a database where data is simply 'not determined (yet)'. In your application, NULL is a good return type for non-exceptional but not-default behavior.
You use NULL when the output is not the expected but not serious enough for an exception.

Related

Idiomatic optional return values in C#

Coming from the Swift programming language, I've grown accustomed to the paradigm of optional values. There are many cool aspects to this language feature, but one in particular that I am wondering if C# has an equivalent way of handling, which is nil function return values. Take, for example, a simple lookup function that returns a value for a given key. In Swift, the idiomatic way of handling a missing key would be to simply return nil. In effect, this prevents programmers from having to throw errors or perform error handling in the vast majority of cases, and it works out quite elegantly. Similarly, in Javascript, I might just return undefined.
From what I read on MSDN, it seems that the C# programmer would typically favor classical exceptions in such cases, and handle them in try catch blocks. Is this generally the case? Would it be unusual, for example, to wrap integer return values in a Nullable object and let the function caller check whether the value is null? What would the orthodox approach for these situations?
Exceptions are for exceptional scenario's, not for regular program flow. I guess we're on the same side there.
You would only need an additional return "type" like null when your method both checks for availability and fetches the resource. Take for example the code to get the contents of a file:
if (File.Exists(...))
{
return false;
}
string contents = File.ReadAllText(...);
...
return true;
This requires no exceptions or null return values, and if the file is deleted between checking and opening the file, that could then be considered an exceptional scenario.
However when trying to get an entity from the database, you'd want to combine the availability check and fetching the resource, so that only one query is required. There are alot of programmers that do in fact return null, but since there is no C# law it's hard to keep track of which method does what.
I personally like the bool TryGet(int id, out Entity entity) style, which clearly indicates it does an availability check and fetches the resource if found. If you prefer returning null and want to be more explicit about what your function does, you could use Entity GetOrDefault(int id) as a naming convention. Some framework methods (ie. Linq) are named that way.
In case of returning value types, returning a Nullable<> probably already states what your method is up to. To complete the circle, C# 7 will (as far as I know) add non-nullable reference types.
In the end it's up to you (and your team) what convention you prefer.
There is no option type in C# (yet), in F# there's an option type which you could abuse in C#.
See here for a detailed description:
Using F# option types in C#
or refer to C.Evenhuis post for the C# way to do it.

Trying to refactor to Null object pattern but the end result seems worse

I'm refactoring a big class that has a lot of checks for null all over the place into using the null object pattern. So far it's been an almost smooth change but I am having a couple of issues with the end result and I would like to know if there is a better or different approach or even to go back the way it was.
The biggest problem is that I have the following code:
IMyObject myObject = GetMyObject();
if(myObject != null && !myObject.BooleanProperty)
DoSomething();
So as you can see I could probably remove the null check from this condition but I have a boolean property which if set to the default value will execute a piece of code. If I always return true I might introduce subtle bugs that would be painful to find and eliminate.
Another problem is that I've had to modify a check from null like this:
if(myObject.GetType() != typeof(MyNullObject))
return false;
DoSomething();
This is just plain ugly since instead of just checking for null, now I have to check the type. This kind of situation happens three times in the class since I am not returning one of the object's properties or executing one of its methods I must do this check.
And lastly the object has a couple of DateTime properties which are not nullable and the architect does not want them to be nullable. Again by having the MinDate value as default some nasty bugs could crawl into the code.
So there you have it. Is this a case in which the null object pattern is just worse than the spaghetti null checks scattered all over? Is there a better way to accomplish this?
Thanks for your answers.
Would it be better to refactor your code so that DoSomething() is the method on Null Object and simply implemented as no-op? Another alternative to Null Object is Maybe<T>. It makes null checks a bit more readable and calling code safer.
The Null Object pattern is about a tradeoff - you gain with the elimination of null checks, but pay with another class to maintain.
I'd suggest adding an IsNull boolean property to your interface/base class.
The normal implementations return false; your null object returns true.
This allows you to avoid tests against exact types, and is clearer about your intent. You can also use this as a test in the code that deals with dates, allowing you to retain your non-nullable date properties.
You might add a boolean IsNull (IsEmpty) property in the interface IMyObject, then you implement MyNullObject that returns true for that property. Obviously you have to trust that in the other cases this should return false, otherwise you will have the wrong behaviour.
You should take a look at Code Contracts. This is an elegant way to enforce pre and post conditions on methods to avoid the sorts of issues you are referring to. It does this by raising exceptions early when the contract is broken at runtime, or in some cases through static analysis (Microsoft framework).
I prefer CuttingEdge Condtions, it doesn't support static analysis but it has a nice fluent interface and more intuitive than the Microsoft equivalent.
It allows you to write code like this, and even extend the interface:
public class MyClass
{
public void MyMethod(string param1, int param2)
{
Condition.Requires(param1).IsNotNullOrWhiteSpace();
Condition.Requires(param2).IsGreaterThan(0);
...
}
}
You would implement conditions on all data input into the system, i.e. all public methods and prevent developers writing code that violates this. When it happens due to a bug, the exception and stack trace tells you exactly where the problem is.
There are also ways to set up conditions that monitor properties to ensure certain conditions never arise through Invariants. This can act as an anti-corruption layer which again catches bugs, and stops developers writing code the breaks the system.

Why does trying to access a property of null cause an exception in some languages?

The thing that really bothers me the most about some programming languages (e.g. C#, Javascript) is that trying to access a property of null causes an error or exception to occur.
For example, in the following code snippet,
foo = bar.baz;
if bar is null, C# will throw a nasty NullReferenceException and my Javascript interpreter will complain with Unable to get value of the property 'baz': object is null or undefined.
I can understand this, in theory, but in real code I often have somewhat deep objects, like
foo.bar.baz.qux
and if any among foo, bar, or baz is null, my codes are broken. :( Further, if I evaluate the following expressions in a console, there seem to be inconsistent results:
true.toString() //evaluates to "true"
false.toString() //evaluates to "false"
null.toString() //should evaluate to "null", but interpreter spits in your face instead
I absolutely despise writing code to handle this problem, because it is always verbose, smelly code. The following are not contrived examples, I grabbed these from one of my projects (the first is in Javascript, the second is in C#):
if (!(solvedPuzzles &&
solvedPuzzles[difficulty] &&
solvedPuzzles[difficulty][index])) {
return undefined;
}
return solvedPuzzles[difficulty][index].star
and
if (context != null &&
context.Request != null &&
context.Request.Cookies != null &&
context.Request.Cookies["SessionID"] != null)
{
SessionID = context.Request.Cookies["SessionID"].Value;
}
else
{
SessionID = null;
}
Things would be so much easier if the whole expression returned null if any one of the properties was null. The above code examples could have been so much simpler:
return solvedPuzzles[difficulty][index].star;
//Will return null if solvedPuzzles, difficulty, index, or star is null.
SessionID = context.Request.Cookies["SessionID"].Value;
//SessionID will be null if the context, Request, Cookies,
//Cookies["SessionID"], or Value is null.
Is there something I'm missing? Why don't these languages use this behavior instead? Is it hard to implement for some reason? Would it cause problems that I'm overlooking?
Would it cause problems that I'm overlooking?
Yes - it would cause the problem where you expect there to be non-null values, but due to a bug, you've got a null value. In that situation you want an exception. Failing silently and keeping going with bad data is a really bad idea.
Some languages (such as Groovy) provide a null-safe dereference operator which can help. In C# it might look something like:
SessionID = context?.Request?.Cookies?["SessionID"]?.Value;
I believe the C# team have considered this in the past and found it problematic, but that doesn't mean they won't revisit it in the future of course.
And then if the tail end of that was a call to a method that returned bool, then what?
Main.Child.Grandchild.IsEdit() and bool IsEdit();
I think it would be better to have a "null" instance that returns default behavior (this is known as the null object pattern).
That way, for others who expect a null to indicate a problem, they get their exception, but if you know a default object is acceptable, you can implement this and not worry about it. Both cases are then solved.
You could derive all "null" objects from INull, put them into a hash against their class name, then refer to that if a member is null. Then you can control default "null" implementation (what would be acceptable if the object is "null")
Also, "null" objects can throw exceptions if they are accessed, that way you know that you accessed one, and choose when this would be ok. So you can implement your own "null object exception".
If they put any language support for this, it would either be for everyone, across the board for the translation unit, or object by object with a qualifier (the last being the preferable option), at that point it wouldn't be default, and you could have already implemented a default "null" instance.
Might I suggest that if you have to check for that many nulls, you are structuring your code poorly. Obviously, as I do not have your code, I cannot say that conclusively. I would suggest that you split your code into smaller functions. This way, your code only have to check one or maybe two nulls at a time, and can have more control over what happens if at any point something is null.
In general foo.Baz means "on the instance if foo, execute the baz member". This could be a dynamic language, a virtual member, or anything - but we've failed at the first step in the premise: there is no instance.
What you are edged is a null-safe member-access operator. Pretty rare, and frankly I disagree with your claim that this makes for smelly code (although I could argue that you are querying down too many levels in a single expression to be healthy).
Likewise, I disagree that null.toString() should return "null" - IMO it is perfecty normal for this to fail. Treating access on a null as "normal" is inherently wrong IMO.
Is there something I'm missing? Why don't these languages use this
behavior instead? Is it hard to implement for some reason? Would it
cause problems that I'm overlooking?
When a variable is null when not expected the language you mention choose to fail early. This is a central concept when creating correctly working programs.
The option would be to hide the null pointer problem and the program would run one more row, but would probably crash later or even worse cause the wrong output!
Also when trying to fix a bug, it's way easier to fix it if it throws a NullReferenceException right away and the debugger shows you the line to start looking at. The alternative would be vague symptoms - Much harder to debug!
It's very common to program with null when you should not. In your above examles what does null mean. The answer is that it doesnt mean anything. As a reader of the code I would assume the writer forgot something.
What is usually the best solution is to introduce a small, what I call, "null object" that can be returned instead of null. In your example perhaps a "NoDifficulty" or "UnknownDifficulty" class could be returned to yield the result you are after in the end - clean code.

Extension Methods - IsNull and IsNotNull, good or bad use?

I like readability.
So, I came up with an extension mothod a few minutes ago for the (x =! null) type syntax, called IsNotNull. Inversly, I also created a IsNull extension method, thus
if(x == null) becomes if(x.IsNull())
and
if(x != null) becomes if(x.IsNotNull())
However, I'm worried I might be abusing extension methods. Do you think that this is bad use of Extenion methods?
It doesn't seem any more readable and could confuse people reading the code, wondering if there's any logic they're unaware of in those methods.
I have used a PerformIfNotNull(Func method) (as well as an overload that takes an action) which I can pass a quick lambda expression to replace the whole if block, but if you're not doing anything other than checking for null it seems like it's not providing anything useful.
I don't find that incredibly useful, but this:
someString.IsNullOrBlank() // Tests if it is empty after Trimming, too
someString.SafeTrim() // Avoiding Exception if someString is null
because those methods actually save you from having to do multiple checks. but replacing a single check with a method call seems useless to me.
It is perfectly valid to do but I don't think it is incredibly useful. Since extension methods are simply compiler trickery I struggle to call any use of them "abuse" since they are just fluff anyhow. I only complain about extension methods when they hurt readability.
Instead I'd go with something like:
static class Check {
public static T NotNull(T instance) {
... assert logic
return instance;
}
}
Then use it like this:
Check.NotNull(x).SomeMethod();
y = Check.NotNull(x);
Personally it's much clearer what is going on than to be clever and allow the following:
if( ((Object)null).IsNull() ) ...
I don't entirely agree with the reasoning saying "it may confuse".
To some extent I can see what is meant, that there is no reason to venture outside "common understanding" -- everybody understands object != null.
But in Visual Studio, we have wonderful tools where you can simply hover over the method, to reveal some additional information.
If we were to say that the extension-method was annotated with a good explanation, then I feel that the argument of confusion falls apart.
The methods .IsNotNull() and .IsNull() explain exactly what they are. I feel they are very reasonable and useful.
In all honesty it is a matter of "what you like". If you feel the methods will make it more readable in the context of your project, then go for it. If you are breaking convention in your project, then I would say the opposite.
I have had the same thoughts as you have on the subject and have asked several very very experienced developers at my place of work. And none of them have come up with a good reason (except what has been mentioned about -confusion- here) that would explain why you shouldn't do this.
Go for it :-)
There is precedent, in as much as the string class has IsNullOrEmpty
You're also introducing method call overhead for something that's a CLR intrinsic operation. The JIT might inline it away, but it might not. It's a micro-perf nitpick, to be sure, but I'd agree that it's not particularly useful. I do things like this when there's a significant readability improvement, or if I want some other behavior like "throw an ArgumentNullException and pass the arg name" that's dumb to do inline over and over again.
It can make sense if you, for instance, assume that you might want to throw an exception whenever x is null (just do it in the extension method). However, I my personal preference in this particular case is to check explicitly (a null object should be null :-) ).
To follow the pattern it should be a property rather than a method (but of course that doesn't work with extensions).
Data values in the System.Data namespace has an IsNull property that determines if the value contains a DbNull value.
The DataRow class has an IsNull method, but it doesn't determine if the DataRow is null, it determines if one of the fields in the data row contains a DbNull value.

How to indicate when purposely ignoring a return value

In some situations using C/C++, I can syntactically indicate to the compiler that a return value is purposely ignored:
int SomeOperation()
{
// Do the operation
return report_id;
}
int main()
{
// We execute the operation, but in this particular context we
// have no use of the report id returned.
(void)SomeOperation();
}
I find this to be a fair practice, firstly because most compilers won't generate a warning here, and secondly because it explicitly shows to future developers that the author made a conscious choice to ignore the return. It makes the author's trail of thought non ambiguous.
As far as I know, the C# compiler won't complain about implicitly ignored return values, but I would like to know if there's a similar convention to use in order to make a clear indication to other developers.
In response to some people here who questions the actual use of this convention (or that it would show bad design to have a method with a potentially unimportant return value).
A real life .NET example (which I maybe should have based the question on from the start) is the Mutex::WaitOne() overload which takes no arguments. It will only return if the mutex was safely acquired, otherwise it never returns. The boolean return value is for the other overloads where you might end up not being in possession of the mutex when it returns.
So along my reasoning, I would like to indicate in my multi-threaded code that I have made a choice to ignore the return:
Mutex mtx = new Mutex();
(void)mtx.WaitOne();
Since the return value never can be anything but true.
With C# 7.0 onward you can indicate purposely ignored return values with the discard operator '_'.
int SomeOperation()
{
return report_id;
}
int main()
{
_ = SomeOperation();
}
For more information you can have a look at the Microsoft docs here.
If you want to indicate to other developers and make it crystal clear that the return value is intentionally ignored, just comment it.
SomeMethod(); // return value ignored - $REASON
I can only think of one situation, when a "return value" is not allowed to be ignored in C#: when an error occurred. This should be provided by throwing an exception, which makes it impossible to be ignored.
In other cases, it is (or better: must be) completely safe and not smelly at all to ignore return values.
I still can't see the point. Why should this improve the code? You specify to ignore the return value by purpose by not assigning it to a variable.
If you don't need this value in your code, everything is fine.
If you need it, you won't be able to write your code.
If there is a special case which must be handled and must never be implicitly ignored, an exception should be thrown.
If the called method did not have a return value and gets one later, it must be designed to not break existing code which ignores it. The existing calling code does not change.
Did I forget a case?
The Microsoft C# compiler doesn't generate a warning on ignoring returns. It doesn't need to since there is a garbage collector so there won't be any memory leakage because of ignoring returned objects (unless they are IDisposable of course). Hence, there's no need to explicitly "override" the compiler.
EDIT: Also, I believe "maintainability" issue is more like a documentation and naming practice issue. I understand that this was only an example, but you wouldn't expect a method called SomeOperation to return a ReportId. You would, however, expect a GetReportId method to return a ReportId without having a lot of side effects. Indeed, ignoring the return value of a method called GetReportId would be rather strange. So, make sure that you name your methods well and people won't have doubts about the effects of your function calls.
EDIT 2: In this example of mutexes, I believe that the right usage would be actually not ignoring the return value. Even if the current implementation will never return false, I think it's good practice to still check the return value, just in case you will end up using another implementation in the future or they change the behaviour in a future release of the .NET Framework or something:
if (mutex.WaitOne())
{
// Your code here
}
else
{
// Optionally, some error handling here
}
object dummy = JustDontCare();
No standard conventions I'm aware of.
But I'm struggling to find a good reason for needing this. It sounds like SomeOperation() should really be two separate methods. Have you got an example of a method which really should behave this way? Why should a method bother returning a result if it's going to be ignored?
Sometimes it's useful to be able to put in (void) to indicate to a future coder looking at the code that you know perfectly well it returns something, and you are deliberately ignoring it.
That said, the C# compiler will error on the syntax.
I've seen:
var notUsed = SomeOperation();
Not so fond of it though.
The convention in .Net is, if you don't store or use a return value that means you ignore it implicitly, there's no explicit convention, and the API is generally designed so return values can be generally ignored, with the exception of boolean values representing fail, success state.
But even in the case of Boolean return values representing success/fail status, the convention is that if you ignore the return value (don't use it) that means the code doesn't depend on the success status of previous call.

Categories