In C#, should one check references passed to methods against null? - c#

Well, a few months ago I asked a similar question about C and C++, but I've been paying more attention to C# lately due to the whole "Windows Phone" thing.
So, in C#, should one bother to check against NULL at method boundaries? I think this is different than in C and C++, because in C# one generally can determine whether a given reference is valid -- the compiler will prevent one from passing uninitialized references anywhere, and therefore the only remaining possible mistake is for it to be null. Furthermore, there's a specific exception defined inside the .NET Framework for these things, the ArgumentNullException, which seems to codify what programmers think they should be getting when an invalid null was passed.
My personal opinion is once again that a caller doing this is broken, and that said caller should have NREs thrown at them until the end of days. However, I'm much less sure about this than I am in native code land -- C# has quite a different programming style in places compared to either C or C++ in this regard.
So... should you check for null parameters in C# methods?

Yes, check for them. Preferably use Code Contracts to tell the caller that you require non-null parameters
void Foo(Bar bar) {
Contract.Requires(bar != null);
}
This is particularly advantageous since the client can see exactly what is required of parameters.
If you can't use Code Contracts, use a guard clause
void Foo(Bar bar) {
Guard.Against<ArgumentNullException>(bar == null);
}
Fail fast.

I think it's better to throw an ArgumentNullException (or use a contract as Jason suggests) if your method doesn't want a particular parameter to be null. It's a much better indicator by the contract to the client not to pass null in the first place when calling your method.
Null-checking is then the client's responsibility which makes for more maintainable code on both sides (often a win-win situation)... at least that's what I feel.

It is a good practice to validate all the parameters at least in public methods. This helps to locate bugs faster and also helps while reading the code, as it describes method's contract.
We implemented a static class AssertUtilities that has a bunch of method for parameters and state validation. I believe some people call such classes Guard.
For example:
public static void ArgumentNotNull(object argument, string argumentName)
{
if (argument == null)
throw new ArgumentNullException(argumentName);
}
public static void ArgumentInRange(decimal argument, decimal minValue, decimal maxValue, string argumentName)
{
if (argument < minValue || argument > maxValue)
{
throw new ArgumentOutOfRangeException(
argumentName,
FormatUtilities.Format("Argument out of range: {0}. Must be between {1} and {2}.", argument, minValue, maxValue));
}
}
public static void ArgumentState(bool expression, string argumentName, string formatText, params object[] args)
{
if (!expression)
throw new ArgumentException(FormatUtilities.Format(formatText, args), argumentName);
}
public static void ArgumentHasText(string argument, string argumentName)
{
ArgumentNotNull(argument, argumentName);
ArgumentState(argument.Length > 0, argumentName, "Argument cannot be an empty string.");
}

So... should you check for null parameters in C# methods?
Yes, unless of course when null is permitted.
A better perspective: You should always check for valid parameter values. Sometimes a reference is allowed to be null, sometimes an int must be >= 0 or a string may not be NullOrWhiteSpace.
So it is a good practice to start every method with a parameter validation block.
From there on there are a few choices. Validation on internal/private methods is usually considered less critical, and for performance reasons it can be made conditional (or even left out).
Validation at a boundary is usually left on in Release builds. There is very little reason to ever turn it off.
Most projects will use some helper classes for validation, to minimize repetitive coding inside the methods. A very good but not yet very popular toolkit is the built-in System.Diagnostics.Contracts class. The Code-Contracts tools have many settings concerning parameter validation.

Related

C# nullable types - common pattern

Often when using nullable types I get following pattern inside my code (test for value, call a function):
Type? a;
if(a.HasValue)
{
SomeFunction(a.Value);
}
Is there some clever shorthand for this? Does this pattern have a name? Perhaps this question is misguided and I should be using nullables differently...
The null conditional can do this, but to make it work you'll want to make SomeFunction an extension method of Type
public static void SomeFunction(this Type? a)
{
// do stuff
}
then you can do
Type? a;
a?.SomeFunction()
In WPF/UWP, we often have to deal with adding change notification to every single property. This code is repeating and verbose. People did learn tricks to absract this. Often storing the actuall value in some kind of String Key collection. Wich then takes care of any change notification.
A simple way would (as others have said) be to use Expansion methods with the null-conditional operatiors. Wich of coruse might become quite tedious quickly.
Maybe you could make something similar to the TryParse pattern? It gives you your normal parsing results via a out parameter, and the return value tells if the parsing failed or not. Except in this case, you hand in the function to be called as a delegate and would get the return if it failed due to null values?
The Static and thus non-overrideable Equals Method, calls the instance (and thus overrideable) one after doing null checks. But in this case, null is a valid input for both values.
But in all cases you still need a way to tell that the operation failed due to a null. So you will not really get around checking something.
If SomeFunction takes Type you're need to pass a.Value, no shortcuts here.
BTW. If you have an aversion to a.HasValue then if(p != null) is another option.
void SomeFuntion(int i) {Console.WriteLine(i);}
int? p = null;
if(p != null) SomeFuntion(p.Value);

Does [Pure] have any implications other than "no visible side-effects" to Code Contracts?

The PureAttribute documentation says:
Indicates that a type or method is pure, that is, it does not make any visible state changes
Is this the only requirement of a Pure function under in Microsoft Code Contracts?
And; does this model assume that Exceptions are results (as opposed to side-effects)?
I ask because, in a more general context, a pure function also implies that the output is only dependent upon the input; ie. it cannot be the result of I/O or a stochastic function.
One might also argue that a pure function always yields a value to the outer expression, possibly in opposition to Exceptions.
If [Pure] is indeed limited to the less restrictive form, is there an equivalent of a "[FunctionalPure]"?
The static analyser assumes that calling the same pure function with the same arguments twice in a row produces the same result.
Given
[Pure]
public delegate int F(int i);
public class A
{
public void f(F f)
{
var i = f(1);
Contract.Assert(i == f(1));
}
}
a warning is generated: "Suggested assumption: Assumption can be proven: Consider changing it into an assert."
So for instance DateTime.Now must not be annoted with the Pure attribute.
As for exceptions, there doesn't seem to be anything disallowing them, nor for requiring them to be thrown consistently. In general, there cannot be. You can always get an OutOfMemoryException for pretty much any code, even a pure function with the same arguments that previously succeeded.

When to check for nulls

This is kinda an open ended question but I'm trying to sharpen my skills with good practices in exception handling specifically when to check for nulls in general.
I know mostly when to check for nulls but I have to say half the time I don't and that's bothering to me. I know ints cannot be set to null unless you use a nullable int. I know strings can be set to null OR empty hence you can check for IsNullOrEmpty.
Obviously in constructors you want to add your explicit checks there also. That's a given. And that I think you should check for null anytime you're passing in an array, generic object, or other objects into a method that can be set to null basically right?
But there is more here to exception handling. For instance I don't know when to always check and throw a null ref exception explicitely in code. If I've got incoming params, usually it's pretty straight forward but there are always cases where I ask myself, is an explicit null throw needed?
I don't really have specific examples but wondering if there's a good reference out there that really talks about exception handling in reference to when to throw them (inside methods, classes, you name it).
You shouldn't throw NullReferenceException. If it is an argument that is null throw ArgumentNullException instead.
I prefer to check for null for all reference types parameters of public/protected methods. For private methods, you may omit the check if you're sure all calls will always have valid data.
Unless you're using Code Contracts, I'd say it's good practice to check arguments for any public/protected member - as well as explicitly documenting whether they can or can't be null. For private/internal methods it becomes a matter of verifying your own code rather than someone else's... it's a judgement call at that point.
I sometimes use a helper extension method for this, so I can write:
public void Foo(string x, string y, string z)
{
x.ThrowIfNull("x");
y.ThrowIfNull("y");
// Don't check z - it's allowed to be null
// Method body here
}
ThrowIfNull will throw an ArgumentNullException with the appropriate name.
By checking explicitly before you make any other calls, you know that if an exception is thrown, nothing else will have happened so you won't have corrupted state.
I admit I'll omit such checks where I know for certain that the first call will do the same checks - e.g. if it's one overload piggybacking onto another.
With Code Contracts, I'd write:
public void Foo(string x, string y, string z)
{
Contract.Requires(x != null);
Contract.Requires(y != null);
// Rest of method here
}
This will throw a ContractException instead of ArgumentNullException, but all the information is still present and no-one should be catching ArgumentNullException explicitly anyway (other than possibly to cope with third party badness).
Of course the decision of whether you should allow null values is an entirely different matter. That depends entirely on the situation. Preventing nulls from ever entering your world does make some things easier, but at the same time nulls can be useful in their own right.
My rules of thumb for when to check for nulls are:
Always check the arguments passed to a public/protected method of any class.
Always check the arguments to constructors and initialize methods.
Always check the arguments in an indexer or property setter.
Always check the arguments to methods that are interface implementations.
With multiple overloads, try to put all argument preconditions in a single method that other overloads delegate to.
Early notification of null arguments (closer to the point of error) are much better than random NullReferenceExceptions that occur far from the point of the error.
You can use utility methods to clean up the typical if( arg == null ) throw new ArgumentNullException(...); constructs. You may also want to look into code contracts in C# 4.0 as a way to improve your code. Code contracts perform both static and runtime checks which can help identify problems even at compile time.
In general, writing preconditions for methods is a time consuming (and therefore often omitted) practice. However, the benefit to this type of defensive coding is in the hours of potentially saved time debugging red herrings that occur because you didn't validate your inputs.
As a side note, ReSharper is a good tool for identifying potential access to arguments or local members that may be null at runtime.
The earlier the better. The sooner you catch the (invalid) null, the less likely you'll be throwing out in the middle of some critical process. One other thing to consider is using your properties (setters) to centralize your validation. Often I reference my properties in the constructor, so I get good reuse on that validation.
class A
{
private string _Name
public string Name
{
get { return _Name; }
set
{
if (value == null)
throw new ArgumentNullException("Name");
_Name = value;
}
}
public A(string name)
{
//Note the use of property with built in validation
Name = name;
}
}
Depends on your type of method/api/library/framework. I think its ok for private methods that they dont check for null values unless they are kind of assertion methods.
If you programm defensive and litter your code with "not null"-tests or "if"-statements your code will be unreadable.
Define the areas where a client can mess up your code by submitting wrong/empty/null arguments. If you know where, you can draw the line where to do your null checks. Do it at your api entry points just like the security works in real life.
Imagine you would be bothered every minute to show your ticket in a cinema every movie would suck.
I prefer using static template methods for checking input constraints. the general format for this would be something like:
static class Check {
public static T NotNull(T arg) {
if( arg == null ) throw new ArgumentNullException();
}
}
The benefit to using this is that your method or constructor can now wrap the first use of the argument with Check.NotNull() as exampled here:
this.instanceMember = Check.NotNull(myArgument);
When I get to .Net 4, I'll likely convert to code contracts, but until then this works. BTW, You can find a more complete definition of the Check class I use at the following url:
http://csharptest.net/browse/src/Shared/Check.cs

When should I use out parameters?

I don't understand when an output parameter should be used, I personally wrap the result in a new type if I need to return more than one type, I find that a lot easier to work with than out.
I have seen method like this,
public void Do(int arg1, int arg2, out int result)
are there any cases where that actually makes sense?
how about TryParse, why not return a ParseResult type? or in the newer framework return a null-able type?
Out is good when you have a TryNNN function and it's clear that the out-parameter will always be set even if the function does not succeed. This allows you rely on the fact that the local variable you declare will be set rather than having to place checks later in your code against null. (A comment below indicates that the parameter could be set to null, so you may want to verify the documentation for the function you're calling to be sure if this is the case or not.) It makes the code a little clearer and easier to read. Another case is when you need to return some data and a status on the condition of the method like:
public bool DoSomething(int arg1, out string result);
In this case the return can indicate if the function succeeded and the result is stored in the out parameter. Admittedly, this example is contrived because you can design a way where the function simply returns a string, but you get the idea.
A disadvantage is that you have to declare a local variable to use them:
string result;
if (DoSomething(5, out result))
UpdateWithResult(result);
Instead of:
UpdateWithResult(DoSomething(5));
However, that may not even be a disadvantage, it depends on the design you're going for. In the case of DateTime, both means (Parse and TryParse) are provided.
I think out is useful for situations where you need to return both a boolean and a value, like TryParse, but it would be nice if the compiler would allow something like this:
bool isValid = int.TryParse("100", out int result = 0);
Well as with most things it depends.
Let us look at the options
you could return whatever you want as the return value of the function
if you want to return multiple values or the function already has a return value, you can either use out params or create a new composite type that exposes all these values as properties
In the case of TryParse, using an out param is efficient - you dont have to create a new type which would be 16B of overhead (on 32b machines) or incur the perf cost of having them garbage collected post the call. TryParse could be called from within a loop for instance - so out params rule here.
For functions that would not be called within a loop (i.e. performance is not a major concern), returning a single composite object might be 'cleaner' (subjective to the beholder). Now with anonymous types and Dynamic typing , it might become even easier.
Note:
out params have some rules that need to be followed i.e. the compiler will ensure that the function does initialize the value before it exits. So TryParse has to set the out param to some value even if parse operation failed
The TryXXX pattern is a good example of when to use out params - Int32.TryParse was introduced coz people complained of the perf hit of catching exceptions to know if parse failed. Also the most likely thing you'd do in case parse succeeded, is to obtain the parsed value - using an out param means you do not have to make another method call to Parse
Years late with an answer, I know.
out (and ref as well) is also really useful if you do not wish your method do instantiate a new object to return. This is very relevant in high-performance systems where you want to achieve sub microsecond performance for your method. instantiating is relatively expensive seen from a memory access perspective.
Definitely, out parameters are intended to be used when you have a method that needs to return more than one value, in the example you posted:
public void Do(int arg1, int arg2, out int result)
It doesn't makes much sense to use an out parameter, since you are only returning one value, and that method could be used better if you remove the out parameter and put a int return value:
public int Do(int arg1, int arg2)
There are some good things about out parameters:
Output parameters are initially considered unassigned.
Every out parameter must be definitely assigned before the method returns, your code will not compile if you miss an assignment.
In conclusion, I basically try use out params in my private API to avoid creating separate types to wrap multiple return values, and on my public API, I only use them on methods that match with the TryParse pattern.
Yes, it does make sense. Take this for example.
String strNum = "-1";
Int32 outNum;
if (Int32.TryParse(strNum, out outNum)) {
// success
}
else {
// fail
}
What could you return if the operation failed in a normal function with a return value? You most certainly could not return -1 to represent a fail, because then there would be no differentiation between the fail-return value and the actual value that was being parsed to begin with. This is why we return a Boolean value to see if it succeeded, and if it did then we have our "return" value safely assigned already.
Creating a type just for returning values sounds little painful to me :-)
First i will have to create a type for returning the value then in the calling method i have assign the value from the returned type to the actual variable that needs it.
Out parameters are simipler to use.
It does annoy me that I can't pass in null to the out parameter for the TryParse functions.
Still, I prefer it in some cases to returning a new type with two pieces of data. Especially when they're unrelated for the most part or one piece is only needed for a single operation a moment after. When I do need to save the resulting value of a TryParse function I really like having an out parameter rather than some random ResultAndValue class that I have to deal with.
If you always create a type, then you can end up with a lot of clutter in your application.
As said here, one typical use case is a TrySomething Method where you want to return a bool as an indicator for success and then the actual value. I also find that a little bit cleaner in an if-statement - all three options roughly have the same LOC anyway.
int myoutvalue;
if(int.TryParse("213",out myoutvalue){
DoSomethingWith(myoutvalue);
}
vs.
ParseResult<int> myoutvalue = int.TryParse("213");
if ( myoutvalue.Success ) {
DoSomethingWith(myoutvalue.Value);
}
vs.
int? myoutvalue = int.TryParse("213");
if(myoutvalue.HasValue){
DoSomethingWith(myoutvalue.Value);
}
As for the "Why not return a Nullable Type": TryParse exists since Framework 1.x, whereas Nullable Types came with 2.0 (As they require Generics). So why unneccessarily break compatibility or start introducing inconsistencies between TryParse on some types? You can always write your own extension Method to duplicate functionality already existing (See Eric Lipperts Post on an unrelated subject that includes some reasoning behind doing/not doing stuff)
Another use case is if you have to return multiple unrelated values, even though if you do that that should trigger an alarm that your method is possibly doing too much. On the other hand, if your Method is something like an expensive database or web service call and you want to cache the result, it may make sense to do that. Sure, you could create a type, but again, that means one more type in your application.
I use out parameters sometimes for readability, when reading the method name is more important than whatever the output of the method is—particularly for methods that execute commands in addition to returning results.
StatusInfo a, b, c;
Initialize(out a);
Validate(a, out b);
Process(b, out c);
vs.
StatusInfo a = Initialize();
StatusInfo b = Validate(a);
StatusInfo c = Process(b);
At least for me, I put a lot of emphasis on the first few characters of each line when I'm scanning. I can easily tell what's going on in the first example after acknowledging that some "StatusInfo" variables are declared. In the second example, the first thing I see is that a bunch of StatusInfo is retrieved. I have to scan a second time to see what kind of effects the methods may have.

What idiom (if any) do you prefer for naming the "this" parameter to extension methods in C#, and why?

The first parameter to a C# extension method is the instance that the extension method was called on. I have adopted an idiom, without seeing it elsewhere, of calling that variable "self". I would not be surprised at all if others are using that as well. Here's an example:
public static void Print(this string self)
{
if(self != null) Console.WriteLine(self);
}
However, I'm starting to see others name that parameter "#this", as follows:
public static void Print(this string #this)
{
if(#this != null) Console.WriteLine(#this);
}
And as a 3rd option, some prefer no idiom at all, saying that "self" and "#this" don't give any information. I think we all agree that sometimes there is a clear, meaningful name for the parameter, specific to its purpose, which is better than "self" or "#this". Some go further and say you can always come up with a more valuable name. So this is another valid point of view.
What other idioms have you seen? What idiom do you prefer, and why?
I name it fairly normally, based on the use. So "source" for the source sequence of a LINQ operator, or "argument"/"parameter" for an extension doing parameter/argument checking, etc.
I don't think it has to be particularly related to "this" or "self" - that doesn't give any extra information about the meaning of the parameter. Surely that's the most important thing.
EDIT: Even in the case where there's not a lot of obvious meaning, I'd prefer some meaning to none. What information is conferred by "self" or "#this"? Merely that it's the first parameter in an extension method - and that information is already obvious by the fact that the parameter is decorated with this. In the example case where theStringToPrint/self option is given, I'd use outputText instead - it conveys everything you need to know about the parameter, IMO.
I name the variable exactly how I would name it if it were a plain old static method. The reason being that it can still be called as a static method and you must consider that use case in your code.
The easiest way to look at this is argument validation. Consider the case where null is passed into your method. You should be doing argument checking and throwing an ArgumentNullException. If it's implemented properly you'll need to put "this" as the argument name like so.
public static void Print(this string #this) {
if ( null == #this ) {
throw new ArgumentNullException("this");
}
...
}
Now someone is coding against your library and suddenly gets an exception dialog which says "this is null". They will be most confused :)
This is a bit of a contrived example, but in general I treat extension methods no different that a plain old static method. I find it makes them easier to reason about.
I have seen obj and val used. I do not like #this. We should try to avoid using keywords. I have never seen self but I like it.
I call it 'target', since the extension method will operate on that parameter.
I believe #this should be avoided as it makes use of the most useless language-specific feature ever seen (#). In fact, anything that can cause confusion or decrease readability such as keywords appearing where they are not keywords should be avoided.
self reminds me of python but could be good for a consistent naming convention as it's clear that it's referring to the instance in use while not requiring some nasty syntactic trickery.
You could do something like this...
public static void Print(this string extended)
{
if(extended != null) Console.WriteLine(extended);
}

Categories