Coming from the Swift programming language, I've grown accustomed to the paradigm of optional values. There are many cool aspects to this language feature, but one in particular that I am wondering if C# has an equivalent way of handling, which is nil function return values. Take, for example, a simple lookup function that returns a value for a given key. In Swift, the idiomatic way of handling a missing key would be to simply return nil. In effect, this prevents programmers from having to throw errors or perform error handling in the vast majority of cases, and it works out quite elegantly. Similarly, in Javascript, I might just return undefined.
From what I read on MSDN, it seems that the C# programmer would typically favor classical exceptions in such cases, and handle them in try catch blocks. Is this generally the case? Would it be unusual, for example, to wrap integer return values in a Nullable object and let the function caller check whether the value is null? What would the orthodox approach for these situations?
Exceptions are for exceptional scenario's, not for regular program flow. I guess we're on the same side there.
You would only need an additional return "type" like null when your method both checks for availability and fetches the resource. Take for example the code to get the contents of a file:
if (File.Exists(...))
{
return false;
}
string contents = File.ReadAllText(...);
...
return true;
This requires no exceptions or null return values, and if the file is deleted between checking and opening the file, that could then be considered an exceptional scenario.
However when trying to get an entity from the database, you'd want to combine the availability check and fetching the resource, so that only one query is required. There are alot of programmers that do in fact return null, but since there is no C# law it's hard to keep track of which method does what.
I personally like the bool TryGet(int id, out Entity entity) style, which clearly indicates it does an availability check and fetches the resource if found. If you prefer returning null and want to be more explicit about what your function does, you could use Entity GetOrDefault(int id) as a naming convention. Some framework methods (ie. Linq) are named that way.
In case of returning value types, returning a Nullable<> probably already states what your method is up to. To complete the circle, C# 7 will (as far as I know) add non-nullable reference types.
In the end it's up to you (and your team) what convention you prefer.
There is no option type in C# (yet), in F# there's an option type which you could abuse in C#.
See here for a detailed description:
Using F# option types in C#
or refer to C.Evenhuis post for the C# way to do it.
Related
As far as I know, int.TryParse(string, out int) exists since Framework 2.0. So does int?.
Is there a reason to use an out parameter instead of returning an int? with HasValue set to true of false depending on the ability to convert ?
The simple reason is because when int.TryParse was added to the language, Nullable<T> didn't exist.
In this blog post by Eric Lippert, there's a line towards the bottom that reads:
The solution is to write your own extension method version of TryParse the way it would have been written had there been nullable value types available in the first place
which makes it clear that nullable types were not available to be used in the original implementation of TryParse. Eric Lippert was on the team that wrote the C# compiler, so I'd say that's a pretty authoritative source.
I cannot tell about the actual reasons, but I see three possible reasons:
1) Nullable types were introduced in .NET 2.0, while the first TryParse methods were already around since .NET 1.1. Thus, when nullable types were introduced, it was too late for such an API change; and new classes wouldn't implement TryParse differently because the pattern had already been set.
2) Not all types can be used with the Nullable structure, only value types can. However, there are methods following the Try* pattern that have to return reference types. For example, a dictionary may totally legitimately contain null as an item, hence its TryGetValue method needs an additional way to express that a key was not found.
3) The way the Try*-methods are written, it is possible to write code like this:
int myValue;
if (int.TryParse("42", out myValue)) {
// do something with myValue
}
// do something else
}
Now, imagine if TryParse only returned an int?. You can either dispose of the myValue variable and lose the result:
if (int.TryParse("42").HasValue) {
// do something with ... what? You didn't store the conversion result!
}
// do something else
}
Or you can add a nullable variable:
int? myValue = int.TryParse("42");
if (myValue.HasValue) {
// do something with myValue.Value
}
// do something else
}
This isn't an advantage over the current version any more, and instead it requires writing myValue.Value at some later instances, where otherwise a simple value would have sufficed. Note that in many cases, you only need the information about whether the operation was successful for the if statement.
Here's a quote from Julie Lerman's blog (Back from 2004):
I have played with nullable in the March preview bits, but not yet in the May and disappointed with the current (but slated for serious improvement by the bcl team!!!) performance when I compared the using nullable<t> over current options. So for example with value types:
comparing myNullableInt.HasValue to (in VB) is myInt < 0
or with reference types
comparing myNullableThing.HasValue to “if not myThing=null”
the nullable type is currently much much slower. I have been promised by a few on the BCL team that the plan is to make the nullable MUCH more performant.
I have also been given the hint that in the future, the following will be possible:
Nullable<T> Parse(string value);
Nullable<Int32> i = Int32.Parse( some String );
And will be more performant than TryParse. So that, too will be interesting.
I assume that as always, the benefit outweighs the cost.
Anyway, in the upcoming C# vNext, you can do:
DateTime.TryParse(s, out var parsedDateTime);
Turning TryParse into a one liner.
One other possible reason:
Generics for .NET and C# in their current form almost didn't happen: it was a very close call, and the feature almost didn't make the cut for Whidbey (Visual Studio 2005). Features such as running CLR code on the database were given higher priority.
...
Ultimately, an erasure model of generics would have been adopted, as for Java, since the CLR team would never have pursued a in-the-VM generics design without external help.
source: http://blogs.msdn.com/b/dsyme/archive/2011/03/15/net-c-generics-history-some-photos-from-feb-1999.aspx
My point being: the majority of changes in the BCL (or at least those not directly related to generics) probably needed to work both with and without generics, in case that feature was cut in the final RTM.
Of course, this also makes sense from a calling client perspective: all the consuming languages (ok, there weren't as many back then) would ideally have been able to use them - and out parameters weren't as cutting-edge as generics.
As to reasons we can only guess, but some possible reasons are:
Assignment overhead: a boxed value incurs some (small) performance overhead over a built in type.
No real gains:
int res;
if int.TryParse("one", out res) {
//something
}
isn't much worse than
int? res = int.TryParse("one");
if (res.HasValue){
int realres = res.Value
//something
}
I just came across this today, if you convert null to int32
Convert.ToInt32(null)
it returns 0
I was expecting an InvalidCastException...
Any idea why this happen?
Any idea why this happen?
Because that's the documented behaviour? Whether it's Convert.ToInt32(object) or Convert.ToInt32(string), the documentation states quite clearly:
(Under return value)
A 32-bit signed integer that is equivalent to the number in value, or 0 (zero) if value is null.
or
A 32-bit signed integer equivalent to value, or zero if value is null.
As always, if reality doesn't match expectations, the first thing you should do is check whether your expectations match the documented behaviour.
Personally I don't fully buy the "compatibility with VB6" argument shown by Gavin. I realize it comes from Microsoft, and it may well be the genuine reason why it behaves that way - but I don't think it's a good reason to behave that way. There are plenty of VB-specific conversion methods - so if the framework designers genuinely thought that returning zero was a non-ideal result, they should have done whatever they thought best, and provided a VB6 compatible conversion for use by VB6 programmers.
Obviously once the behavior was defined in .NET 1.0, it couldn't be changed for later versions - but that's not the same as saying it had to behave the same way as VB6.
See http://msdn.microsoft.com/en-us/library/sf1aw27b.aspx
Edit
The URL above automatically reverts to the latest Framework version, where as the text below was specifically posted on version 4. See the revised URL below which shows the text.
http://msdn.microsoft.com/en-us/library/sf1aw27b(v=vs.100).aspx
It explains:
All of the string-to-numeric conversion methods in the Convert class return zero if the string is null. The original motivation for this behavior was that they would provide a set of conversion methods for programmers migrating from Visual Basic 6 to Visual Basic .NET that mirrored the behavior of the existing Visual Basic 6 conversion methods. The assumption was that C# programmers would be more comfortable with casting operators, whereas Visual Basic had traditionally used conversion methods for type conversion.
Traditionally, the .NET Framework has tried to maintain a high degree of compatibility from version to version. Effectively, this means that, absent an extremely compelling reason, once a method has been implemented in a particular way and that implementation is publicly exposed (as in a method returning 0 if the string parameter is null), it cannot be changed, since that would break code that depends on the established behavior. This makes both of your proposed solutions very problemmatic. In the first case, throwing an exception changes the implementation of a method for customers who are likely to depend on the method returning zero for a null string. In the second case, it is important to remember that the .NET Framework does not consider return type in overload resolution. This means that your method would have to replace the existing Convert.ToInt32(String value) method, and that all code that does not expect to handle a nullable type would now be broken.
This concern for compatibility is even stronger in the case of the string-to-numeric conversion methods in the Convert class, since Parse is the recommended method for performing string-to-numeric conversion for each of the primitive numeric types supported by the .NET Framework, and each Parse method behaves differently that its corresponding Convert method. Unlike the string-to-numeric conversion method in the Convert class, which return zero if the string to be converted is null, each Parse method throws an ArgumentNullException, which is the behavior that you are arguing for. The overloads of the numeric Parse methods, such as Int32.Parse and Double.Parse, also have the advantage of allowing much finer-grained control over the parsing operation.
Because the default value of an Int32 is zero. Int32's cannot be null as they are value types, not reference types, so you get the default value instead.
Because that's what is documented that it will return. Perhaps you were thinking of (int)null, which would be an NullReferenceException (not InvalidCastException; I'm not sure why).
Because this is how the method is written in the Convert class . If the parameter value is null it is simply returning 0.
public static int ToInt32(object value)
{
if (value == null)
{
return 0;
}
else
{
return ((IConvertible)value).ToInt32(null);
}
}
For you to have an InvalidCastException you have to make a non-controlled cast.
For example:
int i = (int)null;
If you execute it the exception should be raised.
The use of
Convert.ToInt32(var)
Is useful when you distrust the value in var, as when reading from a database.
Imagine someone coding the following:
string s = "SomeString";
s.ToUpper();
We all know that in the example above, the call to the “ToUpper()” method is meaningless because the returned string is not handled at all. But yet, many people make that mistake and spend time trying to troubleshoot what the problem is by asking themselves “Why aren’t the characters on my ‘s’ variable capitalized”????
So wouldn’t it be great if there was an attribute that could be applied to the “ToUpper()” method that would yield a compiler error if the return object is not handled? Something like the following:
[MustHandleReturnValueAttribute]
public string ToUpper()
{
…
}
If order for this code to compile correctly the user would have to handle the return value like this:
string s = "SomeString";
string uppers = s.ToUpper();
I think this would make it crystal clear that you must handle the return value otherwise there is no point on calling that function.
In the case of the string example this may not be a big deal but I can think of other more valid reasons why this would come in handy.
What do you guys think?
Thanks.
Does one call a method for its side-effects, for its return value, or for both? "Pure" functions (which have no effects and only serve to compute a return value) would be good to annotate as such, both to eliminate the type of error you describe, as well as to enable some potential optimizations/analyses. Maybe in another 5 years we'll see this happen.
(Note that the F# compiler will warn any time you implicitly ignore a return value. The 'ignore' function can be used when you want to explicitly ignore it.)
If you have Resharper it will highlight things like this for you. Cant recommend resharper highly enough, it has lots of useful IDE additions especially around refactoring.
http://www.jetbrains.com/resharper/
I am not sure that I like this. I have called many a method that returns a value that I choose not to capture. Adding some type of default (the compiler generates a warning when a return value is not handled) just seems wrong to me.
I do agree that something along the suggested lines might help out new programmers but adding an attribute at this point in the game will only affect a very small number of methods relative the the large existing body. That same junior programmer will never get their head around the issue when most of their unhandled return values are not flagged by the compiler.
Might have been nice way back when but the horses are out of the barn.
I'd actually prefer a way to flag a struct or class as [Immutable], and have this handled automatically (with a warning) for methods called without using their return values on immutable objects. This could also protect the object by the compiler from changes after creation.
If the object is truly an immutable object, there really would be no way to handle it. It also could potentially be used by compilers to catch other common mistakes.
Tagging the method itself seems less useful to me, though. I agree with most of the other comments regarding that. If the object is mutable, calling a method could have other side-effects, so the above code could be perfectly valid.
I'm having flashbacks to putting (void) casts on all printf() calls because I couldn't get Lint to shut up about the return value being ignored.
That said, it seems like this functionality should be in some code checker tool rather than the compiler itself.
At least a compiler-warning would be helpful. Perhaps they add something similar for C# 4.0 (Design-By-Contract).
This doesn't warrant for a warning or pragma. There are too many places where it is intended to discard the result, and I'd be quite annoyed getting a warning/error from the compiler just because the method was decorated with some dodge attribute.
This kind of 'warning' should be annotated in the IDE's Editor, like a small icon on the gutter "Warning: Discarding return value" or similar.
In some situations using C/C++, I can syntactically indicate to the compiler that a return value is purposely ignored:
int SomeOperation()
{
// Do the operation
return report_id;
}
int main()
{
// We execute the operation, but in this particular context we
// have no use of the report id returned.
(void)SomeOperation();
}
I find this to be a fair practice, firstly because most compilers won't generate a warning here, and secondly because it explicitly shows to future developers that the author made a conscious choice to ignore the return. It makes the author's trail of thought non ambiguous.
As far as I know, the C# compiler won't complain about implicitly ignored return values, but I would like to know if there's a similar convention to use in order to make a clear indication to other developers.
In response to some people here who questions the actual use of this convention (or that it would show bad design to have a method with a potentially unimportant return value).
A real life .NET example (which I maybe should have based the question on from the start) is the Mutex::WaitOne() overload which takes no arguments. It will only return if the mutex was safely acquired, otherwise it never returns. The boolean return value is for the other overloads where you might end up not being in possession of the mutex when it returns.
So along my reasoning, I would like to indicate in my multi-threaded code that I have made a choice to ignore the return:
Mutex mtx = new Mutex();
(void)mtx.WaitOne();
Since the return value never can be anything but true.
With C# 7.0 onward you can indicate purposely ignored return values with the discard operator '_'.
int SomeOperation()
{
return report_id;
}
int main()
{
_ = SomeOperation();
}
For more information you can have a look at the Microsoft docs here.
If you want to indicate to other developers and make it crystal clear that the return value is intentionally ignored, just comment it.
SomeMethod(); // return value ignored - $REASON
I can only think of one situation, when a "return value" is not allowed to be ignored in C#: when an error occurred. This should be provided by throwing an exception, which makes it impossible to be ignored.
In other cases, it is (or better: must be) completely safe and not smelly at all to ignore return values.
I still can't see the point. Why should this improve the code? You specify to ignore the return value by purpose by not assigning it to a variable.
If you don't need this value in your code, everything is fine.
If you need it, you won't be able to write your code.
If there is a special case which must be handled and must never be implicitly ignored, an exception should be thrown.
If the called method did not have a return value and gets one later, it must be designed to not break existing code which ignores it. The existing calling code does not change.
Did I forget a case?
The Microsoft C# compiler doesn't generate a warning on ignoring returns. It doesn't need to since there is a garbage collector so there won't be any memory leakage because of ignoring returned objects (unless they are IDisposable of course). Hence, there's no need to explicitly "override" the compiler.
EDIT: Also, I believe "maintainability" issue is more like a documentation and naming practice issue. I understand that this was only an example, but you wouldn't expect a method called SomeOperation to return a ReportId. You would, however, expect a GetReportId method to return a ReportId without having a lot of side effects. Indeed, ignoring the return value of a method called GetReportId would be rather strange. So, make sure that you name your methods well and people won't have doubts about the effects of your function calls.
EDIT 2: In this example of mutexes, I believe that the right usage would be actually not ignoring the return value. Even if the current implementation will never return false, I think it's good practice to still check the return value, just in case you will end up using another implementation in the future or they change the behaviour in a future release of the .NET Framework or something:
if (mutex.WaitOne())
{
// Your code here
}
else
{
// Optionally, some error handling here
}
object dummy = JustDontCare();
No standard conventions I'm aware of.
But I'm struggling to find a good reason for needing this. It sounds like SomeOperation() should really be two separate methods. Have you got an example of a method which really should behave this way? Why should a method bother returning a result if it's going to be ignored?
Sometimes it's useful to be able to put in (void) to indicate to a future coder looking at the code that you know perfectly well it returns something, and you are deliberately ignoring it.
That said, the C# compiler will error on the syntax.
I've seen:
var notUsed = SomeOperation();
Not so fond of it though.
The convention in .Net is, if you don't store or use a return value that means you ignore it implicitly, there's no explicit convention, and the API is generally designed so return values can be generally ignored, with the exception of boolean values representing fail, success state.
But even in the case of Boolean return values representing success/fail status, the convention is that if you ignore the return value (don't use it) that means the code doesn't depend on the success status of previous call.
We are currently going through the long process of writing some coding standards for C#.
I've written a method recently with the signature
string GetUserSessionID(int UserID)
GetUserSession() returns null in the case that a session is not found for the user.
in my calling code... I say...
string sessionID = GetUserSessionID(1)
if (null == sessionID && userIsAllowedToGetSession)
{
session = GetNewUserSession(1);
}
In a recent code review, the reviewer said "you should never return null from a method as it puts more work on the calling method to check for nulls."
Immediately I cried shenanigans, as if you return string.Empty you still have to perform some sort of check on the returned value.
if (string.Empty == sessionID)
However, thinking about this further I would never return null in the case of a Collection/Array/List.
I would return an empty list.
The solution to this (I think) would be to refactor this in to 2 methods.
bool SessionExists(int userID);
and
string GetUserSessionID(int UserID);
This time, GetUserSessionID() would throw a SessionNotFound exception (as it should not return null)
now the code would look like...
if(!SessionExists(1) && userIsAllowedToGetSession))
{
session = GetNewUserSession(1);
}
else
{
session = GetUserSessionID(1);
}
This now means that there are no nulls, but to me this seems a bit more complicated. This is also a very simple example and I was wondering how this would impact more complicated methods.
There is plenty of best-practice advise around about when to throw exceptions and how to handle them, but there seems to be less information regarding the use of null.
Does anyone else have any solid guidelines (or even better standards) regarding the use of nulls, and what does this mean for nullable types (should we be using them at all?)
Thanks in advance,
Chris.
Thanks everyone! LOTS of very interesting discussion there.
I've given the answer to egaga as I like thier suggestion of Get vs Find as a coding guideline, but all were interesting answers.
nulls are definitely better, i.e., more honest, than "magic values". But they should not be returned when an error has happened - that's what exceptions are made for. When it comes to returning collections... better an empty collection than null, I agree.
returning null is fine and the following is concise and easy to understand:
var session = GetUserSessionID(1) ?? GetNewUserSession(1);
A possible practice is to use get prefix for methods that throw an exception if result is not found, and find prefix, if null is possible. Thus it's easy to see in client side whether the code could have a problem dealing with null.
Of course one should avoid nulls, and Andrej Heljsberg has said in an interview that if C# was created now, it would have better ways dealing with nullability. http://www.computerworld.com.au/article/261958/-z_programming_languages_c?pp=3&fp=&fpid=
In my opinion you shouldn't rule out the use of null as a return value. I think it is valid in many cases. But you should carefully consider the pro's and con's for every method. It totally depends on the purpose of the method and the expectations callers might have.
I personally use a null return value a lot in cases where you might expect that the method does not return an abject (i.e. a database search by the primary key, might return exactly one instance or none / null). When you can correctly expect a method to return a value, you should use an exception.
In your particular example I think it depends on the context of the system. If the call for instance is only made from code where you might expect a logged in user, yo should throw an exception. If however it is likely that no user is logged in and you therefore don't have a session id to return, you should choose to return null.
Why don't you use the Null design pattern instead?
Keep things Simple. Make it as painless as possible for the consumer of your type. It's a tradeoff : time to implement and benefits gained.
When dealing with things like fetching lists/collections, return an empty list as you rightly pointed out.
Similarly for cases where you need to return a 'null' value to signify 'not-found' - try using the Null Object pattern for a heavily used type. For rarely used types, I guess you could live with a few checks for null or -1 (String.IndexOf).
If you have more categories, please post as comments and I'll attempt solutions.
Nulls are far better than magic values and make much more sense.
You could also try and write a TryGetUserSession method.
bool TryGetUserSession(int sessionId, out session)
Also try not to write nulls == ???, as some developers find it harder to read.
Kind regards,
I'm wary of returning nulls myself, though in your example it certainly seems like the right thing to do.
What is important is being clear about nullability both for arguments and return values. This must be specified as part of your API documentation if your language can't express this concept directly. (Java can't, I've read that C# can.) If an input parameter or return value can be null, be sure to tell users what that means in the context of your API.
Our largest code base is a Java app that has grown steadily over the past decade. Many of the internal APIs are very unclear about their behavior WRT null. This leads to cancerous growth of null-checks all over the place because everyone is programming defensively.
Particularly ugly: functions which return collections returning null instead of an empty collection. WTF!? Don't do that!
In the case of Strings be sure to distinguish between Strings that must be there, but can be empty and strings that are truly optional (can be null). In working with XML we come across this distinction often: elements that are mandatory, but can be empty, versus elements that are optional.
One part of the system, which provides querying over XML-like structures for implementing business rules, is very radical about being null-free. It uses the Null Object Pattern everywhere. In this context, this turns out to be useful because it allows you to do things like:
A.children().first().matches("Blah")
The idea being that this expression should return true if the first child of A is named "Blah". It won't crash if A has no children because in that case first() returns a NullNode which implements the Node interface, but never 'matches' anything.
In summary:
Be clear about the meaning and permissibility of null in your APIs.
Use your brain.
Don't be dogmatic.
One solution might be to declare a simple return type for such things:
class Session
{
public bool Exists;
public string ID;
}
Session GetUserSession(int userID) {...}
...
Session session = GetUserSessionID(1);
if (session.Exists)
{
... use session.ID
}
I do generally like to avoid nulls, but strings are a bit of a special case. I often end up calling string.IsNullOrEmpty(). So much so that we use an extension method for string:
static class StringExtensions
{
public static bool IsNullOrEmpty(this string value)
{
return string.IsNullOrEmpty(value);
}
}
Then you can do:
string s = ... something
if (s.IsNullOrEmpty())
{
...
}
Also, since we're on the subject of coding style:
What's with the unnatural-looking "if (null == ...)" style? That's completely unneccessary for C# - it looks like someone's brought some C++ baggage into the C# coding styles!
I prefer returning empty collections instead of nulls, because that helps avoid cluttering the caller code like the following:
if( list != null) {
foreach( Item item in list ) {
...
}
}
The design by contract answer you have would also be my solution:
if (myClass.SessionExists)
{
// Do something with myClass.Session
}
We always return empty lists / collections and do not return NULL lists / collections because handling empty collections/lists reduces the burden on the calling functions.
Nullable - nullables are particularly useful when you want to avoid handling the "null" values from the databases. You can simply use GetDefaultOrValue for any Nullable type without worrying about the database value being null in your business objects. We define Nullables typically ONLY in our business objects and that too where we really want to escape the db null values check, etc.
The inventor of nulls thinks that they are a bad idea..
see: http://lambda-the-ultimate.org/node/3186
I've had a similar issue, though I through an Exception and was told I should of returned null,
We ended up reading
this blog post about vexing exceptions
I'd recommend having a read.
NULL means unknown/undetermined.
You mostly need it when working with a database where data is simply 'not determined (yet)'. In your application, NULL is a good return type for non-exceptional but not-default behavior.
You use NULL when the output is not the expected but not serious enough for an exception.