Throwing exceptions in switch statements when no specified case can be handled

Throwing exceptions in switch statements when no specified case can be handled - c#

Let's say we have a function that changes a password for a user in a system in an MVC app.:
public JsonResult ChangePassword
(string username, string currentPassword, string newPassword)
{
switch (this.membershipService.ValidateLogin(username, currentPassword))
{
case UserValidationResult.BasUsername:
case UserValidationResult.BadPassword:
// abort: return JsonResult with localized error message
// for invalid username/pass combo.
case UserValidationResult.TrialExpired
// abort: return JsonResult with localized error message
// that user cannot login because their trial period has expired
case UserValidationResult.Success:
break;
}
// NOW change password now that user is validated
}
membershipService.ValidateLogin() returns a UserValidationResult enum that is defined as:
enum UserValidationResult
{
BadUsername,
BadPassword,
TrialExpired,
Success
}
Being a defensive programmer, I would change the above ChangePassword() method to throw an exception if there's an unrecognized UserValidationResult value back from ValidateLogin():
public JsonResult ChangePassword
(string username, string currentPassword, string newPassword)
{
switch (this.membershipService.ValidateLogin(username, currentPassword))
{
case UserValidationResult.BasUsername:
case UserValidationResult.BadPassword:
// abort: return JsonResult with localized error message
// for invalid username/pass combo.
case UserValidationResult.TrialExpired
// abort: return JsonResult with localized error message
// that user cannot login because their trial period has expired
case UserValidationResult.Success:
break;
default:
throw new NotImplementedException
("Unrecognized UserValidationResult value.");
// or NotSupportedException()
break;
}
// Change password now that user is validated
}
I always considered a pattern like the last snippet above a best practice. For example, what if one developer gets a requirement that now when users try to login, if for this-or-that business reason, they're suppose to contact the business first? So UserValidationResult's definition is updated to be:
enum UserValidationResult
{
BadUsername,
BadPassword,
TrialExpired,
ContactUs,
Success
}
The developer changes the body of the ValidateLogin() method to return the new enum value (UserValidationResult.ContactUs) when applicable, but forgets to update ChangePassword(). Without the exception in the switch, the user is still allowed to change their password when their login attempt shouldn't even be validated in the first place!
Is it just me, or is this default: throw new Exception() a good idea? I saw it some years ago and always (after groking it) assume it to be a best practice.

I always throw an exception in this case. Consider using InvalidEnumArgumentException, which gives richer information in this situation.

With what you have it is fine although the break statement after it will never be hit because execution of that thread will cease when an exception is thrown and unhandled.

I always like to do exactly what you have, although I usually throw an ArgumentException if it's an argument that was passed in, but I kind of like NotImplementedException better since the error is likely that a new case statement should be added rather than the caller should change the argument to a supported one.

I've used this practice before for specific instances when the enumeration lives "far" from it's use, but in cases where the enumeration is really close and dedicated to specific feature it seems like a little bit much.
In all likelihood, I suspect an debug assertion failure is probably more appropriate.
http://msdn.microsoft.com/en-us/library/6z60kt1f.aspx
Note the second code sample...

I think approaches like this can be very useful (for example, see Erroneously empty code paths). It's even better if you can have that default-throw be compiled away in released code, like an assert. That way you have the extra defensiveness during development and testing, but don't incur the extra code overhead when releasing.

You do state that you're being very defensive and I might almost say this is overboard. Isn't the other developer going to test their code? Surely, when they do the simplest of tests they'll see that the user can still log in - so, then they'll realize what they need to fix. What you're doing isn't horrible or wrong, but if you're spending a lot of your time doing it, it might just be too much.

For a web application I would prefer a default that generates a result with an error message that asks the user to contact an adminstrator and logs an error rather than throws an exception that might result in something less meaningful to the user. Since I can anticipate, in this case, that the return value might be something other than what I expect I would not consider this truly exceptional behavior.
Also, your use case that results in the additional enum value shouldn't introduce an error into your particular method. My expectation would be that the use case only applies on login -- which would happen before the ChangePassword method could legally be called, presumably, since you'd want to make sure that the person was logged in before changing their password -- you should never actually see the ContactUs value as a return from the password validation. The use case would specify that if a ContactUs result is return, that authentication fails until the condition resulting in the result is cleared. If it were otherwise, I'd expect that you'd have requirements for how other parts of the system should react under that condition and you'd write tests that would force you to change the code in this method to conform to those tests and address the new return value.

Validating runtime constraints is usually a good thing, so yes, best practice, helps with 'fail fast' principle and you are halting (or logging) when detecting an error condition rather than continuing silently. There are cases where that is not true, but given switch statements I suspect we do not see them very often.
To elaborate, switch statements can always be replaced by if/else blocks. Looking at it from this perspective, we see the throw vs do not throw in a default switch case is about equivalent to this example:
if( i == 0 ) {
} else { // i > 0
}
vs
if( i == 0 ) {
} else if ( i > 0 ) {
} else {
// i < 0 is now an enforced error
throw new Illegal(...)
}
The second example is usually considered better since you get a failure when an error constraint is violated rather than continuing under a potentially incorrect assumption.
If instead we want:
if( i == 0 ) {
} else { // i != 0
// we can handle both positve or negative, just not zero
}
Then, I suspect in practice that last case will likely appear as an if/else statement. Because of that the switch statement so often resembles the second if/else block, that the throws is usually a best practice.
A few more considerations are worthwhile:
- consider a polymorphic approach or an enum method to replace the switch statement altogether, eg: Methods inside enum in C#
- if throwing is the best, as noted in other answers prefer to use a specific exception type to avoid boiler plate code, eg: InvalidEnumArgumentException

I never add a break after a throw statement contained in a switch statement. Not the least of the concerns is the annoying "unreachable code detected" warning. So yes, it is a good idea.

Related

How to avoid convoluted logic for custom log messages in code?

I know the title is a little too broad, but I'd like to know how to avoid (if possible) this piece of code I've just coded on a solution of ours.
The problem started when this code resulted in not enough log information:
...
var users = [someRemotingProxy].GetUsers([someCriteria]);
try
{
var user = users.Single();
}
catch (InvalidOperationException)
{
logger.WarnFormat("Either there are no users corresponding to the search or there are multiple users matching the same criteria.");
return;
}
...
We have a business logic in a module of ours that needs there to be a single 'User' that matches some criteria. It turned out that, when problems started showing up, this little 'inconclusive' information was not enough for us to properly know what happened, so I coded this method:
private User GetMappedUser([searchCriteria])
{
var users = [remotingProxy]
.GetUsers([searchCriteria])
.ToList();
switch (users.Count())
{
case 0:
log.Warn("No user exists with [searchCriteria]");
return null;
case 1:
return users.Single();
default:
log.WarnFormat("{0} users [{1}] have been found"
users.Count(),
String.Join(", ", users);
return null;
}
And then called it from the main code like this:
...
var user = GetMappedUser([searchCriteria]);
if (user == null) return;
...
The first odd thing I see there is the switch statement over the .Count() on the list. This seems very strange at first, but somehow ended up being the cleaner solution. I tried to avoid exceptions here because these conditions are quite normal, and I've heard that it is bad to try and use exceptions to control program flow instead of reporting actual errors. The code was throwing the InvalidOperationException from Single before, so this was more of a refactor on that end.
Is there another approach to this seemingly simple problem? It seems to be kind of a Single Responsibility Principle violation, with the logs in between the code and all that, but I fail to see a decent or elegant way out of it. It's even worse in our case because the same steps are repeated twice, once for the 'User' and then for the 'Device', like this:
Get unique user
Get unique device of unique user
For both operations, it is important to us to know exactly what happened, what users/devices were returned in case it was not unique, things like that.

#AntP hit upon the answer I like best. I think the reason you are struggling is that you actually have two problems here. The first is that the code seems to have too much responsibility. Apply this simple test: give this method a simple name that describes everything it does. If your name includes the word "and", it's doing too much. When I apply that test, I might name it "GetUsersByCriteriaAndValidateOnlyOneUserMatches()." So it is doing two things. Split it up into a lookup function that doesn't care how many users are returned, and a separate function that evaluates your business rule regarding "I can handle only one user returned".
You still have your original problem, though, and that is the switch statement seems awkward here. The strategy pattern comes to mind when looking at a switch statement, although pragmatically I'd consider it overkill in this case.
If you want to explore it, though, think about creating a base "UserSearchResponseHandler" class, and three sub classes: NoUsersReturned; MultipleUsersReturned; and OneUserReturned. It would have a factory method that would accept a list of Users and return a UserSearchResponseHandler based on the count of users (encapsulating the logic of the switch inside the factory.) Each handler method would do the right thing: log something appropriate then return null, or return a single user.
The main advantage of the Strategy pattern comes when you have multiple needs for the data it identifies. If you had switch statements buried all over your code that all depended on the count of users found by a search, then it would be very appropriate. The factory can also encapsulate substantially more complex rules, such as "user.count must = 1 AND the user[0].level must = 42 AND it must be a Tuesday in September". You can also get really fancy with a factory and use a registry, allowing for dynamic changes to the logic. Finally, the factory nicely separates the "interpreting" of the business rule from the "handling" of the rule.
But in your case, probably not so much. I'm guessing you likely have only the one occurrence of this rule, it seems pretty static, and it's already appropriately located near the point where you acquired the information it's validating. While I'd still recommend splitting out the search from the response parser, I'd probably just use the switch.
A different way to consider it would be with some Goldilocks tests. If it's truly an error condition, you could even throw:
if (users.count() < 1)
{
throw TooFewUsersReturnedError;
}
if (users.count() > 1)
{
throw TooManyUsersReturnedError;
}
return users[0]; // just right

How about something like this?
public class UserResult
{
public string Warning { get; set; }
public IEnumerable<User> Result { get; set; }
}
public UserResult GetMappedUsers(/* params */) { }
public void Whatever()
{
var users = GetMappedUsers(/* params */);
if (!String.IsNullOrEmpty(users.Warning))
log.Warn(users.Warning);
}
Switch for a List<string> Warnings if required. This treats your GetMappedUsers method more like a service that returns some data and some metadata about the result, which allows you to delegate your logging to the caller - where it belongs - so your data access code can get on with just doing its job.
Although, to be honest, in this scenario I would prefer simply to return a list of user IDs from GetMappedUsers and then use users.Count to evaluate your "cases" in the caller and log as appropriate.

Try - Catch return strategy

When using try-catch in methods, if you want you application to continue even if errors come along, is it okay to return the default value as return through the catch block, and log the error for later review?
For Example:
public static string GetNameById(int id)
{
string name;
try
{
//Connect to Db and get name - some error occured
}
catch(Exception ex)
{
Log(ex);
name = String.Empty;
}
return name;
}
Example 2:
public static string GetIdByName(string name)
{
int id;
try
{
//Connect to Db and get id- some error occured
}
catch(Exception ex)
{
Log(ex);
id = 0;
}
return id;
}
Is it okay to return any default value (depending on the return type of the method ...???) so that the application logic that required the result from this method do not crash and keeps going ....
Thanks in advance...
Regards.

The advice for exception handling is that mostly you should only catch exceptions that you can do something about (e.g. retry an operation, try a different approach, elevate security etc). If you have logging elsewhere in your application at a global level, this kind of catch-log-return is unnecessary.
Personally - typically - in this situation I'd do this:
public static string GetNameById(int id)
{
string name;
try
{
//Connect to Db and get name - some error occured
}
catch(Exception ex)
{
Log(ex);
throw; // Re-throw the exception, don't throw a new one...
}
return name;
}
So as usual - it depends.
Be aware of other pitfalls though, such as the calling method not being aware that there was a failure, and continuing to perform work on the assumption that the method throwing the exception actually worked. At this point you start the conversation about "return codes vs. throwing exceptions", which you'll find a lot of resources for both on SO.com and the internets in general.

I do not think that is a good solution. In my opinion it would be better to let the caller handle the exception. Alternatively you can catch the exception in the method and throw a custom exception (with the caught exception as the inner exception).
Another way of going about it would be to make a TryGet method, such as:
public static bool TryGetNameById(int id, out string name)
{
try
{
//Connect to Db and get name - some error occured
name = actualName
return true;
}
catch(Exception ex)
{
Log(ex);
name = String.Empty;
return false;
}
}
I think this approach is more intention revealing. The method name itself communicates that it may not always be able to produce a useful result. In my opinion this is better than returning some default value that the caller has to be able to know about to do the correct business logic.

My opinion is I'll never mute errors.
If some exception is thrown, it should be treated as a fatal error.
Maybe the problem is throwing exceptions for things that aren't exceptions. For example, business validation shoudn't be throwing such exceptions.
I'd prefer to validate the business and translate the errors in "broken rules" and transmit them to the presentation or wherever, that would save CPU and memory because there's no stack trace.
Another situation is a data connection loss or another situation that makes the application fall in a wrong state. Then, I'd prefer to log the error and prompt the user to re-open the application or maybe the application may restart itself.
I want to make you some suggestion: have you ever heard about PostSharp? Check it, you can implement that exception logging with an AOP paradigm.

It is advised that you only catch errors that you can handle and recover from. Catching and consuming exceptions that you cannot handle is bad form. However, some environments / applications define strict rules that go against this ideal behaviour.
In those cases, I would say in cases you don't have a choice, you can do what you like with returns - it is up to your application to work with them.
Based on the assumption that your application can handle any sort of failure in trying to get an ID, then returning a default ID is a good enough choice. You can go further and use something like the special case pattern to return empty / missing objects, but for simple data types this seems unwarranted.

this depends very much on the context and on the way your calling method is designed. I have used to return default values in the past as you are doing now and I understood only afterwards that it was deeply wrong...
you should imagine the catch block to throw the exception back, after logging it properly, then the calling method could have another try catch and depending on the severity of the error could inform the user or behave accordingly.
the fact is that if you return an empty string, in some cases, could be that the caller "thinks" there is a user with empty name, while would probably be better to notify the user that the record was not found, for example.
depending on the way your logger works you could log the exception only where you handle it and not in all catches...

That depends on what you want. If it's important for you that you log the exception, but that everything else keeps working, then this is - in my honest opinion- ok.
On the other hand: if an exception occurs, you have to make sure this does not have an impact on the further working of your application.

The point of a try/catch is to allow you to catch and handle errors in a graceful manor, allowing application execution to continue, rather than simply crashing the application and stopping execution.
Therefore it is perfectly acceptable to return a default value. However, be sure that all following code will continue to function if a default value is returned rather than causing further errors.
Eg - in your example, if the DB connection fails, ensure there are no further commands to edit / retrieve values from the database.

It REALLY depends. In most scenarios it is not - problems i that if there is a scenario specific issue you amy never find out. Soemtiems a recovery attempt is good. This normalyl depends on circumstances and actual logic.
The scenarios where "swallow and document" are really valid are rare and far in between in the normal world. They come in more often when writing driver type of thigns (like loaded moduels talkign to an external system). Whether a return value or default(T) equivalent maeks sense also depends.
I would say in 99% of the cases let the exception run up. In 1% of the cases it may well make sense to return some sensible default.

It is better to log the exception but then re-trow it so that it can be properly handled in the upper layers.
Note: the "log the exception part" is of course optional and depends a lot on your handling strategy(will you log it again somewhere else?)
Instead of making the app not crash by swalowing exception, it is better to let them pass and try to find the root cause why they were thrown in the first place.

Depends on what you want your app to do...
For example, display a friendly message, "Cannot Login" if you get a SqlException when trying to connect to the Database is OK.
Handling all errors is sometimes considered bad, although people will disagree...
Your application encountered an exception you did not expect, you can no longer be 100% sure what state your application is in, which lines of codes executed and which didn't, etc. Further reading on this: http://blogs.msdn.com/b/larryosterman/archive/2005/05/31/423507.aspx.
And more here : http://blogs.msdn.com/b/oldnewthing/archive/2011/01/20/10117963.aspx

I think the answer would have to be "it depends".
In some cases it may be acceptable to return an empty string from a function in case of an error. If you are looking up somebody's address to display then an empty string works fine rather than crashing the whole thing.
In other cases it may not work so well. If you are returning a string to use for a security lookup for example (eg getSecurityGroup) then you are going to have very undesired behaviour if you return the wrong thing and you might be better off keeping the error thrown and letting the user know something has gone wrong rather than pretending otherwise.
Even worse might be if you are persisting data provided to you by the user. Something goes wrong when you are getting their data, you return a default and store that without even telling them... That's gotta be bad.
So I think you need to look at each method where you are considering this and decide if it makes sense to throw an error or return a default value.
In fact as a more general rule any time you catch an error you should be thinking hard about whether it is an acceptable error and continuing is permissable or whether it is a show stopping error and you should just give up.

Which Exception Should I Throw to Signal an Internal Error in my Program?

Which exception should I use when the program reaches a logic state that I "know" won't happen, and if it does, something is terribly bad?
For example:
int SomeFunction(int arg) {
SomeEnum x = Whatever(arg, somePrivateMember);
switch (x) {
case SomeEnum.Value1:
return SomeFunction1();
case SomeEnum.Value1:
return SomeFunction2();
default:
throw new WhatTypeToThrow();
}
}
Clearly, ArgumentException is a long-shot here since the invalid value for x could have come from a bug in Whatever(), or an invalid combination of any arguments and/or the current instance state.
I'm looking for something such as an InvalidProgramStateException, InternalErrorException or similar.
Of course I could define my own, but I wonder if there is a suitable exception in the framework.
Edit: Removed the simple sample code to reduce amount of ArgumentException answers.

What about InvalidOperationException?

Why not the InvalidEnumArgumentException? It looks like it was designed specifically for this use-case.

Don't throw any specific exception type in the code you're looking at. Call Trace.Assert, or in this case even Trace.Fail to get a similar effect to Debug.Assert, except enabled even in release builds (assuming the settings aren't changed).
If the default trace listener, the one that offers a UI that offers to kill the whole program or launch a debugger, isn't appropriate for your needs, set up a custom trace listener in Trace.Listeners that causes a private exception type to be thrown whenever Trace.Fail is called (including when a Trace.Assert fails).
The exception type should be a private exception type because otherwise, callers may be tempted to try catching whatever exception type you're going to throw. For this particular exception type, you will want to make it as clear as possible that a future version of the method will no longer throw this particular exception. You don't want to be forced to throw a TraceFailedException or whatever you call it from now until eternity to preserve backwards compatibility.
Another answer mentioned Code Contracts already as an alternative. It is, in a similar way: you can call Contract.Assert(false). This takes the same approach of making it customisable what happens if an assertion fails, but in this case, the default behaviour is to throw an exception, again of a type that is not externally accessible. To make the most of Code Contracts, however, you should be using the static rewriter, which has both pros and cons that I will not get into here. If for you the pros outweigh the cons, then by all means use it. If you prefer to avoid the static rewriter though, then I'd recommend avoiding the Contract class entirely, since it is not at all obvious which methods do and don't work.

I think ArgumentOutOfRangeException is valid here and it's what I use. It's the argument to the switch statement that is not handled as it's out of the range of handled values. I tend to code it like this, where the message tells it like it is:
switch (test)
{
case SomeEnum.Woo:
break;
case SomeEnum.Yay:
break;
default:
{
string msg = string.Format("Value '{0}' for enum '{1}' is not handled.",
test, test.GetType().Name);
throw new ArgumentOutOfRangeException(msg);
}
}
Obviously the message is to your own tastes, but the basics are in that one. Adding the value of the enum to the message is useful not only to give detail concerning what known enum member was not handled, but also when there is an invalid enum i.e. the old "(666)SomeEnum" issue.
Value 'OhNoes' for enum 'SomeEnum' is not handled.
vs
Value '666' for enum 'SomeEnum' is not handled.

Here are suggestions that I've been given:
ArgumentException: something is wrong with the value
ArgumentNullException: the argument is null while this is not allowed
ArgumentOutOfRangeException: the argument has a value outside of the valid range
Alternatively, derive your own exception class from ArgumentException.
An input is invalid if it is not valid at any time. While an input is unexpected if it is not valid for the current state of the system (for which InvalidOperationException is a reasonable choice in some situations).
See similar question and answer that I was given.

You should consider using Code Contracts to not only throw exceptions in this case, but document what the failed assumption is, perhaps with a friendly message to the programmer. If you were lucky, the function you called (Whatever) would have a Contract.Ensures which would catch this error before you got your hands on it.

program reaches a logic state that I "know" won't happen, and if it does, something is terribly bad.
In this case, I would throw an ApplicationException, log what you can, and exit the app. If things are that screwed up, you certainly shouldn't try to recover and/or continue.

Can/Should you throw exceptions in a c# switch statement?

I have an insert query that returns an int. Based on that int I may wish to throw an exception. Is this appropriate to do within a switch statement?
switch (result)
{
case D_USER_NOT_FOUND:
throw new ClientException(string.Format("D User Name: {0} , was not found.", dTbx.Text));
case C_USER_NOT_FOUND:
throw new ClientException(string.Format("C User Name: {0} , was not found.", cTbx.Text));
case D_USER_ALREADY_MAPPED:
throw new ClientException(string.Format("D User Name: {0} , is already mapped.", dTbx.Text));
case C_USER_ALREADY_MAPPED:
throw new ClientException(string.Format("C User Name: {0} , is already mapped.", cTbx.Text));
default:
break;
}
I normally add break statements to switches but they will not be hit. Is this a bad design? Please share any opinions/suggestions with me.
Thanks,
~ck in San Diego

Why not?
From The C# Programming Language, Third Ed. by Anders Hejlsberg et al, page 362:
The statement list of a switch section typically ends in a break, goto case, or goto default statement, but any construct that renders the end point of the statement list unreachable is permitted. [...] Likewise, a throw or return statement always transfers control elsewhere and never reaches its end point. Thus the following example is valid:
switch(i) {
case 0:
while(true) F();
case 1:
throw new ArgumentException();
case 2:
return;
}

No problem... why would this be bad design?
Alternatively, since the exception type is the same in all cases, you could construct a lookup table for the error messages. That would save you some code duplication. For example:
static private Dictionary<int, string> errorMessages;
static
{
// create and fill the Dictionary
}
// meanwhile, elsewhere in the code...
if (result is not ok) {
throw new ClientException(string.Format(errorMessages[result], cTbx.Text, dTbx.Text));
}
In the messages themselves, you can select the appropriate parameter with {0}, {1}, etc.

I don't really agree with your variable naming conventions ;), but I don't see why what you did would be inappropriate. It seems like a fairly elegant way of translating an error from one medium to another.
I'm assuming that your "insert query" is some form of stored proc.

If possible it would probably be better if you threw wherever the failed result is set, otherwise you end up with having to do both result checking and throwing. But might not be possible of course.
Also, I'd make your error strings into resources or constants with placeholders so that you don't have to change multiple places if you want to change the wording.

I think this is OK. It seems like you are mapping an return code to an exception, using a switch statement. As long as there are not too many cases, this is not a problem.

I don't see any issue with using a switch in your case.
The stronger consideration should go to whether exceptions themselves are appropriate. Generally, exceptions should be used only when a situation arises that is outside the bounds of expected behavior. Exceptions shouldn't be used as program flow logic. In your case you're probably okay to use them based on the code I see.

There is nothing wrong with the approach you are taking. Switch statements are a lot easier to read than if/then statements (and potentially faster too). The other thing you could do is load the possible exceptions in a
Dictionary<Result_Type, Exception>
and pull the exception from there. If you have lots of switch statements this would make the code more compact (and you can add and remove them at runtime if you need to).

Maybe I beg to differ from all the answers here..
Instead of having to switch in the code, I would rather pass in the result to the ClientException class & let it decide what string it needs to show rather then have an ugly switch all over the place to create various messages
My code would look like:
throw new ClientException(result, cTbx.Text);
So, even if you can throw errors in switch...case, you can avoid it all is my take

Should functions return null or an empty object?

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What is the best practice when returning data from functions. Is it better to return a Null or an empty object? And why should one do one over the other?
Consider this:
public UserEntity GetUserById(Guid userId)
{
//Imagine some code here to access database.....
//Check if data was returned and return a null if none found
if (!DataExists)
return null;
//Should I be doing this here instead?
//return new UserEntity();
else
return existingUserEntity;
}
Lets pretend that there would be valid cases in this program that there would be no user information in the database with that GUID. I Would imagine that it would not be appropriate to throw an exception in this case?? Also I am under the impression that exception handling can hurt performance.

Returning null is usually the best idea if you intend to indicate that no data is available.
An empty object implies data has been returned, whereas returning null clearly indicates that nothing has been returned.
Additionally, returning a null will result in a null exception if you attempt to access members in the object, which can be useful for highlighting buggy code - attempting to access a member of nothing makes no sense. Accessing members of an empty object will not fail meaning bugs can go undiscovered.

It depends on what makes the most sense for your case.
Does it make sense to return null, e.g. "no such user exists"?
Or does it make sense to create a default user? This makes the most sense when you can safely assume that if a user DOESN'T exist, the calling code intends for one to exist when they ask for it.
Or does it make sense to throw an exception (a la "FileNotFound") if the calling code is demanding a user with an invalid ID?
However - from a separation of concerns/SRP standpoint, the first two are more correct. And technically the first is the most correct (but only by a hair) - GetUserById should only be responsible for one thing - getting the user. Handling its own "user does not exist" case by returning something else could be a violation of SRP. Separating into a different check - bool DoesUserExist(id) would be appropriate if you do choose to throw an exception.
Based on extensive comments below: if this is an API-level design question, this method could be analogous to "OpenFile" or "ReadEntireFile". We are "opening" a user from some repository and hydrating the object from the resultant data. An exception could be appropriate in this case. It might not be, but it could be.
All approaches are acceptable - it just depends, based on the larger context of the API/application.

Personally, I use NULL. It makes clear that there is no data to return. But there are cases when a Null Object may be usefull.

If your return type is an array then return an empty array otherwise return null.

You should throw an exception (only) if a specific contract is broken.
In your specific example, asking for a UserEntity based on a known Id, it would depend on the fact if missing (deleted) users are an expected case. If so, then return null but if it is not an expected case then throw an exception.
Note that if the function was called UserEntity GetUserByName(string name) it would probably not throw but return null. In both cases returning an empty UserEntity would be unhelpful.
For strings, arrays and collections the situation is usually different. I remember some guideline form MS that methods should accept null as an 'empty' list but return collections of zero-length rather than null. The same for strings. Note that you can declare empty arrays: int[] arr = new int[0];

This is a business question, dependent on whether the existence of a user with a specific Guid Id is an expected normal use case for this function, or is it an anomaly that will prevent the application from successfully completing whatever function this method is providing the user object to...
If it's an "exception", in that the absence of a user with that Id will prevent the application from successfully completing whatever function it is doing, (Say we're creating an invoice for a customer we've shipped product to...), then this situation should throw an ArgumentException (or some other custom exception).
If a missing user is ok, (one of the potential normal outcomes of calling this function) then return a null....
EDIT: (to address comment from Adam in another answer)
If the application contains multiple business processes, one or more of which require a User in order to complete successfully, and one or more of which can complete successfully without a user, then the exception should be thrown further up the call stack, closer to where the business processes which require a User are calling this thread of execution. Methods between this method and that point (where the exception is being thrown) should just communicate that no user exists (null, boolean, whatever - this is an implementation detail).
But if all processes within the application require a user, I would still throw the exception in this method...

I personally would return null, because that is how I would expect the DAL/Repository layer to act.
If it doesn't exist, don't return anything that could be construed as successfully fetching an object, null works beautifully here.
The most important thing is to be consistant across your DAL/Repos Layer, that way you don't get confused on how to use it.

I tend to
return null if the object id doesn't exist when it's not known beforehand whether it should exist.
throw if the object id doesn't exist when it should exist.
I differentiate these two scenarios with these three types of methods.
First:
Boolean TryGetSomeObjectById(Int32 id, out SomeObject o)
{
if (InternalIdExists(id))
{
o = InternalGetSomeObject(id);
return true;
}
else
{
return false;
}
}
Second:
SomeObject FindSomeObjectById(Int32 id)
{
SomeObject o;
return TryGetObjectById(id, out o) ? o : null;
}
Third:
SomeObject GetSomeObjectById(Int32 id)
{
SomeObject o;
if (!TryGetObjectById(id, out o))
{
throw new SomeAppropriateException();
}
return o;
}

Yet another approach involves passing in a callback object or delegate that will operate on the value. If a value is not found, the callback is not called.
public void GetUserById(Guid id, UserCallback callback)
{
// Lookup user
if (userFound)
callback(userEntity); // or callback.Call(userEntity);
}
This works well when you want to avoid null checks all over your code, and when not finding a value isn't an error. You may also provide a callback for when no objects are found if you need any special processing.
public void GetUserById(Guid id, UserCallback callback, NotFoundCallback notFound)
{
// Lookup user
if (userFound)
callback(userEntity); // or callback.Call(userEntity);
else
notFound(); // or notFound.Call();
}
The same approach using a single object might look like:
public void GetUserById(Guid id, UserCallback callback)
{
// Lookup user
if (userFound)
callback.Found(userEntity);
else
callback.NotFound();
}
From a design perspective, I really like this approach, but has the disadvantage of making the call site bulkier in languages that don't readily support first class functions.

We use CSLA.NET, and it takes the view that a failed data fetch should return an "empty" object. This is actually quite annoying, as it demands the convention of checking whether obj.IsNew rathern than obj == null.
As a previous poster mentioned, null return values will cause code to fail straight away, reducing the likelihood of stealth problems caused by empty objects.
Personally, I think null is more elegant.
It's a very common case, and I'm surprised that people here seem surprised by it: on any web application, data is often fetched using a querystring parameter, which can obviously be mangled, so requiring that the developer handle incidences of "not found".
You could handle this by:
if (User.Exists(id)) {
this.User = User.Fetch(id);
} else {
Response.Redirect("~/notfound.aspx");
}
...but that's an extra call to the database every time, which may be an issue on high-traffic pages. Whereas:
this.User = User.Fetch(id);
if (this.User == null) {
Response.Redirect("~/notfound.aspx");
}
...requires only one call.

I prefer null, since it's compatible with the null-coalescing operator (??).

I'd say return null instead of an empty object.
But the specific instance that you have mentioned here,
you are searching for an user by user id, which is sort
of the key to that user, in that case I'd probably want
to to throw an exception if no user instance instance is
found.
This is the rule I generally follow:
If no result found on a find by primary key operation,
throw ObjectNotFoundException.
If no result found on a find by any other criteria,
return null.
If no result found on a find by a non-key criteria that may return a multiple objects
return an empty collection.

It will vary based on context, but I will generally return null if I'm looking for one particular object (as in your example) and return an empty collection if I'm looking for a set of objects but there are none.
If you've made a mistake in your code and returning null leads to null pointer exceptions, then the sooner you catch that the better. If you return an empty object, initial use of it may work, but you may get errors later.

The best in this case return "null" in a case there no a such user. Also make your method static.
Edit:
Usually methods like this are members of some "User" class and don't have an access to its instance members. In this case the method should be static, otherwise you must create an instance of "User" and then call GetUserById method which will return another "User" instance. Agree this is confusing. But if GetUserById method is member of some "DatabaseFactory" class - no problem to leave it as an instance member.

I personally return a default instance of the object. The reason is that I expect the method to return zero to many or zero to one (depending on the method's purpose). The only reason that it would be an error state of any kind, using this approach, is if the method returned no object(s) and was always expected to (in terms of a one to many or singular return).
As to the assumption that this is a business domain question - I just do not see it from that side of the equation. Normalization of return types is a valid application architecture question. At the very least, it is subject for standardization in coding practices. I doubt that there is a business user who is going to say "in scenario X, just give them a null".

In our Business Objects we have 2 main Get methods:
To keep things simple in the context or you question they would be:
// Returns null if user does not exist
public UserEntity GetUserById(Guid userId)
{
}
// Returns a New User if user does not exist
public UserEntity GetNewOrExistingUserById(Guid userId)
{
}
The first method is used when getting specific entities, the second method is used specifically when adding or editing entities on web pages.
This enables us to have the best of both worlds in the context where they are used.

I'm a french IT student, so excuse my poor english. In our classes we are being told that such a method should never return null, nor an empty object. The user of this method is supposed to check first that the object he is looking for exists before trying to get it.
Using Java, we are asked to add a assert exists(object) : "You shouldn't try to access an object that doesn't exist"; at the beginning of any method that could return null, to express the "precondition" (I don't know what is the word in english).
IMO this is really not easy to use but that's what I'm using, waiting for something better.

If the case of the user not being found comes up often enough, and you want to deal with that in various ways depending on circumstance (sometimes throwing an exception, sometimes substituting an empty user) you could also use something close to F#'s Option or Haskell's Maybe type, which explicitly seperates the 'no value' case from 'found something!'. The database access code could look like this:
public Option<UserEntity> GetUserById(Guid userId)
{
//Imagine some code here to access database.....
//Check if data was returned and return a null if none found
if (!DataExists)
return Option<UserEntity>.Nothing;
else
return Option.Just(existingUserEntity);
}
And be used like this:
Option<UserEntity> result = GetUserById(...);
if (result.IsNothing()) {
// deal with it
} else {
UserEntity value = result.GetValue();
}
Unfortunately, everybody seems to roll a type like this of their own.

I typically return null. It provides a quick and easy mechanism to detect if something screwed up without throwing exceptions and using tons of try/catch all over the place.

For collection types I would return an Empty Collection, for all other types I prefer using the NullObject patterns for returning an object that implements the same interface as that of the returning type. for details about the pattern check out link text
Using the NullObject pattern this would be :-
public UserEntity GetUserById(Guid userId)
{
//Imagine some code here to access database.....
//Check if data was returned and return a null if none found
if (!DataExists)
return new NullUserEntity(); //Should I be doing this here instead? return new UserEntity();
else
return existingUserEntity;
}
class NullUserEntity: IUserEntity { public string getFirstName(){ return ""; } ...}

To put what others have said in a pithier manner...
Exceptions are for Exceptional circumstances
If this method is pure data access layer, I would say that given some parameter that gets included in a select statement, it would expect that I may not find any rows from which to build an object, and therefore returning null would be acceptable as this is data access logic.
On the other hand, if I expected my parameter to reflect a primary key and I should only get one row back, if I got more than one back I would throw an exception. 0 is ok to return null, 2 is not.
Now, if I had some login code that checked against an LDAP provider then checked against a DB to get more details and I expected those should be in sync at all times, I might toss the exception then. As others said, it's business rules.
Now I'll say that is a general rule. There are times where you may want to break that. However, my experience and experiments with C# (lots of that) and Java(a bit of that) has taught me that it is much more expensive performance wise to deal with exceptions than to handle predictable issues via conditional logic. I'm talking to the tune of 2 or 3 orders of magnitude more expensive in some cases. So, if it's possible your code could end up in a loop, then I would advise returning null and testing for it.

Forgive my pseudo-php/code.
I think it really depends on the intended use of the result.
If you intend to edit/modify the return value and save it, then return an empty object. That way, you can use the same function to populate data on a new or existing object.
Say I have a function that takes a primary key and an array of data, fills the row with data, then saves the resulting record to the db. Since I'm intending to populate the object with my data either way, it can be a huge advantage to get an empty object back from the getter. That way, I can perform identical operations in either case. You use the result of the getter function no matter what.
Example:
function saveTheRow($prim_key, $data) {
$row = getRowByPrimKey($prim_key);
// Populate the data here
$row->save();
}
Here we can see that the same series of operations manipulates all records of this type.
However, if the ultimate intent of the return value is to read and do something with the data, then I would return null. This way, I can very quickly determine if there was no data returned and display the appropriate message to the user.
Usually, I'll catch exceptions in my function that retrieves the data (so I can log error messages, etc...) then return null straight from the catch. It generally doesn't matter to the end user what the problem is, so I find it best to encapsulate my error logging/processing directly in the function that gets the data. If you're maintaining a shared codebase in any large company this is especially beneficial because you can force proper error logging/handling on even the laziest programmer.
Example:
function displayData($row_id) {
// Logging of the error would happen in this function
$row = getRow($row_id);
if($row === null) {
// Handle the error here
}
// Do stuff here with data
}
function getRow($row_id) {
$row = null;
try{
if(!$db->connected()) {
throw excpetion("Couldn't Connect");
}
$result = $db->query($some_query_using_row_id);
if(count($result) == 0 ) {
throw new exception("Couldn't find a record!");
}
$row = $db->nextRow();
} catch (db_exception) {
//Log db conn error, alert admin, etc...
return null; // This way I know that null means an error occurred
}
return $row;
}
That's my general rule. It's worked well so far.

Interesting question and I think there is no "right" answer, since it always depends on the responsibility of your code. Does your method know if no found data is a problem or not? In most cases the answer is "no" and that's why returning null and letting the caller handling he situation is perfect.
Maybe a good approach to distinguish throwing methods from null-returning methods is to find a convention in your team: Methods that say they "get" something should throw an exception if there is nothing to get. Methods that may return null could be named differently, perhaps "Find..." instead.

If the object returned is something that can be iterated over, I would return an empty object, so that I don't have to test for null first.
Example:
bool IsAdministrator(User user)
{
var groupsOfUser = GetGroupsOfUser(user);
// This foreach would cause a run time exception if groupsOfUser is null.
foreach (var groupOfUser in groupsOfUser)
{
if (groupOfUser.Name == "Administrators")
{
return true;
}
}
return false;
}

I like not to return null from any method, but to use Option functional type instead. Methods that can return no result return an empty Option, rather than null.
Also, such methods that can return no result should indicate that through their name. I normally put Try or TryGet or TryFind at the beginning of the method's name to indicate that it may return an empty result (e.g. TryFindCustomer, TryLoadFile, etc.).
That lets the caller apply different techniques, like collection pipelining (see Martin Fowler's Collection Pipeline) on the result.
Here is another example where returning Option instead of null is used to reduce code complexity: How to Reduce Cyclomatic Complexity: Option Functional Type

More meat to grind: let's say my DAL returns a NULL for GetPersonByID as advised by some. What should my (rather thin) BLL do if it receives a NULL? Pass that NULL on up and let the end consumer worry about it (in this case, an ASP.Net page)? How about having the BLL throw an exception?
The BLL may be being used by ASP.Net and Win App, or another class library - I think it is unfair to expect the end consumer to intrinsically "know" that the method GetPersonByID returns a null (unless null types are used, I guess).
My take (for what it's worth) is that my DAL returns NULL if nothing is found. FOR SOME OBJECTS, that's ok - it could be a 0:many list of things, so not having any things is fine (e.g. a list of favourite books). In this case, my BLL returns an empty list. For most single entity things (e.g. user, account, invoice) if I don't have one, then that's definitely a problem and a throw a costly exception. However, seeing as retrieving a user by a unique identifier that's been previously given by the application should always return a user, the exception is a "proper" exception, as in it's exceptional. The end consumer of the BLL (ASP.Net, f'rinstance) only ever expects things to be hunky-dory, so an Unhandled Exception Handler will be used instead of wrapping every single call to GetPersonByID in a try - catch block.
If there is a glaring problem in my approach, please let me know as I am always keen to learn. As other posters have said, exceptions are costly things, and the "checking first" approach is good, but exceptions should be just that - exceptional.
I'm enjoying this post, lot's of good suggestions for "it depends" scenarios :-)

I think functions should not return null, for the health of your code-base. I can think of a few reasons:
There will be a large quantity of guard clauses treating null reference if (f() != null).
What is null, is it an accepted answer or a problem? Is null a valid state for a specific object? (imagine that you are a client for the code). I mean all reference types can be null, but should they?
Having null hanging around will almost always give a few unexpected NullRef exceptions from time to time as your code-base grows.
There are some solutions, tester-doer pattern or implementing the option type from functional programming.

I am perplexed at the number of answers (all over the web) that say you need two methods: an "IsItThere()" method and a "GetItForMe()" method and so this leads to a race condition. What is wrong with a function that returns null, assigning it to a variable, and checking the variable for Null all in one test? My former C code was peppered with
if ( NULL != (variable = function(arguments...)) ) {
So you get the value (or null) in a variable, and the result all at once. Has this idiom been forgotten? Why?

I agree with most posts here, which tend towards null.
My reasoning is that generating an empty object with non-nullable properties may cause bugs. For example, an entity with an int ID property would have an initial value of ID = 0, which is an entirely valid value. Should that object, under some circumstance, get saved to database, it would be a bad thing.
For anything with an iterator I would always use the empty collection. Something like
foreach (var eachValue in collection ?? new List<Type>(0))
is code smell in my opinion. Collection properties shouldn't be null, ever.
An edge case is String. Many people say, String.IsNullOrEmpty isn't really necessary, but you cannot always distinguish between an empty string and null. Furthermore, some database systems (Oracle) won't distinguish between them at all ('' gets stored as DBNULL), so you're forced to handle them equally. The reason for that is, most string values either come from user input or from external systems, while neither textboxes nor most exchange formats have different representations for '' and null. So even if the user wants to remove a value, he cannot do anything more than clearing the input control. Also the distinction of nullable and non-nullable nvarchar database fields is more than questionable, if your DBMS is not oracle - a mandatory field that allows '' is weird, your UI would never allow this, so your constraints do not map.
So the answer here, in my opinion is, handle them equally, always.
Concerning your question regarding exceptions and performance:
If you throw an exception which you cannot handle completely in your program logic, you have to abort, at some point, whatever your program is doing, and ask the user to redo whatever he just did. In that case, the performance penalty of a catch is really the least of your worries - having to ask the user is the elephant in the room (which means re-rendering the whole UI, or sending some HTML through the internet). So if you don't follow the anti-pattern of "Program Flow with Exceptions", don't bother, just throw one if it makes sense. Even in borderline cases, such as "Validation Exception", performance is really not an issue, since you have to ask the user again, in any case.

An Asynchronous TryGet Pattern:
For synchronous methods, I believe #Johann Gerell's answer is the pattern to use in all cases.
However the TryGet pattern with the out parameter does not work with Async methods.
With C# 7's Tuple Literals you can now do this:
async Task<(bool success, SomeObject o)> TryGetSomeObjectByIdAsync(Int32 id)
{
if (InternalIdExists(id))
{
o = await InternalGetSomeObjectAsync(id);
return (true, o);
}
else
{
return (false, default(SomeObject));
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.