When should one try to eliminate a switch statement? [duplicate] - c#
This question already has answers here:
Is Switch (Case) always wrong?
(8 answers)
Closed 9 years ago.
I've come across a switch statement in the codebase I'm working on and I'm trying to figure out how to replace it with something better since switch statements are considered a code smell. However, having read through several posts on stackoverflow about replacing switch statements I can't seem to think of an effective way to replace this particular switch statement.
Its left me wondering if this particular switch statement is ok and if there are particular circumstances where switch statements are considered appropriate.
In my case the code (slightly obfuscated naturally) that I'm struggling with is like this:
private MyType DoSomething(IDataRecord reader)
{
var p = new MyType
{
Id = (int)reader[idIndex],
Name = (string)reader[nameIndex]
}
switch ((string) reader[discountTypeIndex])
{
case "A":
p.DiscountType = DiscountType.Discountable;
break;
case "B":
p.DiscountType = DiscountType.Loss;
break;
case "O":
p.DiscountType = DiscountType.Other;
break;
}
return p;
}
Can anyone suggest a way to eliminate this switch? Or is this an appropriate use of a switch? And if it is, are there other appropriate uses for switch statements? I'd really like to know where they are appropriate so I don't waste too much time trying to eliminate every switch statement I come across just because they are considered a smell in some circumstances.
Update: At the suggestion of Michael I did a bit of searching for duplication of this logic and discovered that someone had created logic in another class that effectively made the whole switch statement redundant. So in the context of this particular bit of code the switch statement was unnecessary. However, my question is more about the appropriateness of switch statements in code and whether we should always try to replace them whenever they are found so in this case I'm inclined to accept the answer that this switch statement is appropriate.
This is an appropriate use for a switch statment, as it makes the choices readable, and easy to add or subtract one.
See this link.
Switch statements (especially long ones) are considered bad, not because they are switch statements, but because their presence suggests a need to refactor.
The problem with switch statements is they create a bifurcation in your code (just like an if statement does). Each branch must be tested individually, and each branch within each branch and... well, you get the idea.
That said, the following article has some good practices on using switch statements:
http://elegantcode.com/2009/01/10/refactoring-a-switch-statement/
In the case of your code, the article in the above link suggests that, if you're performing this type of conversion from one enumeration to another, you should put your switch in its own method, and use return statements instead of the break statements. I've done this before, and the code looks much cleaner:
private DiscountType GetDiscountType(string discount)
{
switch (discount)
{
case "A": return DiscountType.Discountable;
case "B": return DiscountType.Loss;
case "O": return DiscountType.Other;
}
}
I think changing code for the sake of changing code is not best use of ones time. Changing code to make it [ more readable, faster, more efficient, etc, etc] makes sense. Don't change it merely because someone says you're doing something 'smelly'.
-Rick
This switch statement is fine. Do you guys not have any other bugs to attend to? lol
However, there is one thing I noticed... You shouldn't be using index ordinals on the IReader[] object indexer.... what if the column orders change? Try using field names i.e. reader["id"] and reader["name"]
In my opinion, it's not switch statements that are the smell, it's what's inside them. This switch statement is ok, to me, until it starts adding a couple of more cases. Then it may be worth creating a lookup table:
private static Dictionary<string, DiscountType> DiscountTypeLookup =
new Dictionary<string, DiscountType>(StringComparer.Ordinal)
{
{"A", DiscountType.Discountable},
{"B", DiscountType.Loss},
{"O", DiscountType.Other},
};
Depending on your point-of-view, this may be more or less readable.
Where things start getting smelly is if the contents of your case are more than a line or two.
Robert Harvey and Talljoe have provided excellent answers - what you have here is a mapping from a character code to an enumerated value. This is best expressed as a mapping where the details of the mapping are provided in one place, either in a map (as Talljoe suggests) or in a function that uses a switch statement (as suggested by Robert Harvey).
Both of those techniques are probably fine in this case, but I'd like to draw your attention to a design principal that may be useful here or in other similar cases. The the Open/Closed principal:
http://en.wikipedia.org/wiki/Open/closed_principle
http://www.objectmentor.com/resources/articles/ocp.pdf (make sure you read this!)
If the mapping is likely to change over time, or possibly be extended runtime (eg, through a plugin system or by reading the parts of the mapping from a database), then a using the Registry Pattern will help you adhere to the open/closed principal, in effect allowing the mapping to be extended without affecting any code that uses the mapping (as they say - open for extension, closed for modification).
I think this is a nice article on the Registry Pattern - see how the registry holds a mapping from some key to some value? In that way it's similar to your mapping expressed as a switch statement. Of course, in your case you will not be registering objects that all implement a common interface, but you should get the gist:
http://sinnema313.wordpress.com/2009/03/01/the-registry-pattern/
So, to answer the original question - the case statement is poor form as I expect the mapping from the character code to an enumerated value will be needed in multiple places in your application, so it should be factored out. The two answers I referenced give you good advice on how to do that - take your pick as to which you prefer. If, however, the mapping is likely to change over time, consider the Registry Pattern as a way insulating your code from the effects of such change.
I wouldn't use an if. An if would be less clear than the switch. The switch is telling me that you are comparing the same thing throughout.
Just to scare people, this is less clear than your code:
if (string) reader[discountTypeIndex]) == "A")
p.DiscountType = DiscountType.Discountable;
else if (string) reader[discountTypeIndex]) == "B")
p.DiscountType = DiscountType.Loss;
else if (string) reader[discountTypeIndex]) == "O")
p.DiscountType = DiscountType.Other;
This switch may be OK, you might want to look at #Talljoe suggestion.
Are switches on discount type located throughout your code? Would adding a new discount type require you to modify several such switches? If so you should look into factoring the switch out. If not, using a switch here should be safe.
If there is a lot of discount specific behavior spread throughout your program, you might want to refactor this like:
p.Discount = DiscountFactory.Create(reader[discountTypeIndex]);
Then the discount object contains all the attributes and methods related to figuring out discounts.
You are right to suspect this switch statement: any switch statement that is contingent on the type of something may be indicative of missing polymorphism (or missing subclasses).
TallJoe's dictionary is a good approach, however.
Note that if your enum and database values were integers instead of strings, or if your database values were the same as the enum names, then reflection would work, e.g. given
public enum DiscountType : int
{
Unknown = 0,
Discountable = 1,
Loss = 2,
Other = 3
}
then
p.DiscountType = Enum.Parse(typeof(DiscountType),
(string)reader[discountTypeIndex]));
would suffice.
Yes, this looks like a correct usage of switch statement.
However, I have another question for you.
Why haven't you included the default label? Throwing an Exception in the default label will make sure that the program will fail properly when you add a new discountTypeIndex and forget to modify the code.
Also, if you wanted to map a string value to an Enum, you can use Attributes and reflection.
Something like:
public enum DiscountType
{
None,
[Description("A")]
Discountable,
[Description("B")]
Loss,
[Description("O")]
Other
}
public GetDiscountType(string discountTypeIndex)
{
foreach(DiscountType type in Enum.GetValues(typeof(DiscountType))
{
//Implementing GetDescription should be easy. Search on Google.
if(string.compare(discountTypeIndex, GetDescription(type))==0)
return type;
}
throw new ArgumentException("DiscountTypeIndex " + discountTypeIndex + " is not valid.");
}
I think this depends if you are creating MType add many different places or only at this place. If you are creating MType at many places always having to switch for the dicsount type of have some other checks then this could be a code smell.
I would try to get the creation of MTypes in one single spot in your program maybe in the constructor of the MType itself or in some kind of factory method but having random parts of your program assign values could lead to somebody not knowing how the values should be and doing something wrong.
So the switch is good but maybe the switch needs to be moved more inside the creation part of your Type
I'm not absolutely opposed to switch statements, but in the case you present, I'd have at least eliminated the duplication of assigning the DiscountType; I might have instead written a function that returns a DiscountType given a string. That function could have simply had the return statements for each case, eliminating the need for a break. I find the need for breaks between switch cases very treacherous.
private MyType DoSomething(IDataRecord reader)
{
var p = new MyType
{
Id = (int)reader[idIndex],
Name = (string)reader[nameIndex]
}
p.DiscountType = FindDiscountType(reader[discountTypeIndex]);
return p;
}
private DiscountType FindDiscountType (string key) {
switch ((string) reader[discountTypeIndex])
{
case "A":
return DiscountType.Discountable;
case "B":
return DiscountType.Loss;
case "O":
return DiscountType.Other;
}
// handle the default case as appropriate
}
Pretty soon, I'd have noticed that FindDiscountType() really belongs to the DiscountType class and moved the function.
When you design a language and finally have a chance to remove the ugliest, most non-intuitive error prone syntax in the whole language.
THAT is when you try and remove a switch statement.
Just to be clear, I mean the syntax. This is something taken from C/C++ which should have been changed to conform with the more modern syntax in C#. I wholeheartedly agree with the concept of providing the switch so the compiler can optimise the jump.
Related
I was surprised this code compiled (C#). Curious of why? [duplicate]
This question already has answers here: Case Statement Block Level Declaration Space in C# (6 answers) Closed 12 months ago. Found this construction while reviewing some code and I was expecting it to not compile at all, tbh. Any reasons why this is permitted? int i = 0; switch (i) { case 0: int k = 0; break; case 1: k = 1; break; } Edit: even more strange, adding Console.Out.WriteLine(k); after case 1: gives error use of unassigned variable 'k'...
Any reasons why this is permitted? We probably cannot say for certain: The essential answer is "because it is" or "because your reasons for thinking it shouldn't be differ with the thinking of those who designed the language" but we can't really speak to questions like "what were Microsoft thinking when they designed it such that...", unless perhaps someone is one of the privileged few to have sat in that design meeting and can be authoritative SharpLab.io will, however, tell you what happens under the hood; it compiles then decompiles your code and shows you the result, so you can get an idea of what your code was transformed into by the compiler: Note: swapped your numbers for other, non-default, ones so that identification of what was what can be maintained after the compiler changes the names A lot of the code you write is syntactic sugar for something else; here you can see your int k isn't buried within the switch, scoped to only "within the first case", but transformed into something else entirely.. It's thus legal C# because nothing prevents it not being, and you can rationalize that in a way that will help you remember it.. In a similar way, perhaps this looks like it shouldn't work: object o = ""; if(o is string s){ } s = ""; s looks, to me, like it's created within the scope of the if, yet it's accessible outside the if.. You'll find a similar explanatory transformation if you run that through SharpLab..
'||' operators to be used between two enums in a switch statement
I want to check if a function returns either of two enum values, each of the same enumeration type. For simplicity's sake, I attempted to create a case as follows: case EnumerationTypeExample.TypeA || EnumerationTypeExample.TypeB: Unfortunately, this does not please C#, which says I cannot use an '||' operator despite them being of the same type; strange. Might there be a way this could be done otherwise? An if statement perhaps might work and I may retreat to that, however, I would much rather use a switch statement if possible.
A case statement must be constants, not a computation. However you can use fall through in this case: switch (something) { case EnumerationTypeExample.TypeA: case EnumerationTypeExample.TypeB: { DoSomething(); break; } } Now the code will run in both situations.
Counting the number of cases in a switch statement
I would like to have a C# method which counts the number of cases in a switch statment. (Let's call it CaseCounter). For example, let's say you have this enum and these two methods: enum Painter { Rubens, Rembrandt, Vermeer, Picasso, Kandinsky } int GetYearOfBirth(Painter painter) { switch(painter) { case Painter.Kandinsky: return 1866; case Painter.Picasso: return 1881; case Painter.Rembrandt: return 1606; case Painter.Rubens: return 1577; case Painter.Vermeer: return 1632; default: return 0; } } bool IsModern(Painter painter) { switch (painter) { case Painter.Kandinsky: case Painter.Picasso: return true; default: return false; } } Now the following equalities should hold: CaseCounter("GetYearOfBirth") == 5 CaseCounter("IsModern") == 2 (It is not important whether or not you include the default case in the count. Also, the parameter to CaseCounter doesn't need to be a string; any type which can somehow be used to represent a method will do.) So, how would you go about implementing CaseCounter? Is it even possible? --- ADDENDUM (EDIT) --- Here's a bit of background info in case you're wondering why I asked this question. I'm maintaining a code base which contains a lot of methods that switch on enums. The methods are spread across various classes; this makes life difficult when an enum is extended (for example, when you add a new Painter). To make maintenance a little bit easier, I have written some unit tests of the following form: // If this test fails please check that the following methods are still OK: // MyClass.GetYearOfBirth, MyOtherClass.IsModernAllCases . // (There is no guarantee that the above list is up-to-date or complete.) [Test] public void PainterCountTest() { int numberOfMembers = Enum.GetValues(typeof(NotificationID)).Length; Assert.AreEqual(5, numberOfMembers); } (In this example, IsModernAllCases is just a variation of IsModern which refers explicitly to all 5 possible values of the Painter enum.) This test is better than nothing, but it's clumsy. It would be a little less clumsy if you could write something like this: [Test] public void PainterCountTest() { int numberOfMembers = Enum.GetValues(typeof(NotificationID)).Length; int numberOfCases_getYearOfBirth = CaseCounter("MyClass.GetYearOfBirth"); Assert.AreEqual(numberOfCases_getYearOfBirth, numberOfMembers); int numberOfCases_modern = CaseCounter("MyOtherClass.IsModernAllCases"); Assert.AreEqual(numberOfCases_modern, numberOfMembers); } In this scenario, at least you don't have to modify the unit test when you extend the enum.
It should be possible to do with the Roslyn CTP - Microsoft's compiler-as-a-service pre-release product. It has API's to let you inspect C# code and represent it as a tree of instructions. Unfortunately, it is a CTP, which may or may not be a problem for you - just remember it is pre-release software if you try to use it. If you can't use Roslyn, I think the only way would be to inspect the generated IL. A daunting task, to say the least. I haven't got any sample code for that, but if you want to try it - I would start by looking at the Cecil Mono project, which has some API's for inspecting IL. You could also MethodInfo.GetMethodBody to get the raw IL bytes and then parse those bytes yourself, but it does require some work to do that. Also, see this related question.
This can't be done via Reflection, but could be done via the Roslyn CTP. Roslyn provides the tooling required to analyze the source code itself, and determine information about it. You could walk the trees in this code to find the methods containing switch statements, and make a count of individual cases.
Without knowing why you are doing this, it's tough to know if this will be an acceptable approach for you. Since these C# files are basically just text files on your computer, you could create a separate application that loops through all the files. It would need to recognize methods, count the number of occurrences of case words (that aren't comments), and report the results. Now, if you need this functionality to run at runtime, within your app, then this approach won't work.
make a find search in visual studio on your case word it is enough to make your refactoring
Why does the c# compiler create a PrivateImplementationDetails from this code?
I've discovered that the following code: public static class MimeHelper { public static string GetMimeType(string strFileName) { string retval; switch (System.IO.Path.GetExtension(strFileName).ToLower()) { case ".3dm": retval = "x-world/x-3dmf"; break; case ".3dmf": retval = "x-world/x-3dmf"; break; case ".a": retval = "application/octet-stream"; break; // etc... default: retval = "application/octet-stream"; break; } return retval; } } causes the compiler to create this namespaceless, internal class (copied from Reflector): <PrivateImplementationDetails>{621DEE27-4B15-4773-9203-D6658527CF2B} - $$method0x60000b0-1 : Dictionary<String, Int32> - Used By: MimeHelper.GetMimeType(String) : String Why is that? How would I change the above code so it doesn't happen (just out of interest) Thanks Andrew
It's creating the dictionary to handle the lookups of the various cases in the switch statement instead of making several branching ifs out of it to set the return value. Trust me -- you don't want to change how it's doing it -- unless you want to make the map explicit. ASIDE: I had originally assumed that the dictionary stored a map from each case to the an index into another map for the return values. According to #Scott (see comments), it actually stores an index to a label for the code that should be executed for that case. This makes absolute sense when you consider that the code that would be executed for each case may differ and may be much longer than in the given example. EDIT: Based on your comment, I think I might be tempted to store the mappings in an external configuration file, read them in during start up, and construct the actual map -- either a single level map from key to value or a similar multilevel map from key to index and index to value. I think it would be easier to maintain these mappings in a configuration file than to update the code every time you needed to add or remove a particular case.
What is happening is compiler is creating an internal class that it emits at compile time. This class is called <PrivateImplementationDetails>{99999999-9999-9999-9999-999999999999}, the GUID component of this class is generated at compile time so it changes with every build. Internally in this class there is a dictionary that contains the different case variables, and an int corresponding to each value. It then replaces the switch statement with a lookup in the dictionary to get the corresponding int, and does a switch on the int value (much more efficient than doing a bunch of sting compares).
See this, for example.
What is the preferred method for handling unexpected enum values?
Suppose we have a method that accepts a value of an enumeration. After this method checks that the value is valid, it switches over the possible values. So the question is, what is the preferred method of handling unexpected values after the value range has been validated? For example: enum Mood { Happy, Sad } public void PrintMood(Mood mood) { if (!Enum.IsDefined(typeof(Mood), mood)) { throw new ArgumentOutOfRangeException("mood"); } switch (mood) { case Happy: Console.WriteLine("I am happy"); break; case Sad: Console.WriteLine("I am sad"); break; default: // what should we do here? } What is the preferred method of handling the default case? Leave a comment like // can never happen Debug.Fail() (or Debug.Assert(false)) throw new NotImplementedException() (or any other exception) Some other way I haven't thought of
I guess most of the above answers are valid, but I'm not sure any are correct. The correct answer is, you very rarely switch in an OO language, it indicates you are doing your OO wrong. In this case, it's a perfect indication that your Enum class has problems. You should just be calling Console.WriteLine(mood.moodMessage()), and defining moodMessage for each of the states. If a new state is added--All Your Code Should Adapt Automatically, nothing should fail, throw an exception or need changes. Edit: response to comment. In your example, to be "Good OO" the functionality of the file mode would be controlled by the FileMode object. It could contain a delegate object with "open, read, write..." operations that are different for each FileMode, so File.open("name", FileMode.Create) could be implemented as (sorry about the lack of familiarity with the API): open(String name, FileMode mode) { // May throw an exception if, for instance, mode is Open and file doesn't exist // May also create the file depending on Mode FileHandle fh = mode.getHandle(name); ... code to actually open fh here... // Let Truncate and append do their special handling mode.setPosition(fh); } This is much neater than trying to do it with switches... (by the way, the methods would be both package-private and possibly delegated to "Mode" classes) When OO is done well, every single method looks like a few lines of really understandable, simple code--TOO simple. You always get the feeling that there is some big messy "Cheese Nucleus" holding together all the little nacho objects, but you can't ever find it--it's nachos all the way down...
I prefer to throw new NotImplementedException("Unhandled Mood: " + mood). The point is that the enumeration may change in the future, and this method may not be updated accordingly. Throwing an exception seems to be the safest method. I don't like the Debug.Fail() method, because the method may be part of a library, and the new values might not be tested in debug mode. Other applications using that library can face weird runtime behaviour in that case, while in the case of throwing an exception the error will be known immediately. Note: NotImplementedException exists in commons.lang.
In Java, the standard way is to throw an AssertionError, for two reasons: This ensures that even if asserts are disabled, an error is thrown. You're asserting that there are no other enum values, so AssertionError documents your assumptions better than NotImplementedException (which Java doesn't have anyway).
My opinion is that since it is a programmer error you should either assert on it or throw a RuntimException (Java, or whatever the equivalent is for other languages). I have my own UnhandledEnumException that extends from RuntimeException that I use for this.
The correct program response would be to die in a manner that will allow the developer to easily spot the problem. mmyers and JaredPar both gave good ways to do that. Why die? That seems so extreme! The reason being that if you're not handling an enum value properly and just fall through, you're putting your program into an unexpected state. Once you're in an unexpected state, who knows what's going on. This can lead to bad data, errors that are harder to track down, or even security vulnerabilities. Also, if the program dies, there's a much greater chance that you're going to catch it in QA and thus it doesn't even go out the door.
For pretty much every switch statement in my code base, I have the following default case switch( value ) { ... default: Contract.InvalidEnumValue(value); } The method will throw an exception detailing the value of the enum at the point an error was detected. public static void InvalidEnumValue<T>(T value) where T: struct { ThrowIfFalse(typeof(T).IsEnum, "Expected an enum type"); Violation("Invalid Enum value of Type {0} : {1}", new object[] { typeof(T).Name, value }); }
For C#, something worth knowing is that Enum.IsDefined() is dangerous. You can't rely on it like you are. Getting something not of the expected values is a good case for throwing an exception and dying loudly. In Java, it's different because enums are classes not integers so you really can't get unexpected values (unless the enum is updated and your switch statement isn't), which is one big reason why I much prefer Java enums. You also have to cater for null values. But getting a non-null case you don't recognize is a good case for throwing an exception too.
You could have a trace for the default calling out the value of the passed enum. Throwing exceptions is OK but in a large application there will be several places where your code does not care about other values of the enum. So, unless you are sure that the code intends to handle all possible values of the enum, you'll have to go back later and remove the exception.
This is one of those questions that proves why test driven development is so important. In this case I'd go for a NotSupportedException because literally the value was unhandled and therefore not supported. A NotImplementedException gives more the idea of: "This is not finished yet" ;) The calling code should be able to handle a situation like this and unit tests can be created to easily test these kind of situations.
Call it an opinion or a preference, but the idea behind the enum is that it represents the full list of possible values. If an "unexpected value" is being passed around in code, then the enum (or purpose behind the enum) is not up to date. My personal preference is that every enum carry a default assignment of Undefined. Given that the enum is a defined list, it should never be out-of-date with your consuming code. As far as what to do if your function is getting either an unexpected value or Undefined in my case, a generic answer doesn't seem possible. For me, it depends on the context of the reason for evaluating the enum value: is it a situation where code execution should halt, or can a default value be used?
It is the responsibility of the calling function to provide valid input and implicitely anything not in the enum is invalid (Pragmatic programmer seems to imply this). That said, this implies that any time you change your enum, you must change ALL code that accepts it as input (and SOME code that yields it as output). But that is probably true anyways. If you have an enum that changes often, you probably should be using something other than an enum, considering that enums are normally compile-time entities.
I usually try to define undefined value (0): enum Mood { Undefined = 0, Happy, Sad } That way I can always say: switch (mood) { case Happy: Console.WriteLine("I am happy"); break; case Sad: Console.WriteLine("I am sad"); break; case Undefined: // handle undefined case default: // at this point it is obvious that there is an unknown problem // -> throw -> die :-) } This is at least how I usually do this.