Replacing If Else unique conditional nested statements - c#

Switch case statements are good to replace nested if statements if we have the same condition but different criteria. But what is a good approach if those nested if statements all have different and unique conditions? Do I have any alternate options to replace a dozen if else statements nested inside each other?
Sample Code:
Note: I know this is extremely unreadable - which is the whole point.
Note: All conditions are unique.
...
if (condition) {
// do A
} else {
if (condition) {
// do B
if (condition) {
if (condition) {
if (condition) {
// do C
if (condition) {
// do D
if (condition) {
// do E
} else {
if (condition) {
// do F
}
}
}
}
if (condition) {
// do G
if (condition) {
// do H
if (condition) {
// do I
} else {
// do J
}
}
}
}
}
}
​

The best approach in this case is to chop up the thing into appropriately named separate methods.

I had to check this was Stackoverflow not DailyWTF when I saw the code!!
The solution is to change the architecture and use interfaces and polymorphism to get around all the conditions. However that maybe a huge job and out of the scope of an acceptable answer, so I will recommend another way you can kinda use Switch statements with unique conditions:
[Flags]
public enum FilterFlagEnum
{
None = 0,
Condition1 = 1,
Condition2 = 2,
Condition3 = 4,
Condition4 = 8,
Condition5 = 16,
Condition6 = 32,
Condition7 = 64
};
public void foo(FilterFlagEnum filterFlags = 0)
{
if ((filterFlags & FilterFlagEnum.Condition1) == FilterFlagEnum.Condition1)
{
//do this
}
if ((filterFlags & FilterFlagEnum.Condition2) == FilterFlagEnum.Condition2)
{
//do this
}
}
foo(FilterFlagEnum.Condition1 | FilterFlagEnum.Condition2);

#Tar suggested one way of looking at it. Another might be.
Invert it.
if (myObject.HasThing1)
{
if(myObject.HasThing2)
{
DoThing1();
}
else
{
DoThing2();
}
}
else
{
DoThing3();
}
could be
DoThing1(myObject.HasThing1);
DoThing2(myObject.HasThing2);
DoThing3(myObject.HasThing3);
So each Do method makes the minimum number of tests, if any fail the it does nothing.
You can make it a bit cleverer if you want to break out of the sequence in few ways.
No idea whether it would work for you, but delegating the testing of the conditions is often enough of a new way of looking at things, that some simplifying factor might just appear as if by magic.

In my point of view there exists two main methods to eliminate nested conditions. The first one is used in more special cases when we have only one condition in each nested conditions like here:
function A(){
if (condition1){
if (condition2){
if (condition3){
// do something
}
}
}
}
we can just go out from the opposite condition with return:
function A(){
if (condition1 == false) return;
if (condition2 == false) return;
if (condition3 == false) return;
// do something
}
The second one is using a condition decomposition and can be treated as more universal than the first one. In the case when we have a condition structure like this, for example:
if (condition1)
{
// do this 1
}
else
{
if (condition2)
{
// do this 2
}
}
We can implement a variables for each particular condition like here:
bool Cond1 = condition1;
bool Cond2 = !condition1 && condition2;
if (Cond1) { //do this 1 }
if (Cond2) { //do this 2 }

If that really is the business logic then the syntax is OK. But I have never seen business logic that complex. Draw up a flow chart and see if that cannot be simplified.
if (condition)
{
// do this
}
else
{
if (condition)
{
// do this
}
}
can be replaced with
if (condition)
{
// do this
}
else if (condition)
{
// do this
}
But again step back and review the design. Need more than just an else if clean up.

I feel your pain.
My situation required writing many (>2000) functional tests that have been customer specified for a large, expensive piece of equipment. While most (>95%) of these tests are simple and have a straight forward pass/fail check dozens fall into the "multiple nested if this do that else do something different" at depths similar or worse than yours.
The solution I came up with was to host Windows Workflow within my test application.
All complex tests became Workflows that I run with the results reported back to my test app.
The customer was happy because they had the ability to:
Verify the test logic (hard for non programmers looking at deeply nested if/else C# - easy looking at a graphical flowchart)
Edit tests graphically
Add new tests
Hosting Windows Workflow (in .NET 4/4.5) is very easy - although it may take you a while to get your head around "communications" between the Workflows and your code - mostly because there are multiple ways to do it.
Good Luck

Related

C# else if confusion

I am currently studying the conditional constructions. Correct me if I am wrong but else if and else(if(){}) is the same thing... Example:
a=5;
if(a==6)
{
Console.WriteLine("Variable 'a' is 6");
}
else if(a==5)
{
Console.WriteLine("Variable 'a' is 5");
}
And
a=5;
if(a==6)
{
Console.WriteLine("Variable 'a' is 6");
}
else
{
if(a==5)
{
Console.WriteLine("Variable 'a' is 5");
}
}
Are these things the same? And if yes why does else if exist if I can write it the "second way"(the second example that I wrote)?
Yes, these are effectively identical.
The reason the "else if" statement exists is to make cleaner code when there are many conditions to test for. For example:
if (a==b) {
//blah
} else if (a==c) {
//blah
} else if (a==d) {
//blah
} else if (a==e) {
//blah
}
is much cleaner than the nested approach
if (a==b) {
//blah
} else {
if (a==c) {
//blah
} else {
if (a==d) {
//blah
} else {
if (a==e) {
//blah
}
}
}
}
why does else if exist
It doesn't. It's not a keyword or construct on its own. Your two examples are identical except that in the second case you've added some superfluous braces and whitespace into the code.
if and else are both simply followed by a single statement. In your first example the statement following the else is:
if(a==5)
{
Console.WriteLine("Variable 'a' is 5");
}
The second example just wraps that same statement in braces, and adds a new line at the start. The new line is ignored, so it doesn't change the semantics, and as the code is already a single statement, wrapping it in braces doesn't change it in any way.
Strictly speaking, there is no such thing as an else if statement. An "else if" is actually just in essence an else with a single line body that happens to be the start of an entirely separate if statement. You can visualize it like this:
var a = 5;
// This if uses a single line
if (a == 6) DoSomething();
// This else is a single line that is also a single-line if
else if (a == 4) DoAnotherThing();
// This else uses a single line as well, but is referring instead to the second if
else DoSomethingElse();
The above script is identical to the following:
if (a == 6)
{
DoSomething();
}
else
{
if (a == 4)
{
DoAnotherThing();
}
else
{
DoSomethingElse();
}
}
Or even this:
if (a == 6)
DoSomething();
else
if (a == 4)
DoAnotherThing();
else
DoSomethingElse();
The reason that it is written as else if so commonly is because it compliments the logical flow of the code. That, and it just looks so much prettier.
Mostly it makes the code cleaner, easier to read, makes indenting better, particularly if you have any conventions on character length of rows (and if you don't then you've got thirty indents in a big if statement, such a pain to read). It also saves space, a few extra characters and indents across thousands of lines may not be much, but why use it if you don't have to. When the code compiles they will pretty much be the exact same in the DLLs anyways.

Skip first and last in IEnumerable, deferring execution

I have this huge json file neatly formated starting with the characters "[\r\n" and ending with "]". I have this piece of code:
foreach (var line in File.ReadLines(#"d:\wikipedia\wikipedia.json").Skip(1))
{
if (line[0] == ']') break;
// Do stuff
}
I'm wondering, what would be best performance-wise, what machine code would be the most optimal in regards to how many clock cycles and memory is consumed if I were to compare the above code to one where I have replaced "break" with "continue", or would both of those pieces of code compile to the same MSIL and machine code? If you know the answer, please explain exactly how you reached your conclusion? I'd really like to know.
EDIT: Before you close this as nonsensical, consider that this code is equivalent to the above code and consider that the c# compiler optimizes when the code path is flat and does not fork in a lot of ways, would all of the following examples generate the same amount of work for the CPU?
IEnumerable<char> text = new[] {'[', 'a', 'b', 'c', ']'};
foreach (var c in text.Skip(1))
{
if (c == ']') break;
// Do stuff
}
foreach (var c in text.Skip(1))
{
if (c == ']') continue;
// Do stuff
}
foreach (var c in text.Skip(1))
{
if (c != ']')
{
// Do stuff
}
}
foreach (var c in text.Skip(1))
{
if (c != ']')
{
// Do stuff
}
}
foreach (var c in text.Skip(1))
{
if (c != ']')
{
// Do stuff
}
else
{
break;
}
}
EDIT2: Here's another way of putting it: what's the prettiest way to skip the first and last item in an IEnumerable while still deferring the executing until //Do stuff?
Q: Different MSIL for break or continue in loop?
Yes, that's because it works like this:
foreach (var item in foo)
{
// more code...
if (...) { continue; } // jump to #1
if (...) { break; } // jump to #2
// more code...
// #1 -- just before the '}'
}
// #2 -- after the exit of the loop.
Q: What will give you the most performance?
Branches are branches for the compiler. If you have a goto, a continue or a break, it will eventually be compiled as a branch (opcode br), which will be analyzes as such. In other words: it doesn't make a difference.
What does make a difference is having predictable patterns of both data and code flow in the code. Branching breaks code flow, so if you want performance, you should avoid irregular branches.
In other words, prefer:
for (int i=0; i<10 && someCondition; ++i)
to:
for (int i=0; i<10; ++i)
{
// some code
if (someCondition) { ... }
// some code
}
As always with performance, the best thing to do is to run benchmarks. There's no surrogate.
Q: What will give you the most performance? (#2)
You're doing a lot with IEnumerable's. If you want raw performance and have the option, it's best to use an array or a string. There's no better alternative in terms of raw performance for sequential access of elements.
If an array isn't an option (for example because it doesn't match the access pattern), it's best to use a data structure that best suits the access pattern. Learn about the characteristics of hash tables (Dictionary), red black trees (SortedDictionary) and how List works. Knowledge about how stuff really works is the thing you need. If unsure, test, test and test again.
Q: What will give you the most performance? (#3)
I'd also try JSON libraries if your intent is to parse that. These people probably already invented the wheel for you - if not, it'll give you a baseline "to beat".
Q: [...] what's the prettiest way to skip the first and last item [...]
If the underlying data structure is a string, List or array, I'd simply do this:
for (int i=1; i<str.Length-1; ++i)
{ ... }
To be frank, other data structures don't really make sense here IMO. That said, people somethings like to put Linq code everywhere, so...
Using an enumerator
You can easily make a method that returns all but the first and last element. In my book, enumerators always are accessed in code through things like foreach to ensure that IDisposable is called correctly.
public static IEnumerable<T> GetAllButFirstAndLast<T>(IEnumerable<T> myEnum)
{
T jtem = default(T);
bool first = true;
foreach (T item in myEnum.Skip(1))
{
if (first) { first = false; } else { yield return jtem; }
jtem = item;
}
}
Note that this has little to do with "getting the best performance out of your code". One look at the IL tells you all you need to know.

bugs in java code after converting from C#

here is a function prints repeating int in a array.
in c#:
int [] ReturnDups(int[] a)
{
int repeats = 0;
Dictionary<int, bool> hash = new Dictionary<int>();
for(int i = 0; i < a.Length i++)
{
bool repeatSeen;
if (hash.TryGetValue(a[i], out repeatSeen))
{
if (!repeatSeen)
{
hash[a[i]] = true;
repeats ++;
}
}
else
{
hash[a[i]] = false;
}
}
int[] result = new int[repeats];
int current = 0;
if (repeats > 0)
{
foreach(KeyValuePair<int,bool> p in hash)
{
if(p.Value)
{
result[current++] = p.Key;
}
}
}
return result;
}
now converted to JAVA by Tangible software's tool.
in java:
private int[] ReturnDups(int[] a)
{
int repeats = 0;
java.util.HashMap<Integer, Boolean> hash = new java.util.HashMap<Integer>();
for (int i = 0; i < a.length i++)
{
boolean repeatSeen = false;
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
{
if (!repeatSeen)
{
hash.put(a[i], true);
repeats++;
}
}
else
{
hash.put(a[i], false);
}
}
int[] result = new int[repeats];
int current = 0;
if (repeats > 0)
{
for (java.util.Map.Entry<Integer,Boolean> p : hash.entrySet())
{
if (p.getValue())
{
result[current++] = p.getKey();
}
}
}
return result;
}
but findbug find this line of code as bugs. and it looks very odd to me too.
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
can someone pls explain to me what this line does and how do i write it in java properly?
thanks
You have overcomplicated the code for TryGetValue - this simple translation should work:
if ( hash.containsKey(a[i]) ) {
if (!hash.get(a[i])) {
hash.put(a[i], true);
}
} else {
hash.put(a[i], false);
}
C# has a way to get the value and a flag that tells you if the value has been found in a single call; Java does not have a similar API, because it lacks an ability to pass variables by reference.
Do not directly convert C# implementation. assign repeatSeen value only if the id is there.
if (hash.containsKey(a[i]))
{
repeatSeen = hash.get(a[i]).equals(repeatSeen)
if (!repeatSeen)
{
hash.put(a[i], true);
repeats++;
}
}
To answer the actual question that was asked:
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
is indeed syntactically wrong. I haven't looked at the rest of the code, but having written parsers/code-generators in my time I'm guessing it was supposed to be
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen) : false)
It's gratuitously ugly -- which often happens with code generators, especially ones without an optimizing pass -- but it's syntactically correct. Let's see if it actually does have a well-defined meaning.
CAVEAT: I haven't crosschecked this by running it -- if someone spots an error, please tell me!
First off, x?y:z is indeed a ternary operator, which Java inherited from C via C++. It's an if-then-else expression -- if x is true it has the value y, whereas if x is false it has the value z. So this one-liner means the same thing as:
boolean implied;
if (hash.containsKey(a[i]) then
implied = (repeatSeen = hash.get(a[i])) == repeatSeen);
else
implied = false;
if(implied)
... and so on.
Now, the remaining bit of ugliness is the second half of that and-expression. I don't know if you're familiar with the use of = (assignment) as an expression operator; its value as an operator is the same value being assigned to the variable. That's mostly intended to let you do things like a=b=0;, but it can also be used to set variables "in passing" in the middle of an expression. Hardcore C hackers do some very clever, and ugly, things with it (he says, being one)... and here's it's being used to get the value from the hashtable, assign it to repeatSeen, and then -- via the == -- test that same value against repeatSeen.
Now the question is, what order are the two arguments of == evaluated in? If the left side is evaluated first, the == must always be true because the assignment will occur before the right-hand side retrieves the value. If the right side is evaluated first, we'd be comparing the new value against the previous value, in an very non-obvious way.
Well, in fact, there's another StackOverflow entry which addresses that question:
What are the rules for evaluation order in Java?
According to that, the rule for Java is that the left argument of an operator is always evaluated before the right argument. So the first case applies, the == always returns true.
Rewriting our translation one more time to reflect that, it turns into
boolean implied;
if (hash.containsKey(a[i]) then
{
repeatSeen = hash.get(a[i]));
implied = true;
}
else
implied = false;
if(implied)
Which could be further rewritten as
if (hash.containsKey(a[i]) then
{
repeatSeen = hash.get(a[i]));
// and go on to do whatever else was in the body of the original if statement
"If that's what they meant, why didn't they just write it that way?" ... As I say, I've written code generators, and in many cases the easiest thing to do is just make sure all the fragments you're writing are individually correct for what they're trying to do and not worry about whether they at all resemble what a human would have written do do the same thing. In particular, it's tempting to generate code according to templates which allow for cases you may not actually use, rather than trying to recognize the simpler situation and generate code differently.
I'm guessing that the compiler was drawing in and translating bits of computation as it realized it needed them, and that this created the odd nesting as it started the if, then realized it needed a conditional assignment to repeatSeen, and for whatever reason tried to make that happen in the if's test rather than in its body. Believe me, I've seen worse kluging from code generators.

Which is the best practices: MethodReturnsBoolean == true/false OR true/false == MethodReturnsBoolean

I have been writing:
if(Class.HasSomething() == true/false)
{
// do somthing
}
else
{
// do something else
}
but I've also seen people that do the opposite:
if(true/false == Class.HasSomething())
{
// do somthing
}
else
{
// do something else
}
Is there any advantage in doing one or the other in terms of performance and speed? I'm NOT talking about coding style here.
They're both equivalent, but my preference is
if(Class.HasSomething())
{
// do something
}
else
{
// do something else
}
...for simplicity.
Certain older-style C programmers prefer "Yoda Conditions", because if you accidentally use a single-equals sign instead, you'll get a compile time error about assigning to a constant:
if (true = Foo()) { ... } /* Compile time error! Stops typo-mistakes */
if (Foo() = true) { ... } /* Will actually compile for certain Foo() */
Even though that mistake will no longer compile in C#, old habits die hard, and many programmers stick to the style developed in C.
Personally, I like the very simple form for True statements:
if (Foo()) { ... }
But for False statements, I like an explicit comparison.
If I write the shorter !Foo(), it is easy to over-look the ! when reviewing code later.
if (false == Foo()) { ... } /* Obvious intent */
if (!Foo()) { ... } /* Easy to overlook or misunderstand */
The second example is what I've heard called "Yoda conditions"; "False, this method's return value must be". It's not the way you'd say it in English and so among English-speaking programmers it's generally looked down on.
Performance-wise, there's really no difference. The first example is generally better grammatically (and thus for readability), but given the name of your method the "grammar" involved (and the fact you're comparing bool to bool) would make the equality check redundant anyway. So, for a true statement, I would simply write:
if(Class.HasSomething())
{
// do somthing
}
else
{
// do something else
}
This would be incrementally faster, as the if() block basically has a built-in equality comparison, so if you code if(Class.HasSomething() == true) the CLR will evaluate if((Class.HasSomething() == true) == true). But, we're talking a gain of maybe a few clocks here (not milliseconds, not ticks, but clocks; the ones that happen 2 billion times a second in modern processors).
For a false condition, it's a toss-up between using the not operator: if(!Class.HasSomething()) and using a comparison to false: if(Class.HasSomething() == false). The first is more concise, but it can be easy to miss that little exclamation point in a complex expression (especially since it occurs before the entire expression) and so I'd consider equating with false to ensure that the code is readable.
You will not see any performance difference.
The correct option is
if (Whatever())
The only time you should write == false or != true is when dealing with bool?s. (in which case all four options have different meanings)
You will not see any performance difference, either comparison is translated into the same IL...
if(Class.HasSomething())
{
// do somthing
}
is my way. But better try to avoid a multiple method call of HasSomething(). Better expose the return value once and reuse it.
you should write neither.
Write
if(Class.HasSomething())
{
// do something
}
else
{
// do something else
}
instead. If Class.HasSomething() is already a bool, it's pointless to compare it to another boolean
There is no perf advantage here. This coding style is used to guard against situation where programmer types = instead of ==. Compiler will cathc this because true/false are constants and cannot be assigned a new value
For the case of booleans, I'd recommend neither: just use if (method()) and if (!method()). For the case of things besides booleans, the convention of using yoda-speak, e.g. if (1 == x) came about to prevent mistakes, because if (1 = x) will throw a compiler error while if (x = 1) will not (it is valid code in C, but is probably not what you intended). In C#, such a statement is only valid if the variable was a boolean, which reduces the need to do that.

How to improve Cyclomatic Complexity?

Cyclomatic Complexity will be high for methods with a high number of decision statements including if/while/for statements. So how do we improve on it?
I am handling a big project where I am supposed to reduced the CC for methods that have CC > 10. And there are many methods with this problem. Below I will list down some eg of code patterns (not the actual code) with the problems I have encountered. Is it possible that they can be simplified?
Example of cases resulting in many decision statements:
Case 1)
if(objectA != null) //objectA is a pass in as a parameter
{
objectB = doThisMethod();
if(objectB != null)
{
objectC = doThatMethod();
if(objectC != null)
{
doXXX();
}
else{
doYYY();
}
}
else
{
doZZZ();
}
}
Case 2)
if(a < min)
min = a;
if(a < max)
max = a;
if(b > 0)
doXXX();
if(c > 0)
{
doYYY();
}
else
{
doZZZ();
if(c > d)
isTrue = false;
for(int i=0; i<d; i++)
s[i] = i*d;
if(isTrue)
{
if(e > 1)
{
doALotOfStuff();
}
}
}
Case 3)
// note that these String Constants are used elsewhere as diff combination,
// so you can't combine them as one
if(e.PropertyName.Equals(StringConstants.AAA) ||
e.PropertyName.Equals(StringConstants.BBB) ||
e.PropertyName.Equals(StringConstants.CCC) ||
e.PropertyName.Equals(StringConstants.DDD) ||
e.PropertyName.Equals(StringConstants.EEE) ||
e.PropertyName.Equals(StringConstants.FFF) ||
e.PropertyName.Equals(StringConstants.GGG) ||
e.PropertyName.Equals(StringConstants.HHH) ||
e.PropertyName.Equals(StringConstants.III) ||
e.PropertyName.Equals(StringConstants.JJJ) ||
e.PropertyName.Equals(StringConstants.KKK))
{
doStuff();
}
Case 1 - deal with this simply by refactoring into smaller functions. E.g. the following snippet could be a function:
objectC = doThatMethod();
if(objectC != null)
{
doXXX();
}
else{
doYYY();
}
Case 2 - exactly the same approach. Take the contents of the else clause out into a smaller helper function
Case 3 - make a list of the strings you want to check against, and make a small helper function that compares a string against many options (could be simplified further with linq)
var stringConstants = new string[] { StringConstants.AAA, StringConstants.BBB etc };
if(stringConstants.Any((s) => e.PropertyName.Equals(s))
{
...
}
You should use the refactoring Replace Conditional with Polymorphism to reduce CC.
The difference between conditional an polymorphic code is that the in polymorphic code the decision is made at run time. This gives you more flexibility to add\change\remove conditions without modifying the code. You can test the behaviors separately using unit tests which improves testability. Also since there will be less conditional code means that the code is easy to read and CC is less.
For more look into behavioral design patterns esp. Strategy.
I would do the first case like this to remove the conditionals and consequently the CC. Moreover the code is more Object Oriented, readable and testable as well.
void Main() {
var objectA = GetObjectA();
objectA.DoMyTask();
}
GetObjectA(){
return If_All_Is_Well ? new ObjectA() : new EmptyObjectA();
}
class ObjectA() {
DoMyTask() {
var objectB = GetObjectB();
var objectC = GetObjectC();
objectC.DoAnotherTask(); // I am assuming that you would call the doXXX or doYYY methods on objectB or C because otherwise there is no need to create them
}
void GetObjectC() {
return If_All_Is_Well_Again ? new ObjectC() : new EmptyObjectC();
}
}
class EmptyObjectA() { // http://en.wikipedia.org/wiki/Null_Object_pattern
DoMyTask() {
doZZZZ();
}
}
class ObjectC() {
DoAnotherTask() {
doXXX();
}
}
class EmptyObjectB() {
DoAnotherTask() {
doYYY();
}
}
In second case do it the same was as first.
In the third case -
var myCriteria = GetCriteria();
if(myCriteria.Contains(curretnCase))
doStuff();
IEnumerable<Names> GetCriteria() {
// return new list of criteria.
}
I'm not a C# programmer, but I will take a stab at it.
In the first case I would say that the objects should not be null in the first place. If this is unavoidable (it is usually avoidable) then I would use the early return pattern:
if ( objectA == NULL ) {
return;
}
// rest of code here
The second case is obviously not realistic code, but I would at least rather say:
if ( isTrue && e > 1 ) {
DoStuff();
}
rather than use two separate ifs.
And in the last case, I would store the strings to be tested in an array/vector/map and use that containers methods to do the search.
And finally, although using cyclomatic complexity is "a good thing" (tm) and I use it myself, there are some functions which naturally have to be a bit complicated - validating user input is an example. I often wish that the CC tool I use (Source Monitor at http://www.campwoodsw.com - free and very good) supported a white-list of functions that I know must be complex and which I don't want it to flag.
The last if in case 2 can be simplified:
if(isTrue)
{
if(e > 1)
{
can be replaced by
if(isTrue && (e>1))
case 3 can be rewritten as:
new string[]{StringConstants.AAA,...}
.Contains(e.PropertyName)
you can even make the string array into a HashSet<String> to get O(1) performance.

Categories