Cyclomatic Complexity will be high for methods with a high number of decision statements including if/while/for statements. So how do we improve on it?
I am handling a big project where I am supposed to reduced the CC for methods that have CC > 10. And there are many methods with this problem. Below I will list down some eg of code patterns (not the actual code) with the problems I have encountered. Is it possible that they can be simplified?
Example of cases resulting in many decision statements:
Case 1)
if(objectA != null) //objectA is a pass in as a parameter
{
objectB = doThisMethod();
if(objectB != null)
{
objectC = doThatMethod();
if(objectC != null)
{
doXXX();
}
else{
doYYY();
}
}
else
{
doZZZ();
}
}
Case 2)
if(a < min)
min = a;
if(a < max)
max = a;
if(b > 0)
doXXX();
if(c > 0)
{
doYYY();
}
else
{
doZZZ();
if(c > d)
isTrue = false;
for(int i=0; i<d; i++)
s[i] = i*d;
if(isTrue)
{
if(e > 1)
{
doALotOfStuff();
}
}
}
Case 3)
// note that these String Constants are used elsewhere as diff combination,
// so you can't combine them as one
if(e.PropertyName.Equals(StringConstants.AAA) ||
e.PropertyName.Equals(StringConstants.BBB) ||
e.PropertyName.Equals(StringConstants.CCC) ||
e.PropertyName.Equals(StringConstants.DDD) ||
e.PropertyName.Equals(StringConstants.EEE) ||
e.PropertyName.Equals(StringConstants.FFF) ||
e.PropertyName.Equals(StringConstants.GGG) ||
e.PropertyName.Equals(StringConstants.HHH) ||
e.PropertyName.Equals(StringConstants.III) ||
e.PropertyName.Equals(StringConstants.JJJ) ||
e.PropertyName.Equals(StringConstants.KKK))
{
doStuff();
}
Case 1 - deal with this simply by refactoring into smaller functions. E.g. the following snippet could be a function:
objectC = doThatMethod();
if(objectC != null)
{
doXXX();
}
else{
doYYY();
}
Case 2 - exactly the same approach. Take the contents of the else clause out into a smaller helper function
Case 3 - make a list of the strings you want to check against, and make a small helper function that compares a string against many options (could be simplified further with linq)
var stringConstants = new string[] { StringConstants.AAA, StringConstants.BBB etc };
if(stringConstants.Any((s) => e.PropertyName.Equals(s))
{
...
}
You should use the refactoring Replace Conditional with Polymorphism to reduce CC.
The difference between conditional an polymorphic code is that the in polymorphic code the decision is made at run time. This gives you more flexibility to add\change\remove conditions without modifying the code. You can test the behaviors separately using unit tests which improves testability. Also since there will be less conditional code means that the code is easy to read and CC is less.
For more look into behavioral design patterns esp. Strategy.
I would do the first case like this to remove the conditionals and consequently the CC. Moreover the code is more Object Oriented, readable and testable as well.
void Main() {
var objectA = GetObjectA();
objectA.DoMyTask();
}
GetObjectA(){
return If_All_Is_Well ? new ObjectA() : new EmptyObjectA();
}
class ObjectA() {
DoMyTask() {
var objectB = GetObjectB();
var objectC = GetObjectC();
objectC.DoAnotherTask(); // I am assuming that you would call the doXXX or doYYY methods on objectB or C because otherwise there is no need to create them
}
void GetObjectC() {
return If_All_Is_Well_Again ? new ObjectC() : new EmptyObjectC();
}
}
class EmptyObjectA() { // http://en.wikipedia.org/wiki/Null_Object_pattern
DoMyTask() {
doZZZZ();
}
}
class ObjectC() {
DoAnotherTask() {
doXXX();
}
}
class EmptyObjectB() {
DoAnotherTask() {
doYYY();
}
}
In second case do it the same was as first.
In the third case -
var myCriteria = GetCriteria();
if(myCriteria.Contains(curretnCase))
doStuff();
IEnumerable<Names> GetCriteria() {
// return new list of criteria.
}
I'm not a C# programmer, but I will take a stab at it.
In the first case I would say that the objects should not be null in the first place. If this is unavoidable (it is usually avoidable) then I would use the early return pattern:
if ( objectA == NULL ) {
return;
}
// rest of code here
The second case is obviously not realistic code, but I would at least rather say:
if ( isTrue && e > 1 ) {
DoStuff();
}
rather than use two separate ifs.
And in the last case, I would store the strings to be tested in an array/vector/map and use that containers methods to do the search.
And finally, although using cyclomatic complexity is "a good thing" (tm) and I use it myself, there are some functions which naturally have to be a bit complicated - validating user input is an example. I often wish that the CC tool I use (Source Monitor at http://www.campwoodsw.com - free and very good) supported a white-list of functions that I know must be complex and which I don't want it to flag.
The last if in case 2 can be simplified:
if(isTrue)
{
if(e > 1)
{
can be replaced by
if(isTrue && (e>1))
case 3 can be rewritten as:
new string[]{StringConstants.AAA,...}
.Contains(e.PropertyName)
you can even make the string array into a HashSet<String> to get O(1) performance.
Related
I have to check writerId is not equal to 0 or 1. here is what I wrote.
int writerId = foo();
if(writerId != 0 && writerId != 1)
{
// do something
}
Is there any short way to write the same if statement?
This is shorter, but considerably harder to understand:
if ((writerId & ~1) != 0)
The writerId & ~1 operation unsets the least significant bit in the number -- the only two numbers that would be equal to 0 after this operation are 1 and 0, so if the result is not 0 then it must not have been 0 or 1.
However, you are severely sacrificing readability. Sometimes the shortest code is not the most readable. I would stick with what you have.
If readability is a concern when viewing this piece of code.. you could always move it into its own boolean method so that it reads nicer in the context of the other code:
bool IsValid(int writerId) {
return writerId != 0 && writerId != 1;
}
Then your code can at least read a little bit nicer:
if (IsValid(writerId)) {
// do something
}
..I will leave the appropriate naming for the method up to you. I generally do this if there is no easier way to make the code read nicer without it becoming more complex.
You can try this:
if (foo() >> 1 != 0)
{
// do something
}
This happened to be a reoccuring thing in my daily work. I wrote an extension for it at some point:
public static class GenericExtensions
{
public static bool EqualsAny<T>(this T value, params T[] comparables)
{
return comparables.Any(element => object.Equals(value, element));
}
public static bool EqualsNone<T>(this T value, params T[] comparables)
{
return !EqualsAny(value, comparables);
}
}
So instead of (writerId != 0 && writerId != 1) you can write (!writerId.EqualsAny(0, 1)) or (writerId.EqualsNone(0, 1)). In this case, I probably wouldn't use the EqualsNone method, because it actually lowers readability. I might not use the method for this case at all anyway. It mostly helps with readability with long enum names that would cause long or wrapped lines. And as always it's a matter of opinion at any rate. ;)
here is a function prints repeating int in a array.
in c#:
int [] ReturnDups(int[] a)
{
int repeats = 0;
Dictionary<int, bool> hash = new Dictionary<int>();
for(int i = 0; i < a.Length i++)
{
bool repeatSeen;
if (hash.TryGetValue(a[i], out repeatSeen))
{
if (!repeatSeen)
{
hash[a[i]] = true;
repeats ++;
}
}
else
{
hash[a[i]] = false;
}
}
int[] result = new int[repeats];
int current = 0;
if (repeats > 0)
{
foreach(KeyValuePair<int,bool> p in hash)
{
if(p.Value)
{
result[current++] = p.Key;
}
}
}
return result;
}
now converted to JAVA by Tangible software's tool.
in java:
private int[] ReturnDups(int[] a)
{
int repeats = 0;
java.util.HashMap<Integer, Boolean> hash = new java.util.HashMap<Integer>();
for (int i = 0; i < a.length i++)
{
boolean repeatSeen = false;
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
{
if (!repeatSeen)
{
hash.put(a[i], true);
repeats++;
}
}
else
{
hash.put(a[i], false);
}
}
int[] result = new int[repeats];
int current = 0;
if (repeats > 0)
{
for (java.util.Map.Entry<Integer,Boolean> p : hash.entrySet())
{
if (p.getValue())
{
result[current++] = p.getKey();
}
}
}
return result;
}
but findbug find this line of code as bugs. and it looks very odd to me too.
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
can someone pls explain to me what this line does and how do i write it in java properly?
thanks
You have overcomplicated the code for TryGetValue - this simple translation should work:
if ( hash.containsKey(a[i]) ) {
if (!hash.get(a[i])) {
hash.put(a[i], true);
}
} else {
hash.put(a[i], false);
}
C# has a way to get the value and a flag that tells you if the value has been found in a single call; Java does not have a similar API, because it lacks an ability to pass variables by reference.
Do not directly convert C# implementation. assign repeatSeen value only if the id is there.
if (hash.containsKey(a[i]))
{
repeatSeen = hash.get(a[i]).equals(repeatSeen)
if (!repeatSeen)
{
hash.put(a[i], true);
repeats++;
}
}
To answer the actual question that was asked:
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen : false)
is indeed syntactically wrong. I haven't looked at the rest of the code, but having written parsers/code-generators in my time I'm guessing it was supposed to be
if (hash.containsKey(a[i]) ? (repeatSeen = hash.get(a[i])) == repeatSeen) : false)
It's gratuitously ugly -- which often happens with code generators, especially ones without an optimizing pass -- but it's syntactically correct. Let's see if it actually does have a well-defined meaning.
CAVEAT: I haven't crosschecked this by running it -- if someone spots an error, please tell me!
First off, x?y:z is indeed a ternary operator, which Java inherited from C via C++. It's an if-then-else expression -- if x is true it has the value y, whereas if x is false it has the value z. So this one-liner means the same thing as:
boolean implied;
if (hash.containsKey(a[i]) then
implied = (repeatSeen = hash.get(a[i])) == repeatSeen);
else
implied = false;
if(implied)
... and so on.
Now, the remaining bit of ugliness is the second half of that and-expression. I don't know if you're familiar with the use of = (assignment) as an expression operator; its value as an operator is the same value being assigned to the variable. That's mostly intended to let you do things like a=b=0;, but it can also be used to set variables "in passing" in the middle of an expression. Hardcore C hackers do some very clever, and ugly, things with it (he says, being one)... and here's it's being used to get the value from the hashtable, assign it to repeatSeen, and then -- via the == -- test that same value against repeatSeen.
Now the question is, what order are the two arguments of == evaluated in? If the left side is evaluated first, the == must always be true because the assignment will occur before the right-hand side retrieves the value. If the right side is evaluated first, we'd be comparing the new value against the previous value, in an very non-obvious way.
Well, in fact, there's another StackOverflow entry which addresses that question:
What are the rules for evaluation order in Java?
According to that, the rule for Java is that the left argument of an operator is always evaluated before the right argument. So the first case applies, the == always returns true.
Rewriting our translation one more time to reflect that, it turns into
boolean implied;
if (hash.containsKey(a[i]) then
{
repeatSeen = hash.get(a[i]));
implied = true;
}
else
implied = false;
if(implied)
Which could be further rewritten as
if (hash.containsKey(a[i]) then
{
repeatSeen = hash.get(a[i]));
// and go on to do whatever else was in the body of the original if statement
"If that's what they meant, why didn't they just write it that way?" ... As I say, I've written code generators, and in many cases the easiest thing to do is just make sure all the fragments you're writing are individually correct for what they're trying to do and not worry about whether they at all resemble what a human would have written do do the same thing. In particular, it's tempting to generate code according to templates which allow for cases you may not actually use, rather than trying to recognize the simpler situation and generate code differently.
I'm guessing that the compiler was drawing in and translating bits of computation as it realized it needed them, and that this created the odd nesting as it started the if, then realized it needed a conditional assignment to repeatSeen, and for whatever reason tried to make that happen in the if's test rather than in its body. Believe me, I've seen worse kluging from code generators.
Switch case statements are good to replace nested if statements if we have the same condition but different criteria. But what is a good approach if those nested if statements all have different and unique conditions? Do I have any alternate options to replace a dozen if else statements nested inside each other?
Sample Code:
Note: I know this is extremely unreadable - which is the whole point.
Note: All conditions are unique.
...
if (condition) {
// do A
} else {
if (condition) {
// do B
if (condition) {
if (condition) {
if (condition) {
// do C
if (condition) {
// do D
if (condition) {
// do E
} else {
if (condition) {
// do F
}
}
}
}
if (condition) {
// do G
if (condition) {
// do H
if (condition) {
// do I
} else {
// do J
}
}
}
}
}
}
The best approach in this case is to chop up the thing into appropriately named separate methods.
I had to check this was Stackoverflow not DailyWTF when I saw the code!!
The solution is to change the architecture and use interfaces and polymorphism to get around all the conditions. However that maybe a huge job and out of the scope of an acceptable answer, so I will recommend another way you can kinda use Switch statements with unique conditions:
[Flags]
public enum FilterFlagEnum
{
None = 0,
Condition1 = 1,
Condition2 = 2,
Condition3 = 4,
Condition4 = 8,
Condition5 = 16,
Condition6 = 32,
Condition7 = 64
};
public void foo(FilterFlagEnum filterFlags = 0)
{
if ((filterFlags & FilterFlagEnum.Condition1) == FilterFlagEnum.Condition1)
{
//do this
}
if ((filterFlags & FilterFlagEnum.Condition2) == FilterFlagEnum.Condition2)
{
//do this
}
}
foo(FilterFlagEnum.Condition1 | FilterFlagEnum.Condition2);
#Tar suggested one way of looking at it. Another might be.
Invert it.
if (myObject.HasThing1)
{
if(myObject.HasThing2)
{
DoThing1();
}
else
{
DoThing2();
}
}
else
{
DoThing3();
}
could be
DoThing1(myObject.HasThing1);
DoThing2(myObject.HasThing2);
DoThing3(myObject.HasThing3);
So each Do method makes the minimum number of tests, if any fail the it does nothing.
You can make it a bit cleverer if you want to break out of the sequence in few ways.
No idea whether it would work for you, but delegating the testing of the conditions is often enough of a new way of looking at things, that some simplifying factor might just appear as if by magic.
In my point of view there exists two main methods to eliminate nested conditions. The first one is used in more special cases when we have only one condition in each nested conditions like here:
function A(){
if (condition1){
if (condition2){
if (condition3){
// do something
}
}
}
}
we can just go out from the opposite condition with return:
function A(){
if (condition1 == false) return;
if (condition2 == false) return;
if (condition3 == false) return;
// do something
}
The second one is using a condition decomposition and can be treated as more universal than the first one. In the case when we have a condition structure like this, for example:
if (condition1)
{
// do this 1
}
else
{
if (condition2)
{
// do this 2
}
}
We can implement a variables for each particular condition like here:
bool Cond1 = condition1;
bool Cond2 = !condition1 && condition2;
if (Cond1) { //do this 1 }
if (Cond2) { //do this 2 }
If that really is the business logic then the syntax is OK. But I have never seen business logic that complex. Draw up a flow chart and see if that cannot be simplified.
if (condition)
{
// do this
}
else
{
if (condition)
{
// do this
}
}
can be replaced with
if (condition)
{
// do this
}
else if (condition)
{
// do this
}
But again step back and review the design. Need more than just an else if clean up.
I feel your pain.
My situation required writing many (>2000) functional tests that have been customer specified for a large, expensive piece of equipment. While most (>95%) of these tests are simple and have a straight forward pass/fail check dozens fall into the "multiple nested if this do that else do something different" at depths similar or worse than yours.
The solution I came up with was to host Windows Workflow within my test application.
All complex tests became Workflows that I run with the results reported back to my test app.
The customer was happy because they had the ability to:
Verify the test logic (hard for non programmers looking at deeply nested if/else C# - easy looking at a graphical flowchart)
Edit tests graphically
Add new tests
Hosting Windows Workflow (in .NET 4/4.5) is very easy - although it may take you a while to get your head around "communications" between the Workflows and your code - mostly because there are multiple ways to do it.
Good Luck
Is there a shorter version of IF statement to do this?
if (el.type == ElementType.Type1 || el.type == ElementType.Type2)
You could use an extension method, but would this really be much better?
Throw this on a static class:
public static bool IsOneOf(this ElementType self, params ElementType[] options)
{
return options.Contains(self);
}
And then you can do:
if (el.type.IsOneOf(ElementType.Type1, ElementType.Type2)) {
However, this will be a lot slower than your if statement, as there is an implicit array initialization followed by an array traversal, as opposed to (at the most) two compares and branches.
Consider ElementType is defined as
enum ElementType
{
Type1,
Type2,
Type3
}
In this particular case you may write if(el.type<ElementType3)
By default Type1 equals to 0, Type2 equals 1, etc
If you have only 2 values, I strongly suggest to use the code you posted, because is likely the most readable, elegant and fast code possible (IMHO).
But if you have more cases like that and more complicated, you could think to use a switch statement:
switch (el.type)
{
case ElementType.Type1:
case ElementType.Type2:
case ElementType.Type3:
//code here
break;
case ElementType.Type4:
case ElementType.Type5:
//code here
break;
case ElementType.Type6:
//code here
break;
}
that translated in if statements would be:
if (el.type == ElementType.Type1 ||
el.type == ElementType.Type2 ||
el.type == ElementType.Type3 )
{
// code here
}else if(el.type == ElementType.Type4 ||
el.type == ElementType.Type5)
{
// code here
}else if(el.type == ElementType.Type6)
{
// code here
}
They're perfectly equal to me, but the switch seems more readable/clearer, and you need to type less (i.e. it's "shorter" in term of code length) :)
You can try this:
if(new [] { ElementType.Type1, ElementType.Type2 }.Contains(el.type))
(turns out, that takes even more characters)
I guess you're referring to an IN() clause or some such? Not really... Well, sort of... You can do something like:
if ((new [] { ElementType.Type1, ElementType.Type2}).Contains(el.type)) {...}
But that's not going to be anywhere near as performant (or brief) as what you're already doing. You can also do
if (el.type == ElementType.Type1 | el.type == ElementType.Type2)
but that doesn't do short-circuit evaluation, so you rarely want to use that operator. My advice is to stick with what you have.
The brief answer is no. There isn't C# language construct that lets you combine object comparisons. But as many people have mentioned before, creating a collection of your types is probably your best bet in creating a shorter if statement. However that sacrifices quite a bit in the area of performance. I would stick with the OR statement.
There is no better way to optimize your code. As other users have shown, you can optimize an if else.
But a type of if statement I have thought about, in your case especially, would be
if(X > [Y || Z || A])
But that doesn't exist, and isn't as clean as the current if (X > Y || X > Z || X > A)
(This is more of a response to Cody Gray)
If this is a common logic comparison in your code that shows up alot I'd just write a method to handle it.
private bool isType1OrType2(ElementType type)
{
return type == ElementType.Type1 || type == ElementType.Type2;
}
then you can do
if(isType1OrType2(el.type))
You could also make this an extension method like so
public static bool isType1OrType2(this ElementType type)
{
return type == ElementType.Type1 || type == ElementType.Type2;
}
so the code would read a little nicer
if(el.type.isType1OrType2())
But then you have to have a static class but you can decide if it's worth it. I personally would not write a method to take a collection of types to compare to unless you find that you are comparing the type to many different combinations. I also would not even bother changing the code at all if this is the only place you make this type of comparison.
i dont think there is a way to optimize your statement
In short: nothing reasonable (reasonable in terms of code readability and performance optimisation). I wouldn't recommend the ternary operator for this kind of comparison either.
The actual if can be shortened to 5 characters ;)
bool b = (el.type == ElementType.Type1) | (el.type == ElementType.Type2);
if(b){...}
Don't do this, it is stupid and confusing unless you have a finite-state automaton.
enum MyEnum
{
A,
B,
C
}
private readonly Dictionary<MyEnum, Action> _handlers = new Dictionary<MyEnum, Action>
{
{MyEnum.A,()=>Console.Out.WriteLine("Foo")},
{MyEnum.B,()=>Console.Out.WriteLine("Bar")},
};
public static void ActOn(MyEnum e)
{
Action handler = null;
if (_handlers.TryGetValue(e, out handler) && handler != null)
{
handler();
}
}
Another approach would be to do some bitwise comparison, but really not worth it again.
private void ActWithCast(MyEnum e)
{
const int interest = (int)MyEnum.A | (int)MyEnum.B;
if (0 != ((int)e & interest))
{
Console.Out.WriteLine("Blam");
}
}
If the ElementType is an enum there is a shorter way to do it:
[Flags]
public enum ElementType
{
Type1 = 1,
Type2 = 2,
Type3 = 4,
}
...
tElementType.HasFlag(ElementType.Type1 | ElementType.Type2);
You do not need the [Flags] attribute to use HasFlag, but the values of each of them do need to follow that pattern.
I'm reading the Cormen algorithms book (binary search tree chapter) and it says that there are two ways to traverse the tree without recursion:
using stack and
a more complicated but elegant
solution that uses no stack but
assumes that two pointers can be
tested for equality
I've implemented the first option (using stack), but don't know how to implement the latter.
This is not a homework, just reading to educate myself.
Any clues as to how to implement the second one in C#?
Sure thing. You didn't say what kind of traversal you wanted, but here's the pseudocode for an in-order traversal.
t = tree.Root;
while (true) {
while (t.Left != t.Right) {
while (t.Left != null) { // Block one.
t = t.Left;
Visit(t);
}
if (t.Right != null) { // Block two.
t = t.Right;
Visit(t);
}
}
while (t != tree.Root && (t.Parent.Right == t || t.Parent.Right == null)) {
t = t.Parent;
}
if (t != tree.Root) { // Block three.
t = t.Parent.Right;
Visit(t);
} else {
break;
}
}
To get pre- or post-order, you rearrange the order of the blocks.
Assuming that the nodes in the tree are references and the values are references, you can always call the static ReferenceEquals method on the Object class to compare to see if the references for any two nodes/values are the same.