Which code is more readable? - c#

Suppose I have two methods bool Foo() and bool Bar(). Which of the following is more readable?
if(Foo())
{
SomeProperty = Bar();
}
else
{
SomeProperty = false;
}
or
SomeProperty = Foo() && Bar();
On the one hand, I consider the short-circuiting && to be a useful feature and the second code sample is much shorter. On the other hand, I'm not sure people are generally accustomed to seeing && outside a conditional statement, so I wonder if that would introduce some cognitive dissonance that makes the first sample the better option.
What do you think? Are there other factors that affect the decision? Like, if the && expression is longer than one line that can fit on the screen, should I prefer the former?
Post-answer clarifications:
A few things that I should have included in the initial question that the answers brought up.
Bar() may be more expensive to execute than Foo(), but neither method should have side effects.
The methods are both named more appropriately, not like in this example. Foo() boils down to something like CurrentUserAllowedToDoX() and Bar() is more like, XCanBeDone()

I agree with the general consensus that the Foo() && Bar() form is reasonable unless it is the case that Bar() is useful for its side effects as well as its value.
If it is the case that Bar() is useful for its side effects as well as it's value, my first choice would be to redesign Bar() so that production of its side effects and computation of its value were separate methods.
If for some reason that was impossible, then I would greatly prefer the original version. To me the original version more clearly emphasizes that the call to Bar() is part of a statement that is useful for its side effects. The latter form to me emphasizes that Bar() is useful for its value.
For example, given the choice between
if (NetworkAvailable())
success = LogUserOn();
else
success = false;
and
success = NetworkAvailable() && LogUserOn();
I would take the former; to me, it is too easy to overlook the important side effect in the latter.
However, if it were a choice between
if (NetworkAvailable())
tryWritingToNetworkStorage = UserHasAvailableDiskQuota();
else
tryWritingToNetworkStorage = false;
and
tryWritingToNetworkStorage = NetworkAvailable() && UserHasAvailableDiskQuota();
I'd choose the latter.

I like this shorthand notation assuming your language permits it:
SomeProperty = Foo() ? Bar() : false;

SomeProperty = Foo() && Bar();
This is way more readable. I don't see how anyone could choose the other one frankly. If the && expression is longer then 1 line, split it into 2 lines...
SomeProperty =
Foo() && Bar();
-or-
SomeProperty = Foo() &&
Bar();
That barely hurts readability and understanding IMHO.

It depends on what Foo and Bar do.
For example, IsUserLoggedIn() && IsUserAdmin() would definitely be better as an &&, but there are some other combinations (I can't think of any offhand) where the ifs would be better.
Overall, I would recommend &&.

Neither. I'd start with renaming SomeProperty, Foo and Bar.
What I mean is, you should structure your code as to convey your intentions clearly. With different functions, I might use different forms. As it stands, however, either form is fine. Consider:
IsFather = IsParent() && IsMale();
and
if (FPUAvailable()) {
SomeProperty = LengthyFPUOperation();
} else {
SomeProperty = false;
}
Here, the first form stresses the logical-and relationship. The second one stresses the short-circuit. I would never write the first example in the second form. I would probably prefer the second form for the second example, especially if I was aiming for clarity.
Point is, it's hard to answer this question with SomeProperty and Foo() and Bar(). There are some good, generic answers defending && for the usual cases, but I would never completely rule out the second form.

In what way do you think people might misinterpret the second one? Do you think they'll forget that && shortcircuits, and worry about what will happen if the second condition is called when the first is false? If so, I wouldn't worry about that - they'd be equally confused by:
if (x != null && x.StartsWith("foo"))
which I don't think many people would rewrite as the first form.
Basically, I'd go with the second. With a suitably descriptive variable name (and conditions) it should be absolutely fine.

In the case where the conditional expressions are short and succinct, as is this case, I find the second to be much more readable. It's very clear at a glance what you are trying to do here. The first example though took me 2 passes to understand what was happening.

I would go for the second, but would probably write it like the first time, and then I would change it :)

When I read the first one, it wasn't immediately obvious SomeProperty was a boolean property, nor that Bar() returned a boolean value until you read the else part of the statement.
My personal view is that this approach should be a best practice: Every line of code should be as interpretable as it can with reference to as little other code as possible.
Because statement one requires me to reference the else part of the statement to interpolate that both SomeProperty and Bar() are boolean in nature, I would use the second.
In the second, it is immediately obvious in a single line of code all of the following facts:
SomeProperty is boolean.
Foo() and Bar() both return values that can be interpreted as boolean.
SomeProperty should be set to false unless both Foo() and Bar() are interpreted as true.

The first one, it's much more debuggable

I think, that best way is use SomeProperty = Foo() && Bar(), because it is much shorter. I think that any normal .NET-programmer should know how &&-operator works.

I would choose the long version because it is clear at first glance what it does. In the second version, you have to stop for a few secons until you realize the short-circuit behavior of the && operator. It is clever code, but not readable code.

Wait a minute. These statements aren't even equivalent, are they? I've looked at them several times and they don't evaluate to the same thing.
The shorthand for the first version would be using the ternary operator, not the "and" operator.
SomeProperty = foo() ? bar: false
However, logic error aside, I agree with everyone else that the shorter version is more readable.
Edit - Added
Then again, if I'm wrong and there IS no logic error, then the second is way more readable because it's very obvious what the code is doing.
Edit again - added more:
Now I see it. There's no logic error. However, I think this makes the point that the longer version is clearer for those of us who haven't had our coffee yet.

If, as you indicate, Bar() has side effects, I would prefer the more explicit way. Otherwise some people might not realize that you are intending to take advantage of short circuiting.
If Bar() is self contained then I would go for the more succinct way of expressing the code.

If the first version looked like this:
if (Foo() && Bar())
{
SomeProperty = true;
}
else
{
SomeProperty = false;
}
or like this:
if (Foo())
{
if (Bar())
SomeProperty = true;
else
SomeProperty = false;
}
else
{
SomeProperty = false;
}
Then it would at least be consistent. This way the logic is much harder to follow than the second case.

It is often useful to be able to breakpoint any specific function call individually, and also any specific setting of a variable to a particular value -- so, perhaps controversially, I would be inclined to go for the first option. This stands a greater chance of allowing fine-grained breakpointing across all debuggers (which in the case of C# is not a very large number).
That way, a breakpoint may be set before the call to Foo(), before the call to Bar() (and the set of the variable), and when the variable is set to false -- and obviously the same breakpoints could also be used to trap the Foo() vs !Foo() cases, depending on the question one has in mind.
This is not impossible to do when it is all on one line, but it takes attention that could be used to work out the cause of whatever problem is being investigated.
(C# compiles quickly, so it is usually not a big problem to rearrange the code, but it is a distraction.)

It comes down to being intentional and clear, in my mind.
The first way makes it clear to the casual observer that you aren't executing Bar() unless Foo() returns true. I get that the short circuit logic will keep Bar() from being called in the second example (and I might write it that way myself) but the first way is far more intentional at first glance.

Short-circuiting of AND behavior is not standard in all languages, so I tend to be wary of using it implicitly if that's essential to my code.
I don't trust myself to see the short-circuit immediately after I've switched languages for the fifth time in the day.
So if Boo() should never be called when Foo() returns false, I'd go with version #2, if only as a defensive programming technique.

First method is more readable, but get more lines of codes, second is more EFFECTIVE, no comparision, only one line! I think that second is better

Since you're using c#, I would choose the first version or use the ternary ?: operator. Using && in that manner isn't a common idiom in c#, so you're more likely to be misunderstood. In some other languages (Ruby comes to mind) the && pattern is more popular, and I would say it's appropriate in that context.

I would say code is readable if it is readable to a person with no knowledge of the actual programming language so therefore code like
if(Foo() == true and Bar() == true)
then
SomeProperty = true;
else
SomeProperty = false;
is much more readable than
SomeProperty = Foo() && Bar();
If the programmer taking over the code after you isn't familiar with the latter syntax it will cost him 10 times the time it takes you to write those extra few characters.

Related

how to make code more readable with compiler optimizations in place?

Code is read more often then updated. Writing more readable code is better than writing powerful and geeky code when compilers can optimize for best execution.
For example see below code - this code can be compressed by combining the nested if statements, but will the compiler not optimize this code for best execution anyway while we get to maintain the readability of it?
// yeild sunRays when sky is blue.
// yeild sunRays when sky is not blue and sun is not present.
if (yieldWhenSkyIsBlue)
{
// if sky is blue and sun is present -> yeild sunRaysObjB.
if (sunObjA != null)
{
yield return sunRaysObjB;
}
else
{
// do not yield ;
}
}
else
{
// if sky is not blue and sun is not present -> yeild sunRaysObjB.
if (sunObjA == null)
{
yield return sunRaysObjB;
}
}
As opposed to something like this :
// yeild sunRays when (sky is blue) or (sun is not present and sky is blue).
// (this interpretation is a bit misleading as compared to first one?)
if(( sunObjA == null && yieldWhenSkyIsBlue ==false) || (yieldWhenSkyIsBlue && sunObjA != null) )
{
yield return sunRaysObjB;
}
Reading the first version depicts the use case better for future enhancements\updates ? The second version of the code is shorter but reading it does not make the use case very apparent or does it ? Are there other advantages of second case apart from concise code ?
update #1 : yes it returns ObjB in both cases but based on the condition it may not yield at all. so the strategy decides when to yield and when not. ( one more reason why readability is imp)
update #2 : updated to site a better example. copied the syntax from stripplingWarrior
update #3 : updated for "What do you expect to happen when the sun is out and the sky is blue".
I think the second code example is much more readable, and has the advantage of being pretty optimal anyway.
Most programmers will find this logic flow to be obvious and natural: you will return ObjB if ObjA is null, or if it's not null and howtoYieldFalg is set.
But if I had to choose between making code like this more readable and making it optimal, I'd make it readable first. Only if I discovered that it's the source of a bottleneck would I bother optimizing it. In this particular case, I can pretty much guarantee that your use of yield return will introduce way more overhead than a suboptimal evaluation of your conditionals.
Update
Take another look at your code samples: they are not logically equivalent. What do you expect to happen when the sun is out and the sky is blue? The second code sample correctly allows sun rays to shine in that case, whereas the first example does not.
The fact that it was so easy to introduce a bug in the first case which so many people failed to catch for so long should be ample evidence to show that the second approach is better. All those nested if/else statements can be tricky to keep straight, even to an experienced programmer. Simple boolean logic is a lot easier to keep straight, especially once you use variable names that give it meaning.
Update 2
Based on the further explanation, and with a little creativity, I'm going to suggest an approach that uses both comments and variable names to increase clarity:
/* Explanation: We live on a strange planet where the sun's
* rays can shine if the sky is blue while the sun is out,
* or if the sky is not blue and there is no sun. */
bool sunIsPresent = sunObjA != null;
if ((skyIsBlue && sunIsPresent) ||
(!skyIsBlue && !sunIsPresent))
{
yield return sunRaysObjB;
}
The compiler optimizes right through any way you organize your program's control flow, so you really do not have to worry about it.
The weakness of compilers though, is they only optimize based on preserving code semantics, not preserving the meaning you intend. I compiled both your examples in LLVM, and here are the control flow graphs generated:
and
I was surprised to find the two CFG's are slightly different. You will note that first is an instruction smaller, but in the second graph, there exists a path to the exit node which only passes through one comparison, whereas in the first, two comparisons are always necessary.
In fact, further tracing of possible routes yields that the first example has possible routes of 6,8,8,6 instructions long, while the second has routes of 8,10,10 respectively. In BOTH cases the average run length is 7 instructions long, but we can see that the first case has better best-time run lengths. Without more information the compiler cannot tell which is better.
tldr: Compilers do magic stuff, don't worry about it, code how you think is best.
This is probably not the popular opinion but I'd definitely not rely on the compiler to perform optimizations of this type. (It may do it, I don't know.) I don't see the second example as geeky - for me it describes more clearly that the two conditions are connected.
Typically I try to write as optimal code as possible without making it very cryptic and then let the compiler optimize that.
Though I haven't tested this particular case, I'm willing to bet that there will be no significant difference between the generated code, if any at all.
Unless you're doing it for fun or a specialized use case, I would argue human-readability is by far the more important quality of good code. The compiler is going to collapse much of your expressive code into more efficient forms, and what it misses you probably won't ever notice.
Given that, idiomatic code is easier to read even when it's less concise. Experienced readers of a language are going to recognize a common pattern more quickly than unfamiliar code that is, arguably 'more human' but breaks the familiar pattern. Looping/incrementing constructs are a good example of code that should be unsurprising. So, my approach is: Be expressive but not too clever.

Is it bad practice to use return inside a void method?

Imagine the following code:
void DoThis()
{
if (!isValid) return;
DoThat();
}
void DoThat() {
Console.WriteLine("DoThat()");
}
Is it OK to use a return inside a void method? Does it have any performance penalty? Or it would be better to write a code like this:
void DoThis()
{
if (isValid)
{
DoThat();
}
}
A return in a void method is not bad, is a common practice to invert if statements to reduce nesting.
And having less nesting on your methods improves code readability and maintainability.
Actually if you have a void method without any return statement, the compiler will always generate a ret instruction at the end of it.
There is another great reason for using guards (as opposed to nested code): If another programmer adds code to your function, they are working in a safer environment.
Consider:
void MyFunc(object obj)
{
if (obj != null)
{
obj.DoSomething();
}
}
versus:
void MyFunc(object obj)
{
if (obj == null)
return;
obj.DoSomething();
}
Now, imagine another programmer adds the line: obj.DoSomethingElse();
void MyFunc(object obj)
{
if (obj != null)
{
obj.DoSomething();
}
obj.DoSomethingElse();
}
void MyFunc(object obj)
{
if (obj == null)
return;
obj.DoSomething();
obj.DoSomethingElse();
}
Obviously this is a simplistic case, but the programmer has added a crash to the program in the first (nested code) instance. In the second example (early-exit with guards), once you get past the guard, your code is safe from unintentional use of a null reference.
Sure, a great programmer doesn't make mistakes like this (often). But prevention is better than cure - we can write the code in a way that eliminates this potential source of errors entirely. Nesting adds complexity, so best practices recommend refactoring code to reduce nesting.
Bad practice??? No way. In fact, it is always better to handle validations by returning from the method at the earliest if validations fail. Else it would result in huge amount of nested ifs & elses. Terminating early improves code readability.
Also check the responses on a similar question: Should I use return/continue statement instead of if-else?
It's not bad practice (for all reasons already stated). However, the more returns you have in a method, the more likely it should be split into smaller logical methods.
The first example is using a guard statement. From Wikipedia:
In computer programming, a guard is a
boolean expression that must evaluate
to true if the program execution is to
continue in the branch in question.
I think having a bunch of guards at the top of a method is a perfectly understandable way to program. It is basically saying "do not execute this method if any of these are true".
So in general it would like this:
void DoThis()
{
if (guard1) return;
if (guard2) return;
...
if (guardN) return;
DoThat();
}
I think that's a lot more readable then:
void DoThis()
{
if (guard1 && guard2 && guard3)
{
DoThat();
}
}
There is no performance penalty, however the second piece of code is more readable and hence easier to maintain.
In this case, your second example is better code, but that has nothing to do with returning from a void function, it's simply because the second code is more direct. But returning from a void function is entirely fine.
It's perfectly okay and no 'performance penalty', but never ever write an 'if' statement without brackets.
Always
if( foo ){
return;
}
It's way more readable; and you'll never accidentally assume that some parts of the code are within that statement when they're not.
I'm going to disagree with all you young whippersnappers on this one.
Using return in the middle of a method, void or otherwise, is very bad practice, for reasons that were articulated quite clearly, nearly forty years ago, by the late Edsger W. Dijkstra, starting in the well-known "GOTO Statement Considered Harmful", and continuing in "Structured Programming", by Dahl, Dijkstra, and Hoare.
The basic rule is that every control structure, and every module, should have exactly one entry and one exit. An explicit return in the middle of the module breaks that rule, and makes it much harder to reason about the state of the program, which in turn makes it much harder to say whether the program is correct or not (which is a much stronger property than "whether it appears to work or not").
"GOTO Statement Considered Harmful" and "Structured Programming" kicked off the "Structured Programming" revolution of the 1970s. Those two pieces are the reasons we have if-then-else, while-do, and other explicit control constructs today, and why GOTO statements in high-level languages are on the Endangered Species list. (My personal opinion is that they need to be on the Extinct Species list.)
It is worth noting that the Message Flow Modulator, the first piece of military software that EVER passed acceptance testing on the first try, with no deviations, waivers, or "yeah, but" verbiage, was written in a language that did not even have a GOTO statement.
It is also worth mentioning that Nicklaus Wirth changed the semantics of the RETURN statement in Oberon-07, the latest version of the Oberon programming language, making it a trailing piece of the declaration of a typed procedure (i.e., function), rather than an executable statement in the body of the function. His explication of the change said that he did it precisely because the previous form WAS a violation of the one-exit principle of Structured Programming.
While using guards, make sure you follow certain guidelines to not confuse readers.
the function does one thing
guards are only introduced as the first logic in the function
the unnested part contains the function's core intent
Example
// guards point you to the core intent
void Remove(RayCastResult rayHit){
if(rayHit== RayCastResult.Empty)
return
;
rayHit.Collider.Parent.Remove();
}
// no guards needed: function split into multiple cases
int WonOrLostMoney(int flaw)=>
flaw==0 ? 100 :
flaw<10 ? 30 :
flaw<20 ? 0 :
-20
;
Throw exception instead of returning nothing when object is null etc.
Your method expects object to be not null and is not the case so you should throw exception and let caller handle that.
But early return is not bad practice otherwise.

Which is preferable and less expensive: class matching vs exception?

Which is less expensive and preferable: put1 or put2?
Map<String, Animal> map = new Map<String, Animal>();
void put1(){
for (.....)
if (Animal.class.isAssignableFrom(item[i].getClass())
map.put(key[i], item[i]);
void put2(){
for (.....)
try{
map.put(key[i], item[i]);}
catch (...){}
Question revision:
The question wasn't that clear. Let me revise the question a little. I forgot the casting so that put2 depends on cast exception failure. isAssignableFrom(), isInstanceOf() and instanceof are similar functionally and therefore incur the same expense just one is a method to include subclasses,while the 2nd is for exact type matching and the 3rd is the operator version. Both reflective methods and exceptions are expensive operations.
My question is for those who have done some benchmarking in this area - which is less expensive and preferable: instanceof/isassignablefrom vs cast exception?
void put1(){
for (.....)
if (Animal.class.isAssignableFrom(item[i].getClass())
map.put(key[i], (Animal)item[i]);
void put2(){
for (.....)
try{
map.put(key[i], (Animal)item[i]);}
catch (...){}
Probably you want:
if (item[i] instanceof Animal)
map.put(key[i], (Animal) item[i]);
This is almost certainly much better than calling isAssignableFrom.
Or in C# (since you added the c# tag):
var a = item[i] as Animal;
if (a != null)
map[key[i]] = a;
EDIT: The updated question is which is better: instanceof or cast-and-catch. The functionality is basically the same. The performance difference might not be significant and I would have to measure it; generally throwing an exception is slow, but I don't know about the rest. So I would decide based on style. Say what you mean.
If you always expect expect item[i] to be an Animal, and you're just being extra careful, cast-and-catch. Otherwise I find it much clearer to use instanceof, because that plainly says what you mean: "if this object is an Animal, put it in the map".
I'm confused. If item[i] is not an Animal, then how does map.put(key[i], item[i]) even compile?
That said, the first method says what you're intending to do, although I believe instanceof would be an even better check.
Typically exception handling will be significantly slower because, since it is supposed to be used for exceptional things (rarely occurring) not much work is spent by VM makers on speeding it up.
The tr/catch version of your code I would consider to be abuse of exception handling and would never consider doing it. The fact that you are thinking of doing something like this probably means you have a poor design, items should probably an Animal[] not something else, in which case you don't need to check at runtime at all. Let the compiler do the work for you.
I agree with a previous answer - this will not compile.
But, in my opinion, whether it is an exception or a check depends on the purpose of the function.
Is item[i] not being a Animal an error/exceptional case? Is it expected to happen rarely? In this case, it should be an exception.
If it is part of the logic - meaning you expect item[i] to be many things - and only if it is an Animal you want to put in a map. In this case, the instanceof check is the right way.
UPDATE :
I'll also add an example (bit lame) :
Which is better :
(1)
if ( aNumber < 100 ) {
processNumber(aNumber);
}
or (2)
try {
processNumber(aNumber); //Throws exception if aNumber >= 100
} catch () {
}
This depends on what the program does. (1) may be used for counting numbers < 100 for any integer input. (2) will be used if processNumber expects a percentage value which cannot be greater than 100.
The difference is, it is an error for program (2) to get aNumber > 100. However, for program (1) aNumber > 100 is valid, but "something" happens only when aNumber is < 100.
PS - This may not be helpful to you at all, and I apologize if this is the case.
Your two alternatives are not really equivalent. Which one to choose, depends totally on what your code is supposed to do:
If the item is expected to always be
an Animal, then you should use
put2 (which will throw, if
that's not the case...)
If the item may or may not be an
Animal, you should use put1 (which
checks a condition, not an error...)
Never care about performance in the first place, if you're writing code!

Calling methods inside if() - C#

I have a couple of methods that return a bool depending on their success, is there anything wrong with calling those methods inside of the IF() ?
//&& makes sure that Method2() will only get called if Method1() returned true, use & to call both methods
if(Method1() && Method2())
{
// do stuff if both methods returned TRUE
}
Method2() doesn't need to fire if Method1() returns FALSE.
Let me know there's any problem with the code above.
thank you.
EDIT: since there was nothing wrong with the code, I'll accept the most informative answer ... added the comment to solve the "newbie & &&" issue
I'll throw in that you can use the & operator (as opposed to &&) to guarantee that both methods are called even if the left-hand side is false, if for some reason in the future you wish to avoid short-circuiting.
The inverse works for the | operator, where even if the left-hand condition evaluates to true, the right-hand condition will be evaluated as well.
No, there is nothing wrong with method calls in the if condition. Actually, that can be a great way to make your code more readable!
For instance, it's a lot cleaner to write:
private bool AllActive()
{
return x.IsActive && y.IsActive && z.IsActive;
}
if(AllActive())
{
//do stuff
}
than:
if(x.IsActive && y.IsActive && z.IsActive)
{
//do stuff
}
As useful as they are, sequence points can be confusing. Unless you really understand that, it is not clear that Method2() might not get called at all. If on the other hand you needed BOTH methods to be called AND they had to return true, what would you write? You could go with
bool result1 = Method1();
bool result2 = Method2();
if (result1 && result2)
{
}
or you could go with
if (Method1())
if (Method2())
{
}
So I guess the answer to you question IMHO is, no, it's not exactly clear what you mean even though the behavior will be what you describe.
I would only recommend it if the methods are pure (side-effect-free) functions.
While, as everyone says, there's nothing "wrong" with doing things this way, and in many cases you're doing precisely what the language was designed for.
Bear in mind, however, that for maintainabilities sake, if Method2 has side effects (that is, it changes somethings state) it may not be obvious that this function is not being called (a good programmer will usually know, but even good programmers sometimes have brain farts).
If the short circuited expression has some kind of side effect, it may be more readable to seperate the statements, strictly from a maintenance perspective.
Looks good to me, multiple clauses in the if() block will short circuit if an earlier condition fails.
There shouldn't be any problem.
The normal behavior is that Method1() will execute, and if that returns true Method2() will execute, and depending on what Method2() returns, you may / may not enter the if() statement.
Now, this assumes that the compiler generates code that executes that way. If you want to be absolutely sure that Method2() doesn't execute unless Method1() returns true you could write it like this
if( Method1() )
{
if( Method2() )
{
// do stuff if both methods returned TRUE
}
}
But, I've always observed that your code will run as expected, so this is probably not necessary.
Nothin' wrong.
Actually...I wouldn't name them Method1 and Method2. Something more descriptive. Maybe passive sounding too (like StuffHasHappened or DataHasLoaded)
Looks good to me, but there are some caveats... This is NOT the kind of thing where blanket rules apply.
My guidelines are:
If the method names are short, and there are not too many of them, then it's all good.
If you have too many statements/method calls inside the if statement, you most likely are comparing more than one "set" of things. Break those "sets" out and introduce temporary variables.
"Too many" is subjective, but usually more than around 3
When I say "method names are short" I'm talking not just about the names, but the parameters they take as well. Basically the effort required for someone to read it. For example if( Open(host) ) is shorter than if( WeCouldConnectToTheServer ). The total size of all these items is what it comes down to.
Personally, I would consider
if(Method1() && Method2())
{
// do stuff if both methods returned TRUE
}
to be a bad practice. Yes, it works in the current environment, but so does
if(Method1())
{
if (Method2())
{
// do stuff if both methods returned TRUE
}
}
But will it work in ALL environments? Will future, possibly non-Microsoft, C# compilers work this way? What if your next job involves another language where both methods will always be called? I wouldn't rely on that particular construct not because it's wrong, but because it doesn't solve any serious problem, and it may become wrong in the future

Implementing a "LazyProperty" class - is this a good idea?

I often find myself writing a property that is evaluated lazily. Something like:
if (backingField == null)
backingField = SomeOperation();
return backingField;
It is not much code, but it does get repeated a lot if you have a lot of properties.
I am thinking about defining a class called LazyProperty:
public class LazyProperty<T>
{
private readonly Func<T> getter;
public LazyProperty(Func<T> getter)
{
this.getter = getter;
}
private bool loaded = false;
private T propertyValue;
public T Value
{
get
{
if (!loaded)
{
propertyValue = getter();
loaded = true;
}
return propertyValue;
}
}
public static implicit operator T(LazyProperty<T> rhs)
{
return rhs.Value;
}
}
This would enable me to initialize a field like this:
first = new LazyProperty<HeavyObject>(() => new HeavyObject { MyProperty = Value });
And then the body of the property could be reduced to:
public HeavyObject First { get { return first; } }
This would be used by most of the company, since it would go into a common class library shared by most of our products.
I cannot decide whether this is a good idea or not. I think the solutions has some pros, like:
Less code
Prettier code
On the downside, it would be harder to look at the code and determine exactly what happens - especially if a developer is not familiar with the LazyProperty class.
What do you think ? Is this a good idea or should I abandon it ?
Also, is the implicit operator a good idea, or would you prefer to use the Value property explicitly if you should be using this class ?
Opinions and suggestions are welcomed :-)
Just to be overly pedantic:
Your proposed solution to avoid repeating code:
private LazyProperty<HeavyObject> first =
new LazyProperty<HeavyObject>(() => new HeavyObject { MyProperty = Value });
public HeavyObject First {
get {
return first;
}
}
Is actually more characters than the code that you did not want to repeat:
private HeavyObject first;
public HeavyObject First {
get {
if (first == null) first = new HeavyObject { MyProperty = Value };
return first;
}
}
Apart from that, I think that the implicit cast made the code very hard to understand. I would not have guessed that a method that simply returns first, actually end up creating a HeavyObject. I would at least have dropped the implicit conversion and returned first.Value from the property.
Don't do it at all.
Generally using this kind of lazy initialized properties is a valid design choice in one case: when SomeOperation(); is an expensive operation (in terms of I/O, like when it requires a DB hit, or computationally) AND when you are certain you will often NOT need to access it.
That said, by default you should go for eager initialization, and when profiler says it's your bottleneck, then change it to lazy initialization.
If you feel urge to create that kind of abstraction, it's a smell.
Surely you'd at least want the LazyPropery<T> to be a value type, otherwise you've added memory and GC pressure for every "lazily-loaded" property in your system.
Also, what about multiple-threaded scenarios? Consider two threads requesting the property at the same time. Without locking, you could potentially create two instances of the underlying property. To avoid locking in the common case, you would want to do a double-checked lock.
I prefer the first code, because a) it is such a common pattern with properties that I immediately understand it, and b) the point you raised: that there is no hidden magic that you have to go look up to understand where and when the value is being obtained.
I like the idea in that it is much less code and more elegant, but I would be very worried about the fact that it becomes hard to look at it and tell what is going on. The only way I would consider it is to have a convention for variables set using the "lazy" way, and also to comment anywhere it is used. Now there isn't going to be a compiler or anything that will enforce those rules, so still YMMV.
In the end, for me, decisions like this boil down to who is going to be looking at it and the quality of those programmers. If you can trust your fellow developers to use it right and comment well then go for it, but if not, you are better off doing it in a easily understood and followed way. /my 2cents
I don't think worrying about a developer not understanding is a good argument against doing something like this...
If you think that then you couldn't do anything for the fear of someone not understanding what you did
You could write a tutorial or something in a central repository, we have here a wiki for these kind of notes
Overall, I think it's a good implementation idea (not wanting to start a debate whether lazyloading is a good idea or not)
What I do in this case is I create a Visual Studio code snippet. I think that's what you really should do.
For example, when I create ASP.NET controls, I often times have data that gets stored in the ViewState a lot, so I created a code snippet like this:
public Type Value
{
get
{
if(ViewState["key"] == null)
ViewState["key"] = someDefaultValue;
return (Type)ViewState["key"];
}
set{ ViewState["key"] = value; }
}
This way, the code can be easily created with only a little work (defining the type, the key, the name, and the default value). It's reusable, but you don't have the disadvantage of a complex piece of code that other developers might not understand.
I like your solution as it is very clever but I don't think you win much by using it. Lazy loading a private field in a public property is definitely a place where code can be duplicated. However this has always struck me as a pattern to use rather than code that needs to be refactored into a common place.
Your approach may become a concern in the future if you do any serialization. Also it is more confusing initially to understand what you are doing with the custom type.
Overall I applaud your attempt and appreciate its cleverness but would suggest that you revert to your original solution for the reasons stated above.
Personally, I don't think the LazyProperty class as is offers enough value to justify using it especially considering the drawbacks using it for value types has (as Kent mentioned). If you needed other functionality (like making it multithreaded), it might be justified as a ThreadSafeLazyProperty class.
Regarding the implicit property, I like the "Value" property better. It's a little more typing, but a lot more clear to me.
I think this is an interesting idea. First I would recommend that you hide the Lazy Property from the calling code, You don't want to leak into your domain model that it is lazy. Which your doing with the implicit operator so keep that.
I like how you can use this approach to handle and abstract away the details of locking for example. If you do that then I think there is value and merit. If you do add locking watch out for the double lock pattern it's very easy to get it wrong.

Categories