For my software development programming class we were supposed to make a "Feed Manager" type program for RSS feeds. Here is how I handled the implementation of FeedItems.
Nice and simple:
struct FeedItem {
string title;
string description;
string url;
}
I got marked down for that, the "correct" example answer is as follows:
class FeedItem
{
public:
FeedItem(string title, string description, string url);
inline string getTitle() const { return this->title; }
inline string getDescription() const { return this->description; }
inline string getURL() const { return this->url; }
inline void setTitle(string title) { this->title = title; }
inline void setDescription(string description){ this->description = description; }
inline void setURL(string url) { this->url = url; }
private:
string title;
string description;
string url;
};
Now to me, this seems stupid. I honestly can't believe I got marked down, when this does the exact same thing that mine does with a lot more overhead.
It reminds me of how in C# people always do this:
public class Example
{
private int _myint;
public int MyInt
{
get
{
return this._myint;
}
set
{
this._myint = value;
}
}
}
I mean I GET why they do it, maybe later on they want to validate the data in the setter or increment it in the getter. But why don't you people just do THIS UNTIL that situation arises?
public class Example
{
public int MyInt;
}
Sorry this is kind of a rant and not really a question, but the redundancy is maddening to me. Why are getters and setters so loved, when they are unneeded?
It's an issue of "best practice" and style.
You don't ever want to expose your data members directly. You always want to be able to control how they are accessed. I agree, in this instance, it seems a bit ridiculous, but it is intended to teach you that style so you get used to it.
It helps to define a consistent interface for classes. You always know how to get to something --> calling its get method.
Then there's also the reusability issue. Say, down the road, you need to change what happens when somebody accesses a data member. You can do that without forcing clients to recompile code. You can simply change the method in the class and guarantee that the new logic is utilized.
Here's a nice long SO discussion on the subject: Why use getters and setters.
The question you want to ask yourself is "What's going to happen 3 months from now when you realize that FeedItem.url does need to be validated but it's already referenced directly from 287 other classes?"
The main reason to do this before its needed is for versioning.
Fields behave differently than properties, especially when using them as an lvalue (where it's often not allowed, especially in C#). Also, if you need to, later, add property get/set routines, you'll break your API - users of your class will need to rewrite their code to use the new version.
It's much safer to do this up front.
C# 3, btw, makes this easier:
public class Example
{
public int MyInt { get; set; }
}
I absolutely agree with you. But in life you should probably do The Right Thing: in school, it's to get good marks. In your workplace it's to fulfill specs. If you want to be stubborn, then that's fine, but do explain yourself -- cover your bases in comments to minimize the damage you might get.
In your particular example above I can see you might want to validate, say, the URL. Maybe you'd even want to sanitize the title and the description, but either way I think this is the sort of thing you can tell early on in the class design. State your intentions and your rationale in comments. If you don't need validation then you don't need a getter and setter, you're absolutely right.
Simplicity pays, it's a valuable feature. Never do anything religiously.
If something's a simple struct, then yes it's ridiculous because it's just DATA.
This is really just a throwback to the beginning of OOP where people still didn't get the idea of classes at all. There's no reason to have hundreds of get and set methods just in case you might change getId() to be an remote call to the hubble telescope some day.
You really want that functionality at the TOP level, at the bottom it's worthless. IE you would have a complex method that was sent a pure virtual class to work on, guaranteeing it can still work no matter what happens below. Just placing it randomly in every struct is a joke, and it should never be done for a POD.
Maybe both options are a bit wrong, because neither version of the class has any behaviour. It's hard to comment further without more context.
See http://www.pragprog.com/articles/tell-dont-ask
Now lets imagine that your FeedItem class has become wonderfully popular and is being used by projects all over the place. You decide you need (as other answers have suggested) validate the URL that has been provided.
Happy days, you have written a setter for the URL. You edit this, validate the URL and throw an exception if it is invalid. You release your new version of the class and everyone one using it is happy. (Let's ignored checked vs unchecked exceptions to keep this on-track).
Except, then you get a call from an angry developer. They were reading a list of feeditems from a file when their application starts up. And now, if someone makes a little mistake in the configuration file your new exception is thrown and the whole system doesn't start up, just because one frigging feed item was wrong!
You may have kept the method signature the same, but you have changed the semantics of the interface and so it breaks dependant code. Now, you can either take the high-ground and tell them to re-write their program right or you humbly add setURLAndValidate.
Keep in mind that coding "best practices" are often made obsolete by advances in programming languages.
For example, in C# the getter/setter concept has been baked into the language in the form of properties. C# 3.0 made this easier with the introduction of automatic properties, where the compiler automatically generates the getter/setter for you. C# 3.0 also introduced object initializers, which means that in most cases you no longer need to declare constructors which simply initialize properties.
So the canonical C# way to do what you're doing would look like this:
class FeedItem
{
public string Title { get; set; } // automatic properties
public string Description { get; set; }
public string Url { get; set; }
};
And the usage would look like this (using object initializer):
FeedItem fi = new FeedItem() { Title = "Some Title", Description = "Some Description", Url = "Some Url" };
The point is that you should try and learn what the best practice or canonical way of doing things are for the particular language you are using, and not simply copy old habits which no longer make sense.
As a C++ developer I make my members always private simply to be consistent. So I always know that I need to type p.x(), and not p.x.
Also, I usually avoid implementing setter methods. Instead of changing an object I create a new one:
p = Point(p.x(), p.y() + 1);
This preserves encapsulation as well.
There absolutely is a point where encapsulation becomes ridiculous.
The more abstraction that is introduced into code the greater your up-front education, learning-curve cost will be.
Everyone who knows C can debug a horribly written 1000 line function that uses just the basic language C standard library. Not everyone can debug the framework you've invented. Every introduced level encapsulation/abstraction must be weighed against the cost. That's not to say its not worth it, but as always you have to find the optimal balance for your situation.
One of the problems that the software industry faces is the problem of reusable code. Its a big problem. In the hardware world, hardware components are designed once, then the design is reused later when you buy the components and put them together to make new things.
In the software world every time we need a component we design it again and again. Its very wasteful.
Encapsulation was proposed as a technique for ensuring that modules that are created are reusable. That is, there is a clearly defined interface that abstracts the details of the module and make it easier to use that module later. The interface also prevents misuse of the object.
The simple classes that you build in class do not adequately illustrate the need for the well defined interface. Saying "But why don't you people just do THIS UNTIL that situation arises?" will not work in real life. What you are learning in you software engineering course is to engineer software that other programmers will be able to use. Consider that the creators of libraries such as provided by the .net framework and the Java API absolutely require this discipline. If they decided that encapsulation was too much trouble these environments would be almost impossible to work with.
Following these guidelines will result in high quality code in the future. Code that adds value to the field because more than just yourself will benefit from it.
One last point, encapsulation also makes it possible to adequately test a module and be resonably sure that it works. Without encapsulation, testing and verification of your code would be that much more difficult.
Getters/Setters are, of course, good practice but they are tedious to write and, even worse, to read.
How many times have we read a class with half a dozen member variables and accompanying getters/setters, each with the full hog #param/#return HTML encoded, famously useless comment like 'get the value of X', 'set the value of X', 'get the value of Y', 'set the value of Y', 'get the value of Z', 'set the value of Zzzzzzzzzzzzz. thump!
This is a very common question: "But why don't you people just do THIS UNTIL that situation arises?".
The reason is simple: usually it is much cheaper not to fix/retest/redeploy it later, but to do it right the first time.
Old estimates say that maintenance costs are 80%, and much of that maintenance is exactly what you are suggesting: doing the right thing only after someone had a problem. Doing it right the first time allows us to concentrate on more interesting things and to be more productive.
Sloppy coding is usually very unprofitable - your customers are unhappy because the product is unreliable and they are not productive when the are using it. Developers are not happy either - they spend 80% of time doing patches, which is boring. Eventually you can end up losing both customers and good developers.
I agree with you, but it's important to survive the system. While in school, pretend to agree. In other words, being marked down is detrimental to you and it is not worth it to be marked down for your principles, opinions, or values.
Also, while working on a team or at an employer, pretend to agree. Later, start your own business and do it your way. While you try the ways of others, be calmly open-minded toward them -- you may find that these experiences re-shape your views.
Encapsulation is theoretically useful in case the internal implementation ever changes. For example, if the per-object URL became a calculated result rather than a stored value, then the getUrl() encapsulation would continue to work. But I suspect you already have heard this side of it.
Related
I would like to do a very simple test for the Constructor of my class,
[Test]
public void InitLensShadingPluginTest()
{
_lensShadingStory.WithScenario("Init Lens Shading plug-in")
.Given(InitLensShadingPlugin)
.When(Nothing)
.Then(PluginIsCreated)
.Execute();
}
this can be in Given or When it... I think it should be in When() but it doesn't really matter.
private void InitLensShadingPlugin()
{
_plugin = new LSCPlugin(_imagesDatabaseProvider, n_iExternalToolImageViewerControl);
}
Since the Constructor is the one being tested, I do not have anything to do inside the When() statement,
And in Then() I assert about the plugin creation.
private void PluginIsCreated()
{
Assert.NotNull(_plugin);
}
my question is about StoryQ, since I do not want to do anything inside When()
i tried to use When(()=>{}) however this is not supported by storyQ,
this means I need to implement something like
private void Nothing()
{
}
and call When(Nothing)
is there a better practice?
It's strange that StoryQ doesn't support missing steps; your scenario is actually pretty typical of other examples I've used of starting applications, games etc. up:
Given the chess program is running
Then the pieces should be in the starting positions
for instance. So your desire to use a condition followed by an outcome is perfectly valid.
Looking at StoryQ's API, it doesn't look as if it supports these empty steps. You could always make your own method and call both the Given and When steps inside it, returning the operation from the When:
.GivenIStartedWith(InitLensShadingPlugin)
.Then(PluginIsCreated)
If that seems too clunky, I'd do as you suggested and move the Given to a When, initializing the Given with an empty method with a more meaningful name instead:
Given(NothingIsInitializedYet)
.When(InitLensShadingPlugin)
.Then(PluginIsCreated)
Either of these will solve your problem.
However, if all you're testing is a class, rather than an entire application, using StoryQ is probably overkill. The natural-language BDD frameworks like StoryQ, Cucumber, JBehave etc. are intended to help business and development teams collaborate in their exploration of requirements. They incur significant setup and maintenance overhead, so if the audience of your class-level scenarios / examples is technical, there may be an easier way.
For class-level examples of behaviour I would just go with a plain unit testing tool like NUnit or MSpec. I like using NUnit and putting my "Given / When / Then" in comments:
// Given I initialized the lens shading plugin on startup
_plugin = new LSCPlugin(_imagesDatabaseProvider, n_iExternalToolImageViewerControl);
// Then the plugin should have been created
Assert.NotNull(_plugin);
Steps at a class level aren't reused in the same way they are in full-system scenarios, because classes have much smaller, more encapsulated responsibilities; and developers benefit from reading the code rather than having it hidden away in the step definitions.
Your Given/When/Then comments here might still echo scenarios at a higher level, if the class is directly driving the functionality that the user sees.
Normally for full-system scenarios we would derive the steps from conversations with the "3 amigos":
a business representative (PO, SME, someone who has a problem to be solved)
a tester (who spots scenarios we might otherwise miss)
the dev (who's going to solve the problem).
There might be a pair of devs. UI designers can get involved if they want to. Matt Wynne says it's "3 amigos, where 3 is any number between 3 and 7". The best time to have the conversations is right before the devs pick up the work to begin coding it.
However, if you're working on your own, whether it's a toy or a real application, you might benefit just from having imaginary conversations. I use a pixie called Thistle for mine.
I would love to write code like this:
class Zebra
{
public lazy int StripeCount
{
get { return ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce(); }
}
}
EDIT: Why? I think it looks better than:
class Zebra
{
private Lazy<int> _StripeCount;
public Zebra()
{
this._StripeCount = new Lazy(() => ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce());
}
public lazy int StripeCount
{
get { return this._StripeCount.Value; }
}
}
The first time you call the property, it would run the code in the get block, and afterward would just return the value from it.
My questions:
What costs would be involved with adding this kind of keyword to the library?
What situations would this be problematic in?
Would you find this useful?
I'm not starting a crusade to get this into the next version of the library, but I am curious what kind of considerations a feature such as this should have to go through.
I am curious what kind of considerations a feature such as this should have to go through.
First off, I write a blog about this subject, amongst others. See my old blog:
http://blogs.msdn.com/b/ericlippert/
and my new blog:
http://ericlippert.com
for many articles on various aspects of language design.
Second, the C# design process is now open for view to the public, so you can see for yourself what the language design team considers when vetting new feature suggestions. See https://github.com/dotnet/roslyn/ for details.
What costs would be involved with adding this kind of keyword to the library?
It depends on a lot of things. There are, of course, no cheap, easy features. There are only less expensive, less difficult features. In general, the costs are those involving designing, specifying, implementing, testing, documenting and maintaining the feature. There are more exotic costs as well, like the opportunity cost of not doing a better feature, or the cost of choosing a feature that interacts poorly with future features we might want to add.
In this case the feature would probably be simply making the "lazy" keyword a syntactic sugar for using Lazy<T>. That's a pretty straightforward feature, not requiring a lot of fancy syntactic or semantic analysis.
What situations would this be problematic in?
I can think of a number of factors that would cause me to push back on the feature.
First off, it is not necessary; it's merely a convenient sugar. It doesn't really add new power to the language. The benefits don't seem to be worth the costs.
Second, and more importantly, it enshrines a particular kind of laziness into the language. There is more than one kind of laziness, and we might choose wrong.
How is there more than one kind of laziness? Well, think about how it would be implemented. Properties are already "lazy" in that their values are not calculated until the property is called, but you want more than that; you want a property that is called once, and then the value is cached for the next time. By "lazy" essentially you mean a memoized property. What guarantees do we need to put in place? There are many possibilities:
Possibility #1: Not threadsafe at all. If you call the property for the "first" time on two different threads, anything can happen. If you want to avoid race conditions, you have to add synchronization yourself.
Possibility #2: Threadsafe, such that two calls to the property on two different threads both call the initialization function, and then race to see who fills in the actual value in the cache. Presumably the function will return the same value on both threads, so the extra cost here is merely in the wasted extra call. But the cache is threadsafe, and doesn't block any thread. (Because the threadsafe cache can be written with low-lock or no-lock code.)
Code to implement thread safety comes at a cost, even if it is low-lock code. Is that cost acceptable? Most people write what are effectively single-threaded programs; does it seem right to add the overhead of thread safety to every single lazy property call whether it's needed or not?
Possibility #3: Threadsafe such that there is a strong guarantee that the initialization function will only be called once; there is no race on the cache. The user might have an implicit expectation that the initialization function is only called once; it might be very expensive and two calls on two different threads might be unacceptable. Implementing this kind of laziness requires full-on synchronization where it is possible that one thread blocks indefinitely while the lazy method is running on another thread. It also means there could be deadlocks if there's a lock-ordering problem with the lazy method.
That adds even more cost to the feature, a cost that is borne equally by people who do not take advantage of it (because they are writing single-threaded programs).
So how do we deal with this? We could add three features: "lazy not threadsafe", "lazy threadsafe with races" and "lazy threadsafe with blocking and maybe deadlocks". And now the feature just got a whole lot more expensive and way harder to document. This produces an enormous user education problem. Every time you give a developer a choice like this, you present them with an opportunity to write terrible bugs.
Third, the feature seems weak as stated. Why should laziness be applied merely to properties? It seems like this could be applied generally through the type system:
lazy int x = M(); // doesn't call M()
lazy int y = x + x; // doesn't add x + x
int z = y * y; // now M() is called once and cached.
// x + x is computed and cached
// y * y is computed
We try to not do small, weak features if there is a more general feature that is a natural extension of it. But now we're talking about really serious design and implementation costs.
Would you find this useful?
Personally? Not really useful. I write lots of simple low-lock lazy code mostly using Interlocked.Exchange. (I don't care if the lazy method gets run twice and one of the results discarded; my lazy methods are never that expensive.) The pattern is straightforward, I know it to be safe, there are never extra objects allocated for the delegate or the locks, and if I have something a little more complex I can always use Lazy<T> to do the work for me. It would be a small convenience.
The system library already has a class that does what you want: System.Lazy<T>
I'm sure it could be integrated into the language, but as Eric Lippert will tell you adding features to a language is not something to take lightly. Many things have to be considered, and the benefit/cost ratio needs to be very good. Since System.Lazy already handles this pretty well, I doubt we will see this anytime soon.
Do you know about the Lazy<T> class that was added in .Net 4.0?
http://sankarsan.wordpress.com/2009/10/04/laziness-in-c-4-0-lazyt/
Have you tryed / Dou you mean this?
private Lazy<int> MyExpensiveCountingValue = new Lazy<int>(new Func<int>(()=> ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce()));
public int StripeCount
{
get
{
return MyExpensiveCountingValue.Value;
}
}
EDIT:
after your post edit I would add that your idea is definitely more elegant, but still has the same functionallity!!!.
This is unlikely to be added to the C# language because you can easily do it yourself, even without Lazy<T>.
A simple, but not thread-safe, example:
class Zebra
{
private int? stripeCount;
public int StripeCount
{
get
{
if (this.stripeCount == null)
{
this.stripeCount = ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce();
}
return this.stripeCount;
}
}
}
If you don't mind using a post-compiler, CciSharp has this feature:
class Zebra {
[Lazy] public int StripeCount {
get { return ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce(); }
}
}
Have a look at the Lazy<T> type. Also ask Eric Lippert about adding things like this to the language, he would no doubt have a view.
We have a lot of code that passes about “Ids” of data rows; these are mostly ints or guids. I could make this code safer by creating a different struct for the id of each database table. Then the type checker will help to find cases when the wrong ID is passed.
E.g the Person table has a column calls PersonId and we have code like:
DeletePerson(int personId)
DeleteCar(int carId)
Would it be better to have:
struct PersonId
{
private int id;
// GetHashCode etc....
}
DeletePerson(PersionId persionId)
DeleteCar(CarId carId)
Has anyone got real life experience
of dong this?
Is it worth the overhead?
Or more pain then it is worth?
(It would also make it easier to change the data type in the database of the primary key, that is way I thought of this ideal in the first place)
Please don’t say use an ORM some other big change to the system design as I know an ORM would be a better option, but that is not under my power at present. However I can make minor changes like the above to the module I am working on at present.
Update:
Note this is not a web application and the Ids are kept in memory and passed about with WCF, so there is no conversion to/from strings at the edge. There is no reason that the WCF interface can’t use the PersonId type etc. The PersonsId type etc could even be used in the WPF/Winforms UI code.
The only inherently "untyped" bit of the system is the database.
This seems to be down to the cost/benefit of spending time writing code that the compiler can check better, or spending the time writing more unit tests. I am coming down more on the side of spending the time on testing, as I would like to see at least some unit tests in the code base.
It's hard to see how it could be worth it: I recommend doing it only as a last resort and only if people are actually mixing identifiers during development or reporting difficulty keeping them straight.
In web applications in particular it won't even offer the safety you're hoping for: typically you'll be converting strings into integers anyway. There are just too many cases where you'll find yourself writing silly code like this:
int personId;
if (Int32.TryParse(Request["personId"], out personId)) {
this.person = this.PersonRepository.Get(new PersonId(personId));
}
Dealing with complex state in memory certainly improves the case for strongly-typed IDs, but I think Arthur's idea is even better: to avoid confusion, demand an entity instance instead of an identifier. In some situations, performance and memory considerations could make that impractical, but even those should be rare enough that code review would be just as effective without the negative side-effects (quite the reverse!).
I've worked on a system that did this, and it didn't really provide any value. We didn't have ambiguities like the ones you're describing, and in terms of future-proofing, it made it slightly harder to implement new features without any payoff. (No ID's data type changed in two years, at any rate - it's could certainly happen at some point, but as far as I know, the return on investment for that is currently negative.)
I wouldn't make a special id for this. This is mostly a testing issue. You can test the code and make sure it does what it is supposed to.
You can create a standard way of doing things in your system than help future maintenance (similar to what you mention) by passing in the whole object to be manipulated. Of course, if you named your parameter (int personID) and had documentation then any non malicious programmer should be able to use the code effectively when calling that method. Passing a whole object will do that type matching that you are looking for and that should be enough of a standardized way.
I just see having a special structure made to guard against this as adding more work for little benefit. Even if you did this, someone could come along and find a convenient way to make a 'helper' method and bypass whatever structure you put in place anyway so it really isn't a guarantee.
You can just opt for GUIDs, like you suggested yourself. Then, you won't have to worry about passing a person ID of "42" to DeleteCar() and accidentally delete the car with ID of 42. GUIDs are unique; if you pass a person GUID to DeleteCar in your code because of a programming typo, that GUID will not be a PK of any car in the database.
You could create a simple Id class which can help differentiate in code between the two:
public class Id<T>
{
private int RawValue
{
get;
set;
}
public Id(int value)
{
this.RawValue = value;
}
public static explicit operator int (Id<T> id) { return id.RawValue; }
// this cast is optional and can be excluded for further strictness
public static implicit operator Id<T> (int value) { return new Id(value); }
}
Used like so:
class SomeClass
{
public Id<Person> PersonId { get; set; }
public Id<Car> CarId { get; set; }
}
Assuming your values would only be retrieved from the database, unless you explicitly cast the value to an integer, it is not possible to use the two in each other's place.
I don't see much value in custom checking in this case. You might want to beef up your testing suite to check that two things are happening:
Your data access code always works as you expect (i.e., you aren't loading inconsistent Key information into your classes and getting misuse because of that).
That your "round trip" code is working as expected (i.e., that loading a record, making a change and saving it back isn't somehow corrupting your business logic objects).
Having a data access (and business logic) layer you can trust is crucial to being able to address the bigger pictures problems you will encounter attempting to implement the actual business requirements. If your data layer is unreliable you will be spending a lot of effort tracking (or worse, working around) problems at that level that surface when you put load on the subsystem.
If instead your data access code is robust in the face of incorrect usage (what your test suite should be proving to you) then you can relax a bit on the higher levels and trust they will throw exceptions (or however you are dealing with it) when abused.
The reason you hear people suggesting an ORM is that many of these issues are dealt with in a reliable way by such tools. If your implementation is far enough along that such a switch would be painful, just keep in mind that your low level data access layer needs to be as robust as an good ORM if you really want to be able to trust (and thus forget about to a certain extent) your data access.
Instead of custom validation, your testing suite could inject code (via dependency injection) that does robust tests of your Keys (hitting the database to verify each change) as the tests run and that injects production code that omits or restricts such tests for performance reasons. Your data layer will throw errors on failed keys (if you have your foreign keys set up correctly there) so you should also be able to handle those exceptions.
My gut says this just isn't worth the hassle. My first question to you would be whether you actually have found bugs where the wrong int was being passed (a Car ID instead of a Person ID in your example). If so, it is probably more of a case of worse overall architecture in that your Domain objects have too much coupling, and are passing too many arguments around in method parameters rather than acting on internal variables.
And if so, why?
and what constitutes "long running"?
Doing magic in a property accessor seems like my prerogative as a class designer. I always thought that is why the designers of C# put those things in there - so I could do what I want.
Of course it's good practice to minimize surprises for users of a class, and so embedding truly long running things - eg, a 10-minute monte carlo analysis - in a method makes sense.
But suppose a prop accessor requires a db read. I already have the db connection open. Would db access code be "acceptable", within the normal expectations, in a property accessor?
Like you mentioned, it's a surprise for the user of the class. People are used to being able to do things like this with properties (contrived example follows:)
foreach (var item in bunchOfItems)
foreach (var slot in someCollection)
slot.Value = item.Value;
This looks very natural, but if item.Value actually is hitting the database every time you access it, it would be a minor disaster, and should be written in a fashion equivalent to this:
foreach (var item in bunchOfItems)
{
var temp = item.Value;
foreach (var slot in someCollection)
slot.Value = temp;
}
Please help steer people using your code away from hidden dangers like this, and put slow things in methods so people know that they're slow.
There are some exceptions, of course. Lazy-loading is fine as long as the lazy load isn't going to take some insanely long amount of time, and sometimes making things properties is really useful for reflection- and data-binding-related reasons, so maybe you'll want to bend this rule. But there's not much sense in violating the convention and violating people's expectations without some specific reason for doing so.
In addition to the good answers already posted, I'll add that the debugger automatically displays the values of properties when you inspect an instance of a class. Do you really want to be debugging your code and have database fetches happening in the debugger every time you inspect your class? Be nice to the future maintainers of your code and don't do that.
Also, this question is extensively discussed in the Framework Design Guidelines; consider picking up a copy.
A db read in a property accessor would be fine - thats actually the whole point of lazy-loading. I think the most important thing would be to document it well so that users of the class understand that there might be a performance hit when accessing that property.
You can do whatever you want, but you should keep the consumers of your API in mind. Accessors and mutators (getters and setters) are expected to be very light weight. With that expectation, developers consuming your API might make frequent and chatty calls to these properties. If you are consuming external resources in your implementation, there might be an unexpected bottleneck.
For consistency sake, it's good to stick with convention for public APIs. If your implementations will be exclusively private, then there's probably no harm (other than an inconsistent approach to solving problems privately versus publicly).
It is just a "good practice" not to make property accessors taking long time to execute.
That's because properties looks like fields for the caller and hence caller (a user of your API that is) usually assumes there is nothing more than just a "return smth;"
If you really need some "action" behind the scenes, consider creating a method for that...
I don't see what the problem is with that, as long as you provide XML documentation so that the Intellisense notifies the object's consumer of what they're getting themselves into.
I think this is one of those situations where there is no one right answer. My motto is "Saying always is almost always wrong." You should do what makes the most sense in any given situation without regard to broad generalizations.
A database access in a property getter is fine, but try to limit the amount of times the database is hit through caching the value.
There are many times that people use properties in loops without thinking about the performance, so you have to anticipate this use. Programmers don't always store the value of a property when they are going to use it many times.
Cache the value returned from the database in a private variable, if it is feasible for this piece of data. This way the accesses are usually very quick.
This isn't directly related to your question, but have you considered going with a load once approach in combination with a refresh parameter?
class Example
{
private bool userNameLoaded = false;
private string userName = "";
public string UserName(bool refresh)
{
userNameLoaded = !refresh;
return UserName();
}
public string UserName()
{
if (!userNameLoaded)
{
/*
userName=SomeDBMethod();
*/
userNameLoaded = true;
}
return userName;
}
}
I often find myself writing a property that is evaluated lazily. Something like:
if (backingField == null)
backingField = SomeOperation();
return backingField;
It is not much code, but it does get repeated a lot if you have a lot of properties.
I am thinking about defining a class called LazyProperty:
public class LazyProperty<T>
{
private readonly Func<T> getter;
public LazyProperty(Func<T> getter)
{
this.getter = getter;
}
private bool loaded = false;
private T propertyValue;
public T Value
{
get
{
if (!loaded)
{
propertyValue = getter();
loaded = true;
}
return propertyValue;
}
}
public static implicit operator T(LazyProperty<T> rhs)
{
return rhs.Value;
}
}
This would enable me to initialize a field like this:
first = new LazyProperty<HeavyObject>(() => new HeavyObject { MyProperty = Value });
And then the body of the property could be reduced to:
public HeavyObject First { get { return first; } }
This would be used by most of the company, since it would go into a common class library shared by most of our products.
I cannot decide whether this is a good idea or not. I think the solutions has some pros, like:
Less code
Prettier code
On the downside, it would be harder to look at the code and determine exactly what happens - especially if a developer is not familiar with the LazyProperty class.
What do you think ? Is this a good idea or should I abandon it ?
Also, is the implicit operator a good idea, or would you prefer to use the Value property explicitly if you should be using this class ?
Opinions and suggestions are welcomed :-)
Just to be overly pedantic:
Your proposed solution to avoid repeating code:
private LazyProperty<HeavyObject> first =
new LazyProperty<HeavyObject>(() => new HeavyObject { MyProperty = Value });
public HeavyObject First {
get {
return first;
}
}
Is actually more characters than the code that you did not want to repeat:
private HeavyObject first;
public HeavyObject First {
get {
if (first == null) first = new HeavyObject { MyProperty = Value };
return first;
}
}
Apart from that, I think that the implicit cast made the code very hard to understand. I would not have guessed that a method that simply returns first, actually end up creating a HeavyObject. I would at least have dropped the implicit conversion and returned first.Value from the property.
Don't do it at all.
Generally using this kind of lazy initialized properties is a valid design choice in one case: when SomeOperation(); is an expensive operation (in terms of I/O, like when it requires a DB hit, or computationally) AND when you are certain you will often NOT need to access it.
That said, by default you should go for eager initialization, and when profiler says it's your bottleneck, then change it to lazy initialization.
If you feel urge to create that kind of abstraction, it's a smell.
Surely you'd at least want the LazyPropery<T> to be a value type, otherwise you've added memory and GC pressure for every "lazily-loaded" property in your system.
Also, what about multiple-threaded scenarios? Consider two threads requesting the property at the same time. Without locking, you could potentially create two instances of the underlying property. To avoid locking in the common case, you would want to do a double-checked lock.
I prefer the first code, because a) it is such a common pattern with properties that I immediately understand it, and b) the point you raised: that there is no hidden magic that you have to go look up to understand where and when the value is being obtained.
I like the idea in that it is much less code and more elegant, but I would be very worried about the fact that it becomes hard to look at it and tell what is going on. The only way I would consider it is to have a convention for variables set using the "lazy" way, and also to comment anywhere it is used. Now there isn't going to be a compiler or anything that will enforce those rules, so still YMMV.
In the end, for me, decisions like this boil down to who is going to be looking at it and the quality of those programmers. If you can trust your fellow developers to use it right and comment well then go for it, but if not, you are better off doing it in a easily understood and followed way. /my 2cents
I don't think worrying about a developer not understanding is a good argument against doing something like this...
If you think that then you couldn't do anything for the fear of someone not understanding what you did
You could write a tutorial or something in a central repository, we have here a wiki for these kind of notes
Overall, I think it's a good implementation idea (not wanting to start a debate whether lazyloading is a good idea or not)
What I do in this case is I create a Visual Studio code snippet. I think that's what you really should do.
For example, when I create ASP.NET controls, I often times have data that gets stored in the ViewState a lot, so I created a code snippet like this:
public Type Value
{
get
{
if(ViewState["key"] == null)
ViewState["key"] = someDefaultValue;
return (Type)ViewState["key"];
}
set{ ViewState["key"] = value; }
}
This way, the code can be easily created with only a little work (defining the type, the key, the name, and the default value). It's reusable, but you don't have the disadvantage of a complex piece of code that other developers might not understand.
I like your solution as it is very clever but I don't think you win much by using it. Lazy loading a private field in a public property is definitely a place where code can be duplicated. However this has always struck me as a pattern to use rather than code that needs to be refactored into a common place.
Your approach may become a concern in the future if you do any serialization. Also it is more confusing initially to understand what you are doing with the custom type.
Overall I applaud your attempt and appreciate its cleverness but would suggest that you revert to your original solution for the reasons stated above.
Personally, I don't think the LazyProperty class as is offers enough value to justify using it especially considering the drawbacks using it for value types has (as Kent mentioned). If you needed other functionality (like making it multithreaded), it might be justified as a ThreadSafeLazyProperty class.
Regarding the implicit property, I like the "Value" property better. It's a little more typing, but a lot more clear to me.
I think this is an interesting idea. First I would recommend that you hide the Lazy Property from the calling code, You don't want to leak into your domain model that it is lazy. Which your doing with the implicit operator so keep that.
I like how you can use this approach to handle and abstract away the details of locking for example. If you do that then I think there is value and merit. If you do add locking watch out for the double lock pattern it's very easy to get it wrong.