Declaring variables - best practices - c#

I found a while ago (and I want to confirm again) that if you declare a class level variable, you should not call its constructor until the class constructor or load has been called. The reason was performance - but are there other reasons to do or not do this? Are there exceptions to this rule?
ie: this is what I do based on what I think the best practice is:
public class SomeClass
{
private PersonObject _person;
public SomeClass()
{
_person = new PersonObject("Smitface");
}
}
opposed to:
public class SomeClass
{
private PersonObject _person = new PersonObject("Smitface");
public SomeClass()
{
}
}

If you set your variable outside of the constructor then there is no error handling (handeling) available. While in your example it makes no difference, but there are many cases that you may want to have some sort of error handling. In that case using your first option would be correct.
Nescio talked about what implication this would have on your applicaiton if there were some constructor failures.
For that reason, I always use Option #1.

Honestly, if you look at the IL, all that happens in the second case is the compiler moves the initialization to the constructor for you.
Personally, I like to see all initialization done in the constructor. If I'm working on a throwaway prototype project, I don't mind having the initialization and declaration in the same spot, but for my "I want to keep this" projects I do it all in the constructor.

Actually, in spite of what others have said, it can be important whether your initialization is inside or outside the constructor, as there is different behaviour during object construction if the object is in a hierarchy (i.e. the order in which things get run is different).
See this post and this post from Eric Lippert which explains the semantic difference between the two in more detail.
So the answer is that in the majority of cases it doesn't make any difference, and it certainly doesn't make any difference in terms of performance, but in a minority of cases it could make a difference, and you should know why, and make a decision based on that.

There's a common pattern called Dependency Injection or Inversion of Control (IOC) that offers exactly these two mechanisms for "injecting" a dependant object (like a DAL class) into a class that's furthur up the dependency chain (furthur from the database)
In this pattern, using a ctor, you would
public class SomeClass
{
private PersonObject per;
public SomeClass(PersonObject person)
{
per = person;
}
}
private PersonObject Joe = new PersonObject("Smitface");
SomeClass MyObj = new SomeClass(Joe);
Now you could for example, pass in a real DAL class for production call
or a test DAL class in a unit test method...

It depends on the context of how the variable will be used. Naturally constants and static or readonly should be initialized on declaration, otherwise they typically should be initialized in the constructor. That way you can swap out design patterns for how your objects are instatiated fairly easy without having to worry about when the variables will be initialized.

You should generally prefer the second variant. It's more robust to changes in your code. Suppose you add a constructor. Now you have to remember to initialize your variables there as well, unless you use the second variant.
Of course, this only counts if there are no compelling reasons to use in-constructor initialization (as mentioned by discorax).

I prefer the latter, but only becuase I find it neater.
It's personal preference really, they both do the same thing.

I like to initialize in the constructor because that way it all happens in one place, and also because it makes it easier if you decide to create an overloaded constructor later on.
Also, it helps to remind me of things that I'll want to clean up in a deconstructor.

the latter can take advantage of lazy instantiation, i.e. it won't initialize the variable until it is referenced

i think this type of question is stylistic only, so who cares what way you do it. the language allows both, so other people are going to do both. don't make bad assumptions.

The first declaration is actually cleaner. The second conceals the fact the constructor initializes the class in the static constructor. If for any reason the constructor fails, the whole type is unusable for the rest of the applicaiton.

I prefer to initialize variables as soon as possible, since it avoids (some) null errors.
Edit: Obviously in this simplified example there is no difference, however in the general case I believe it is good practice to initialize class variables when they are declared, if possible. This makes it impossible to refer to the variable before it it initialized, which eliminates some errors which would be possible when you initialize fields in the constructor.
As you get more class variables and the initialization sequence in the constructor gets more complex, it gets easier to introduce bugs where one initialization depends on another initialization which haven't happend yet.

Related

Passing constructor delegate or object for unmanaged resources

In my (simplified) problem I have a method "Reading" that can use many different implementation of some IDisposableThing. I am passing delegates to the constructor right now so I can use the using statement.
Is this approach of passing a delegate of the constructor of my object appropriate?
My problem is that things like List<Func<IDisposable>> etc start looking bit scary (because delegates look like crap in c#) and passing in a object seems more usual and a clearer statement of intent.
Is there a better/different way of managing this situation without delegates?
public void Main()
{
Reading(() => new DisposableThingImplementation());
Reading(() => new AnotherDisposableThingImplementation());
}
public void Reading(Func<IDisposableThing> constructor)
{
using (IDisposableThing streamReader = constructor())
{
//do things
}
}
As I said in the comment, it's difficult to say what's best for your situation, so instead I'll just list your options so you can make an informed decision:
Continue doing what you're doing
Having to use around objects with an unpleasantly complicated-looking type is maybe not ideal visually, but in your situation it may well be perfectly appropriate
Use a custom delegate type
You can define a delegate like:
public delegate IDisposableThing DisposableThingConstructor();
Then anywhere you would write Func<IDisposableThing>, you can just write DisposableThingConstructor instead. For a commonly used delegate type, this may improve code readability, though this too is a matter of taste.
Move the using statements out of Reading
This really depends on whether it's sensible for the lifecycle management of these objects to be a responsibility of the Reading method or not. Given what we have of your code at the moment, we can't really judge this for you. An implementation with the lifecycle management moved out would look like:
public void Main()
{
using(var disposableThing = new DisposableThingImplementation())
Reading(disposableThing);
}
public void Reading(IDisposableThing disposableThing)
{
//do things
}
Use a factory pattern
In this option, you create a class which returns new IDisposableThing implementations. Lots of information can be found on the factory pattern which you may well already know, so I won't repeat it all here. This option may well be overkill for your purposes here, adding a lot of pointless complexity, but depending on how those DisposableThings are constructed, it may have additional benefits which make it worthwhile.
Use a generic argument
This option will only work if all of your IDisposableThing implementations have a parameterless constructor. I'm guessing that's not the case, but in case it is, it's a relatively straightforward approach:
public void Reading<T>() where T : IDisposableThing, new()
{
using(var disposableThing = new T())
{
//do things
}
}
Use an Inversion of Control container
This is another option which would certainly be overkill if used for this purpose alone. I include it mostly for completeness. Inversion of control containers like Ninject will give you easy ways to manage the lifecycles of objects passed into others.
I very much doubt this would be an appropriate solution in your case, especially since the disposable objects are not being used in another class's constructor. If you later run into a situation where you're trying to manage object lifecycle in a larger, complex object graph, this option might be worth revisiting.
Construct the objects outside of the using statement
This is specifically described as "not a best practice" in the MSDN documentation, but it is an option. You can do:
public void Main()
{
Reading(new DisposableThingImplementation());
}
public void Reading(IDisposableThing disposableThing)
{
using (disposableThing)
{
//do things
}
}
At the end of the using statement, the Dispose method will be called, but the object will not be garbage collected because it is still in scope. Trying to use the object after that would be very likely to cause problems because it is not fully initialized. So again, while this is an option, it's unlikely to be a good one.
Is this approach of passing a delegate of the constructor of my object appropriate? My problem is that things like List<Func<IDisposable>> etc start looking bit scary (because delegates look like crap in c#) and passing in a object seems more usual and a clearer statement of intent.
Yes, it's fine. However I understand your concern about passing a list of those things... Perhaps creating a custom delegate with the same signature as Func<IDisposable> and a more explicit name (e.g. SomethingFactory) would be clearer.
Is there a better/different way of managing this situation without delegates?
You could pass a factory or a list of factories to the method. I don't think it's really "better", though; it's mostly the same, since your factory would typically be represented as an interface with a single method, which is essentially the same as a delegate.

C# Object construction outside the constructor

When it comes to designing classes and "communication" between them, I always try to design them in such way that all object construction and composing take place in object constructor. I don't like the idea of object construction and composition taking place from outside, like other objects setting properties and calling methods on my object to initialize it. This especially gets ugly when multiple object try to do thisto your object and you never know in what order your props\methods will be executed.
Unforunatly I stumbl on such situations quite often, especially now with the growing popularity of dependecy injection frameworks, lots of libraries and frameworks rely on some kind of external object initialization, and quite often require not only constructor injection on our object but property injection too.
My question are:
Is it ok to have objects that relly on some method, or property to be called on them after which they can consider them initialzied?
Is ther some kind of pattern for situations when your object acting is receiver, and must support multiple interfaces that call it, and the order of these calls does matter? (something better than setting flags, like ThisWasDone, ThatWasCalled)
Is it ok to have objects that relly on some method, or property to be called on them after which they can consider them initialzied?
No. Init methods are a pain since there is no guarantee that they will get called. A simple solution is to switch to interfaces and use factory or builder pattern to compose the implementation.
#Mark Seemann has written a article about it: http://blog.ploeh.dk/2011/05/24/DesignSmellTemporalCoupling.aspx
Is there some kind of pattern for situations when your object acting is receiver, and must support multiple interfaces that call it, and the order of these calls does matter? (something better than setting flags, like ThisWasDone, ThatWasCalled)
Builder pattern.
I think it is OK, but there are implications. If this is an object to be used by others, you need to ensure that an exception is thrown any time a method or property is set or accessed and the initialization should have been called but isn't.
Obviously it is much more convenient and intuitive if you can take care of this in the constructor, then you don't have to implement these checks.
I don't see anything wrong in this. It may be not so convinient, but you can not ALWAYS use initialization in ctor, like you can not alwats drive under green light. These are dicisions that you made based on your app requirements.
It's ok. Immagine if your object, for example, need to read data from TCP stream or a file that ciuld be not present or corrupted. Raise an exception from ctor is baaad.
It's ok. If you think, for example, about some your DSL language compiler, it can looks like:
A) find all global variables and check if there mem allocation sum sutisfies your device requierements
B) parse for errors
C) check for self cycling
And so on...
Hoe this helps.
Answering (1)
Why not? An engine needs the driver because this must enter the key for the car, and later power-on. Will a car do things like detecting current speed if engine is stopeed? Or Will the car show remaining oil without powering-on it?
Some programming goals won't be able to have their actors initialized during its object construction, and this isn't because it's a non-proper way of doing things but because it's the natural, regular and/or semantically-wise way of representing its whole behavior.
Answering (2)
A decent class usage documentation will be your best friend. Like answer to (1), there're some things in this world that should be done in order to get them done rightly, and it's not a problem but a requirement.
Checking objects' state using flags isn't a problem too, it's a good way of adding reliability to your object models, because its own behaviors and consumers of them will be aware about if things got done as expected or not.
First of all, Factory Method.
public class MyClass
{
private MyClass()
{
}
public Create()
{
return new MyClass();
}
}
Second of all, why do you not want another class creating an object for you? (Factory)
public class MyThingFactory
{
IThing CreateThing(Speed speed)
{
if(speed == Speed.Fast)
{
return new FastThing();
}
return new SlowThing();
}
}
Third, why do multiple classes have side effects on new instances of your class? Don't you have declarative control over what other classes have access to your object?

Are global static classes and methods bad?

It's generally agreed upon that relying heavily on global stuff is to be avoided. Wouldn't using static classes and methods be the same thing?
Global data is bad. However many issues can be avoided by working with static methods.
I'm going to take the position of Rich Hickey on this one and explain it like this:
To build the most reliable systems in C# use static methods and classes, but not global data. For instance if you hand in a data object into a static method, and that static method does not access any static data, then you can be assured that given that input data the output of the function will always be the same. This is the position taken by Erlang, Lisp, Clojure, and every other Functional Programming language.
Using static methods can greatly simplify multi-threaded coding, since, if programmed correctly, only one thread will have access to a given set of data at a time. And that is really what it comes down to. Having global data is bad since it is a state that can be changed by who knows what thread, and any time. Static methods however allow for very clean code that can be tested in smaller increments.
I know this will be hotly debated, as it flies in the face of C#'s OOP thought process, but I have found that the more static methods I use, the cleaner and more reliable my code is.
This video explains it better than I can, but shows how immutable data, and static methods can produce some extremely thread-safe code.
Let me clarify a bit more some issues with Global Data. Constant (or read-only) global data isn't nearly as big of an issue as mutable (read/write) global data. Therefore if it makes sense to have a global cache of data, use global data! To some extent every application that uses a database will have that, since we could say that all a SQL Database is one massive global variable that holds data.
So making a blanket statement like I did above is probably a bit strong. Instead, let's say that having global data introduces many issues that can be avoid by having local data instead.
Some languages such as Erlang get around this issue by having the cache in a separate thread that handles all requests for that data. This way you know that all requests and modifications to that data will be atomic and the global cache will not be left in some unknown state.
static doesn't necessarely mean global. Classes and members can be static private, hence only applying to the specific class. That said, having too many public static members instead of using appropriate ways to pass data (method calls, callbacks, etc.) is generally bad design.
If you're trying to be purist about your OO development, then statics probably don't fit the mold.
However the real world is messier than theory, and statics are often a very useful way to solve some development problems. Use them when appropriate, and in moderation.
As an addition to whatever else is said, final static variables are just fine; constants are a good thing. The only exception to that is when/if you should just move them to a properties file, to be easier to change.
Mutable static variables are bad because they're just global state. The best discussion I know of about this is here, under the heading "Why Global Variables Should Be Avoided When Unnecessary".
Static methods have several drawbacks that often make them undesirable - the biggest one being that they cannot be used polymorphically.
First, why are the old global variables so bad? Because it is state that is accessible from anywhere, any time. Hard to track.
There are no such problems with static methods.
That leaves static fields (variables). If you declared a public static field in a class, that would truly be a global variable and it would be bad.
But make the static field private and most problems are solved. Or better, they are limited to the containing class and that makes them solvable.
public class Foo
{
private static int counter = 0;
public static int getCounterValue()
{
return counter;
}
//...
public Foo()
{
//other tasks
counter++;
}
}
In the code above you can see that we count how many Foo objects were created. This can be useful in many cases.
The static keyword is not global, it's telling you that it's on class level, which can be very useful in various cases. So, in conclusion, class level things are static, object level things are not static.
Static methods are used to implement traits in Scala. In C#, extension methods (which are static) fulfill that role in part. That could be seen, as the DCI proponents state, as a "higher order form of polymorphism".
Also, static methods can be used to implement functions. This is what F# uses to implement modules. (And also VB.NET.) Functions are useful for (unsurprisingly) functional-programming. And sometimes they're just the way something should be modeled (like the "functions" in the Math class). Again, C# comes close here.
I think the bad thing about global variables is the idea of having global state - variables that can be manipulated anywhere and tend to cause unintended side effects in far-flung areas of a program.
Static data would be similar to global variables in that they introduce a kind of global state. Static methods though are not nearly as bad, assuming they are stateless.
Not entirely. Static actually determines when, where and how often something is instantiated, not who has access to it.

Is object creation in getters bad practice?

Let's have an object created in a getter like this :
public class Class1
{
public string Id { get; set; }
public string Oz { get; set; }
public string Poznamka { get; set; }
public Object object
{
get
{
// maybe some more code
return new Object { Id = Id, poznamla = Poznamka, Oz = OZ };
}
}
}
Or should I rather create a Method that will create and return the object ?
Yes, it is bad practice.
Ideally, a getter should not be changing or creating anything (aside from lazy loading, and even then I think it leads to less clear code...). That way you minimise the risk of unintentional side effects.
Properties look like fields but they are methods. This has been known to cause a phenomenal amount of confusion. When a programmer sees code that appears to be accessing a field, there are many assumptions that the programmer makes that may not be true for a property.So there are some common properties design guidelines.
Avoid returning different values from the property getter. If called multiple times in a row, a property method may return a different value each time; a field returns the same value each time.
A property method may require additional memory or return a reference to something that is not actually part of the object's state, so modifying the returned object has no effect on the original object; querying a field always returns a reference to an object that is guaranteed to be part of the original object's state. Working with a property that returns a copy can be very confusing to developers, and this characteristic is frequently not documented.
Consider that a property cannot be passed as an out or ref parameter to a method; a field can.
Avoid long running property getters. A property method can take a long time to execute; field access always completes immediately.
Avoid throwing exceptions from getters.
Do preserve previous values if a property setter throws an exception
Avoid observable side effects.
Allow properties to be set in any order even if this results in a temporary invalid state of objects.
Sources
"CLR via C#", Jeffrey Richter. Chapter 9. Defining Properties Intelligently
"Framework Design Guidelines" 2nd edition, Brad Abrams, Krzysztof Cwalina, Chapter 5.2 Property Design
If you want your getter to create a new object every time it is accessed, that's the way to do it. This pattern is normally refered to as a Factory Method.
However, this is not normally needed on properties (ie. getters and setters), and as such is considered bad practice.
yes, it is ... from the outside, it should be transparent, whether you access a property or a field ...
when reading twice from field, or a property, you expect two things:
there is no impact on the object's (external) behaviour
you get identical results
I have no real knowledge of C#, but I hope, the following makes my point clear. let's start like this:
Object o1 = myInst.object;
Object o2 = myInst.object;
o1.poznamka = "some note";
in the case of a field, conditions like the following will be true:
o1 == o2;
o2.poznamka == "some note";
if you use a property with a getter, that returns a new object every time called, both conditions will be false ...
your getter seems to be meant to produce a temporary snapshot of your instance ... if that is what you want to do, than make it a plain method ... it avoids any ambiguities ...
A property should, to all intents and purposes, act like a field. That means no exceptions should be thrown, and no new objects should be created (so you don't create lots of unneccessary objects if the property is used in a loop)
Use a wrapper class or similar instead.
According to me if something is 'property' the getter should return you a property (basically a data that is already existing) relevant to the object.
In your case, you are returning something that is not a property of that object at that moment. You are not returning a property of your object but a product of some action.
I would go with a method something like GetMyObject() instead. Especially if there is an 'action' will take place, I think it is most of the time best to have a method than a property name.
And try to imagine what would other developers who are not familiar with your code expect after seeing your property.
A property is just a convenient way to express a calculated field.
It should still represent something about an object, regardless of how the value itself is arrived at. For example, if the object in question is an invoice, you might have to add up the cost of each line item, and return the total.
What's written in the question breaks that rule, because returning a copy of the object isn't something that describes the object. If the return value changes between calls to the property without an explicit change of object state, then the object model is broken.
Speaking in generalities, returning a new object like this will just about always break the rule (I can't think of a counter-example right now), so I would say that it's bad practice.
There's also the gotcha of properties where you can so easily and innocently call on a property multiple times and end up running the same code (which hopefully isn't slow!).
For writing code that is easily tested, you have to maintain separation of Object initialization.
i.e while in test cases you do not have hold on test some specific items.
like in House object you dont want to test anything related to kitchen object.
and you wana test only the garden. so while you initiate a house class and initiate object in some constructors or in getters you wont be coding good that will support testing.
As an aside to the comments already made, you can run into some real debugging headaches when lazy loading fields via a property.
I had a class with
private Collection<int> moo;
public Collection<int> Moo
{
get
{
if (this.moo == null) this.moo = new Collection<int>();
return this.moo;
}
}
Then somewhere else in the class there was a public method that referenced
this.moo.Add(baa);
without checking it was instantiated.
It threw a null reference exception, as expected. But the exception was on a UI thread so not immediately obvious where it was coming from. I started tracing through, and everytime I traced through, the error dissapeared.
For a while I have to admit I thought I was going crazy. Debugger - no error. Runtime, error. Much scratching of head later I spotted the error, and realised that the Visual Studio debugger was instantiating the Collection as it displayed the public properties of the class.
It's maybe at most acceptable for structs. For reference types, I would only create a new object in a getter when it's only done once using some lazy-load pattern.
It depends on the use of the getter. It's a great place to include this kind of code for lazy loading.
It is a bad practice. In your example, you should be able to expect the same Object every time you access the object property.
As you have it it is bad but not dis similar to an acceptable practice called lazy loading which can be read about here.
http://www.aspcode.net/Lazy-loading-of-structures-in-C-howto-part-8.aspx
It is a bad practice. But if you are thinking of objects as a bunch of getters & setters you should check the classical discussions about the topic.
As some folks mentioned, lazy loading could be a reason to do so. Depends on the actual business logic you are modeling here. You should create a separate method if it is better for legibility purposes, but if the code to create the object is simple you could avoid the indirection.

Holding onto object references

Given the below setup and code snippets, what reasons can you come up with for using one over the other? I have a couple arguments using either of them, but I am curious about what others think.
Setup
public class Foo
{
public void Bar()
{
}
}
Snippet One
var foo = new Foo();
foo.Bar();
Snippet Two
new Foo().Bar();
The first version means you can examine foo in a debugger more easily before calling Bar().
The first version also means you associate a name with the object (it's actually the name of variable of course, but there's clearly a mental association) which can be useful at times:
var customersWithBadDebt = (some big long expression)
customersWithBadDebt.HitWithBaseballBat();
is clearer than
(some big long expression).HitWithBaseballBat();
If neither of those reasons apply, feel free to go for the single line version of course. It's personal taste which is then applied to each specific context.
If you are not going to use the Foo instance for more than simply doing that call I would probably go for the second one, but I can't see that there should be any practical difference, unless the call happens extremely often (but that does not strike me as very likely with that construct).
I would favour the second as it makes it pretty clear that I don't need the object reference at all.
I'd question your motives for having this as a instance method.
If I'd only new up the object to call one method on it and discard it afterwards, a static method with the classes dependencies as parameters should be the way to go (although I despise statics).
What leads to the more interesting problem: If you don't rely on internal state at all, then you are just a function and therefore I'd say you belong to some entity, not in your own class.
I'd go for snippet number one.
If you're going to have class Foo do Bar() then Bar2() it will be easier.
Plus its easier to debug because you can watch foo, something that you can't with snippet 2

Categories