Holding onto object references - c#

Given the below setup and code snippets, what reasons can you come up with for using one over the other? I have a couple arguments using either of them, but I am curious about what others think.
Setup
public class Foo
{
public void Bar()
{
}
}
Snippet One
var foo = new Foo();
foo.Bar();
Snippet Two
new Foo().Bar();

The first version means you can examine foo in a debugger more easily before calling Bar().
The first version also means you associate a name with the object (it's actually the name of variable of course, but there's clearly a mental association) which can be useful at times:
var customersWithBadDebt = (some big long expression)
customersWithBadDebt.HitWithBaseballBat();
is clearer than
(some big long expression).HitWithBaseballBat();
If neither of those reasons apply, feel free to go for the single line version of course. It's personal taste which is then applied to each specific context.

If you are not going to use the Foo instance for more than simply doing that call I would probably go for the second one, but I can't see that there should be any practical difference, unless the call happens extremely often (but that does not strike me as very likely with that construct).

I would favour the second as it makes it pretty clear that I don't need the object reference at all.

I'd question your motives for having this as a instance method.
If I'd only new up the object to call one method on it and discard it afterwards, a static method with the classes dependencies as parameters should be the way to go (although I despise statics).
What leads to the more interesting problem: If you don't rely on internal state at all, then you are just a function and therefore I'd say you belong to some entity, not in your own class.

I'd go for snippet number one.
If you're going to have class Foo do Bar() then Bar2() it will be easier.
Plus its easier to debug because you can watch foo, something that you can't with snippet 2

Related

Using inline object method call vs. declaring a new variable

I have been working with Java and C# for a while now and I've been asking myself this many times but haven't ever found the answer I was looking for.
When I have to call an object method (this means it's not static), I have to use the call through instance of class, for example:
MyClass myInstance = new MyClass();
myInstance.nonStaticMethod();
I see this kind of code everywhere, so I was thinking if one-line call (example below) is behaving differently performance wise or it's just the sake of standards?
This is what I meant with one-line call:
new MyClass().nonStaticMethod();
The performance would probably be the same.
However, having calls such as new MyClass().nonStaticMethod(); usually reeks of a code smell - what state was encapsulated by the object that you only needed to invoke a method on it? (i.e. Why was that not a static method?)
EDIT: I do not intend to say it is always bad - in some cases, such idioms are encouraged (such as in the case of fluent builder objects) - but you will notice that in these cases, the resulting object is still significant in some way.
I would go with the first way i.e.
MyClass myInstance = new MyClass();
myInstance.nonStaticMethod();
The problem with the second one i.e. new MyClass().nonStaticMethod();
is in case you want to call another method from same object you don't have any choice The only thing you can do is new MyClass().nonStaticMethod1(); which actually creates a new object everytime. IMHO I don't think any one of them would performance better than the other. IN lack of performance gain I would definitely choose the one which is more clear and understandable hence my choice.
If you look at the byte code generated you can prove there is absolutely no different to the performance or anything else. Except you are using a local variable which should be discarded.
Unless you have measured that you have a performance problem, you should assume you don't and guessing where a performance problem might be is just that, no matter how much experience you have performance optimizing Java applications.
When faced with a question like this, you should first consider what is the simplest and clearest, because the optimizer looks for standard patterns and if you write confusing code, you will not only confuse yourself and others but the optimizer as well and it is more likely to be slower as a result.
Assuming you never need to access the object's instance again, there is no difference. Do whichever you prefer.
Of course if you want to do anything else with that object later on you'll need it in a variable.

Best practice for method chaining ("return this")

I have an abstract class called DatabaseRow that, after being derived and constructed, is mainly loaded from a Load(object id) method.
I have lots of code that creates a new instance of the class, loads it from an ID, then returns the class. I'd like to simplify this code into one line of code (just to be neat, there are plenty of classes that will just have lists of properties that return these Loaded instances).
There are two ways I can think of doing it, but neither seem 'right' to me.
1. I could return this; at the end of my Load method and use return new Derived().Load(id);
2. I could create a generic method to return a loaded method.
public static T LoadRow<T>(object id) where T : DatabaseRow, new()
{
T row = new T();
row.Load(id);
return row;
}
I've seen some other code that uses the same method as number 1, but I've never seen any experienced developer recommend it, nor have I came across any methods in the .NET framework that do the same thing, so maybe this isn't best practice?
Does anybody know of any other solutions that could be better than both of these?
Solution:
After reading SirViver's answer and comment, I realised that all the properties being returned need to be cached anyway. The solution was sightly different, but similar to option 2 (that I wouldn't expect anyone to come up with this answer as I didn't explain this part of the design)
All these instances will be loaded from a value retrieved from another column in the database (database relationships if you like). Instead of trying to load the new instance from this value, I made a method to load the instance from the column name and cache the loaded value in a Dictionary. This works well as this is one of the primary functions of the DatabaseRow class.
private Dictionary<string, DatabaseRow> linkedRows;
protected T GetLinkedRow<T>(string key) where T : DatabaseRow, new()
{
if (linkedRows.ContainsKey(key)) return (T)linkedRows[key];
else
{
T row = new T();
row.Load(this[key]);
linkedRows.Add(key, row);
return row;
}
}
Personally, I think it's bad practice to chain method calls that have actual sideffects on the object instance. To be honest, I think both examples are quite ugly "hacks" whose only purpose is saving two lines of code. I don't think the result is actually more readable either.
If you want a record to be loaded immediately, I'd probably rather supply a constructor variant that takes the ID you load from and make the object auto-populate itself on construction, though when I think about it, I wouldn't bother at all to be honest - cramming more information in a single line does not make for more read- and maintainable code.
Number 1 is gaining in popularity in some circles; it's often called fluent programming when there's an interface that defines great numbers of these methods and long chains can be assembled. The key to doing this successfully is never to define a method that sometimes returns this, but other times returns null. If this is always returned (unless there's an exception) it's perfectly good style.
Personally I'm not as fond of solutions like number 2, as it could be said to violate the "one responsibility" principle.
Option 1 is only acceptable if it is possible to have a row without being loaded. This is due to the pattern allowing you to do this:
return new Derived();
Personally I would prefer the static method. But I suspect it is simply a matter of personal preference.
As ssg says the other option (lets call it option 3) is to overload the constructor in Derived which would also work, however having lots of constructors can often be confusing as the calling code does not have anything in it which describes what is going on.
Option 1:
return new Derived().Load(10);
Option 2:
return Derived.Load(10);
Option 3:
return new Derived(10);
Option 1 looks like you are getting a superfluous object created. Option 2 is good because it does what it looks like it is doing. Option 3 leaves confusion about what it does.
Although I personally like methods returning this as it allows them to be chained, I think in the .NET framework (up until Linq), this was frowned upon. The reason that pops to mind is this:
Methods either return a result or change the state of the object. Returning this is a bit of both: Changing the state of the object and then returning a "result" - except that result is the modified original object. This doesn't fit the expectations of the user.
What about:
public class Derived : DatabaseRow
{
public Derived(object id):
{
Load(id);
}
}
And use it like this:
return new Derived(id);

How to enforce the use of a method's return value in C#?

I have a piece of software written with fluent syntax. The method chain has a definitive "ending", before which nothing useful is actually done in the code (think NBuilder, or Linq-to-SQL's query generation not actually hitting the database until we iterate over our objects with, say, ToList()).
The problem I am having is there is confusion among other developers about proper usage of the code. They are neglecting to call the "ending" method (thus never actually "doing anything")!
I am interested in enforcing the usage of the return value of some of my methods so that we can never "end the chain" without calling that "Finalize()" or "Save()" method that actually does the work.
Consider the following code:
//The "factory" class the user will be dealing with
public class FluentClass
{
//The entry point for this software
public IntermediateClass<T> Init<T>()
{
return new IntermediateClass<T>();
}
}
//The class that actually does the work
public class IntermediateClass<T>
{
private List<T> _values;
//The user cannot call this constructor
internal IntermediateClass<T>()
{
_values = new List<T>();
}
//Once generated, they can call "setup" methods such as this
public IntermediateClass<T> With(T value)
{
var instance = new IntermediateClass<T>() { _values = _values };
instance._values.Add(value);
return instance;
}
//Picture "lazy loading" - you have to call this method to
//actually do anything worthwhile
public void Save()
{
var itemCount = _values.Count();
. . . //save to database, write a log, do some real work
}
}
As you can see, proper usage of this code would be something like:
new FluentClass().Init<int>().With(-1).With(300).With(42).Save();
The problem is that people are using it this way (thinking it achieves the same as the above):
new FluentClass().Init<int>().With(-1).With(300).With(42);
So pervasive is this problem that, with entirely good intentions, another developer once actually changed the name of the "Init" method to indicate that THAT method was doing the "real work" of the software.
Logic errors like these are very difficult to spot, and, of course, it compiles, because it is perfectly acceptable to call a method with a return value and just "pretend" it returns void. Visual Studio doesn't care if you do this; your software will still compile and run (although in some cases I believe it throws a warning). This is a great feature to have, of course. Imagine a simple "InsertToDatabase" method that returns the ID of the new row as an integer - it is easy to see that there are some cases where we need that ID, and some cases where we could do without it.
In the case of this piece of software, there is definitively never any reason to eschew that "Save" function at the end of the method chain. It is a very specialized utility, and the only gain comes from the final step.
I want somebody's software to fail at the compiler level if they call "With()" and not "Save()".
It seems like an impossible task by traditional means - but that's why I come to you guys. Is there an Attribute I can use to prevent a method from being "cast to void" or some such?
Note: The alternate way of achieving this goal that has already been suggested to me is writing a suite of unit tests to enforce this rule, and using something like http://www.testdriven.net to bind them to the compiler. This is an acceptable solution, but I am hoping for something more elegant.
I don't know of a way to enforce this at a compiler level. It's often requested for objects which implement IDisposable as well, but isn't really enforceable.
One potential option which can help, however, is to set up your class, in DEBUG only, to have a finalizer that logs/throws/etc. if Save() was never called. This can help you discover these runtime problems while debugging instead of relying on searching the code, etc.
However, make sure that, in release mode, this is not used, as it will incur a performance overhead since the addition of an unnecessary finalizer is very bad on GC performance.
You could require specific methods to use a callback like so:
new FluentClass().Init<int>(x =>
{
x.Save(y =>
{
y.With(-1),
y.With(300)
});
});
The with method returns some specific object, and the only way to get that object is by calling x.Save(), which itself has a callback that lets you set up your indeterminate number of with statements. So the init takes something like this:
public T Init<T>(Func<MyInitInputType, MySaveResultType> initSetup)
I can think of three a few solutions, not ideal.
AIUI what you want is a function which is called when the temporary variable goes out of scope (as in, when it becomes available for garbage collection, but will probably not be garbage collected for some time yet). (See: The difference between a destructor and a finalizer?) This hypothetical function would say "if you've constructed a query in this object but not called save, produce an error". C++/CLI calls this RAII, and in C++/CLI there is a concept of a "destructor" when the object isn't used any more, and a "finaliser" which is called when it's finally garbage collected. Very confusingly, C# has only a so-called destructor, but this is only called by the garbage collector (it would be valid for the framework to call it earlier, as if it were partially cleaning the object immediately, but AFAIK it doesn't do anything like that). So what you would like is a C++/CLI destructor. Unfortunately, AIUI this maps onto the concept of IDisposable, which exposes a dispose() method which can be called when a C++/CLI destructor would be called, or when the C# destructor is called -- but AIUI you still have to call "dispose" manually, which defeats the point?
Refactor the interface slightly to convey the concept more accurately. Call the init function something like "prepareQuery" or "AAA" or "initRememberToCallSaveOrThisWontDoAnything". (The last is an exaggeration, but it might be necessary to make the point).
This is more of a social problem than a technical problem. The interface should make it easy to do the right thing, but programmers do have to know how to use code! Get all the programmers together. Explain simply once-and-for-all this simple fact. If necessary have them all sign a piece of paper saying they understand, and if they wilfully continue to write code which doesn't do anythign they're worse than useless to the company and will be fired.
Fiddle with the way the operators are chained, eg. have each of the intermediateClass functions assemble an aggregate intermediateclass object containing all of the parameters (you mostly do it this was already (?)) but require an init-like function of the original class to take that as an argument, rather than have them chained after it, and then you can have save and the other functions return two different class types (with essentially the same contents), and have init only accept a class of the correct type.
The fact that it's still a problem suggests that either your coworkers need a helpful reminder, or they're rather sub-par, or the interface wasn't very clear (perhaps its perfectly good, but the author didn't realise it wouldn't be clear if you only used it in passing rather than getting to know it), or you yourself have misunderstood the situation. A technical solution would be good, but you should probably think about why the problem occurred and how to communicate more clearly, probably asking someone senior's input.
After great deliberation and trial and error, it turns out that throwing an exception from the Finalize() method was not going to work for me. Apparently, you simply can't do that; the exception gets eaten up, because garbage collection operates non-deterministically. I was unable to get the software to call Dispose() automatically from the destructor either. Jack V.'s comment explains this well; here was the link he posted, for redundancy/emphasis:
The difference between a destructor and a finalizer?
Changing the syntax to use a callback was a clever way to make the behavior foolproof, but the agreed-upon syntax was fixed, and I had to work with it. Our company is all about fluent method chains. I was also a fan of the "out parameter" solution to be honest, but again, the bottom line is the method signatures simply could not change.
Helpful information about my particular problem includes the fact that my software is only ever to be run as part of a suite of unit tests - so efficiency is not a problem.
What I ended up doing was use Mono.Cecil to Reflect upon the Calling Assembly (the code calling into my software). Note that System.Reflection was insufficient for my purposes, because it cannot pinpoint method references, but I still needed(?) to use it to get the "calling assembly" itself (Mono.Cecil remains underdocumented, so it's possible I just need to get more familiar with it in order to do away with System.Reflection altogether; that remains to be seen....)
I placed the Mono.Cecil code in the Init() method, and the structure now looks something like:
public IntermediateClass<T> Init<T>()
{
ValidateUsage(Assembly.GetCallingAssembly());
return new IntermediateClass<T>();
}
void ValidateUsage(Assembly assembly)
{
// 1) Use Mono.Cecil to inspect the codebase inside the assembly
var assemblyLocation = assembly.CodeBase.Replace("file:///", "");
var monoCecilAssembly = AssemblyFactory.GetAssembly(assemblyLocation);
// 2) Retrieve the list of Instructions in the calling method
var methods = monoCecilAssembly.Modules...Types...Methods...Instructions
// (It's a little more complicated than that...
// if anybody would like more specific information on how I got this,
// let me know... I just didn't want to clutter up this post)
// 3) Those instructions refer to OpCodes and Operands....
// Defining "invalid method" as a method that calls "Init" but not "Save"
var methodCallingInit = method.Body.Instructions.Any
(instruction => instruction.OpCode.Name.Equals("callvirt")
&& instruction.Operand is IMethodReference
&& instruction.Operand.ToString.Equals(INITMETHODSIGNATURE);
var methodNotCallingSave = !method.Body.Instructions.Any
(instruction => instruction.OpCode.Name.Equals("callvirt")
&& instruction.Operand is IMethodReference
&& instruction.Operand.ToString.Equals(SAVEMETHODSIGNATURE);
var methodInvalid = methodCallingInit && methodNotCallingSave;
// Note: this is partially pseudocode;
// It doesn't 100% faithfully represent either Mono.Cecil's syntax or my own
// There are actually a lot of annoying casts involved, omitted for sanity
// 4) Obviously, if the method is invalid, throw
if (methodInvalid)
{
throw new Exception(String.Format("Bad developer! BAD! {0}", method.Name));
}
}
Trust me, the actual code is even uglier looking than my pseudocode.... :-)
But Mono.Cecil just might be my new favorite toy.
I now have a method that refuses to be run its main body unless the calling code "promises" to also call a second method afterwards. It's like a strange kind of code contract. I'm actually thinking about making this generic and reusable. Would any of you have a use for such a thing? Say, if it were an attribute?
What if you made it so Init and With don't return objects of type FluentClass? Have them return, e.g., UninitializedFluentClass which wraps a FluentClass object. Then calling .Save(0 on the UnitializedFluentClass object calls it on the wrapped FluentClass object and returns it. If they don't call Save they don't get a FluentClass object.
In Debug mode beside implementing IDisposable you can setup a timer that will throw a exception after 1 second if the resultmethod has not been called.
Use an out parameter! All the outs must be used.
Edit: I am not sure of it will help, tho...
It would break the fluent syntax.

Can I detect whether I've been given a new object as a parameter?

Short Version
For those who don't have the time to read my reasoning for this question below:
Is there any way to enforce a policy of "new objects only" or "existing objects only" for a method's parameters?
Long Version
There are plenty of methods which take objects as parameters, and it doesn't matter whether the method has the object "all to itself" or not. For instance:
var people = new List<Person>();
Person bob = new Person("Bob");
people.Add(bob);
people.Add(new Person("Larry"));
Here the List<Person>.Add method has taken an "existing" Person (Bob) as well as a "new" Person (Larry), and the list contains both items. Bob can be accessed as either bob or people[0]. Larry can be accessed as people[1] and, if desired, cached and accessed as larry (or whatever) thereafter.
OK, fine. But sometimes a method really shouldn't be passed a new object. Take, for example, Array.Sort<T>. The following doesn't make a whole lot of sense:
Array.Sort<int>(new int[] {5, 6, 3, 7, 2, 1});
All the above code does is take a new array, sort it, and then forget it (as its reference count reaches zero after Array.Sort<int> exits and the sorted array will therefore be garbage collected, if I'm not mistaken). So Array.Sort<T> expects an "existing" array as its argument.
There are conceivably other methods which may expect "new" objects (though I would generally think that to have such an expectation would be a design mistake). An imperfect example would be this:
DataTable firstTable = myDataSet.Tables["FirstTable"];
DataTable secondTable = myDataSet.Tables["SecondTable"];
firstTable.Rows.Add(secondTable.Rows[0]);
As I said, this isn't a great example, since DataRowCollection.Add doesn't actually expect a new DataRow, exactly; but it does expect a DataRow that doesn't already belong to a DataTable. So the last line in the code above won't work; it needs to be:
firstTable.ImportRow(secondTable.Rows[0]);
Anyway, this is a lot of setup for my question, which is: is there any way to enforce a policy of "new objects only" or "existing objects only" for a method's parameters, either in its definition (perhaps by some custom attributes I'm not aware of) or within the method itself (perhaps by reflection, though I'd probably shy away from this even if it were available)?
If not, any interesting ideas as to how to possibly accomplish this would be more than welcome. For instance I suppose if there were some way to get the GC's reference count for a given object, you could tell right away at the start of a method whether you've received a new object or not (assuming you're dealing with reference types, of course--which is the only scenario to which this question is relevant anyway).
EDIT:
The longer version gets longer.
All right, suppose I have some method that I want to optionally accept a TextWriter to output its progress or what-have-you:
static void TryDoSomething(TextWriter output) {
// do something...
if (output != null)
output.WriteLine("Did something...");
// do something else...
if (output != null)
output.WriteLine("Did something else...");
// etc. etc.
if (output != null)
// do I call output.Close() or not?
}
static void TryDoSomething() {
TryDoSomething(null);
}
Now, let's consider two different ways I could call this method:
string path = GetFilePath();
using (StreamWriter writer = new StreamWriter(path)) {
TryDoSomething(writer);
// do more things with writer
}
OR:
TryDoSomething(new StreamWriter(path));
Hmm... it would seem that this poses a problem, doesn't it? I've constructed a StreamWriter, which implements IDisposable, but TryDoSomething isn't going to presume to know whether it has exclusive access to its output argument or not. So the object either gets disposed prematurely (in the first case), or doesn't get disposed at all (in the second case).
I'm not saying this would be a great design, necessarily. Perhaps Josh Stodola is right and this is just a bad idea from the start. Anyway, I asked the question mainly because I was just curious if such a thing were possible. Looks like the answer is: not really.
No, basically.
There's really no difference between:
var x = new ...;
Foo(x);
and
Foo(new ...);
and indeed sometimes you might convert between the two for debugging purposes.
Note that in the DataRow/DataTable example, there's an alternative approach though - that DataRow can know its parent as part of its state. That's not the same thing as being "new" or not - you could have a "detach" operation for example. Defining conditions in terms of the genuine hard-and-fast state of the object makes a lot more sense than woolly terms such as "new".
Yes, there is a way to do this.
Sort of.
If you make your parameter a ref parameter, you'll have to have an existing variable as your argument. You can't do something like this:
DoSomething(ref new Customer());
If you do, you'll get the error "A ref or out argument must be an assignable variable."
Of course, using ref has other implications. However, if you're the one writing the method, you don't need to worry about them. As long as you don't reassign the ref parameter inside the method, it won't make any difference whether you use ref or not.
I'm not saying it's good style, necessarily. You shouldn't use ref or out unless you really, really need to and have no other way to do what you're doing. But using ref will make what you want to do work.
No. And if there is some reason that you need to do this, your code has improper architecture.
Short answer - no there isn't
In the vast majority of cases I usually find that the issues that you've listed above don't really matter all that much. When they do you could overload a method so that you can accept something else as a parameter instead of the object you are worried about sharing.
// For example create a method that allows you to do this:
people.Add("Larry");
// Instead of this:
people.Add(new Person("Larry"));
// The new method might look a little like this:
public void Add(string name)
{
Person person = new Person(name);
this.add(person); // This method could be private if neccessary
}
I can think of a way to do this, but I would definitely not recommend this. Just for argument's sake.
What does it mean for an object to be a "new" object? It means there is only one reference keeping it alive. An "existing" object would have more than one reference to it.
With this in mind, look at the following code:
class Program
{
static void Main(string[] args)
{
object o = new object();
Console.WriteLine(IsExistingObject(o));
Console.WriteLine(IsExistingObject(new object()));
o.ToString(); // Just something to simulate further usage of o. If we didn't do this, in a release build, o would be collected by the GC.Collect call in IsExistingObject. (not in a Debug build)
}
public static bool IsExistingObject(object o)
{
var oRef = new WeakReference(o);
#if DEBUG
o = null; // In Debug, we need to set o to null. This is not necessary in a release build.
#endif
GC.Collect();
GC.WaitForPendingFinalizers();
return oRef.IsAlive;
}
}
This prints True on the first line, False on the second.
But again, please do not use this in your code.
Let me rewrite your question to something shorter.
Is there any way, in my method, which takes an object as an argument, to know if this object will ever be used outside of my method?
And the short answer to that is: No.
Let me venture an opinion at this point: There should not be any such mechanism either.
This would complicate method calls all over the place.
If there was a method where I could, in a method call, tell if the object I'm given would really be used or not, then it's a signal to me, as a developer of that method, to take that into account.
Basically, you'd see this type of code all over the place (hypothetical, since it isn't available/supported:)
if (ReferenceCount(obj) == 1) return; // only reference is the one we have
My opinion is this: If the code that calls your method isn't going to use the object for anything, and there are no side-effects outside of modifying the object, then that code should not exist to begin with.
It's like code that looks like this:
1 + 2;
What does this code do? Well, depending on the C/C++ compiler, it might compile into something that evaluates 1+2. But then what, where is the result stored? Do you use it for anything? No? Then why is that code part of your source code to begin with?
Of course, you could argue that the code is actually a+b;, and the purpose is to ensure that the evaluation of a+b isn't going to throw an exception denoting overflow, but such a case is so diminishingly rare that a special case for it would just mask real problems, and it would be really simple to fix by just assigning it to a temporary variable.
In any case, for any feature in any programming language and/or runtime and/or environment, where a feature isn't available, the reasons for why it isn't available are:
It wasn't designed properly
It wasn't specified properly
It wasn't implemented properly
It wasn't tested properly
It wasn't documented properly
It wasn't prioritized above competing features
All of these are required to get a feature to appear in version X of application Y, be it C# 4.0 or MS Works 7.0.
Nope, there's no way of knowing.
All that gets passed in is the object reference. Whether it is 'newed' in-situ, or is sourced from an array, the method in question has no way of knowing how the parameters being passed in have been instantiated and/or where.
One way to know if an object passed to a function (or a method) has been created right before the call to the function/method is that the object has a property that is initialized with the timestamp passed from a system function; in that way, looking at that property, it would be possible to resolve the problem.
Frankly, I would not use such method because
I don't see any reason why the code should now if the passed parameter is an object right created, or if it has been created in a different moment.
The method I suggest depends from a system function that in some systems could not be present, or that could be less reliable.
With the modern CPUs, which are a way faster than the CPUs used 10 years ago, there could be the problem to use the right value for the threshold value to decide when an object has been freshly created, or not.
The other solution would be to use an object property that is set to a a value from the object creator, and that is set to a different value from all the methods of the object.
In this case the problem would be to forget to add the code to change that property in each method.
Once again I would ask to myself "Is there a really need to do this?".
As a possible partial solution if you only wanted one of an object to be consumed by a method maybe you could look at a Singleton. In this way the method in question could not create another instance if it existed already.

Declaring variables - best practices

I found a while ago (and I want to confirm again) that if you declare a class level variable, you should not call its constructor until the class constructor or load has been called. The reason was performance - but are there other reasons to do or not do this? Are there exceptions to this rule?
ie: this is what I do based on what I think the best practice is:
public class SomeClass
{
private PersonObject _person;
public SomeClass()
{
_person = new PersonObject("Smitface");
}
}
opposed to:
public class SomeClass
{
private PersonObject _person = new PersonObject("Smitface");
public SomeClass()
{
}
}
If you set your variable outside of the constructor then there is no error handling (handeling) available. While in your example it makes no difference, but there are many cases that you may want to have some sort of error handling. In that case using your first option would be correct.
Nescio talked about what implication this would have on your applicaiton if there were some constructor failures.
For that reason, I always use Option #1.
Honestly, if you look at the IL, all that happens in the second case is the compiler moves the initialization to the constructor for you.
Personally, I like to see all initialization done in the constructor. If I'm working on a throwaway prototype project, I don't mind having the initialization and declaration in the same spot, but for my "I want to keep this" projects I do it all in the constructor.
Actually, in spite of what others have said, it can be important whether your initialization is inside or outside the constructor, as there is different behaviour during object construction if the object is in a hierarchy (i.e. the order in which things get run is different).
See this post and this post from Eric Lippert which explains the semantic difference between the two in more detail.
So the answer is that in the majority of cases it doesn't make any difference, and it certainly doesn't make any difference in terms of performance, but in a minority of cases it could make a difference, and you should know why, and make a decision based on that.
There's a common pattern called Dependency Injection or Inversion of Control (IOC) that offers exactly these two mechanisms for "injecting" a dependant object (like a DAL class) into a class that's furthur up the dependency chain (furthur from the database)
In this pattern, using a ctor, you would
public class SomeClass
{
private PersonObject per;
public SomeClass(PersonObject person)
{
per = person;
}
}
private PersonObject Joe = new PersonObject("Smitface");
SomeClass MyObj = new SomeClass(Joe);
Now you could for example, pass in a real DAL class for production call
or a test DAL class in a unit test method...
It depends on the context of how the variable will be used. Naturally constants and static or readonly should be initialized on declaration, otherwise they typically should be initialized in the constructor. That way you can swap out design patterns for how your objects are instatiated fairly easy without having to worry about when the variables will be initialized.
You should generally prefer the second variant. It's more robust to changes in your code. Suppose you add a constructor. Now you have to remember to initialize your variables there as well, unless you use the second variant.
Of course, this only counts if there are no compelling reasons to use in-constructor initialization (as mentioned by discorax).
I prefer the latter, but only becuase I find it neater.
It's personal preference really, they both do the same thing.
I like to initialize in the constructor because that way it all happens in one place, and also because it makes it easier if you decide to create an overloaded constructor later on.
Also, it helps to remind me of things that I'll want to clean up in a deconstructor.
the latter can take advantage of lazy instantiation, i.e. it won't initialize the variable until it is referenced
i think this type of question is stylistic only, so who cares what way you do it. the language allows both, so other people are going to do both. don't make bad assumptions.
The first declaration is actually cleaner. The second conceals the fact the constructor initializes the class in the static constructor. If for any reason the constructor fails, the whole type is unusable for the rest of the applicaiton.
I prefer to initialize variables as soon as possible, since it avoids (some) null errors.
Edit: Obviously in this simplified example there is no difference, however in the general case I believe it is good practice to initialize class variables when they are declared, if possible. This makes it impossible to refer to the variable before it it initialized, which eliminates some errors which would be possible when you initialize fields in the constructor.
As you get more class variables and the initialization sequence in the constructor gets more complex, it gets easier to introduce bugs where one initialization depends on another initialization which haven't happend yet.

Categories