I've been programming in C# and Java recently and I am curious where the best place is to initialize my class fields.
Should I do it at declaration?:
public class Dice
{
private int topFace = 1;
private Random myRand = new Random();
public void Roll()
{
// ......
}
}
or in a constructor?:
public class Dice
{
private int topFace;
private Random myRand;
public Dice()
{
topFace = 1;
myRand = new Random();
}
public void Roll()
{
// .....
}
}
I'm really curious what some of you veterans think is the best practice. I want to be consistent and stick to one approach.
My rules:
Don't initialize with the default values in declaration (null, false, 0, 0.0…).
Prefer initialization in declaration if you don't have a constructor parameter that changes the value of the field.
If the value of the field changes because of a constructor parameter put the initialization in the constructors.
Be consistent in your practice (the most important rule).
In C# it doesn't matter. The two code samples you give are utterly equivalent. In the first example the C# compiler (or is it the CLR?) will construct an empty constructor and initialise the variables as if they were in the constructor (there's a slight nuance to this that Jon Skeet explains in the comments below).
If there is already a constructor then any initialisation "above" will be moved into the top of it.
In terms of best practice the former is less error prone than the latter as someone could easily add another constructor and forget to chain it.
I think there is one caveat. I once committed such an error: Inside of a derived class, I tried to "initialize at declaration" the fields inherited from an abstract base class. The result was that there existed two sets of fields, one is "base" and another is the newly declared ones, and it cost me quite some time to debug.
The lesson: to initialize inherited fields, you'd do it inside of the constructor.
The semantics of C# differs slightly from Java here. In C# assignment in declaration is performed before calling the superclass constructor. In Java it is done immediately after which allows 'this' to be used (particularly useful for anonymous inner classes), and means that the semantics of the two forms really do match.
If you can, make the fields final.
Assuming the type in your example, definitely prefer to initialize fields in the constructor. The exceptional cases are:
Fields in static classes/methods
Fields typed as static/final/et al
I always think of the field listing at the top of a class as the table of contents (what is contained herein, not how it is used), and the constructor as the introduction. Methods of course are chapters.
In Java, an initializer with the declaration means the field is always initialized the same way, regardless of which constructor is used (if you have more than one) or the parameters of your constructors (if they have arguments), although a constructor might subsequently change the value (if it is not final). So using an initializer with a declaration suggests to a reader that the initialized value is the value that the field has in all cases, regardless of which constructor is used and regardless of the parameters passed to any constructor. Therefore use an initializer with the declaration only if, and always if, the value for all constructed objects is the same.
There are many and various situations.
I just need an empty list
The situation is clear. I just need to prepare my list and prevent an exception from being thrown when someone adds an item to the list.
public class CsvFile
{
private List<CsvRow> lines = new List<CsvRow>();
public CsvFile()
{
}
}
I know the values
I exactly know what values I want to have by default or I need to use some other logic.
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = new List<string>() {"usernameA", "usernameB"};
}
}
or
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = GetDefaultUsers(2);
}
}
Empty list with possible values
Sometimes I expect an empty list by default with a possibility of adding values through another constructor.
public class AdminTeam
{
private List<string> usernames = new List<string>();
public AdminTeam()
{
}
public AdminTeam(List<string> admins)
{
admins.ForEach(x => usernames.Add(x));
}
}
What if I told you, it depends?
I in general initialize everything and do it in a consistent way. Yes it's overly explicit but it's also a little easier to maintain.
If we are worried about performance, well then I initialize only what has to be done and place it in the areas it gives the most bang for the buck.
In a real time system, I question if I even need the variable or constant at all.
And in C++ I often do next to no initialization in either place and move it into an Init() function. Why? Well, in C++ if you're initializing something that can throw an exception during object construction you open yourself to memory leaks.
The design of C# suggests that inline initialization is preferred, or it wouldn't be in the language. Any time you can avoid a cross-reference between different places in the code, you're generally better off.
There is also the matter of consistency with static field initialization, which needs to be inline for best performance. The Framework Design Guidelines for Constructor Design say this:
✓ CONSIDER initializing static fields inline rather than explicitly using static constructors, because the runtime is able to optimize the performance of types that don’t have an explicitly defined static constructor.
"Consider" in this context means to do so unless there's a good reason not to. In the case of static initializer fields, a good reason would be if initialization is too complex to be coded inline.
Being consistent is important, but this is the question to ask yourself:
"Do I have a constructor for anything else?"
Typically, I am creating models for data transfers that the class itself does nothing except work as housing for variables.
In these scenarios, I usually don't have any methods or constructors. It would feel silly to me to create a constructor for the exclusive purpose of initializing my lists, especially since I can initialize them in-line with the declaration.
So as many others have said, it depends on your usage. Keep it simple, and don't make anything extra that you don't have to.
Consider the situation where you have more than one constructor. Will the initialization be different for the different constructors? If they will be the same, then why repeat for each constructor? This is in line with kokos statement, but may not be related to parameters. Let's say, for example, you want to keep a flag which shows how the object was created. Then that flag would be initialized differently for different constructors regardless of the constructor parameters. On the other hand, if you repeat the same initialization for each constructor you leave the possibility that you (unintentionally) change the initialization parameter in some of the constructors but not in others. So, the basic concept here is that common code should have a common location and not be potentially repeated in different locations. So I would say always put it in the declaration until you have a specific situation where that no longer works for you.
There is a slight performance benefit to setting the value in the declaration. If you set it in the constructor it is actually being set twice (first to the default value, then reset in the ctor).
When you don't need some logic or error handling:
Initialize class fields at declaration
When you need some logic or error handling:
Initialize class fields in constructor
This works well when the initialization value is available and the
initialization can be put on one line. However, this form of
initialization has limitations because of its simplicity. If
initialization requires some logic (for example, error handling or a
for loop to fill a complex array), simple assignment is inadequate.
Instance variables can be initialized in constructors, where error
handling or other logic can be used.
From https://docs.oracle.com/javase/tutorial/java/javaOO/initial.html .
I normally try the constructor to do nothing but getting the dependencies and initializing the related instance members with them. This will make you life easier if you want to unit test your classes.
If the value you are going to assign to an instance variable does not get influenced by any of the parameters you are going to pass to you constructor then assign it at declaration time.
Not a direct answer to your question about the best practice but an important and related refresher point is that in the case of a generic class definition, either leave it on compiler to initialize with default values or we have to use a special method to initialize fields to their default values (if that is absolute necessary for code readability).
class MyGeneric<T>
{
T data;
//T data = ""; // <-- ERROR
//T data = 0; // <-- ERROR
//T data = null; // <-- ERROR
public MyGeneric()
{
// All of the above errors would be errors here in constructor as well
}
}
And the special method to initialize a generic field to its default value is the following:
class MyGeneric<T>
{
T data = default(T);
public MyGeneric()
{
// The same method can be used here in constructor
}
}
"Prefer initialization in declaration", seems like a good general practice.
Here is an example which cannot be initialized in the declaration so it has to be done in the constructor.
"Error CS0236 A field initializer cannot reference the non-static field, method, or property"
class UserViewModel
{
// Cannot be set here
public ICommand UpdateCommad { get; private set; }
public UserViewModel()
{
UpdateCommad = new GenericCommand(Update_Method); // <== THIS WORKS
}
void Update_Method(object? parameter)
{
}
}
While refactoring some code, I came across this strange compile error:
The constructor call needs to be dynamically dispatched, but cannot be because it is part of a constructor initializer. Consider casting the dynamic arguments.
It seems to occur when trying to call base methods/constructors that take dynamic arguments. For example:
class ClassA
{
public ClassA(dynamic test)
{
Console.WriteLine("ClassA");
}
}
class ClassB : ClassA
{
public ClassB(dynamic test)
: base(test)
{
Console.WriteLine("ClassB");
}
}
It works if I cast the argument to object, like this:
public ClassB(dynamic test)
: base((object)test)
So, I'm a little confused. Why do I have to put this nasty cast in - why can't the compiler figure out what I mean?
The constructor chain has to be determined for certain at compile-time - the compiler has to pick an overload so that it can create valid IL. Whereas normally overload resolution (e.g. for method calls) can be deferred until execution time, that doesn't work for chained constructor calls.
EDIT: In "normal" C# code (before C# 4, basically), all overload resolution is performed at compile-time. However, when a member invocation involves a dynamic value, that is resolved at execution time. For example consider this:
using System;
class Program
{
static void Foo(int x)
{
Console.WriteLine("int!");
}
static void Foo(string x)
{
Console.WriteLine("string!");
}
static void Main(string[] args)
{
dynamic d = 10;
Foo(d);
}
}
The compiler doesn't emit a direct call to Foo here - it can't, because in the call Foo(d) it doesn't know which overload it would resolve to. Instead it emits code which does a sort of "just in time" mini-compilation to resolve the overload with the actual type of the value of d at execution time.
Now that doesn't work for constructor chaining, as valid IL has to contain a call to a specific base class constructor. (I don't know whether the dynamic version can't even be expressed in IL, or whether it can, but the result would be unverifiable.)
You could argue that the C# compiler should be able to tell that there's only actually one visible constructor which can be called, and that constructor will always be available... but once you start down that road, you end up with a language which is very complicated to specify. The C# designers usually take the position of having simpler rules which occasionally aren't as powerful as you'd like them to be.
Consider this code block:
struct Animal
{
public string name = ""; // Error
public static int weight = 20; // OK
// initialize the non-static field here
public void FuncToInitializeName()
{
name = ""; // Now correct
}
}
Why can we initialize a static field inside a struct but not a non-static field?
Why do we have to initialize non-static in methods bodies?
Have a look at Why Can't Value Types have Default Constructors?
The CLI expects to be able to allocate and create new instances of any value type that would require 'n' bytes of memory, by simply allocating 'n' bytes and filling them with zero. There's no reason the CLI "couldn't" provide a means of specifying either that before any entity containing structs is made available to outside code, a constructor must be run on every struct therein, or that a whenever an instance of a particular n-byte struct is created, the compiler should copy a 'template instance'. As it is, however, the CLI doesn't allow such a thing. Consequently, there's no reason for a compiler to pretend it has a means of assuring that structs will be initialized to anything other than the memory-filled-with-zeroes default.
You cannot write a custom default constructor in a structure. The instance field initializers will eventually need to get moved to the constructor which you can't define.
Static field initializers are moved to a static constructor. You can write a custom static constructor in a struct.
You can do exactly what you're trying. All you're missing is a custom constructor that calls the default constructor:
struct Animal
{
public string name = "";
public static int weight = 20;
public Animal(bool someArg) : this() { }
}
The constructor has to take at least one parameter, and then it has to forward to this() to get the members initialised.
The reason this works is that the compiler now has a way to discover the times when the code should run to initialise the name field: whenever you write new Animal(someBool).
With any struct you can say new Animal(), but "blank" animals can be created implicitly in many circumstances in the workings of the CLR, and there isn't a way to ensure custom code gets run every time that happens.
I have an assignment for my first OOP class, and I understand all of it including the following statement:
You should create a class called ComplexNumber. This class will contain the real and imaginary parts of the complex number in private data members defined as doubles. Your class should contain a constructor that allows the data members of the imaginary number to be specified as parameters of the constructor. A default (non-parameterized) constructor should initialize the data members to 0.0.
Of course I know how to create these constructors without chaining them together, and the assignment does not require chaining them, but I want to as I just like to.
Without chaining them together, my constructors look like this:
class ComplexNumber
{
private double realPart;
private double complexPart;
public ComplexNumber()
{
realPart = 0.0;
complexPart = 0.0
}
public ComplexNumber(double r, double c)
{
realPart = r;
complexPart = c;
}
// the rest of my class code...
}
Is this what you're looking for?
public ComplexNumber()
: this(0.0, 0.0)
{
}
public ComplexNumber(double r, double c)
{
realPart = r;
complexPart = c;
}
#Rex has the connect answer for chaining.
However in this case chaining or any initialization is not necessary. The CLR will initialize fields to their default value during object constructor. For doubles this will cause them to be initialized to 0.0. So the assignment in the default constructor case is not strictly necessary.
Some people prefer to explicitly initialize their fields for documentation or readability though.
I am still trying to grasp the concept of constructor-chaining, so it works, but why/how?
The 'how' of constructor chaining by using the 'this' keyword in the constructor definition, and shown in Rex M's example.
The 'why' of constructor chaining is to reuse the implementation of a constructor. If the implementation (body) of the 2nd constructor were long and complicated, then you'd want to reuse it (i.e. chain to it or invoke it) instead of copy-and-pasting it into other constructors. An alternative might be to put that code, which is shared between several constructors, into a common subroutine which is invoked from several constructors: however, this subroutine wouldn't be allowed to initialize readonly fields (which can only be initialized from a constructor and not from a subroutine), so constructor chaining is a work-around for that.
Consider the following class,
class Foo
{
public Foo(int count)
{
/* .. */
}
public Foo(int count)
{
/* .. */
}
}
Above code is invalid and won't compile. Now consider the following code,
class Foo<T>
{
public Foo(int count)
{
/* .. */
}
public Foo(T t)
{
/* .. */
}
}
static void Main(string[] args)
{
Foo<int> foo = new Foo<int>(1);
}
Above code is valid and compiles well. It calls Foo(int count).
My question is, if the first one is invalid, how can the second one be valid? I know class Foo<T> is valid because T and int are different types. But when it is used like Foo<int> foo = new Foo<int>(1), T is getting integer type and both constructor will have same signature right? Why don't compiler show error rather than choosing an overload to execute?
There is no ambiguity, because the compiler will choose the most specific overload of Foo(...) that matches. Since a method with a generic type parameter is considered less specific than a corresponding non-generic method, Foo(T) is therefore less specific than Foo(int) when T == int. Accordingly, you are invoking the Foo(int) overload.
Your first case (with two Foo(int) definitions) is an error because the compiler will allow only one definition of a method with precisely the same signature, and you have two.
Your question was hotly debated when C# 2.0 and the generic type system in the CLR were being designed. So hotly, in fact, that the "bound" C# 2.0 specification published by A-W actually has the wrong rule in it! There are four possibilities:
1) Make it illegal to declare a generic class that could POSSIBLY be ambiguous under SOME construction. (This is what the bound spec incorrectly says is the rule.) So your Foo<T> declaration would be illegal.
2) Make it illegal to construct a generic class in a manner which creates an ambiguity. declaring Foo<T> would be legal, constructing Foo<double> would be legal, but constructing Foo<int> would be illegal.
3) Make it all legal and use overload resolution tricks to work out whether the generic or nongeneric version is better. (This is what C# actually does.)
4) Do something else I haven't thought of.
Rule #1 is a bad idea because it makes some very common and harmless scenarios impossible. Consider for example:
class C<T>
{
public C(T t) { ... } // construct a C that wraps a T
public C(Stream state) { ... } // construct a C based on some serialized state from disk
}
You want that to be illegal just because C<Stream> is ambiguous? Yuck. Rule #1 is a bad idea, so we scrapped it.
Unfortunately, it is not as simple as that. IIRC the CLI rules say that an implementation is allowed to reject as illegal constructions that actually do cause signature ambiguities. That is, the CLI rules are something like Rule #2, whereas C# actually implements Rule #3. Which means that there could in theory be legal C# programs that translate into illegal code, which is deeply unfortunate.
For some more thoughts on how these sorts of ambiguities make our lives wretched, here are a couple of articles I wrote on the subject:
http://blogs.msdn.com/ericlippert/archive/2006/04/05/569085.aspx
http://blogs.msdn.com/ericlippert/archive/2006/04/06/odious-ambiguous-overloads-part-two.aspx
Eric Lippert blogged about this recently.
The fact is that they do not both have the same signature - one is using generics while this other is not.
With those methods in place you could also call it using a non-int object:
Foo<string> foo = new Foo<string>("Hello World");