C# initial value in constructor OR class variable declaration? [duplicate] - c#

I've been programming in C# and Java recently and I am curious where the best place is to initialize my class fields.
Should I do it at declaration?:
public class Dice
{
private int topFace = 1;
private Random myRand = new Random();
public void Roll()
{
// ......
}
}
or in a constructor?:
public class Dice
{
private int topFace;
private Random myRand;
public Dice()
{
topFace = 1;
myRand = new Random();
}
public void Roll()
{
// .....
}
}
I'm really curious what some of you veterans think is the best practice. I want to be consistent and stick to one approach.

My rules:
Don't initialize with the default values in declaration (null, false, 0, 0.0…).
Prefer initialization in declaration if you don't have a constructor parameter that changes the value of the field.
If the value of the field changes because of a constructor parameter put the initialization in the constructors.
Be consistent in your practice (the most important rule).

In C# it doesn't matter. The two code samples you give are utterly equivalent. In the first example the C# compiler (or is it the CLR?) will construct an empty constructor and initialise the variables as if they were in the constructor (there's a slight nuance to this that Jon Skeet explains in the comments below).
If there is already a constructor then any initialisation "above" will be moved into the top of it.
In terms of best practice the former is less error prone than the latter as someone could easily add another constructor and forget to chain it.

I think there is one caveat. I once committed such an error: Inside of a derived class, I tried to "initialize at declaration" the fields inherited from an abstract base class. The result was that there existed two sets of fields, one is "base" and another is the newly declared ones, and it cost me quite some time to debug.
The lesson: to initialize inherited fields, you'd do it inside of the constructor.

The semantics of C# differs slightly from Java here. In C# assignment in declaration is performed before calling the superclass constructor. In Java it is done immediately after which allows 'this' to be used (particularly useful for anonymous inner classes), and means that the semantics of the two forms really do match.
If you can, make the fields final.

Assuming the type in your example, definitely prefer to initialize fields in the constructor. The exceptional cases are:
Fields in static classes/methods
Fields typed as static/final/et al
I always think of the field listing at the top of a class as the table of contents (what is contained herein, not how it is used), and the constructor as the introduction. Methods of course are chapters.

In Java, an initializer with the declaration means the field is always initialized the same way, regardless of which constructor is used (if you have more than one) or the parameters of your constructors (if they have arguments), although a constructor might subsequently change the value (if it is not final). So using an initializer with a declaration suggests to a reader that the initialized value is the value that the field has in all cases, regardless of which constructor is used and regardless of the parameters passed to any constructor. Therefore use an initializer with the declaration only if, and always if, the value for all constructed objects is the same.

There are many and various situations.
I just need an empty list
The situation is clear. I just need to prepare my list and prevent an exception from being thrown when someone adds an item to the list.
public class CsvFile
{
private List<CsvRow> lines = new List<CsvRow>();
public CsvFile()
{
}
}
I know the values
I exactly know what values I want to have by default or I need to use some other logic.
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = new List<string>() {"usernameA", "usernameB"};
}
}
or
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = GetDefaultUsers(2);
}
}
Empty list with possible values
Sometimes I expect an empty list by default with a possibility of adding values through another constructor.
public class AdminTeam
{
private List<string> usernames = new List<string>();
public AdminTeam()
{
}
public AdminTeam(List<string> admins)
{
admins.ForEach(x => usernames.Add(x));
}
}

What if I told you, it depends?
I in general initialize everything and do it in a consistent way. Yes it's overly explicit but it's also a little easier to maintain.
If we are worried about performance, well then I initialize only what has to be done and place it in the areas it gives the most bang for the buck.
In a real time system, I question if I even need the variable or constant at all.
And in C++ I often do next to no initialization in either place and move it into an Init() function. Why? Well, in C++ if you're initializing something that can throw an exception during object construction you open yourself to memory leaks.

The design of C# suggests that inline initialization is preferred, or it wouldn't be in the language. Any time you can avoid a cross-reference between different places in the code, you're generally better off.
There is also the matter of consistency with static field initialization, which needs to be inline for best performance. The Framework Design Guidelines for Constructor Design say this:
✓ CONSIDER initializing static fields inline rather than explicitly using static constructors, because the runtime is able to optimize the performance of types that don’t have an explicitly defined static constructor.
"Consider" in this context means to do so unless there's a good reason not to. In the case of static initializer fields, a good reason would be if initialization is too complex to be coded inline.

Being consistent is important, but this is the question to ask yourself:
"Do I have a constructor for anything else?"
Typically, I am creating models for data transfers that the class itself does nothing except work as housing for variables.
In these scenarios, I usually don't have any methods or constructors. It would feel silly to me to create a constructor for the exclusive purpose of initializing my lists, especially since I can initialize them in-line with the declaration.
So as many others have said, it depends on your usage. Keep it simple, and don't make anything extra that you don't have to.

Consider the situation where you have more than one constructor. Will the initialization be different for the different constructors? If they will be the same, then why repeat for each constructor? This is in line with kokos statement, but may not be related to parameters. Let's say, for example, you want to keep a flag which shows how the object was created. Then that flag would be initialized differently for different constructors regardless of the constructor parameters. On the other hand, if you repeat the same initialization for each constructor you leave the possibility that you (unintentionally) change the initialization parameter in some of the constructors but not in others. So, the basic concept here is that common code should have a common location and not be potentially repeated in different locations. So I would say always put it in the declaration until you have a specific situation where that no longer works for you.

There is a slight performance benefit to setting the value in the declaration. If you set it in the constructor it is actually being set twice (first to the default value, then reset in the ctor).

When you don't need some logic or error handling:
Initialize class fields at declaration
When you need some logic or error handling:
Initialize class fields in constructor
This works well when the initialization value is available and the
initialization can be put on one line. However, this form of
initialization has limitations because of its simplicity. If
initialization requires some logic (for example, error handling or a
for loop to fill a complex array), simple assignment is inadequate.
Instance variables can be initialized in constructors, where error
handling or other logic can be used.
From https://docs.oracle.com/javase/tutorial/java/javaOO/initial.html .

I normally try the constructor to do nothing but getting the dependencies and initializing the related instance members with them. This will make you life easier if you want to unit test your classes.
If the value you are going to assign to an instance variable does not get influenced by any of the parameters you are going to pass to you constructor then assign it at declaration time.

Not a direct answer to your question about the best practice but an important and related refresher point is that in the case of a generic class definition, either leave it on compiler to initialize with default values or we have to use a special method to initialize fields to their default values (if that is absolute necessary for code readability).
class MyGeneric<T>
{
T data;
//T data = ""; // <-- ERROR
//T data = 0; // <-- ERROR
//T data = null; // <-- ERROR
public MyGeneric()
{
// All of the above errors would be errors here in constructor as well
}
}
And the special method to initialize a generic field to its default value is the following:
class MyGeneric<T>
{
T data = default(T);
public MyGeneric()
{
// The same method can be used here in constructor
}
}

"Prefer initialization in declaration", seems like a good general practice.
Here is an example which cannot be initialized in the declaration so it has to be done in the constructor.
"Error CS0236 A field initializer cannot reference the non-static field, method, or property"
class UserViewModel
{
// Cannot be set here
public ICommand UpdateCommad { get; private set; }
public UserViewModel()
{
UpdateCommad = new GenericCommand(Update_Method); // <== THIS WORKS
}
void Update_Method(object? parameter)
{
}
}

Related

Can A subclass be downcast while sent as a parameter to an overloaded function

I have a superClass called Block and another 3 subclasses. the class I want to implement contains 3 overloaded functions each one takes an object of one of the subclasses as a parameter. When I use one of these function, I only have a Block object (An object from the superClass). My question is what is the cleanest way to choose which function to call.
What I did until now is if conditions on the object type then casting it. but it seems unclean.
Those are the overloaded functions.
public void WriteBlock(TableBlock block) { }
public void WriteBlock(TextBlock block) { }
public void WriteBlock(ListBlock block) { }
And This is The function I want to implement.
public void WriteBlocks(List<Block> blocks)
{
BlockWriter w = new BlockWriter();
foreach (var block in blocks)
{
w.WriteBlock(block);
}
}
Note that I have no access on the Blocks classes.
Yes, it is possible using the dynamic type which allows for this.
If you use:
foreach (var block in blocks)
{
w.WriteBlock(block as dynamic);
}
It should call the intended WriteBlock overload.
This is described in greater length in another question: https://stackoverflow.com/a/40618674/3195477
And also here: method overloading and dynamic keyword in C#.
Caveats:
I am not sure if there is any runtime penalty associated with this type of dynamic "cast".
Also whenever I see this pattern it makes me wonder if the class hierarchy could be improved. i.e., should whatever WriteBlock will do actually be moved inside the Block classes? That might be "more polymorphic". Also using dynamic could be a somewhat fragile approach, as you can add new Block derived types and forget to an an overloaded WriteBlock for them, which may cause an error. (This is more evidence that some of WriteBlock should be incorporated into the Block classes themselves).
For instance, add a virtual PrepareForWriting() to the base Block class, which returns a BlockWritable. Then you only need one WriteBlock(BlockWritable data) to do the writing work. BlockWritable could be a string, Json, XML, etc. This assumes you are able to modify the Block classes (which it seems you cannot).
No. Given this:
public void WriteBlocks(List<Block> blocks)
the only thing the compiler knows about each item in the list is that it is a Block. That's all it should know. That's what makes polymorphism possible. There can be any number of classes that inherit from Block, but within this context those distinctions don't matter.
But if all the compiler knows is that each item is a Block, it can't know whether any individual item might be a TableBlock, TextBlock, or some other inherited type. If, at compile time, it doesn't know what the runtime type will be, it can't know whether there even is an overload for that specific type.
Suppose what you're trying to do could compile, because you have an overload for every type that inherited from Block. What would or should happen if you added a new type - class PurpleBlock : Block - and there was no overload for it? Should this no longer compile just because you added a new type?
If the method that calls WriteBlocks knows what sort of Block is in the list, then it can supply that information:
public void WriteBlocks<TBlock>(List<TBlock> blocks) where TBlock : Block
Now you can call WriteBlock<TextBlock>(listOfTextBlocks) and the compiler will know that each item in the list is a TextBlock, not just a Block.
It follows, then, that BlockWriter would have to be generic also so that you could have different implementations for different types of Block. It might make more sense to inject it. Either way, you're likely to perceive that you've "moved" the problem. If the class that calls WriteBlocks "knows" the type of the Block, then it might make more sense for that method to determine the type of BlockWriter to use.
As mentioned in your comment, the list might include different types of Block, not just one. That requires either a method or a class that returns a specific BlockWriter depending on the type of Block. That means runtime type-checking, which isn't ideal, but it's not too bad if you keep it in one place.
Here's a simple example:
public class BlockWriterFactory
{
public BlockWriter GetBlockWriter(Block block)
{
if (block is TextBlock)
return new TextBlockWriter();
if (block is TableBlock)
return new TableBlockWriter();
if (block is ListBlock)
return new ListBlockWriter();
// this could be a "null" class or some fallback
// default implementation. You could also choose to
// throw an exception.
return new NullBlockWriter();
}
}
(A NullBlockWriter would just be a class that does nothing when you call its Write method.)
This sort of type-checking isn't ideal, but at least this keeps it isolated into one class. Now you can create (or inject) an instance of the factory, and call GetBlockWriter, and the rest of your code in that method still wouldn't "know" anything about the different types of Block or BlockWriter.
BlockWriter w = new BlockWriter();
would become
BlockWriter w = blockWriterFactory.GetBlockWriter(block);
...and then the rest would still be the same.
That's the simplest possible factory example. There are other approaches to creating such a factory. You could store all of your implementations in a Dictionary<Type, BlockWriter> and attempt to retrieve an instance using block.GetType().

Verifying code against template patterns using reflection

I am working on a large project where a base class has thousands of classes derived from it (multiple developers are working on them). Each class is expected to override a set of methods. I first generated these thousands of class files with a code template that conforms to an acceptable pattern. I am now writing unit tests to ensure that developers have not deviated from this pattern. Here is a sample generated class:
// Base class.
public abstract partial class BaseClass
{
protected abstract bool OnTest ();
}
// Derived class. DO NOT CHANGE THE CLASS NAME!
public sealed partial class DerivedClass_00000001: BaseClass
{
/// <summary>
/// Do not modify the code template in any way.
/// Write code only in the try and finally blocks in this method.
/// </summary>
protected override void OnTest ()
{
bool result = false;
ComObject com = null;
// Declare ALL value and reference type variables here. NOWHERE ELSE!
// Variables that would otherwise be narrowly scoped should also be declared here.
// Initialize all reference types to [null]. [object o;] does not conform. [object o = null;] conforms.
// Initialize all value types to their default values. [int x;] does not conform. [int x = 0;] conforms.
try
{
com = new ComObject();
// Process COM objects here.
// Do NOT return out of this function yourself!
}
finally
{
// Release all COM objects.
System.Runtime.InteropServices.Marshal.ReleaseComObject(com);
// Set all COM objects to [null].
// The base class will take care of explicit garbage collection.
com = null;
}
return (result);
}
}
In the unit tests, I have been able to verify the following via reflection:
The class derives from [BaseClass] and does not implement any interfaces.
The class name conforms to a pattern.
The catch block has not been filtered.
No other catch blocks have been added.
No class level fields or properties have been declared.
All method value type variables have been manually initialized upon declaration.
No other methods have been added to the derived classes.
The above is easily achieved via reflection but I am struggling with asserting the following list:
The catch block re-throws the caught exception rather than wrapping it or throwing some other exception.
The [return (result);] line at the end has not been modified and no other [return (whatever);] calls have been added. No idea how to achieve this.
Verify that all reference types implementing IDisposable have been disposed.
Verify that all reference types of type [System.__ComObject] have been manually de-referenced and set to [null] in the finally block.
I have thought about parsing the source code but I don't like that solution unless absolutely necessary. It is messy and unless I have expression trees, almost impossible to guarantee success.
Any tips would be appreciated.
Some thoughts:
If the methods need to be overriden, why are they virtual instead of abstract?
Code that should not be changed doesn't belong in the derived class. It belongs in the base class.
catch { throw; } is useless. Remove it.
Returning a boolean value from a void method causes a compiler error.
Setting local variables to null is useless.
Not all reference types implement IDisposable.
Generally: Most of your requirements seem to have no business value.
Why prohibit implementation of an interface?
Why prohibit declaration of other methods?
Why prohibit catch clauses?
etc.
You should really think about what your actual business requirements are and model your classes after them. If the classes need to fulfill a certain contract, model that contract. Leave the implementation to the implementor.
About the actual questions raised:
You can't use reflection here. You can either analyze the original source code or the IL code of the compiled assembly.
Both options are pretty tricky and most likely impossible to achieve within your limited time. I am positive that fixing the architecture would take less time than implementing one of those options.
You could try to use Roslyn CTP here if the fully automated code analysis is what you really need. It has more advanced syntax and semantics analysis than reflection does. But it is still a lot of work. Working directly with developers, not with their code, preparing templates, guidelines may be more time efficient.
While I'm sure you have a very good reason for such rigid requirements... have you considered passing a Lambda's/Delegates/Action to the Test function instead?
Can't solve everything, but would more logically give you some of the behaviours you want (e.g. can't return, can't have class level variables, can't write code anywhere but specified).
Biggest concern with it would be captured variables... but there may be work arounds for that.
Example Code:
//I'd make a few signatures....
bool OnTest<T1, T2> (Action<ComObject, T1, T2> logic, T1 first, T2 second)
{
bool result = false;
ComObject com = null;
//no checks needed re parameters
//Can add reflection tests here if wanted before code is run.
try
{
com = new ComObject();
//can't return
logic(com, first,second);
}
finally
{
// Release all COM objects.
System.Runtime.InteropServices.Marshal.ReleaseComObject(com);
// Set all COM objects to [null].
// The base class will take care of explicit garbage collection.
com = null;
//If you want, we can check each argument and if it is disposable dispose.
if (first is IDisposable && first != null) ((IDisposable) first).Dispose();
...
}
return (result); //can't be changed
}
No idea if this'll work, but it's just a thought. Oh, and as a thought it's not thorough or tested - I'd expect you to develop it drastically.

C# myths about best practices?

My colleague keeps telling me of the things listed in comments.
I am confused.
Can somebody please demystify these things for me?
class Bar
{
private int _a;
public int A
{
get { return _a; }
set { _a = value; }
}
private Foo _objfoo;
public Foo OFoo
{
get { return _objfoo; }
set { _objfoo = value; }
}
public Bar(int a, Foo foo)
{
// this is a bad idea
A = a;
OFoo = foo;
}
// MYTHS
private void Method()
{
this.A //1 -
this._a //2 - use this when inside the class e.g. if(this._a == 2)
A //3 - use this outside the class e.g. barObj.A
_a //4 -
// Not using this.xxx creates threading issues.
}
}
class Foo
{
// implementation
}
The this. is redundant if there isn't a name collision. You only need it when you need a reference to the current object or if you have an argument with the same name as a field.
Threading issues have nothing to do with it. The confusion maybe comes from the fact that most static members are implemented so that they are thread-safe and static members cannot (!) be called with this. since they aren't bound to the instance.
"Not using this.xxx creates threading
issues"
is a complete myth. Just ask your co-worker to check the generate IL and have him explain why they are the same whether you add this or not.
"use this when inside the class e.g.
if(this._a == 2)"
is down to what you want to achieve. What your co-worker seems to be saying is always reference the private field, which does not seem to me sensible. Often you want to access the public property, even inside a class, since the getter may modify the value (for instance, a property of type List may return a new List instance when the list is null to avoid null reference exceptions when accessing the property).
My personal "best practice" is to always use this. Yes it's redundant but it's great way to identify from the first look where the state of the instance is chaged or retrieved when you consider multi-threaded app.
It may help to ask your co-worker why he considers these suggestions are best practice? Often people quote best-practice "rules" that they have picked up somewhere without any real understanding of the reasons behind the practices.
As Lucero says, the "this" is not required unless () there is a name collision. However, some people like to include the "this" when it is not strictly required, because they believe it enhances readability / more clearly shows the programmers intentions. In my opinion, this is a matter of personal preference rather than anything else.
As for the "bad idea" in your "Bar" method: Your co-worker may consider this bad practice for the following reason: if the setter method for "A" is altered to have some side effect then A=a; will also produce this side effect, whereas _a = a; will just set the private variable. In my view, best practice is a matter of being aware of the difference rather than prefering one over another.
Finally, the "threading issues" are nonsense - AFAIK "this" has nothing to do with threading.
The number 2 is a myth that is easily debunked by mentioning automatic properties. Automatic properties allow you to define a property without the backing field which is automatically generated by the compiler. So ask your co-worker what is his opinion about automatic properties.

Proper way to accomplish this construction using constructor chaining?

I have an assignment for my first OOP class, and I understand all of it including the following statement:
You should create a class called ComplexNumber. This class will contain the real and imaginary parts of the complex number in private data members defined as doubles. Your class should contain a constructor that allows the data members of the imaginary number to be specified as parameters of the constructor. A default (non-parameterized) constructor should initialize the data members to 0.0.
Of course I know how to create these constructors without chaining them together, and the assignment does not require chaining them, but I want to as I just like to.
Without chaining them together, my constructors look like this:
class ComplexNumber
{
private double realPart;
private double complexPart;
public ComplexNumber()
{
realPart = 0.0;
complexPart = 0.0
}
public ComplexNumber(double r, double c)
{
realPart = r;
complexPart = c;
}
// the rest of my class code...
}
Is this what you're looking for?
public ComplexNumber()
: this(0.0, 0.0)
{
}
public ComplexNumber(double r, double c)
{
realPart = r;
complexPart = c;
}
#Rex has the connect answer for chaining.
However in this case chaining or any initialization is not necessary. The CLR will initialize fields to their default value during object constructor. For doubles this will cause them to be initialized to 0.0. So the assignment in the default constructor case is not strictly necessary.
Some people prefer to explicitly initialize their fields for documentation or readability though.
I am still trying to grasp the concept of constructor-chaining, so it works, but why/how?
The 'how' of constructor chaining by using the 'this' keyword in the constructor definition, and shown in Rex M's example.
The 'why' of constructor chaining is to reuse the implementation of a constructor. If the implementation (body) of the 2nd constructor were long and complicated, then you'd want to reuse it (i.e. chain to it or invoke it) instead of copy-and-pasting it into other constructors. An alternative might be to put that code, which is shared between several constructors, into a common subroutine which is invoked from several constructors: however, this subroutine wouldn't be allowed to initialize readonly fields (which can only be initialized from a constructor and not from a subroutine), so constructor chaining is a work-around for that.

C# this.everything?

I've decided to use this.variableName when referring to string/int etc.. fields.
Would that include ArrayList, ListBox etc too?
Like:
private ListBox usersListBox;
private void PopulateListBox()
{
this.usersListBox.Items.Add(...);
}
...Or not?
And what about classes?
MyClass myClass;
private void PlayWithMyClass()
{
this.myClass = new MyClass();
this.myClass.Name = "Bob";
}
?
This looks kind of odd to me.
And I don't know if I should use this.PublicProperty or only private fields.
I'm not 100% with the C# terminology, but hopefully what I said makes sense.
I used to do that sort of thing, but now I find that IDEs are pretty smart about giving me a visual indication that I'm dealing with a member variable. I only use "this" when it's necessary to distinguish the member variable from a parameter of the same name.
the this. command will allow you to call anything that is in scope in the same class as you are executing. You can access private and public variables and since everything in c# is a object calling a class is the same as calling a string.
You don't have to use this in your code if you don't want to as it is implied in c# unless a method param and a global variable are the same.
Less is more. Less text to parse is more readable.
I use this in the constructors since my parameters and member variables have the same names (I don't like marking member variables with _).
public class A
{
int a;
public A(int a)
{
this.a = a;
}
}
If your class is small enough and does one thing well, then usually you wouldn't need to add this for the sake of readability.
If you can read the whole class easily, what would be the point? It'd be more typing and clutter the code, thus possibly degrade the readability
using the 'this' keyword can be against any instance of an object. So this means u can use it to reference a class instance (eg. usersListBox, myClass, etc).
It's perfectly fine.
Some people use it to clearly explain what they are referencing so people understand that the instances are in the scope of the code and not external or part of another instance or static member elsewhere.
Finally, you can use it to reference both private and/or public properties and fields and members.
This is nothing more then a keyword pointing to the current instance. In a function, this.foo is generally the same as foo.
As msdn tells you:
The this keyword refers to the current instance of the class.
The page about the this keyword contains a lot more info.
As the this. is implicit you only need to actually use it when disambiguating between class variables and local variables of the same name.
The examples you've given would work how you've written then or like this:
private ListBox usersListBox;
private void PopulateListBox()
{
usersListBox.Items.Add(...);
}
MyClass myClass;
private void PlayWithMyClass()
{
myClass = new MyClass();
myClass.Name = "Bob";
}
it's just a matter of personal preference. If you do choose one over the other, try to be consistent.

Categories