During my research into the best way to build a Singleton in C# I stumbled across the following article where there is a brief mention that in C++
"The C++ specification left some ambiguity around the initialization
order of static variables."
I ended up looking into the question and found this and this. Where basically the point (as far as I understand) is that the initialization order of static variables in C++ is undefined. Ok I guess so far so good, but then I wanted to understand the following statement that the article later makes
"Fortunately, the .NET Framework resolves this ambiguity through its
handling of variable initialization."
So I found this page where they say
The static field variable initializers of a class correspond to a
sequence of assignments that are executed in the textual order in
which they appear in the class declaration.
and give the example of
using System;
class Test
{
static void Main() {
Console.WriteLine("{0} {1}", B.Y, A.X);
}
public static int F(string s) {
Console.WriteLine(s);
return 1;
}
}
class A
{
static A() {}
public static int X = Test.F("Init A");
}
class B
{
static B() {}
public static int Y = Test.F("Init B");
}
the output must be:
Init B
Init A
1 1
"Because the rules for when static constructors execute (as defined in
Section 10.11) provide that B's static constructor (and hence B's
static field initializers) must run before A's static constructor and
field initializers."
But where I am confused is that my understanding was that the initialization order of static variables in these examples would be based on when a method or field within the class was first invoked, which is in turn based on the execution order of the block of code (this case left to right). IE: Completely independent of where - or the order - of the class declaration. Yet by my interpretation of that article it says its as a result of the order of declaration of those classes, which my testing doesn't back up?
Could someone please clarify this (and the point the article is trying to make) for me and perhaps provide a better example that illiterates the behaviour described?
The static field variable initializers of a class correspond to a
sequence of assignments that are executed in the textual order in
which they appear in the class declaration.
This means that within the same class, static fields are initialized in order of appearance in the source code. For example:
class A
{
public static int X = Test.F("Init A.X");
public static int Y = Test.F("Init A.Y");
}
When it's time for the static fields to be initialized, X is guaranteed to be initialized before Y.
"Because the rules for when static constructors execute (as defined in
Section 10.11) provide that B's static constructor (and hence B's
static field initializers) must run before A's static constructor and
field initializers."
This means that the static constructor and member initialization for each class will run in evaluation order when expressions that access these classes appear¹. The relative order of appearance of the class definitions in source code does not play any role, even if they appear in the same source file (which they most certainly are not obliged to do). For example:
static void Main() {
Console.WriteLine("{0} {1}", B.Y, A.X);
}
Assuming that neither A nor B has already been statically initialized, order of evaluation guarantees that all the fields of B will be initialized before any field of A. The fields of each class will be initialized in the order specified by the first rule.
¹ for the purposes of this discussion I am ignoring the existence of beforefieldinit.
In C++ the order of initialization of variables with static storage duration in a single translation unit is the order in which the definitions of such variables occur. It is unspecified what the order of initialization of variables with static storage duration is across different translation units.
That is, the C++ standard does offer a similar guarantee to what you quoted, substituting the order of declaration in the class for the order of definition in the single translation unit that defines such variables. But that is not the important difference.
While in C++ that is the only guarantee, in C# there is the added guarantee that all static members will be initialized before the first use of the class. This means that, if your program depends on A (consider each type in a different assembly which is the worst case), it will start the initialization of all static fields in A, if A in turn depends on B for any of those static initializations, then the initialization of B static members will be triggered there.
Contrast that with C++, where during static initialization[*], all other variables with static duration are assumed to be initialized. This is the main difference: C++ assumes that they are initialized, C# ensures that they are before that use.
[*] Technically the case where this is problematic could be dynamic initialization in the standard. Initialization of variables with static storage duration inside a each translation unit is a two step process, where during the first pass static initialization sets the variables to a fixed constant expression, and later in a second pass called dynamic initialization all variables with static storage whose initializer is not a constant expression are initialized.
Related
I was reading MSDN Documentation and there seems to be a contradiction.
Static members are initialized before the static member is accessed
for the first time and before the static constructor, if there is one,
is called.
also in the next paragraph or so,
If your class contains static fields, provide a static constructor
that initializes them when the class is loaded.
If static constructor's purpose is to initialize static members of the class then how come it says that static members get initialized even before static constructor gets called?
Is it like if I write:
public static int age = 10;
static SimpleClass()
{
age = 20;
}
Does that mean that age first gets initialized to 10 and then the value is overwritten to 20?
The second quote is a recommendation: Microsoft recommends using the static constructor instead of initializing fields when declaring, to avoid ordering issues, especially when using partial classes, which can cause null exceptions.
Indeed, by using partial classes, the order of assignment of the fields is not guaranteed. Using the static constructor, it does.
You can also use properties to make sure that you don't get a null exception if the getters don't access instances of uninitialized reference types.
So, because of the first quote, the answer to your question is: yes, it means that age first gets initialized to 10 and then the value is overwritten to 20, unless you are using partial classes, then the result may be hazardous and it can be a fight against the debugger...
You can check and inversigate this by playing around with breakpoints.
DependencyProperty.AddOwner MSDN page offers an example with two classes with static members, and the member of one class depends on the member of the other class for initialization. I think MSDN is wrong - the initialization order of static variables is unreliable in C# just like it is in C++ or anywhere else. I'm probably wrong because the WPF library itself is written that way and it works just fine. What am I missing? How can C# compiler possibly know the safe initialization order?
It's fine for one type to depend on another type being initialized, so long as you don't end up in a cycle.
Basically this is fine:
public class Child
{
static Child() {} // Added static constructor for extra predictability
public static readonly int X = 10;
}
public class Parent
{
static Parent() {} // Added static constructor for extra predictability
public static readonly int Y = Child.X;
}
The result is well-defined. Child's static variable initializers are executed prior to the first access to any static field in the class, as per section 10.5.5.1 of the spec.
This isn't though:
public class Child
{
public static readonly int Nasty = Parent.Y;
public static readonly int X = 10;
}
public class Parent
{
public static readonly int Y = Child.X;
}
In this latter case, you either end up with Child.Nasty=0, Parent.Y=10, Child.X=10 or Child.Nasty=0, Parent.Y=0, Child.X=10 depending on which class is accessed first.
Accessing Parent.Y first will start initializing Parent first, which triggers the initialization of Child. The initialization of Child will realise that Parent needs to be initialized, but the CLR knows that it's already being initialized, so carries on regardless, leading to the first set of numbers - because Child.X ends up being initialized before its value is used for Parent.Y.
Accessing Child.Nasty will start initializing Child first, which will then start to initialize Parent. The initialization of Parent will realise that Child needs to be initialized, but the CLR knows that it's already being initialized, so carries on regardless, leading to the second set of numbers.
Don't do this.
EDIT: Okay, more detailed explanation, as promised.
When is a type initialized?
If a type has a static constructor, it will only be initialized
when it's first used (either when a static member is referenced, or
when an instance is created). If it doesn't have a static
constructor, it can be initialized earlier. In theory, it could also
be initialized later; you could theoretically call a constructor or
a static method without the static variables being initialized - but
it must be initialized before static variables are referenced.
What happens during initialization?
First, all static variables receive their default values (0, null
etc).
Then the static variables of the type are initialized in textual
order. If the initializer expression for a static variable requires
another type to be initialized, then that other type will be
completely initialized before the variable's value is assigned -
unless that second type is already being initialized (due to a
cyclic dependency). Essentially, a type is either:
Already initialized
Being initialized at the moment
Not initialized
Initialization is only triggered if the type is not initialized.
This means that when there are cyclic dependencies, it is possible
to observe a static variable's value before its initial value has
been assigned. That's what my Child/Parent example shows.
After all the static variable initializers have executed, the static
constructor executes.
See section 10.12 of the C# spec for more details on all of this.
By popular demand, here was my original answer when I thought the question was about the initialization order of static variables within a class:
Static variables are initialized in textual order, as per section 10.5.5.1 of the C# spec:
The static field variable initializers
of a class correspond to a sequence of
assignments that are executed in the
textual order in which they appear in
the class declaration.
Note that partial types make this trickier as there's no one canonical "textual order" of the class.
If you are concerned about the order you could always place your code in the static constructor. This is where I register my dependency properties.
No I think unreliable is not the correct word here.
In true single thread scenario, static members of class are initialized when any of static members of the type is first accessed in your code.
I am not aware of c++, but yes only in certain cases like in Multi threaded environment if two types trying to access shared resource and if that is static then its impossible to tell who will win and which one will work correct.
The MSDN Example is correct and that will work correctly.
I have C++ wrapped in C dll. The dll is called in my C# project.
In my wrapper functions I call a lot of Singletons, they are setup as follows:
ComponentManager &ComponentManager::_cmpManager()
{
static ComponentManager ONLY_ONE;
return ONLY_ONE;
}
The above function is a static function inside my ComponentManager class.
Here is the specific problem:
bool createNewEntity(char *c)
{
if (ComponentManager::_cmpManager().nameAvailable(c))
{
Entity e(c);
Transform t;
ComponentManager::_cmpManager().addComponent(c, t);
SceneNode sc(CMP_MANAGER2.getComponent<Transform>(c));
SCENE_MANAGER.addSceneNode(sc, e.entityName);
return true;
}
return false;
}
Essentially what this does is the singleton has a Hash Map with a key type string, this function checks to see if this key already exists. The behaviour is always returning true. When I use a global object of type componentManager instead of the singleton it behaves correctly, so something is telling me the singleton keeps leaving scope and deleting itself. Also if I use the singleton in an application exe rather than a dll it behaves correctly. So I have 2 questions,
Is there a way to keep my singleton from traveling out of scope? If
not.
Is there another way of setting up singletons to not be deleted
after leaving scope?
C++ static keyword is a bit different from C# static.
See https://msdn.microsoft.com/en-us/library/y5f6w579.aspx for description.
In item 2 there it says: 2. When you declare a variable in a function, the static keyword specifies that the variable retains its state between calls to that function.
Try to declare your static not inside the method but in class scope (as per item 3 in above reference).
3. When you declare a data member in a class declaration, the static keyword specifies that one copy of the member is shared by all instances of the class. A static data member must be defined at file scope. An integral data member that you declare as const static can have an initializer.
You will also need to declare that static member at file scope.
Why it works when it is not in a dll -- it is not exactly clear. Probably a peculiar behaviour of linker. If class is declared in a dll it probably tries to instantiate class every time and your static inside the function is a new one every time. But if class is inside the exe file it is somehow the same class every time and when you call your method _cmpManager() it always accesses the same instance of the class.
Just my two cents :-).
After watching webinar Jon Skeet Inspects ReSharper, I've started to play a little with
recursive constructor calls and found, that the following code is valid C# code (by valid I mean it compiles).
class Foo
{
int a = null;
int b = AppDomain.CurrentDomain;
int c = "string to int";
int d = NonExistingMethod();
int e = Invalid<Method>Name<<Indeeed();
Foo() :this(0) { }
Foo(int v) :this() { }
}
As we all probably know, field initialization is moved into constructor by the compiler. So if you have a field like int a = 42;, you will have a = 42 in all constructors. But if you have constructor calling another constructor, you will have initialization code only in called one.
For example if you have constructor with parameters calling default constructor, you will have assignment a = 42 only in the default constructor.
To illustrate second case, next code:
class Foo
{
int a = 42;
Foo() :this(60) { }
Foo(int v) { }
}
Compiles into:
internal class Foo
{
private int a;
private Foo()
{
this.ctor(60);
}
private Foo(int v)
{
this.a = 42;
base.ctor();
}
}
So the main issue, is that my code, given at the start of this question, is compiled into:
internal class Foo
{
private int a;
private int b;
private int c;
private int d;
private int e;
private Foo()
{
this.ctor(0);
}
private Foo(int v)
{
this.ctor();
}
}
As you can see, the compiler can't decide where to put field initialization and, as result, doesn't put it anywhere. Also note, there are no base constructor calls. Of course, no objects can be created, and you will always end up with StackOverflowException if you will try to create an instance of Foo.
I have two questions:
Why does compiler allow recursive constructor calls at all?
Why we observe such behavior of the compiler for fields, initialized within such class?
Some notes: ReSharper warns you with Possible cyclic constructor calls. Moreover, in Java such constructor calls won't event compile, so the Java compiler is more restrictive in this scenario (Jon mentioned this information at the webinar).
This makes these questions more interesting, because with all respect to Java community, the C# compiler is at least more modern.
This was compiled using C# 4.0 and C# 5.0 compilers and decompiled using dotPeek.
Interesting find.
It appears that there are really only two kinds of instance constructors:
An instance constructor which chains another instance constructor of the same type, with the : this( ...) syntax.
An instance constructor which chains an instance constructor of the base class. This includes instance constructors where no chainig is specified, since : base() is the default.
(I disregarded the instance constructor of System.Object which is a special case. System.Object has no base class! But System.Object has no fields either.)
The instance field initializers that might be present in the class, need to be copied into the beginning of the body of all instance constructors of type 2. above, whereas no instance constructors of type 1. need the field assignment code.
So apparently there's no need for the C# compiler to do an analysis of the constructors of type 1. to see if there are cycles or not.
Now your example gives a situation where all instance constructors are of type 1.. In that situation the field initaializer code does not need to be put anywhere. So it is not analyzed very deeply, it seems.
It turns out that when all instance constructors are of type 1., you can even derive from a base class that has no accessible constructor. The base class must be non-sealed, though. For example if you write a class with only private instance constructors, people can still derive from your class if they make all instance constructors in the derived class be of type 1. above. However, an new object creation expression will never finish, of course. To create instances of the derived class, one would have to "cheat" and use stuff like the System.Runtime.Serialization.FormatterServices.GetUninitializedObject method.
Another example: The System.Globalization.TextInfo class has only an internal instance constructor. But you can still derive from this class in an assembly other than mscorlib.dll with this technique.
Finally, regarding the
Invalid<Method>Name<<Indeeed()
syntax. According to the C# rules, this is to be read as
(Invalid < Method) > (Name << Indeeed())
because the left-shift operator << has higher precedence than both the less-than operator < and the greater-than operator >. The latter two operarors have the same precedence, and are therefore evaluated by the left-associative rule. If the types were
MySpecialType Invalid;
int Method;
int Name;
int Indeed() { ... }
and if the MySpecialType introduced an (MySpecialType, int) overload of the operator <, then the expression
Invalid < Method > Name << Indeeed()
would be legal and meaningful.
In my opinion, it would be better if the compiler issued a warning in this scenario. For example, it could say unreachable code detected and point to the line and column number of the field initializer that is never translated into IL.
I think because the language specification only rules out directly invoking the same constructor that is being defined.
From 10.11.1:
All instance constructors (except those for class object) implicitly include an invocation of another instance constructor immediately before the constructor-body. The constructor to implicitly invoke is determined by the constructor-initializer
...
An instance constructor initializer of the form this(argument-listopt) causes an instance constructor from the class itself to be invoked ... If an instance constructor declaration includes a constructor initializer that invokes the constructor itself, a compile-time error occurs
That last sentence seems to only preclude direct calling itself as producing a compile time error, e.g.
Foo() : this() {}
is illegal.
I admit though - I can't see a specific reason for allowing it. Of course, at the IL level such constructs are allowed because different instance constructors could be selected at runtime, I believe - so you could have recursion provided it terminates.
I think the other reason it doesn't flag or warn on this is because it has no need to detect this situation. Imagine chasing through hundreds of different constructors, just to see if a cycle does exist - when any attempted usage will quickly (as we know) blow up at runtime, for a fairly edge case.
When it's doing code generation for each constructor, all it considers is constructor-initializer, the field initializers, and the body of the constructor - it doesn't consider any other code:
If constructor-initializer is an instance constructor for the class itself, it doesn't emit the field initializers - it emits the constructor-initializer call and then the body.
If constructor-initializer is an instance constructor for the direct base class, it emits the field initializers, then the constructor-initializer call, and then then body.
In neither case does it need to go looking elsewhere - so it's not a case of it being "unable" to decide where to place the field initializers - it's just following some simple rules that only consider the current constructor.
Your example
class Foo
{
int a = 42;
Foo() :this(60) { }
Foo(int v) { }
}
will work fine, in the sense that you can instantiate that Foo object without problems. However, the following would be more like the code that you're asking about
class Foo
{
int a = 42;
Foo() :this(60) { }
Foo(int v) : this() { }
}
Both that and your code will create a stackoverflow (!), because the recursion never bottoms out. So your code is ignored because it never gets to execute.
In other words, the compiler can't decide where to put the faulty code because it can tell that the recursion never bottoms out. I think this is because it has to put it where it will only be called once, but the recursive nature of the constructors makes that impossible.
Recursion in the sense of a constructor creating instances of itself within the body of the constructor makes sense to me, because e.g. that could be used to instantiate trees where each node points to other nodes. But recursion via the pre-constructors of the sort illustrated by this question can't ever bottom out, so it would make sense for me if that was disallowed.
I think this is allowed because you can (could) still catch the Exception and do something meaningfull with it.
The initialisation will never be run, and it will almost certaintly throw a StackOverflowException. But this can still be wanted behaviour, and didn't always mean the process should crash.
As explained here https://stackoverflow.com/a/1599236/869482
If I have a class like:
public class A
{
public A(string name)
{
Console.WriteLine("Mon");
}
}
public class B
{
private A m_a = new A("Tues");
public B()
{
m_a = new A("Wed");
}
}
I'm not on a windows machine so I can't test the output.
What would it be, but more importantly why is it that way?
i.e why would the private var get instantiated before the constructor, or visa versa. Or would one be ignored or simply over-written?
Would Java be the same behaviour?
In both C# and Java all the initialization that is outside a constructor comes before any calls to the constructor. The assignment in the constructor would overwrite the other assignment.
For C# at least you can see the details of the language specification in section 10.11. That should answer any of the finer details of ordering, especially where inheritance is concerned.
I don't know how it works in Java (probably the same), but in C# member variables are instantiated before the constructor runs. As to why, I've never thought about it, and I don't know the best answer, but a pragmatic answer would be so that the member variables are available, already instantiated, in the constructor.
Variable initializers (within the class) are called before the constructor for that class. So private A m_a = new A("Tues") would be called before m_a = new A("Wed"), because the constructor might need to use the values of the private variables. I would assume Java does it the same way, but I can't test it right now.
As per the C# specification 17.4.5 Variable initializers:
For instance fields, variable
initializers correspond to assignment
statements that are executed when an
instance of the class is created.
All fields also have default value initializers that run regardless of whether the field has a variable initializer:
The default value initialization
described in 1.4.3 occurs for all
fields, including fields that have
variable initializers. Thus, when a
class is initialized, all static
fields in that class are first
initialized to their default values,
and then the static field initializers
are executed in textual order.
Likewise, when an instance of a class
is created, all instance fields in
that instance are first initialized to
their default values, and then the
instance field initializers are
executed in textual order.
So basically, there is no difference between private member initialization and construction initialization - the compiler will put it all the object's instance initialization routine, in textual order.