The following two categories of variables are initially assigned:
Instance variables of class instances.
Instance variables of initially assigned struct variables.
Now what does initially assigned struct variables mean?
We're not talking about local variable, right? So we're talking about field variables (in both of these categories) that are used in function member definitions?
Clarifying this would be really appreciated. And thanks in advance!
The following two categories of variables are initially assigned: (1) Instance variables of class instances and (2) Instance variables of initially assigned struct variables. What does "initially assigned struct variables" mean?
It means an initially assigned variable of struct type.
Follow along.
class C
{
public int i;
}
...
C c = new C();
Console.WriteLine(c.i);
c.i is an instance variable of a class, so it is initially assigned.
struct S
{
public int j;
}
class D
{
public S t;
}
...
D d = new D();
Console.WriteLine(d.t.j);
d.t is an instance variable of a class, so it is initially assigned. d.t.j is an instance variable of a struct S, and the variable d.t of type S is initially assigned, therefore d.t.j is also initially assigned.
That is, a field of a struct is initially assigned if the variable that holds the value of the struct is itself initially assigned.
By contrast:
void M()
{
int q;
Console.WriteLine(q); // Error
S u;
Console.WriteLine(u.j); // Error
Neither q nor u are initially assigned; they are not fields of any class. Since u is not initially assigned, u.j is not either.
Make sense now?
Your question is not very clear. But it is correct that a field (instance or static) of a class or struct is always considered assigned. It will have the default value of the type, which is null for reference types and nullable types, and "zero" or something similar to zero for other value types.
In contrast a local variable, that is a variable declared inside a method (or constructor, or accessor, etc.) must be explicitly assigned to before it is used.
In the example:
class Example
{
int field;
void Method()
{
int local;
...
...
}
}
the field is considered assigned automatically and will have the initial value 0, whereas the variable local is unassigned and must be assigned to (later in the same method) before it can be used (even later in the same method).
An unassigned local variable may be passed as an out parameter of a method, though.
EDIT: (after helpful comments)
My answer above gives a pretty precise description for (static and non-static) fields of classes and for static fields of structs. But there was missing something, as the comments pointed out, in case of instance fields of structs.
A struct instance is fully assigned when all its instance fields are fully assigned. Given the following (mutable!!) struct:
struct SomeStruct
{
public int AlphaField;
public int BetaField;
}
then the following is legal:
void M()
{
SomeStruct localSS;
// localSS and its fields are not assigned, and can't be read yet
localSS.AlphaField = 7; // legal
int useA = localSS.AlphaField; // legal, AlphaField is assigned
// localSS and its remaining field BetaField are not assigned
localSS.BetaField = 13;
string useB = localSS.ToString(); // legal, localSS variable is now fully assigned
}
Even if the above example seems crazy (because mutable structs are discouraged by most people), it is still entirely equivalent to what happens inside a user-defined instance constructor of a struct. The C# Specification uses this sentence: The this variable of an instance constructor of a struct behaves exactly the same as an out parameter of the struct type—in particular, this means that the variable must be definitely assigned in every execution path of the instance constructor.
Note that one way for an instance constructor of a struct to assign all fields, is to chain another instance constructor with the : this(...) constructor chaining syntax.
Also note that instance constructors of structs must take parameters. The expression new SomeStruct() (with empty parameter list) is equivalent to default(SomeStruct) and evaluates to the definitely assigned instance of SomeStruct where all fields have their default values.
A class (in C#) is a definition of a reference type. This means that any members of that class are stored as references.
A struct (in C#) is a definition of a value type. Any members of this class are stored as values, not as references.
Reference types are able to be unassigned, because they don't need to be assigned until they're used. Using an unassigned class instance member would result in a runtime error.
Value types need a value to store, because they would not be able to determine their memory space otherwise. Using an unassigned struct member would result in a compile time error.
Related
I found it difficult to come up with a descriptive enough title for this scenario so I'll let the code do most of the talking.
Consider covariance where you can substitute a derived type for a base class.
class Base
{
}
class Derived : Base
{
}
Passing in typeof(Base) to this method and setting that variable to the derived type is possible.
private void TryChangeType(Base instance)
{
var d = new Derived();
instance = d;
Console.WriteLine(instance.GetType().ToString());
}
However, when checking the type from the caller of the above function, the instance will still be of type Base
private void CallChangeType()
{
var b = new Base();
TryChangeType(b);
Console.WriteLine(b.GetType().ToString());
}
I would assume since objects are inherently reference by nature that the caller variable would now be of type Derived. The only way to get the caller to be type Derived is to pass a reference object by ref like so
private void CallChangeTypeByReference()
{
var b = new Base();
TryChangeTypeByReference(ref b);
Console.WriteLine(b.GetType().ToString());
}
private void TryChangeTypeByReference(ref Base instance)
{
var d = new Derived();
instance = d;
}
Further more, I feel like it's common knowledge that passing in an object to a method, editing props, and passing that object down the stack will keep the changes made down the stack. This makes sense as the object is a reference object.
What causes an object to permanently change type down the stack, only if it's passed in by reference?
You have a great many confused and false beliefs. Let's fix that.
Consider covariance where you can substitute a derived type for a base class.
That is not covariance. That is assignment compatibility. An Apple is assignment compatible with a variable of type Fruit because you can assign an Apple to such a variable. Again, that is not covariance. Covariance is the fact that a transformation on a type preserves the assignment compatibility relationship. A sequence of apples can be used somewhere that a sequence of fruit is needed because apples are a kind of fruit. That is covariance. The mapping "apple --> sequence of apples, fruit --> sequence of fruit" is a covariant mapping.
Moving on.
Passing in typeof(Base) to this method and setting that variable to the derived type is possible.
You are confusing types with instances. You do not pass typeof(Base) to this method; you pass a reference to Base to this instance. typeof(Base) is of type System.Type.
As you correctly note, formal parameters are variables. A formal parameter is a new variable, and it is initialized to the actual parameter aka argument.
However, when checking the type from the caller of the above function, the instance will still be of type Base
Correct. The argument is of type Base. You copy that to a variable, and then you reassign the variable. This is no different than saying:
Base x = new Base();
Base y = x;
y = new Derived();
And now x is still Base and y is Derived. You assigned the same variable twice; the second assignment wins. This is no different than if you said a = 1; b = a; b = 2; -- you would not expect a to be 2 afterwards just because you said b = a in the past.
I would assume since objects are inherently reference by nature that the caller variable would now be of type Derived.
That assumption is wrong. Again, you have made two assignments to the same variable, and you have two variables, one in the caller, and one in the callee. Variables contain values; references to objects are values.
The only way to get the caller to be type Derived is to pass a reference object by ref like so
Now we're getting to the crux of the problem.
The correct way to think about this is that ref makes an alias to a variable. A normal formal parameter is a new variable. A ref formal parameter makes the variable in the formal parameter an alias to the variable at the call site. So now you have one variable but it has two names, because the name of the formal parameter is an alias for the variable at the call. This is the same as:
Base x = new Base();
ref Base y = ref x; // x and y are now two names for the same variable
y = new Derived(); // this assigns both x and y because there is only one variable, with two names
Further more, I feel like it's common knowledge that passing in an object to a method, editing props, and passing that object down the stack will keep the changes made down the stack. This makes sense as the object is a reference object.
Correct.
The mistake you are making here is very common. It was a bad idea for the C# design team to name the variable aliasing feature "ref" because this causes confusion. A reference to a variable makes an alias; it gives another name to a variable. A reference to an object is a token that represents a specific object with a specific identity. When you mix the two it gets confusing.
The normal thing to do is to not pass variables by ref particularly if they contain references.
What causes an object to permanently change type down the stack, only if it's passed in by reference?
Now we have the most fundamental confusion. You have confused objects with variables. An object never changes its type, ever! An apple is an object, and an apple is now and forever an apple. An apple never becomes any other kind of fruit.
Stop thinking that variables are objects, right now. Your life will get so much better. Internalize these rules:
variables are storage locations that store values
references to objects are values
objects have a type that never changes
ref gives a new name to an existing variable
assigning to a variable changes its value
Now if we ask your question again using correct terminology, the confusion disappears immediately:
What causes the value of a variable to change its type down the stack, only if it's passed in by ref?
The answer is now very clear:
A variable passed by ref is an alias to another variable, so changing the value of the parameter is the same as changing the value of the variable at the call site
Assigning an object reference to a variable changes the value of that variable
An object has a particular type
If we don't pass by ref but instead pass normally:
A value passed normally is copied to a new variable, the formal parameter
We now have two variables with no connection; changing one of them does not change the other.
If that's still not clear, start drawing boxes, circles and arrows on a whiteboard, where objects are circles, variables are boxes, and object references are arrows from variables to objects. Making an alias via ref gives a new name to an existing circle; calling without ref makes a second circle and copies the arrow. It'll all make sense then.
This is not an issue with inheritance and polymorphism, what you're seeing is the difference between pass-by-value and pass-by-reference.
private void TryChangeType(Base instance)
The preceding method's instance parameter will be a copy of the caller's Base reference. You can change the object that is referenced and those changes will be visible to the caller because both the caller the callee both reference the same object. But, any changes to the reference itself (such as pointing it to a new object) will not affect the caller's reference. This is why it works as expected when you pass by reference.
When you call TryChangeType() you are passing a copy of the reference to "b" into "instance". Any changes to members of "instance" are made in the same memory space still referenced by "b" in your calling method. However, the command "instance = d" reassigns the value of the memory addressed by "instance". "b" and "instance no longer point to the same memory. When you return to CallChangeType, "b" still references the original space and hence Type.
TryChangeTypeByReference passes the a reference to where "b"'s pointer value is actually stored. Reassigning "instance" now changes the address that "b" is actually pointing to.
We know that class are reference types, so in general when we are passing a type, we are passing a reference but there's a difference between passing just b and ref b, which can be understood as:
In first case 1 it is passing reference by value, which means creating a separate pointer internally to the memory location, now when base class object is assigned to the derived class object, it starts pointing to another object in the memory and when that method returns, only the original pointer remains, which provides the same instance as Base class, when the new pointer created is off for garbage collection
However when object is passed as ref, this is passing reference to a reference in memory, which is like pointer to a pointer, like double pointer in C or C++, which when changes actually changes the original memory allocation and thus you see the difference
For first one to show the same result value has to be returned from the method and old object shall start pointing to the new derived object
Following is the modification to your program to get expected result in case 1:
private Base TryChangeType(Base instance)
{
var d = new Derived();
instance = d;
Console.WriteLine(instance.GetType().ToString());
return instance;
}
private void CallChangeType()
{
var b = new Base();
b = TryChangeType(b);
Console.WriteLine(b.GetType().ToString());
}
Following is the pictorial reference of both the cases:
When you do not pass by reference, a copy of the base class object is passed inside the function, and this copy is changed inside the TryChangeType function. When you print the type of the instance of the base class it is still the of the type "Base" because the copy of the instance was changed to "Derived" class.
When you pass by referece, the address of the instance i.e. the instace itself will be passed to the function. So any changes made to the instance inside the function is permanent.
What does happen behind the scenes when you make a struct without using the new keyword?
Let's say we have this struct:
struct Person
{
public int Age;
public string Name;
}
And In the Main() method I decide to make an instance of it without the new keyword like that:
Person p;
now if I try to access p.Age I will get a compile-time error saying "Use of possibly unassigned field 'Age'" however if I make an instance of the struct like that:
Person p = new Person();
and then I try to access p.Age I will get the value of 0. Now what exactly happens behind the scenes? Does the Runtime initialize these variables for me or the compiler places code that initializes them in the IL after compilation?
Edit:
Can anybody also explain this behavior:
Code:
struct Person
{
public string Name { get; set; }
}
If I make instance of struct like that:
Person p;
and I initialize the name manually
p.Name = "SomeRandomName"';
I won't be able to use it. The compiler gives an error "Use of an unassigned local variable p" but If I make instance of the struct with the default (parameterless) constructor there isn't such an error.
Members don't have the same rules as locals.
Locals must be explicitly initialised before use. Members are initialised by the runtime to their respective default values.
If you want some more relevant information:
In the internal implementation details (non-contractual!), up to the current MS .NET runtime for Windows objects are allocated in pre-zeroed memory on the heap (when they're on the heap at all, of course). All the default values are "physical" zeroes, so all you need is e.g. "200 consecutive bytes with value 0". In many cases, this is as simple as asking the OS for a pre-zeroed memory page. It's a performance compromise to keep memory safety - you can easily allocate an array of 2000 Person instances by just doing new Person[2000], which just requests 2000 * size of Person bytes with value zero; extremely cheap, while still keeping safe default values. No need to initialise 2000 Person instances, and 2000 int instances and 2000 string instances - they're all zero by default. At the same time, there's no chance you'd get a random value for the string reference that would point to some random place in memory (a very common error in unmanaged code).
The main reason for requiring explicit initialisation of locals is that it prevents stupid programming errors. You should never access an uninitialised value in the first place, and if you need a default value, you should be explicit about it - the default value then gets a meaning, and meanings should be explicit. You'll find that cases where you could use an uninitialised local meaningfully in the first place are pretty rare - you usually either declare the local right where it gets a value, or you need all possible branches to update a pre-declared local anyway. Both make it easier to understand code and avoid silly mistakes.
If you go through the small struct documentation, you can quote:
A struct type is a value type that is typically used to encapsulate small groups of related variables, such as the coordinates of a rectangle or the characteristics of an item in an inventory.
Normally, when you declare in your code these value type like:
int i; // By default it's equal to 0
bool b; // by default it's equal to false.
Or a reference type as:
string s; //By default it's null
The struct you have created is a value type, which by default isn't initialized and you can't access its properties. Therefore, you can't declare it as:
Person p;
Then use it directly.
Hence the error you got:
"Use of possibly unassigned field 'Age'"
Because p is still not initialized.
This also explains your second part of the question:
I won't be able to use it. The compiler gives an error "Use of an unassigned local variable p" but If I make instance of the struct with the default (parameterless) constructor there isn't such an error.
The same reason you couldn't directly assign p.Name = "something" is because p is still not initialized.
You must create a new instance of the struct as
Person p = New Person(); //or Person p = default(Person);
Now, what happens when you create a new instance of your struct without giving values to the struct properties? Each one of them will hold it's default value. Such as the Age = 0 because it's an int type.
Every datatype in .NET has a default value. For all reference types it's null. For the special type string it is also null. For all value types it is something akin to zero. For bool it's false because this is the equivalent to zero.
You can observe the same behaviour when you write a class with member fields. After construction all these fields will have a default value even when none was assigned during construction.
The same is also true when you use a struct as a member. Since a struct cannot be null, it will also be initialized and all its members (again) use their default values.
The difference in compiler output is that the compiler cannot determine if you have initialized the member field through any means. But it can determine if you have set the method variable value before reading it. Technically this wouldn't be necessary but since it reduces programming errors (why would you read a variable you have not written?), the compiler error appears.
while experimenting static variables I was amazed to know why the static "int" result to 0 (zero) and non-static result to compile time error.
Consider Case 1
static int i;
static void Main()
{
Console.Write("Value of i = " + i);
Console.ReadKey();
}
the output is
Value of i = 0
Case 2 with removing static
static void Main()
{
int i;
Console.Write("Value of i = " + i);
Console.ReadKey();
}
And the output for this will result to compile time error
Error 1 Use of unassigned local variable 'i'
question here is how do both cases differ i.e first one result to 0 and another get compiler error.
The existing answers all miss something important here, which is where the variable is declared. Is it a class variable or a local variable
In the first scenario
class Program
{
static int i;
static void Main()
{
Console.Write("Value of i = " + i);
Console.ReadKey();
}
}
The variable i is declared as a class variable. Class variables always get initialized, it doesn't matter if it's static or not. If you don't provide a default value, the variable is assigned default, which in the case of int is 0.
On the other hand, in the second example
class Program
{
static void Main()
{
int i;
Console.Write("Value of i = " + i);
Console.ReadKey();
}
}
Variable i is local variable. Unlike class variables, local variables are never initialized with a default value implicitly, but only when you explicitly initialize them. So the compiler error comes not from the variable being static or not, but from the difference in initialization between local and class variables.
The specification shares some more details on that: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables, especially sections 9.2 Variable Types and 9.3 Default Values.
The interesting parts are
9.2.2 The initial value of a static variable is the default value (§9.3) of the variable’s type.
9.3.2.2 The initial value of an instance variable of a class is the default value (§9.3) of the variable’s type.
9.2.8 A local variable introduced by a local_variable_declaration is not automatically initialized and thus has no default value. Such a local variable is considered initially unassigned.
9.3: The following categories of variables are automatically initialized to their default values:
Static variables.
Instance variables of class instances.
Array elements.
The underlying reason for this has to do with memory management. When you initialize a class with new() the garbage collector zeros out all bytes on the heap, thus basically defaulting the value of the variable. In case of integers this is 0, for an object reference it would be a null.
Since local variables live on the stack, and the garbage collector doesn't live work on the stack, this guarantee does not exist, so we have explicitly initialize a variable before using it.
by definition of the C# language, types have "default values", which are assigned to them if you don't assign something else. numbers have a default value of 0, boolean - false, reference types - null, and structs - each member by it's type.
Based on my limited understanding, by declaring a variable as static, it becomes existing in memory and is assigned a default value of 0, and does not depend on an instance of the class it is in to exist.
If it is not defined as static, it needs to be initialized within the class(as in, given a value) before it can be used in any logic/math.
Now my confusion comes from being very new, and trying to understand exactly WHY you would choose to do something like this one way, instead of another. Perhaps this is a way to have some values that persist that may be necessary even when the class it is in is not existing, and making everything static would result in ineficient use of memory.
The fundamental reason why you do not get a compilation error for the static scenario is because the compiler has no way to be sure that it is not initialized before being read.
Indeed, a static class member without visibility modifier is internal by default. This means that another class of the same assembly could define it before the Main method is called.
And even if it were private, an external code could still define its value using classes in the System.Reflection namespace.
Whereas, for the local variable case, the compiler is sure that no other code could define the variable between its declaration and its first reading.
Indeed, local variables are not accessible outside their declaration scope to non debugging code.
For C# in VS2005, what will be value of variables of the following types if they are simply declared and not assigned to any value? ie. What are their default values?
int
bool
string
char
enum
Here's what default value for each type you've mentioned would be.
int = 0
bool = false
string = null
char = '\0'
enum = 0 //behind the scenes enum is int
Taking this forward, at runtime if you wish to capture default value of any type then you can use default statement in C# and simply call it as following.
//This will print 0 on screen.
Console.WriteLine(default(int));
Generally, this is used in generics for identifying default values of generic type arguments, where the type is only known at runtime.
If they're used locally (i.e. they're not members of a class or struct) they won't have default values. You won't be able to use them until they're assigned a value (unless you explicitly "new them up").
If they're not used locally, they'll default to 0, false, null, and '\0'.
Edit: You've added enum to your list. enums default to a value of 0 because they use an int by default behind the scenes. So whatever enumerate is declared as 0 for the enum (typically the first enumerate, but that's overridable) will be the default. If you don't have a 0 value enumerate for whatever reason, then you'll have an invalid value for your enum as the default value.
Please explain. I am talking about declaring the variables inside the scope of a class or method. – Craig Johnston
I told a little white lie to point out how the compiler works. If you declare a local variable, the compiler will not compile if you try to use it without having first explicitly assigning a value to that variable; that includes null for reference types.
For example:
public void Foo()
{
int bar;
int barPlus5 = bar + 5; // Compiler Error!
}
Technically, bar still has a default value of 0, but because the C# compiler will not allow you to use that default value in a locally scoped variable, a locally scoped variable effectively doesn't have a default value. <Ben Kenobi>So what I told you was true, from a certain point of view.</Ben Kenobi>.
There are exceptions to the rule: out parameters get a pass because the compiler enforces that an out parameter must be assigned by a method before it returns, and you can do int bar = new int(); to get the default value, since that's technically an assignment.
Now, if you declare a variable as a member of a class or a struct such as follows:
public class Foo
{
public int Bar {get;set;}
public Foo() { }
}
And then you instantiate Foo somewhere, Bar will have a default value of 0.
For example:
var foo = new Foo();
Console.WriteLine(foo.Bar); // output: 0
MSDN article - Value Types.
All value types implicitly declare a
public parameterless instance
constructor called the "default
constructor". The default constructor
returns a zero-initialized instance
know as the default value for the
value type.
In case of local variables (Value or
Reference Types) in C#, they must be
initialized before they are used.
MSDN Article - Types
A type that is defined as a class,
delegate, array, or interface is a
reference type. At run time, when you
declare a variable of a reference
type, the variable contains the value
null until you explicitly create an
instance of the object by using the
new operator, or assign it an object
that has been created elsewhere by
using new.
Is it possible to hide the parameterless constructor from a user in C#?
I want to force them to always use the constructor with parameters
e.g. this Position struct
public struct Position
{
private readonly int _xposn;
private readonly int _yposn;
public int Xposn
{
get { return _xposn; }
}
public int Yposn
{
get { return _yposn; }
}
public Position(int xposn, int yposn)
{
_xposn = xposn;
_yposn = yposn;
}
}
I only want users to be able to new up a Position by specifying the x and y coordinates.
However, the parameterless constructor is ALWAYS available.
I cannot make it private. Or even define it as public.
I have read this:
Why can't I define a default constructor for a struct in .NET?
but it doesn't really help.
If this is not possible - what is the best way to detect if the Position I am being passed has values?
Explicitly checking each property field? Is there a slicker way?
No, you can't do this. As you said, similar question has been asked before - and I thought the answer was fairly clear that you couldn't do it.
You can create a private parameterless constructor for a struct, but not in C#. However, even if you do that it doesn't really help - because you can easily work around it:
MyStruct[] tmp = new MyStruct[1];
MyStruct gotcha = tmp[0];
That will be the default value of MyStruct - the "all zeroes" value - without ever calling a constructor.
You could easily add a Validate method to your struct and call that each time you received one as a parameter, admittedly.
Nope can't hide it. Structs cannot redefine zero arg constructor, so therefore its visibility can't be redefined.
Well a struct is literally a declaration of how the memory will sit.
Even when you use an object, the objects pointer IS declared, regardless of whether it's null.
The reason that you can't hide the constructor is because the struct demands that the CLR can create it as internally it must do this.
You could convert this struct into an object to achieve this task. Or use static analysis to ensure it's intialized before you use it?
struct point
{
int xpos;
int ypos;
}
Have a google for immutable objects, this appears to be what your after. I believe that they are looking to add this feature (but not in C# 4) to the language itself, because it is a common requirement. Is there a specific need for a struct here?
You cannot make a struct with a private parameter-less constructor or even declare a parameter-less constructor. You would have to change it to a class. Structs are not allow to declare a parameterless constructor.
From the Structs Tutorial on MSDN:
Structs can declare constructors, but they must take parameters. It is an error to declare a default (parameterless) constructor for a struct. Struct members cannot have initializers. A default constructor is always provided to initialize the struct members to their default values.
From the C# specification on MSDN:
11.3 Class and struct differences
Structs differ from classes in several important ways:
Structs are value types (Section
11.3.1).
All struct types implicitly inherit
from the class System.ValueType
(Section 11.3.2). Assignment to a
variable of a struct type creates a
copy of the value being assigned
(Section 11.3.3).
The default value of a struct is the
value produced by setting all value
type fields to their default value
and all reference type fields to null
(Section 11.3.4). Boxing and
unboxing operations are used to
convert between a struct type and
object (Section 11.3.5).
The meaning of this is different for
structs (Section 11.3.6).
Instance field declarations for a
struct are not permitted to include
variable initializers (Section
11.3.7).
A struct is not permitted to declare
a parameterless instance constructor
(Section 11.3.8).
A struct is not permitted to declare
a destructor (Section 11.3.9).