Noninitialized variable in C#

Noninitialized variable in C# - c#

I have the following piece of code:
class Foo
{
public Foo()
{
Bar bar;
if (null == bar)
{
}
}
}
class Bar { }
Code gurus will already see that this gives an error. Bar might not be initialized before the if statement.
What is the value of bar? Shouldn't it be null? Aren't they set to null? (null pointer?)

No, local variables don't have a default value1. They have to be definitely assigned before you read them. This reduces the chance of you using a variable you think you've given a sensible value to, when actually it's got some default value. This can't be done for instance or static variables because you don't know in what order methods will be called.
See section 5.3 of the C# 3.0 spec for more details of definite assignment.
Note that this has nothing to do with this being a reference type variable. This will fail to compile in the same way:
int i;
if (i == 0) // Nope, i isn't definitely assigned
{
}
1 As far as the language is concerned, anyway... clearly the storage location in memory has something in it, but it's irrelevant and implementation-specific. There is one way you can find out what that value is, by creating a method with an out parameter but then using IL to look at the value of that parameter within the method, without having given it another value. The CLR doesn't mind that at all. You can then call that method passing in a not-definitely-assigned variable, and lo and behold you can detect the value - which is likely to be the "all zeroes" value basically.
I suspect that the CLI specification does enforce local variables having a default value - but I'd have to check. Unless you're doing evil things like the above, it shouldn't matter to you in C#.

Fields (variables on classes / structs) are initialized to null/zero/etc. Local variables... well - since (by "definite assignment") you can't access them without assigning there is no sensible way of answering; simply, it isn't defined since it is impossible. I believe they happen to be null/zero/etc (provable by hacking some out code via dynamic IL generation), but that is an implementation detail.
For info, here's some crafy code that shows the value of a formally uninitialised variable:
using System;
using System.Reflection.Emit;
static class Program
{
delegate void Evil<T>(out T value);
static void Main()
{
MakeTheStackFilthy();
Test();
}
static void Test()
{
int i;
DynamicMethod mthd = new DynamicMethod("Evil", null, new Type[] { typeof(int).MakeByRefType()});
mthd.GetILGenerator().Emit(OpCodes.Ret); // just return; no assignments
Evil<int> evil = (Evil<int>)mthd.CreateDelegate(typeof(Evil<int>));
evil(out i);
Console.WriteLine(i);
}
static void MakeTheStackFilthy()
{
DateTime foo = new DateTime();
Bar(ref foo);
Console.WriteLine(foo);
}
static void Bar(ref DateTime foo)
{
foo = foo.AddDays(1);
}
}
The IL just does a "ret" - it never assigns anything.

Local variables do not get assigned a default value. You have to initialize them before you use them. You can explicityly initialize to null though:
public Foo()
{
Bar bar = null;
if (null == bar)
{
}
}

Local variables are not assigned a default value, not even a null.

The value of bar is undefined. There's space allocated for it on the stack, but the space isn't initialised to any value so it contains anything that happened to be there before.
(The local variable might however be optimised to use a register instead of stack space, but it's still undefined.)
The compiler won't let you use the undefined value, it has to be able to determine that the variable is initialised before you can use it.
As a comparison, VB does initialise local variables. While this can be practical sometimes, it can also mean that you unintenionally use a variable before you have given it a meaningful value, and the compiler can't determine if it's what you indended to do or not.

It doesn't matter because no such code should be compilable by any compiler that implements C#.
If there was a default value, then it would be compilable. But there is none for local variables.

Besides "correctness", local variable initialization is also related to the CLR's verification process.
For more details, see my answer to this similar question: Why must local variables have initial values?

Related

dotNet/C#: Check if a variable was declared

I can't find an answer to my problem. In dotNet/C#, is it possible to check if a variable was declared to some type and if not, declare it?
Thanks
[Edit] In this case, C# is used as a preexecute language in Open Text CMS. C# code can be used in any module. Using a non-declared variable throws hard to debug errors, as does double-declaring a variable. That's why I'd like to check.
[Edit2] Yes it is most probably compiled somewhere, but the errors are thrown (or rather not thrown) on runtime
[Edit3] Further explanation:
In Open Text, every page can hold several modules, several instances of a module and the same instance of a module several times. In each module, you can use C# as a "pre-execute" language. This is mostly really easy scripting to maneuver around the failings of OpenText. You introduce small variables, set them to true or false, and three lines later write a condition based on the variable. We could (and do) declare a bunch of variables in an initialization block of the page, but since there are so many, it would help to be able to check if a variable was declared and if not, declare it.
I like the idea of changing this to a key/value dictionary but this is a really large site with loads of pages/modules and instances and I'm looking for a working solution without changing the whole thing.
The actual code is really simple most oft he time:
var hasHeadline = false; // this will throw an error if hasHeadline was declared before
hasHeadline = true; // if some CMS condition is met. this will throw an error if hasHeadline wasn't declared
if(hasHeadline) { ** CMS code ** }
As I said, this will show up in multiple instances over which I don't have full control. The resulting "error" will be that the whole code block is stripped from the page.

Declare a single variable that is dynamic, e.g. an ExpandoObject.
dynamic Globals = new ExpandoObject();
Use this variable to store all of your global state.
Globals.hasHeadline = false; //No declaration needed, so
Globals.hasHeadline = true; //no chance of a duplicate declaration

There's no need to. C# is a statically typed programming language ("type" refers to more than just class, struct, and interface: "static typing" means the "types" (shapes) of data, objects and values in your program are known
statically - i.e. at compile-time). If something isn't declared in scope then your code simply won't compile.
This also applies to locals (local variables, method parameters, etc).
This won't compile:
class Foo
{
void Foo( String x )
{
if( z > 0 ) { // `z` isn't declared as a field, parameter or local.
// ...
}
}
}
Similarly, this won't compile:
class Foo
{
public string x;
}
class Bar
{
void Baz( Foo foo )
{
if( foo.z > 0 ) { // `z` is not declared in `Foo`
}
}
}
That said, there are some things you do need to check-before-using in C#, such as:
Nullable references or nullable values.
Entries in a Dictionary or other keyed collection.
Type-checking when you want a known subclass or interface (As C# still does not natively support algebraic types, grrrr)
...but none of those involve checking for declarations.

C# unassigned static int results to zero

while experimenting static variables I was amazed to know why the static "int" result to 0 (zero) and non-static result to compile time error.
Consider Case 1
static int i;
static void Main()
{
Console.Write("Value of i = " + i);
Console.ReadKey();
}
the output is
Value of i = 0
Case 2 with removing static
static void Main()
{
int i;
Console.Write("Value of i = " + i);
Console.ReadKey();
}
And the output for this will result to compile time error
Error 1 Use of unassigned local variable 'i'
question here is how do both cases differ i.e first one result to 0 and another get compiler error.

The existing answers all miss something important here, which is where the variable is declared. Is it a class variable or a local variable
In the first scenario
class Program
{
static int i;
static void Main()
{
Console.Write("Value of i = " + i);
Console.ReadKey();
}
}
The variable i is declared as a class variable. Class variables always get initialized, it doesn't matter if it's static or not. If you don't provide a default value, the variable is assigned default, which in the case of int is 0.
On the other hand, in the second example
class Program
{
static void Main()
{
int i;
Console.Write("Value of i = " + i);
Console.ReadKey();
}
}
Variable i is local variable. Unlike class variables, local variables are never initialized with a default value implicitly, but only when you explicitly initialize them. So the compiler error comes not from the variable being static or not, but from the difference in initialization between local and class variables.
The specification shares some more details on that: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/language-specification/variables, especially sections 9.2 Variable Types and 9.3 Default Values.
The interesting parts are
9.2.2 The initial value of a static variable is the default value (§9.3) of the variable’s type.
9.3.2.2 The initial value of an instance variable of a class is the default value (§9.3) of the variable’s type.
9.2.8 A local variable introduced by a local_variable_declaration is not automatically initialized and thus has no default value. Such a local variable is considered initially unassigned.
9.3: The following categories of variables are automatically initialized to their default values:
Static variables.
Instance variables of class instances.
Array elements.
The underlying reason for this has to do with memory management. When you initialize a class with new() the garbage collector zeros out all bytes on the heap, thus basically defaulting the value of the variable. In case of integers this is 0, for an object reference it would be a null.
Since local variables live on the stack, and the garbage collector doesn't live work on the stack, this guarantee does not exist, so we have explicitly initialize a variable before using it.

by definition of the C# language, types have "default values", which are assigned to them if you don't assign something else. numbers have a default value of 0, boolean - false, reference types - null, and structs - each member by it's type.

Based on my limited understanding, by declaring a variable as static, it becomes existing in memory and is assigned a default value of 0, and does not depend on an instance of the class it is in to exist.
If it is not defined as static, it needs to be initialized within the class(as in, given a value) before it can be used in any logic/math.
Now my confusion comes from being very new, and trying to understand exactly WHY you would choose to do something like this one way, instead of another. Perhaps this is a way to have some values that persist that may be necessary even when the class it is in is not existing, and making everything static would result in ineficient use of memory.

The fundamental reason why you do not get a compilation error for the static scenario is because the compiler has no way to be sure that it is not initialized before being read.
Indeed, a static class member without visibility modifier is internal by default. This means that another class of the same assembly could define it before the Main method is called.
And even if it were private, an external code could still define its value using classes in the System.Reflection namespace.
Whereas, for the local variable case, the compiler is sure that no other code could define the variable between its declaration and its first reading.
Indeed, local variables are not accessible outside their declaration scope to non debugging code.

Avoid "Use of unassigned local variable" error

I have a two methods that are equivalent to this (pardon the contrived example):
public void WithResource(Action<Resource> action) {
using (var resource = GetResource()) {
action(resource);
}
}
public void Test() {
int id;
SomeObject someObject;
WithResource((resource) => {
id = 1;
someObject = SomeClass.SomeStaticMethod(resource);
});
Assert.IsNotNull(someObject);
Assert.AreEqual(id, someObject.Id);
}
(There's some more logic in the WithResource call I'm trying to factor out.)
I'm getting Use of unassigned local variable compile-time errors because the assertions
are... using unassigned variables. I'm currently avoiding the issue by assigning them -1 and null respectively.
Initializing to null doesn't feel bad, but I'd like to avoid putting the -1 in there... I would really like to tell the compiler "trust me, this will become initialized". Since it's a test, I don't actually care too much if it bombs, because that only means I'll have to fix the test.
I'm tempted to ask if there's a way to give that hint to the compiler, but have the feeling that's even uglier. Does such a hint exist or should I just initialize the variables like I'm doing now?

Compiler is not smart enough to determine if the assignment would be made and hence the error.
You can't do anything about it, You have to assign it some default value, probably 0,-1, default(int) or int.MinValue

You should initialize the variables. The compiler will never trust you ;)

The other answers are correct. The easiest way to solve this problem is to initialize the locals.
I assume that you understand why the error is being produced: the compiler has no ability to know that the method called actually runs the lambda, and therefore no knowledge that the locals are initialized.
The only way to trick the compiler into not checking whether a variable is assigned is to make the variable non-local:
public void Test() {
int[] id = new int[1];
SomeObject[] someObject = new SomeObject[1];
WithResource((resource) => {
id[0] = 1;
someObject[0] = SomeClass.SomeStaticMethod(resource);
});
Assert.IsNotNull(someObject[0]);
Assert.AreEqual(id[0], someObject.Id);
}
Now you might say, well here I've clearly assigned id. Yes, but notice that the compiler does not complain that you've used id[0] before initializing it! The compiler knows that array element variables are initialized to zero.

C# Cannot use ref or out parameter inside an anonymous method body

I'm trying to create a function that can create an Action that increments whatever integer is passed in. However my first attempt is giving me an error "cannot use ref or out parameter inside an anonymous method body".
public static class IntEx {
public static Action CreateIncrementer(ref int reference) {
return () => {
reference += 1;
};
}
}
I understand why the compiler doesn't like this, but nonetheless I'd like to have a graceful way to provide a nice incrementer factory that can point to any integer. The only way I'm seeing to do this is something like the following:
public static class IntEx {
public static Action CreateIncrementer(Func<int> getter, Action<int> setter) {
return () => setter(getter() + 1);
}
}
But of course that is more of a pain for the caller to use; requiring the caller to create two lambdas instead of just passing in a reference. Is there any more graceful way of providing this functionality, or will I just have to live with the two-lambda option?

Okay, I've found that it actually is possible with pointers if in unsafe context:
public static class IntEx {
unsafe public static Action CreateIncrementer(int* reference) {
return () => {
*reference += 1;
};
}
}
However, the garbage collector can wreak havoc with this by moving your reference during garbage collection, as the following indicates:
class Program {
static void Main() {
new Program().Run();
Console.ReadLine();
}
int _i = 0;
public unsafe void Run() {
Action incr;
fixed (int* p_i = &_i) {
incr = IntEx.CreateIncrementer(p_i);
}
incr();
Console.WriteLine(_i); // Yay, incremented to 1!
GC.Collect();
incr();
Console.WriteLine(_i); // Uh-oh, still 1!
}
}
One can get around this problem by pinning the variable to a specific spot in memory. This can be done by adding the following to the constructor:
public Program() {
GCHandle.Alloc(_i, GCHandleType.Pinned);
}
That keeps the garbage collector from moving the object around, so exactly what we're looking for. However then you've got to add a destructor to release the pin, and it fragments the memory throughout the lifetime of the object. Not really any easier. This would make more sense in C++, where stuff doesn't get moved around, and resource management is par the course, but not so much in C# where all that is supposed to be automatic.
So looks like the moral of the story is, just wrap that member int in a reference type and be done with it.
(And yes, that's the way I had it working before asking the question, but was just trying to figure out if there was a way I could get rid of all my Reference<int> member variables and just use regular ints. Oh well.)

This is not possible.
The compiler will transform all local variables and parameters used by anonymous methods into fields in an automatically generated closure class.
The CLR does not allow ref types to be stored in fields.
For example, if you pass a value type in a local variable as such a ref parameter, the value's lifetime would extend beyond its stack frame.

It might have been a useful feature for the runtime to allow the creation of variable references with a mechanism to prevent their persistence; such a feature would have allowed an indexer to behave like an array (e.g. so a Dictionary<Int32, Point> could be accessed via "myDictionary[5].X = 9;"). I think such a feature could have been provided safely if such references could not be downcast to other types of objects, nor used as fields, nor passed by reference themselves (since anyplace such a reference could be stored would go out of scope before the reference itself would). Unfortunately, the CLR does not provide such a feature.
To implement what you're after would require that the caller of any function which uses a reference parameter within a closure must wrap within a closure any variable it wants to pass to such a function. If there were a special declaration to indicate that a parameter would be used in such a fashion, it might be practical for a compiler to implement the required behavior. Maybe in a .net 5.0 compiler, though I'm not sure how useful that would be.
BTW, my understanding is that closures in Java use by-value semantics, while those in .net are by-reference. I can understand some occasional uses for by-reference semantics, but using reference by default seems a dubious decision, analogous to the use of default by-reference parameter-passing semantics for VB versions up through VB6. If one wants to capture the value of a variable when creating a delegate to call a function (e.g. if one wants a delegate to call MyFunction(X) using the value of X when the delegate is created), is it better to use a lambda with an extra temp, or is it better to simply use a delegate factory and not bother with Lambda expressions.

why C# does not provide internal helper for passing property as reference?

This is issue about LANGUAGE DESIGN.
Please do not answer to the question until you read entire post! Thank you.
With all helpers existing in C# (like lambdas, or automatic properties) it is very odd for me that I cannot pass property by a reference. Let's say I would like to do that:
foo(ref my_class.prop);
I get error so I write instead:
{
var tmp = my_class.prop;
foo(tmp);
my_class.prop = tmp;
}
And now it works. But please notice two things:
it is general template, I didn't put anywhere type, only "var", so it applies for all types and number of properties I have to pass
I have to do it over and over again, with no benefit -- it is mechanical work
The existing problem actually kills such useful functions as Swap. Swap is normally 3 lines long, but since it takes 2 references, calling it takes 5 lines. Of course it is nonsense and I simply write "swap" by hand each time I would like to call it. But this shows C# prevents reusable code, bad.
THE QUESTION
So -- what bad could happen if compiler automatically create temporary variables (as I do by hand), call the function, and assign the values back to properties? Is this any danger in it? I don't see it so I am curious what do you think why the design of this issue looks like it looks now.
Cheers,
EDIT As 280Z28 gave great examples for beating idea of automatically wrapping ref for properties I still think wrapping properties with temporary variables would be useful. Maybe something like this:
Swap(inout my_class.prop1,inout my_class.prop2);
Otherwise no real Swap for C# :-(

There are a lot of assumptions you can make about the meaning and behavior of a ref parameter. For example,
Case 1:
int x;
Interlocked.Increment(ref x);
If you could pass a property by ref to this method, the call would be the same but it would completely defeat the semantics of the method.
Case 2:
void WaitForCompletion(ref bool trigger)
{
while (!trigger)
Thread.Sleep(1000);
}
Summary: A by-ref parameter passes the address of a memory location to the function. An implementation creating a temporary variable in order to "pass a property by reference" would be semantically equivalent to passing by value, which is precisely the behavior that you're disallowing when you make the parameter a ref one.

Your proposal is called "copy in - copy out" reference semantics. Copy-in-copy-out semantics are subtly different from what we might call "ref to variable" semantics; different enough to be confusing and wrong in many situations. Others have already given you some examples; there are plenty more. For example:
void M() { F(ref this.p); }
void F(ref int x) { x = 123; B(); }
void B() { Console.WriteLine(this.p); }
If "this.p" is a property, with your proposal, this prints the old value of the property. If it is a field then it prints the new value.
Now imagine that you refactor a field to be a property. In the real language, that causes errors if you were passing a field by ref; the problem is brought to your attention. With your proposal, there is no error; instead, behaviour changes silently and subtly. That makes for bugs.
Consistency is important in C#, particularly in parts of the language that people find confusing, like reference semantics. I would want either references to always be copy-in-copy-out or never copy-in-copy-out. Doing it one way sometimes and another way other times seems like really bad design for C#, a language which values consistency over brevity.

Because a property is a method. It is a language construct responding to a pattern of encapsulating the setting and retrieval of a private field through a set of methods. It is functionally equivalent to this:
class Foo
{
private int _bar;
public int GetBar( ) { return _bar; }
public void SetBar( ) { _bar = value; }
}

With a ref argument, changes to the underlying variable will be observed by the method, this won't happen in your case. In other words, it is not exactly the same.

var t = obj.prop;
foo(ref t);
obj.prop = t;
Here, side effects of getter and setter are only visible once each, regardless of how many times the "by-ref" parameter got assigned to.
Imagine a dynamically computed property. Its value might change at any time. With this construct, foo is not kept up to date even though the code suggests this ("I'm passing the property to the method")

So -- what bad could happen if
compiler automatically create
temporary variables (as I do by hand),
call the function, and assign the
values back to properties? Is this any
danger in it?
The danger is that the compiler is doing something you don't know. Making the code confusing because properties are methods, not variables.

I'll provide just one simple example where it would cause confusion. Assume it was possible (as is in VB):
class Weird {
public int Prop { get; set; }
}
static void Test(ref int x) {
x = 42;
throw new Exception();
}
static void Main() {
int v = 10;
try {
Test(ref v);
} catch {}
Console.WriteLine(v); // prints 42
var c = new Weird();
c.Prop = 10;
try {
Test(ref c.Prop);
} catch {}
Console.WriteLine(c.Prop); // prints 10!!!
}
Nice. Isn't it?

Because, as Eric Lippert is fond of pointing out, every language feature must be understood, designed, specified, implemented, tested and documented. And it's obviously not a common scenario/pain point.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.