dotNet/C#: Check if a variable was declared

dotNet/C#: Check if a variable was declared - c#

I can't find an answer to my problem. In dotNet/C#, is it possible to check if a variable was declared to some type and if not, declare it?
Thanks
[Edit] In this case, C# is used as a preexecute language in Open Text CMS. C# code can be used in any module. Using a non-declared variable throws hard to debug errors, as does double-declaring a variable. That's why I'd like to check.
[Edit2] Yes it is most probably compiled somewhere, but the errors are thrown (or rather not thrown) on runtime
[Edit3] Further explanation:
In Open Text, every page can hold several modules, several instances of a module and the same instance of a module several times. In each module, you can use C# as a "pre-execute" language. This is mostly really easy scripting to maneuver around the failings of OpenText. You introduce small variables, set them to true or false, and three lines later write a condition based on the variable. We could (and do) declare a bunch of variables in an initialization block of the page, but since there are so many, it would help to be able to check if a variable was declared and if not, declare it.
I like the idea of changing this to a key/value dictionary but this is a really large site with loads of pages/modules and instances and I'm looking for a working solution without changing the whole thing.
The actual code is really simple most oft he time:
var hasHeadline = false; // this will throw an error if hasHeadline was declared before
hasHeadline = true; // if some CMS condition is met. this will throw an error if hasHeadline wasn't declared
if(hasHeadline) { ** CMS code ** }
As I said, this will show up in multiple instances over which I don't have full control. The resulting "error" will be that the whole code block is stripped from the page.

Declare a single variable that is dynamic, e.g. an ExpandoObject.
dynamic Globals = new ExpandoObject();
Use this variable to store all of your global state.
Globals.hasHeadline = false; //No declaration needed, so
Globals.hasHeadline = true; //no chance of a duplicate declaration

There's no need to. C# is a statically typed programming language ("type" refers to more than just class, struct, and interface: "static typing" means the "types" (shapes) of data, objects and values in your program are known
statically - i.e. at compile-time). If something isn't declared in scope then your code simply won't compile.
This also applies to locals (local variables, method parameters, etc).
This won't compile:
class Foo
{
void Foo( String x )
{
if( z > 0 ) { // `z` isn't declared as a field, parameter or local.
// ...
}
}
}
Similarly, this won't compile:
class Foo
{
public string x;
}
class Bar
{
void Baz( Foo foo )
{
if( foo.z > 0 ) { // `z` is not declared in `Foo`
}
}
}
That said, there are some things you do need to check-before-using in C#, such as:
Nullable references or nullable values.
Entries in a Dictionary or other keyed collection.
Type-checking when you want a known subclass or interface (As C# still does not natively support algebraic types, grrrr)
...but none of those involve checking for declarations.

Related

C# initial value in constructor OR class variable declaration? [duplicate]

I've been programming in C# and Java recently and I am curious where the best place is to initialize my class fields.
Should I do it at declaration?:
public class Dice
{
private int topFace = 1;
private Random myRand = new Random();
public void Roll()
{
// ......
}
}
or in a constructor?:
public class Dice
{
private int topFace;
private Random myRand;
public Dice()
{
topFace = 1;
myRand = new Random();
}
public void Roll()
{
// .....
}
}
I'm really curious what some of you veterans think is the best practice. I want to be consistent and stick to one approach.

My rules:
Don't initialize with the default values in declaration (null, false, 0, 0.0…).
Prefer initialization in declaration if you don't have a constructor parameter that changes the value of the field.
If the value of the field changes because of a constructor parameter put the initialization in the constructors.
Be consistent in your practice (the most important rule).

In C# it doesn't matter. The two code samples you give are utterly equivalent. In the first example the C# compiler (or is it the CLR?) will construct an empty constructor and initialise the variables as if they were in the constructor (there's a slight nuance to this that Jon Skeet explains in the comments below).
If there is already a constructor then any initialisation "above" will be moved into the top of it.
In terms of best practice the former is less error prone than the latter as someone could easily add another constructor and forget to chain it.

I think there is one caveat. I once committed such an error: Inside of a derived class, I tried to "initialize at declaration" the fields inherited from an abstract base class. The result was that there existed two sets of fields, one is "base" and another is the newly declared ones, and it cost me quite some time to debug.
The lesson: to initialize inherited fields, you'd do it inside of the constructor.

The semantics of C# differs slightly from Java here. In C# assignment in declaration is performed before calling the superclass constructor. In Java it is done immediately after which allows 'this' to be used (particularly useful for anonymous inner classes), and means that the semantics of the two forms really do match.
If you can, make the fields final.

Assuming the type in your example, definitely prefer to initialize fields in the constructor. The exceptional cases are:
Fields in static classes/methods
Fields typed as static/final/et al
I always think of the field listing at the top of a class as the table of contents (what is contained herein, not how it is used), and the constructor as the introduction. Methods of course are chapters.

In Java, an initializer with the declaration means the field is always initialized the same way, regardless of which constructor is used (if you have more than one) or the parameters of your constructors (if they have arguments), although a constructor might subsequently change the value (if it is not final). So using an initializer with a declaration suggests to a reader that the initialized value is the value that the field has in all cases, regardless of which constructor is used and regardless of the parameters passed to any constructor. Therefore use an initializer with the declaration only if, and always if, the value for all constructed objects is the same.

There are many and various situations.
I just need an empty list
The situation is clear. I just need to prepare my list and prevent an exception from being thrown when someone adds an item to the list.
public class CsvFile
{
private List<CsvRow> lines = new List<CsvRow>();
public CsvFile()
{
}
}
I know the values
I exactly know what values I want to have by default or I need to use some other logic.
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = new List<string>() {"usernameA", "usernameB"};
}
}
or
public class AdminTeam
{
private List<string> usernames;
public AdminTeam()
{
usernames = GetDefaultUsers(2);
}
}
Empty list with possible values
Sometimes I expect an empty list by default with a possibility of adding values through another constructor.
public class AdminTeam
{
private List<string> usernames = new List<string>();
public AdminTeam()
{
}
public AdminTeam(List<string> admins)
{
admins.ForEach(x => usernames.Add(x));
}
}

What if I told you, it depends?
I in general initialize everything and do it in a consistent way. Yes it's overly explicit but it's also a little easier to maintain.
If we are worried about performance, well then I initialize only what has to be done and place it in the areas it gives the most bang for the buck.
In a real time system, I question if I even need the variable or constant at all.
And in C++ I often do next to no initialization in either place and move it into an Init() function. Why? Well, in C++ if you're initializing something that can throw an exception during object construction you open yourself to memory leaks.

The design of C# suggests that inline initialization is preferred, or it wouldn't be in the language. Any time you can avoid a cross-reference between different places in the code, you're generally better off.
There is also the matter of consistency with static field initialization, which needs to be inline for best performance. The Framework Design Guidelines for Constructor Design say this:
✓ CONSIDER initializing static fields inline rather than explicitly using static constructors, because the runtime is able to optimize the performance of types that don’t have an explicitly defined static constructor.
"Consider" in this context means to do so unless there's a good reason not to. In the case of static initializer fields, a good reason would be if initialization is too complex to be coded inline.

Being consistent is important, but this is the question to ask yourself:
"Do I have a constructor for anything else?"
Typically, I am creating models for data transfers that the class itself does nothing except work as housing for variables.
In these scenarios, I usually don't have any methods or constructors. It would feel silly to me to create a constructor for the exclusive purpose of initializing my lists, especially since I can initialize them in-line with the declaration.
So as many others have said, it depends on your usage. Keep it simple, and don't make anything extra that you don't have to.

Consider the situation where you have more than one constructor. Will the initialization be different for the different constructors? If they will be the same, then why repeat for each constructor? This is in line with kokos statement, but may not be related to parameters. Let's say, for example, you want to keep a flag which shows how the object was created. Then that flag would be initialized differently for different constructors regardless of the constructor parameters. On the other hand, if you repeat the same initialization for each constructor you leave the possibility that you (unintentionally) change the initialization parameter in some of the constructors but not in others. So, the basic concept here is that common code should have a common location and not be potentially repeated in different locations. So I would say always put it in the declaration until you have a specific situation where that no longer works for you.

There is a slight performance benefit to setting the value in the declaration. If you set it in the constructor it is actually being set twice (first to the default value, then reset in the ctor).

When you don't need some logic or error handling:
Initialize class fields at declaration
When you need some logic or error handling:
Initialize class fields in constructor
This works well when the initialization value is available and the
initialization can be put on one line. However, this form of
initialization has limitations because of its simplicity. If
initialization requires some logic (for example, error handling or a
for loop to fill a complex array), simple assignment is inadequate.
Instance variables can be initialized in constructors, where error
handling or other logic can be used.
From https://docs.oracle.com/javase/tutorial/java/javaOO/initial.html .

I normally try the constructor to do nothing but getting the dependencies and initializing the related instance members with them. This will make you life easier if you want to unit test your classes.
If the value you are going to assign to an instance variable does not get influenced by any of the parameters you are going to pass to you constructor then assign it at declaration time.

Not a direct answer to your question about the best practice but an important and related refresher point is that in the case of a generic class definition, either leave it on compiler to initialize with default values or we have to use a special method to initialize fields to their default values (if that is absolute necessary for code readability).
class MyGeneric<T>
{
T data;
//T data = ""; // <-- ERROR
//T data = 0; // <-- ERROR
//T data = null; // <-- ERROR
public MyGeneric()
{
// All of the above errors would be errors here in constructor as well
}
}
And the special method to initialize a generic field to its default value is the following:
class MyGeneric<T>
{
T data = default(T);
public MyGeneric()
{
// The same method can be used here in constructor
}
}

"Prefer initialization in declaration", seems like a good general practice.
Here is an example which cannot be initialized in the declaration so it has to be done in the constructor.
"Error CS0236 A field initializer cannot reference the non-static field, method, or property"
class UserViewModel
{
// Cannot be set here
public ICommand UpdateCommad { get; private set; }
public UserViewModel()
{
UpdateCommad = new GenericCommand(Update_Method); // <== THIS WORKS
}
void Update_Method(object? parameter)
{
}
}

Variable of indeterminate type

I have a class that contains a variable of indeterminate type, which must be overridden at runtime, how can I do this?
Sorry for the disgusting question(
Example:
public class MyClass
{
public e_Type TypeValue;
public (variable of indeterminate type) Value;
}
public enum e_Type
{
string, int, bool, byte
}
At runtime variable TypeValue should determine the type of variable Value

Depending on what you actually mean, you should use either var or dynamic.
The var keyword simply lets the compiler take care of deciding which type you are actually using. If the data you will be assigning is truly dynamic during runtime, it won't do you much good. You should mostly look at var as syntactic sugar (even if it at times can be very, very helpful sugar) - i.e. it just saves you typing.
The dynamic keyword lets you create an object that is truly dynamic, that is you will not get a compiler or runtime error no matter what you try to assign to it. The runtime errors will happen later down the road when you try to call on a property that doesn't exist on it. This is essentially you telling the compiler "Hey, look, just don't give me any fuss about this object, allow me to assign anything to it and call anything on it. If I mess up, it's my problem, not yours."
I think whenever you are thinking about using dynamic you should consider the problem at hand and see if it can be solved in a better way (interfaces, generics etc).

It sounds like you're really after generics:
class Foo<T>
{
public T Value { get; set; };
}
Then you can create instances for different types:
Foo<string> x = new Foo<string>();
x.Value = "fred";
Foo<int> y = new Foo<int>();
y.Value = 10;
This is still fixing the type at compile-time - but when the code using the type is compiled.
var is completely wrong here - var is just used for implicitly typed local variables. In particular, you can't apply it to fields.
It's possible that you want dynamic, but it's not really clear from your question at the moment.

I know that this must be done using the keyword var
Nope, that isn't what var does. There are 3 things that leap to mind that would work:
object; can store anything, but requires reflection to do anything useful
dynamic; a special-case of object, where the compiler performs voodoo such that obj.SomeMethod() (etc) is resolved at runtime
generics, i.e. have the class be SomeType<T>, with the variable typed as T; generic constraints can make this T more usable by declaring features (interfaces) that it must have

var has the purpose of referencing anything, not to declare anything. It's the other way around.
I acomplished this once leveraging the System.Dynamic.ExpandoObject (C# 4 only!), it allows for properties to be added at will without declaring them, and they will be resolved at runtime (it resembles how PHP treats objects and I'm a huge fan of it).
A quick example:
dynamic myObject = new ExpandoObject();
myObject.myProperty = "You can declare properties on-the-fly inside me !";
Console.WriteLine(myObject.myProperty);

Statements cannot exist outside of methods?

Reading a book (VS 2010), it says that commands (statements) in .NET Csharp cannot exist outside of method.
I am wondering - field declaration etc, these are commands, are they not? And they exist at class level. Can somebody elaborate at this a bit?

If you mean:
class Foo
{
int count = 0;
StringBuilder buffer = new StringBuilder();
}
The count and buffer are declarations using initializer expressions . But this code contains no statements.

A field initialiser is written with the code outside a method, but the compiler puts that code inside the constructor.
So a field initialiser like this:
class Foo {
int Bar = 42;
}
is basiclally a field and an initialiser in the constructor:
class Foo {
int Bar;
Foo() {
Bar = 42;
}
}

There's no such concept as a "command" in C#.
And a static / instance variable declaration isn't categorized as a statement within C# - it's a field-declaration (which is a type of class-member-declaration) as per the C# spec. See section 10.5 of the C# 4 spec for example.
Now the statements which declare local variables are statements, as defined by declaration-statement in the spec (section 8.5). They're only used for locals though. See section B.2.5 for a complete list of statement productions within C# 4.
Basically, the C# spec defines the terminology involved - so while you might think informally of "commands" and the like, in a matter of correctness the C# spec is the source of authority. (Except for where it doesn't say what the language designers meant to say, of course. That's pretty rare.)

As you said they're declarations, a statement is one which actually gets something done.

No, they're declarations. Class member declarations, to be precise.
And it's perfectly legal for those to exist outside of a method. Otherwise, you couldn't declare a method in the first place!
By "statements", the book is telling you that you can't have things like method calls outside of a method. For example, the following code is illegal:
public void DoSomething()
{
// Do something here...
}
MessageBox.Show("This statement is not allowed because it is outside a method.");

Classes, namespace, fields declarations are not declarations statements.
A field can be initialised outside a method with an expression but while an expression is a statement there are lots of statements that are not expressions (eg. if).
It all comes down to how the language grammar defines the terms, and the way C# does it is pretty common (eg. very similar to C and C++).

Noninitialized variable in C#

I have the following piece of code:
class Foo
{
public Foo()
{
Bar bar;
if (null == bar)
{
}
}
}
class Bar { }
Code gurus will already see that this gives an error. Bar might not be initialized before the if statement.
What is the value of bar? Shouldn't it be null? Aren't they set to null? (null pointer?)

No, local variables don't have a default value1. They have to be definitely assigned before you read them. This reduces the chance of you using a variable you think you've given a sensible value to, when actually it's got some default value. This can't be done for instance or static variables because you don't know in what order methods will be called.
See section 5.3 of the C# 3.0 spec for more details of definite assignment.
Note that this has nothing to do with this being a reference type variable. This will fail to compile in the same way:
int i;
if (i == 0) // Nope, i isn't definitely assigned
{
}
1 As far as the language is concerned, anyway... clearly the storage location in memory has something in it, but it's irrelevant and implementation-specific. There is one way you can find out what that value is, by creating a method with an out parameter but then using IL to look at the value of that parameter within the method, without having given it another value. The CLR doesn't mind that at all. You can then call that method passing in a not-definitely-assigned variable, and lo and behold you can detect the value - which is likely to be the "all zeroes" value basically.
I suspect that the CLI specification does enforce local variables having a default value - but I'd have to check. Unless you're doing evil things like the above, it shouldn't matter to you in C#.

Fields (variables on classes / structs) are initialized to null/zero/etc. Local variables... well - since (by "definite assignment") you can't access them without assigning there is no sensible way of answering; simply, it isn't defined since it is impossible. I believe they happen to be null/zero/etc (provable by hacking some out code via dynamic IL generation), but that is an implementation detail.
For info, here's some crafy code that shows the value of a formally uninitialised variable:
using System;
using System.Reflection.Emit;
static class Program
{
delegate void Evil<T>(out T value);
static void Main()
{
MakeTheStackFilthy();
Test();
}
static void Test()
{
int i;
DynamicMethod mthd = new DynamicMethod("Evil", null, new Type[] { typeof(int).MakeByRefType()});
mthd.GetILGenerator().Emit(OpCodes.Ret); // just return; no assignments
Evil<int> evil = (Evil<int>)mthd.CreateDelegate(typeof(Evil<int>));
evil(out i);
Console.WriteLine(i);
}
static void MakeTheStackFilthy()
{
DateTime foo = new DateTime();
Bar(ref foo);
Console.WriteLine(foo);
}
static void Bar(ref DateTime foo)
{
foo = foo.AddDays(1);
}
}
The IL just does a "ret" - it never assigns anything.

Local variables do not get assigned a default value. You have to initialize them before you use them. You can explicityly initialize to null though:
public Foo()
{
Bar bar = null;
if (null == bar)
{
}
}

Local variables are not assigned a default value, not even a null.

The value of bar is undefined. There's space allocated for it on the stack, but the space isn't initialised to any value so it contains anything that happened to be there before.
(The local variable might however be optimised to use a register instead of stack space, but it's still undefined.)
The compiler won't let you use the undefined value, it has to be able to determine that the variable is initialised before you can use it.
As a comparison, VB does initialise local variables. While this can be practical sometimes, it can also mean that you unintenionally use a variable before you have given it a meaningful value, and the compiler can't determine if it's what you indended to do or not.

It doesn't matter because no such code should be compilable by any compiler that implements C#.
If there was a default value, then it would be compilable. But there is none for local variables.

Besides "correctness", local variable initialization is also related to the CLR's verification process.
For more details, see my answer to this similar question: Why must local variables have initial values?

why C# does not provide internal helper for passing property as reference?

This is issue about LANGUAGE DESIGN.
Please do not answer to the question until you read entire post! Thank you.
With all helpers existing in C# (like lambdas, or automatic properties) it is very odd for me that I cannot pass property by a reference. Let's say I would like to do that:
foo(ref my_class.prop);
I get error so I write instead:
{
var tmp = my_class.prop;
foo(tmp);
my_class.prop = tmp;
}
And now it works. But please notice two things:
it is general template, I didn't put anywhere type, only "var", so it applies for all types and number of properties I have to pass
I have to do it over and over again, with no benefit -- it is mechanical work
The existing problem actually kills such useful functions as Swap. Swap is normally 3 lines long, but since it takes 2 references, calling it takes 5 lines. Of course it is nonsense and I simply write "swap" by hand each time I would like to call it. But this shows C# prevents reusable code, bad.
THE QUESTION
So -- what bad could happen if compiler automatically create temporary variables (as I do by hand), call the function, and assign the values back to properties? Is this any danger in it? I don't see it so I am curious what do you think why the design of this issue looks like it looks now.
Cheers,
EDIT As 280Z28 gave great examples for beating idea of automatically wrapping ref for properties I still think wrapping properties with temporary variables would be useful. Maybe something like this:
Swap(inout my_class.prop1,inout my_class.prop2);
Otherwise no real Swap for C# :-(

There are a lot of assumptions you can make about the meaning and behavior of a ref parameter. For example,
Case 1:
int x;
Interlocked.Increment(ref x);
If you could pass a property by ref to this method, the call would be the same but it would completely defeat the semantics of the method.
Case 2:
void WaitForCompletion(ref bool trigger)
{
while (!trigger)
Thread.Sleep(1000);
}
Summary: A by-ref parameter passes the address of a memory location to the function. An implementation creating a temporary variable in order to "pass a property by reference" would be semantically equivalent to passing by value, which is precisely the behavior that you're disallowing when you make the parameter a ref one.

Your proposal is called "copy in - copy out" reference semantics. Copy-in-copy-out semantics are subtly different from what we might call "ref to variable" semantics; different enough to be confusing and wrong in many situations. Others have already given you some examples; there are plenty more. For example:
void M() { F(ref this.p); }
void F(ref int x) { x = 123; B(); }
void B() { Console.WriteLine(this.p); }
If "this.p" is a property, with your proposal, this prints the old value of the property. If it is a field then it prints the new value.
Now imagine that you refactor a field to be a property. In the real language, that causes errors if you were passing a field by ref; the problem is brought to your attention. With your proposal, there is no error; instead, behaviour changes silently and subtly. That makes for bugs.
Consistency is important in C#, particularly in parts of the language that people find confusing, like reference semantics. I would want either references to always be copy-in-copy-out or never copy-in-copy-out. Doing it one way sometimes and another way other times seems like really bad design for C#, a language which values consistency over brevity.

Because a property is a method. It is a language construct responding to a pattern of encapsulating the setting and retrieval of a private field through a set of methods. It is functionally equivalent to this:
class Foo
{
private int _bar;
public int GetBar( ) { return _bar; }
public void SetBar( ) { _bar = value; }
}

With a ref argument, changes to the underlying variable will be observed by the method, this won't happen in your case. In other words, it is not exactly the same.

var t = obj.prop;
foo(ref t);
obj.prop = t;
Here, side effects of getter and setter are only visible once each, regardless of how many times the "by-ref" parameter got assigned to.
Imagine a dynamically computed property. Its value might change at any time. With this construct, foo is not kept up to date even though the code suggests this ("I'm passing the property to the method")

So -- what bad could happen if
compiler automatically create
temporary variables (as I do by hand),
call the function, and assign the
values back to properties? Is this any
danger in it?
The danger is that the compiler is doing something you don't know. Making the code confusing because properties are methods, not variables.

I'll provide just one simple example where it would cause confusion. Assume it was possible (as is in VB):
class Weird {
public int Prop { get; set; }
}
static void Test(ref int x) {
x = 42;
throw new Exception();
}
static void Main() {
int v = 10;
try {
Test(ref v);
} catch {}
Console.WriteLine(v); // prints 42
var c = new Weird();
c.Prop = 10;
try {
Test(ref c.Prop);
} catch {}
Console.WriteLine(c.Prop); // prints 10!!!
}
Nice. Isn't it?

Because, as Eric Lippert is fond of pointing out, every language feature must be understood, designed, specified, implemented, tested and documented. And it's obviously not a common scenario/pain point.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.