After I have migrated my project from VS2013 to VS2015 the project no longer builds. A compilation error occurs in the following LINQ statement:
static void Main(string[] args)
{
decimal a, b;
IEnumerable<dynamic> array = new string[] { "10", "20", "30" };
var result = (from v in array
where decimal.TryParse(v, out a) && decimal.TryParse("15", out b) && a <= b // Error here
orderby decimal.Parse(v)
select v).ToArray();
}
The compiler returns an error:
Error CS0165 Use of unassigned local variable 'b'
What causes this issue? Is it possible to fix it through a compiler setting?
What does cause this issue?
Looks like a compiler bug to me. At least, it did. Although the decimal.TryParse(v, out a) and decimal.TryParse(v, out b) expressions are evaluated dynamically, I expected the compiler to still understand that by the time it reaches a <= b, both a and b are definitely assigned. Even with the weirdnesses you can come up with in dynamic typing, I'd expect to only ever evaluate a <= b after evaluating both of the TryParse calls.
However, it turns out that through operator and conversion tricky, it's entirely feasible to have an expression A && B && C which evaluates A and C but not B - if you're cunning enough. See the Roslyn bug report for Neal Gafter's ingenious example.
Making that work with dynamic is even harder - the semantics involved when the operands are dynamic are harder to describe, because in order to perform overload resolution, you need to evaluate operands to find out what types are involved, which can be counter-intuitive. However, again Neal has come up with an example which shows that the compiler error is required... this isn't a bug, it's a bug fix. Huge amounts of kudos to Neal for proving it.
Is it possible to fix it through compiler settings?
No, but there are alternatives which avoid the error.
Firstly, you could stop it from being dynamic - if you know that you'll only ever use strings, then you could use IEnumerable<string> or give the range variable v a type of string (i.e. from string v in array). That would be my preferred option.
If you really need to keep it dynamic, just give b a value to start with:
decimal a, b = 0m;
This won't do any harm - we know that actually your dynamic evaluation won't do anything crazy, so you'll still end up assigning a value to b before you use it, making the initial value irrelevant.
Additionally, it seems that adding parentheses works too:
where decimal.TryParse(v, out a) && (decimal.TryParse("15", out b) && a <= b)
That changes the point at which various pieces of overload resolution are triggered, and happens to make the compiler happy.
There is one issue still remaining - the spec's rules on definite assignment with the && operator need to be clarified to state that they only apply when the && operator is being used in its "regular" implementation with two bool operands. I'll try to make sure this is fixed for the next ECMA standard.
This does appear to be a bug, or at the least a regression, in the Roslyn compiler. The following bug has been filed to track it:
https://github.com/dotnet/roslyn/issues/4509
In the meantime, Jon's excellent answer has a couple of work arounds.
Since I got schooled so hard in the bug report, I'm going to try to explain this myself.
Imagine T is some user-defined type with an implicit cast to bool that alternates between false and true, starting with false. As far as the compiler knows, the dynamic first argument to the first && might evaluate to that type, so it has to be pessimistic.
If, then, it let the code compile, this could happen:
When the dynamic binder evaluates the first &&, it does the following:
Evaluate the first argument
It's a T - implicitly cast it to bool.
Oh, it's false, so we don't need to evaluate the second argument.
Make the result of the && evaluate as the first argument. (No, not false, for some reason.)
When the dynamic binder evaluates the second &&, it does the following:
Evaluate the first argument.
It's a T - implicitly cast it to bool.
Oh, it's true, so evaluate the second argument.
... Oh crap, b isn't assigned.
In spec terms, in short, there are special "definite assignment" rules that let us say not only whether a variable is "definitely assigned" or "not definitely assigned", but also if it is "definitely assigned after false statement" or "definitely assigned after true statement".
These exist so that when dealing with && and || (and ! and ?? and ?:) the compiler can examine whether variables may be assigned in particular branches of a complex boolean expression.
However, these only work while the expressions' types remain boolean. When part of the expression is dynamic (or a non-boolean static type) we can no longer reliably say that the expression is true or false - the next time we cast it to bool to decide which branch to take, it may have changed its mind.
Update: this has now been resolved and documented:
The definite assignment rules implemented by previous compilers for dynamic expressions allowed some cases of code that could result in variables being read that are not definitely assigned. See https://github.com/dotnet/roslyn/issues/4509 for one report of this.
...
Because of this possibility the compiler must not allow this program to be compiled if val has no initial value. Previous versions of the compiler (prior to VS2015) allowed this program to compile even if val has no initial value. Roslyn now diagnoses this attempt to read a possibly uninitialized variable.
This is not a bug. See https://github.com/dotnet/roslyn/issues/4509#issuecomment-130872713 for an example of how a dynamic expression of this form can leave such an out variable unassigned.
Related
Given code like:
a.b.c = 12;
Is there a way to use operators like ?. and ?? to safely handle the case that a or b is null, and do nothing in that case. a?.b?.c = 12 gives a compiler error, presumably because the L-value might be null and you cannot assign to null.
I'm using C# 7.3 so ??= operator is not available but even if it was, I don't think that is the solution. Is it possible or must I do explicit checks?
As #user2864740 wrote in his answer - the c# language doesn't support such a thing.
The most concise way you can write it in c# is probably this:
if (A?.B is null) A.B.C = 12;
However I find the need for such a null-safe property assignment rather strange - I mean, if you have the need to populate a property of some instance, surely you need that instance to actually be there - and if B or A are null at that point, your program should probably not simply ignore that and treat A.B.C = 12; as a NO-OP - but throw a NullReferenceException.
That being said, you don't want to see a NullReferenceException being thrown in a production code, but rather write your code in such a way that it would be null safe.
IMHO, The way to handle such cases is not by avoiding the value assignment to the property - but by making sure that the reference that holds this property is actually not null before attempting to populate the property.
Is there a way to use operators like ?. and ?? to safely handle the case that a or b [on the LHS of an assignment] is null..
No, there is no syntax to get the same short-hand bypass effect for an assignment. Such has not been added to the language.
In the current form, only ever evaluating to a value, it would be more comparable to a compiler error similar to:
(a.b.c) = 12; // LHS is a ‘value’
However, a?.b?.. merely expands internally to an equivalent chain of conditions, and if defined/implemented, this could also apply to such a hypothetical case. Anyway, can still be done manually in the presented question.
today I discovered a very strange behavior with C# function overloading. The problem occurs when I have a method with 2 overloads, one accepting Object and the other accepting Enum of any type. When I pass 0 as parameter, the Enum version of the method is called. When I use any other integer value, the Object version is called. I know this can be easilly fixed by using explicit casting, but I want to know why the compiler behaves that way. Is this a bug or just some strange language rule I don't know about?
The code below explains the problem (checked with runtime 2.0.50727)
Thanks for any help on this,
Grzegorz Kyc
class Program
{
enum Bar
{
Value1,
Value2,
Value3
}
static void Main(string[] args)
{
Foo(0);
Foo(1);
Console.ReadLine();
}
static void Foo(object a)
{
Console.WriteLine("object");
}
static void Foo(Bar a)
{
Console.WriteLine("enum");
}
}
It may be that you're not aware that there's an implicit conversion from a constant1 of 0 to any enum:
Bar x = 0; // Implicit conversion
Now, the conversion from 0 to Bar is more specific than the conversion from 0 to object, which is why the Foo(Bar) overload is used.
Does that clear everything up?
1 There's actually a bug in the Microsoft C# compiler which lets it be any zero constant, not just an integer:
const decimal DecimalZero = 0.0m;
...
Bar x = DecimalZero;
It's unlikely that this will ever be fixed, as it could break existing working code. I believe Eric Lippert has a two blog posts which go into much more detail.
The C# specification section 6.1.3 (C# 4 spec) has this to say about it:
An implicit enumeration conversion
permits the decimal-integer-literal 0
to be converted to any enum-type and
to any nullable-type whose underlying
type is an enum-type. In the latter
case the conversion is evaluated by
converting to the underlying enum-type
and wrapping the result (§4.1.10).
That actually suggests that the bug isn't just in allowing the wrong type, but allowing any constant 0 value to be converted rather than only the literal value 0.
EDIT: It looks like the "constant" part was partially introduced in the C# 3 compiler. Previously it was some constant values, now it looks like it's all of them.
I know I have read somewhere else that the .NET system always treats zero as a valid enumeration value, even if it actually isn't. I will try to find some reference for this...
OK, well I found this, which quotes the following and attributes it to Eric Gunnerson:
Enums in C# do dual purpose. They are used for the usual enum use, and they're also used for bit fields. When I'm dealing with bit fields, you often want to AND a value with the bit field and check if it's true.
Our initial rules meant that you had to write:
if ((myVar & MyEnumName.ColorRed) != (MyEnumName) 0)
which we thought was difficult to read. One alernative was to define a zero entry:
if ((myVar & MyEnumName.ColorRed) != MyEnumName.NoBitsSet)
which was also ugly.
We therefore decided to relax our rules a bit, and permit an implicit conversion from the literal zero to any enum type, which allows you to write:
if ((myVar & MyEnumName.ColorRed) != 0)
which is why PlayingCard(0, 0) works.
So it appears that the whole reason behind this was to simply allow equating to zero when checking flags without having to cast the zero.
I've recently been coding a lot in both Objective C while also working on several C# projects. In this process, I've found that I miss things in both directions.
In particular, when I code in C# I find I miss the short null check syntax of Objective C.
Why do you suppose in C# you can't check an object for null with a syntax like:
if (maybeNullObject) // works in Objective C, but not C# :(
{
...
}
I agree that if (maybeNullObject != null) is a more verbose / clear syntax, but it feels not only tedious to write it out in code all the time but overly verbose. In addition, I believe the if (maybeNullObject) syntax is generally understood (Javascript, Obj C, and I assume others) by most developers.
I throw this out as a question assuming that perhaps there is a specific reason C# disallows the if (maybeNullObject) syntax. I would think however that the compiler could easily convert an object expression such as if (maybeNullObject) automatically (or automagically) to if (maybeNullObject != null).
Great reference to this question is How an idea becomes a C# language feature?.
Edit
The short null check syntax that I am suggesting would only apply to objects. The short null check would not apply to primitives and types like bool?.
Because if statements in C# are strict. They take only boolean values, nothing else, and there are no subsequent levels of "truthiness" (i.e., 0, null, whatever. They are their own animal and no implicit conversion exists for them).
The compiler could "easily convert" almost any expression to a boolean, but that can cause subtle problems (believe me...) and a conscious decision was made to disallow these implicit conversions.
IMO this was a good choice. You are essentially asking for a one-off implicit conversion where the compiler assumes that, if the expression does not return a boolean result, then the programmer must have wanted to perform a null check. Aside from being a very narrow feature, it is purely syntactic sugar and provides little to no appreciable benefit. As Eric Lippert woudl say, every feature has a cost...
You are asking for a feature which adds needless complexity to the language (yes, it is complex because a type may define an implicit conversion to bool. If that is the case, which check is performed?) only to allow you to not type != null once in a while.
EDIT:
Example of how to define an implicit conversion to bool for #Sam (too long for comments).
class Foo
{
public int SomeVar;
public Foo( int i )
{
SomeVar = i;
}
public static implicit operator bool( Foo f )
{
return f.SomeVar != 0;
}
}
static void Main()
{
var f = new Foo(1);
if( f )
{
Console.Write( "It worked!" );
}
}
One potential collision is with a reference object that defines an implicit conversion to bool.
There is no delineation for the compiler between if(myObject) checking for null or checking for true.
The intent its to leave no ambiguity. You may find it tedious but that short hand is responsible for a number of bugs over the years. C# rightly has a type for booleans and out was a conscience decision not to make 0 mean false and any other value true.
You could write an extension method against System.Object, perhaps called IsNull()?
Of course, that's still an extra 8 or 9 characters on top of the code you'd have to write for the extension class. I think most people are happy with the clarity that an explicit null test brings.
today I discovered a very strange behavior with C# function overloading. The problem occurs when I have a method with 2 overloads, one accepting Object and the other accepting Enum of any type. When I pass 0 as parameter, the Enum version of the method is called. When I use any other integer value, the Object version is called. I know this can be easilly fixed by using explicit casting, but I want to know why the compiler behaves that way. Is this a bug or just some strange language rule I don't know about?
The code below explains the problem (checked with runtime 2.0.50727)
Thanks for any help on this,
Grzegorz Kyc
class Program
{
enum Bar
{
Value1,
Value2,
Value3
}
static void Main(string[] args)
{
Foo(0);
Foo(1);
Console.ReadLine();
}
static void Foo(object a)
{
Console.WriteLine("object");
}
static void Foo(Bar a)
{
Console.WriteLine("enum");
}
}
It may be that you're not aware that there's an implicit conversion from a constant1 of 0 to any enum:
Bar x = 0; // Implicit conversion
Now, the conversion from 0 to Bar is more specific than the conversion from 0 to object, which is why the Foo(Bar) overload is used.
Does that clear everything up?
1 There's actually a bug in the Microsoft C# compiler which lets it be any zero constant, not just an integer:
const decimal DecimalZero = 0.0m;
...
Bar x = DecimalZero;
It's unlikely that this will ever be fixed, as it could break existing working code. I believe Eric Lippert has a two blog posts which go into much more detail.
The C# specification section 6.1.3 (C# 4 spec) has this to say about it:
An implicit enumeration conversion
permits the decimal-integer-literal 0
to be converted to any enum-type and
to any nullable-type whose underlying
type is an enum-type. In the latter
case the conversion is evaluated by
converting to the underlying enum-type
and wrapping the result (§4.1.10).
That actually suggests that the bug isn't just in allowing the wrong type, but allowing any constant 0 value to be converted rather than only the literal value 0.
EDIT: It looks like the "constant" part was partially introduced in the C# 3 compiler. Previously it was some constant values, now it looks like it's all of them.
I know I have read somewhere else that the .NET system always treats zero as a valid enumeration value, even if it actually isn't. I will try to find some reference for this...
OK, well I found this, which quotes the following and attributes it to Eric Gunnerson:
Enums in C# do dual purpose. They are used for the usual enum use, and they're also used for bit fields. When I'm dealing with bit fields, you often want to AND a value with the bit field and check if it's true.
Our initial rules meant that you had to write:
if ((myVar & MyEnumName.ColorRed) != (MyEnumName) 0)
which we thought was difficult to read. One alernative was to define a zero entry:
if ((myVar & MyEnumName.ColorRed) != MyEnumName.NoBitsSet)
which was also ugly.
We therefore decided to relax our rules a bit, and permit an implicit conversion from the literal zero to any enum type, which allows you to write:
if ((myVar & MyEnumName.ColorRed) != 0)
which is why PlayingCard(0, 0) works.
So it appears that the whole reason behind this was to simply allow equating to zero when checking flags without having to cast the zero.
1) According to my book, is operator can check whether
expression E (E is type) can be converted to the target type only if E is either a reference conversion, boxing or unboxing. Since in the following example is doesn’t check for either of the three types of conversion, the code shouldn’t work, but it does:
long l; // EDIT - I forgot to add this line of code in my initial post
int i=100;
if (i is long) //EDIT - in my initial post I've claimed condition returns true, but it really returns false
l = i;
2)
a)
B b;
A a = new A();
if (a is B)
b = (B)a;
int i = b.l;
class A { public int l = 100; }
class B:A { }
The above code always causes compile time error “Use of unassigned variable”. If condition a is B evaluates to false, then b won’t be assigned a value, but if condition is true, then it will. And thus by allowing such a code compiler would have no way of knowing whether the usage of b in code following the if statement is valid or not ( due to not knowing whether a is b evaluates to true or false) , but why should it know that? Intsead why couldn’t runtime handle this?
b) But if instead we’re dealing with non reference types, then compiler doesn’t complain, even though the code is identical.Why?
int i = 100;
long l;
if (i is long)
l = i;
thank you
This has nothing to do with the is operator. The compiler sees that there are two possible paths, only one of which will assign a value to b.
When dealing with value types, the compiler knows that l gets implicitly initialized to the value 0.
The real difference is that in the int case, you are talking about the definite assignment of a field (l). Fields are always definitely assigned (even without the =100). In the B case, you are talking about the definite assignment of the local variable (b); local variables do not start as definitely assigned.
That's all it is.
int i=100;
if (i is long) //returns true, indicating that conversion is possible
1: I don't think this returns true at all; for me it shows an IDE warning about never being true. Looking in reflector, the compiler completely removes this branch. I guess the compiler is obliged to at least compile on the grounds that it could (in theory) box and test. But it already knows the answer, so it snips it.
2: I still get the "unassigned variable" compiler error; due to "definite assignment"
The compiler behaves correctly - why should it compile without errors if there is a use of an unassigned variable? You cannot work with b.l if b is unassigned as the compiler checks that there is a code path that does not instantiate b which is why it throws an error ...
In your code, class B derives from A. This means:
a is B // evaluates to false
b is A // evaluates to true
This means that the body of the if block won't be entered, and b will not be assigned.
Stephen Cleary also has a point. I don't know how sophisticated the compiler is when evaluating if values are assigned.
Okay, the MSDN says on is:
The is operator is used to check whether the run-time type of an object is compatible with a given type.
An is expression evaluates to true if both of the following conditions are met:
expression is not null.
expression can be cast to type. That is, a cast expression of the form (type)(expression) will complete without throwing an exception.
That would fit pretty well with 1, but 2 is another topic and correct (think about it).
However, the following code writes 0 to the output:
int i = 1;
long l = 0;
if (i is long) {
l = i;
}
Console.WriteLine(l);
Therefore it seems that the note in the is MSDN documentation is correct as well:
Note that the is operator only considers reference conversions, boxing conversions, and unboxing conversions. Other conversions, such as user-defined conversions, are not considered by the is operator.