"is" operator behaving a bit strangely - c#

1) According to my book, is operator can check whether
expression E (E is type) can be converted to the target type only if E is either a reference conversion, boxing or unboxing. Since in the following example is doesn’t check for either of the three types of conversion, the code shouldn’t work, but it does:
long l; // EDIT - I forgot to add this line of code in my initial post
int i=100;
if (i is long) //EDIT - in my initial post I've claimed condition returns true, but it really returns false
l = i;
2)
a)
B b;
A a = new A();
if (a is B)
b = (B)a;
int i = b.l;
class A { public int l = 100; }
class B:A { }
The above code always causes compile time error “Use of unassigned variable”. If condition a is B evaluates to false, then b won’t be assigned a value, but if condition is true, then it will. And thus by allowing such a code compiler would have no way of knowing whether the usage of b in code following the if statement is valid or not ( due to not knowing whether a is b evaluates to true or false) , but why should it know that? Intsead why couldn’t runtime handle this?
b) But if instead we’re dealing with non reference types, then compiler doesn’t complain, even though the code is identical.Why?
int i = 100;
long l;
if (i is long)
l = i;
thank you

This has nothing to do with the is operator. The compiler sees that there are two possible paths, only one of which will assign a value to b.
When dealing with value types, the compiler knows that l gets implicitly initialized to the value 0.

The real difference is that in the int case, you are talking about the definite assignment of a field (l). Fields are always definitely assigned (even without the =100). In the B case, you are talking about the definite assignment of the local variable (b); local variables do not start as definitely assigned.
That's all it is.
int i=100;
if (i is long) //returns true, indicating that conversion is possible
1: I don't think this returns true at all; for me it shows an IDE warning about never being true. Looking in reflector, the compiler completely removes this branch. I guess the compiler is obliged to at least compile on the grounds that it could (in theory) box and test. But it already knows the answer, so it snips it.
2: I still get the "unassigned variable" compiler error; due to "definite assignment"

The compiler behaves correctly - why should it compile without errors if there is a use of an unassigned variable? You cannot work with b.l if b is unassigned as the compiler checks that there is a code path that does not instantiate b which is why it throws an error ...

In your code, class B derives from A. This means:
a is B // evaluates to false
b is A // evaluates to true
This means that the body of the if block won't be entered, and b will not be assigned.
Stephen Cleary also has a point. I don't know how sophisticated the compiler is when evaluating if values are assigned.

Okay, the MSDN says on is:
The is operator is used to check whether the run-time type of an object is compatible with a given type.
An is expression evaluates to true if both of the following conditions are met:
expression is not null.
expression can be cast to type. That is, a cast expression of the form (type)(expression) will complete without throwing an exception.
That would fit pretty well with 1, but 2 is another topic and correct (think about it).
However, the following code writes 0 to the output:
int i = 1;
long l = 0;
if (i is long) {
l = i;
}
Console.WriteLine(l);
Therefore it seems that the note in the is MSDN documentation is correct as well:
Note that the is operator only considers reference conversions, boxing conversions, and unboxing conversions. Other conversions, such as user-defined conversions, are not considered by the is operator.

Related

Roslyn failed to compile code

After I have migrated my project from VS2013 to VS2015 the project no longer builds. A compilation error occurs in the following LINQ statement:
static void Main(string[] args)
{
decimal a, b;
IEnumerable<dynamic> array = new string[] { "10", "20", "30" };
var result = (from v in array
where decimal.TryParse(v, out a) && decimal.TryParse("15", out b) && a <= b // Error here
orderby decimal.Parse(v)
select v).ToArray();
}
The compiler returns an error:
Error CS0165 Use of unassigned local variable 'b'
What causes this issue? Is it possible to fix it through a compiler setting?
What does cause this issue?
Looks like a compiler bug to me. At least, it did. Although the decimal.TryParse(v, out a) and decimal.TryParse(v, out b) expressions are evaluated dynamically, I expected the compiler to still understand that by the time it reaches a <= b, both a and b are definitely assigned. Even with the weirdnesses you can come up with in dynamic typing, I'd expect to only ever evaluate a <= b after evaluating both of the TryParse calls.
However, it turns out that through operator and conversion tricky, it's entirely feasible to have an expression A && B && C which evaluates A and C but not B - if you're cunning enough. See the Roslyn bug report for Neal Gafter's ingenious example.
Making that work with dynamic is even harder - the semantics involved when the operands are dynamic are harder to describe, because in order to perform overload resolution, you need to evaluate operands to find out what types are involved, which can be counter-intuitive. However, again Neal has come up with an example which shows that the compiler error is required... this isn't a bug, it's a bug fix. Huge amounts of kudos to Neal for proving it.
Is it possible to fix it through compiler settings?
No, but there are alternatives which avoid the error.
Firstly, you could stop it from being dynamic - if you know that you'll only ever use strings, then you could use IEnumerable<string> or give the range variable v a type of string (i.e. from string v in array). That would be my preferred option.
If you really need to keep it dynamic, just give b a value to start with:
decimal a, b = 0m;
This won't do any harm - we know that actually your dynamic evaluation won't do anything crazy, so you'll still end up assigning a value to b before you use it, making the initial value irrelevant.
Additionally, it seems that adding parentheses works too:
where decimal.TryParse(v, out a) && (decimal.TryParse("15", out b) && a <= b)
That changes the point at which various pieces of overload resolution are triggered, and happens to make the compiler happy.
There is one issue still remaining - the spec's rules on definite assignment with the && operator need to be clarified to state that they only apply when the && operator is being used in its "regular" implementation with two bool operands. I'll try to make sure this is fixed for the next ECMA standard.
This does appear to be a bug, or at the least a regression, in the Roslyn compiler. The following bug has been filed to track it:
https://github.com/dotnet/roslyn/issues/4509
In the meantime, Jon's excellent answer has a couple of work arounds.
Since I got schooled so hard in the bug report, I'm going to try to explain this myself.
Imagine T is some user-defined type with an implicit cast to bool that alternates between false and true, starting with false. As far as the compiler knows, the dynamic first argument to the first && might evaluate to that type, so it has to be pessimistic.
If, then, it let the code compile, this could happen:
When the dynamic binder evaluates the first &&, it does the following:
Evaluate the first argument
It's a T - implicitly cast it to bool.
Oh, it's false, so we don't need to evaluate the second argument.
Make the result of the && evaluate as the first argument. (No, not false, for some reason.)
When the dynamic binder evaluates the second &&, it does the following:
Evaluate the first argument.
It's a T - implicitly cast it to bool.
Oh, it's true, so evaluate the second argument.
... Oh crap, b isn't assigned.
In spec terms, in short, there are special "definite assignment" rules that let us say not only whether a variable is "definitely assigned" or "not definitely assigned", but also if it is "definitely assigned after false statement" or "definitely assigned after true statement".
These exist so that when dealing with && and || (and ! and ?? and ?:) the compiler can examine whether variables may be assigned in particular branches of a complex boolean expression.
However, these only work while the expressions' types remain boolean. When part of the expression is dynamic (or a non-boolean static type) we can no longer reliably say that the expression is true or false - the next time we cast it to bool to decide which branch to take, it may have changed its mind.
Update: this has now been resolved and documented:
The definite assignment rules implemented by previous compilers for dynamic expressions allowed some cases of code that could result in variables being read that are not definitely assigned. See https://github.com/dotnet/roslyn/issues/4509 for one report of this.
...
Because of this possibility the compiler must not allow this program to be compiled if val has no initial value. Previous versions of the compiler (prior to VS2015) allowed this program to compile even if val has no initial value. Roslyn now diagnoses this attempt to read a possibly uninitialized variable.
This is not a bug. See https://github.com/dotnet/roslyn/issues/4509#issuecomment-130872713 for an example of how a dynamic expression of this form can leave such an out variable unassigned.

Do short-circuiting operators || and && exist for nullable booleans? The RuntimeBinder sometimes thinks so

I read the C# Language Specification on the Conditional logical operators || and &&, also known as the short-circuiting logical operators. To me it seemed unclear if these existed for nullable booleans, i.e. the operand type Nullable<bool> (also written bool?), so I tried it with non-dynamic typing:
bool a = true;
bool? b = null;
bool? xxxx = b || a; // compile-time error, || can't be applied to these types
That seemed to settle the question (I could not understand the specification clearly, but assuming the implementation of the Visual C# compiler was correct, now I knew).
However, I wanted to try with dynamic binding as well. So I tried this instead:
static class Program
{
static dynamic A
{
get
{
Console.WriteLine("'A' evaluated");
return true;
}
}
static dynamic B
{
get
{
Console.WriteLine("'B' evaluated");
return null;
}
}
static void Main()
{
dynamic x = A | B;
Console.WriteLine((object)x);
dynamic y = A & B;
Console.WriteLine((object)y);
dynamic xx = A || B;
Console.WriteLine((object)xx);
dynamic yy = A && B;
Console.WriteLine((object)yy);
}
}
The surprising result is that this runs without exception.
Well, x and y are not surprising, their declarations lead to both properties being retrieved, and the resulting values are as expected, x is true and y is null.
But the evaluation for xx of A || B lead to no binding-time exception, and only the property A was read, not B. Why does this happen? As you can tell, we could change the B getter to return a crazy object, like "Hello world", and xx would still evaluate to true without binding-problems...
Evaluating A && B (for yy) also leads to no binding-time error. And here both properties are retrieved, of course. Why is this allowed by the run-time binder? If the returned object from B is changed to a "bad" object (like a string), a binding exception does occur.
Is this correct behavior? (How can you infer that from the spec?)
If you try B as first operand, both B || A and B && A give runtime binder exception (B | A and B & A work fine as everything is normal with non-short-circuiting operators | and &).
(Tried with C# compiler of Visual Studio 2013, and runtime version .NET 4.5.2.)
First of all, thanks for pointing out that the spec isn't clear on the non-dynamic nullable-bool case. I will fix that in a future version. The compiler's behavior is the intended behavior; && and || are not supposed to work on nullable bools.
The dynamic binder does not seem to implement this restriction, though. Instead, it binds the component operations separately: the &/| and the ?:. Thus it's able to muddle through if the first operand happens to be true or false (which are boolean values and thus allowed as the first operand of ?:), but if you give null as the first operand (e.g. if you try B && A in the example above), you do get a runtime binding exception.
If you think about it, you can see why we implemented dynamic && and || this way instead of as one big dynamic operation: dynamic operations are bound at runtime after their operands are evaluated, so that the binding can be based on the runtime types of the results of those evaluations. But such eager evaluation defeats the purpose of short-circuiting operators! So instead, the generated code for dynamic && and || breaks the evaluation up into pieces and will proceed as follows:
Evaluate the left operand (let's call the result x)
Try to turn it into a bool via implicit conversion, or the true or false operators (fail if unable)
Use x as the condition in a ?: operation
In the true branch, use x as a result
In the false branch, now evaluate the second operand (let's call the result y)
Try to bind the & or | operator based on the runtime type of x and y (fail if unable)
Apply the selected operator
This is the behavior that lets through certain "illegal" combinations of operands: the ?: operator successfully treats the first operand as a non-nullable boolean, the & or | operator successfully treats it as a nullable boolean, and the two never coordinate to check that they agree.
So it's not that dynamic && and || work on nullables. It's just that they happen to be implemented in a way that is a little bit too lenient, compared with the static case. This should probably be considered a bug, but we will never fix it, since that would be a breaking change. Also it would hardly help anyone to tighten the behavior.
Hopefully this explains what happens and why! This is an intriguing area, and I often find myself baffled by the consequences of the decisions we made when we implemented dynamic. This question was delicious - thanks for bringing it up!
Mads
Is this correct behavior?
Yes, I'm pretty sure it is.
How can you infer that from the spec?
Section 7.12 of C# Specification Version 5.0, has information regarding the conditional operators && and || and how dynamic binding relates to them. The relevant section:
If an operand of a conditional logical operator has the compile-time type dynamic, then the expression is dynamically bound (§7.2.2). In this case the compile-time type of the expression is dynamic, and the resolution described below will take place at run-time using the run-time type of those operands that have the compile-time type dynamic.
This is the key point that answers your question, I think. What is the resolution that happens at run-time? Section 7.12.2, User-Defined conditional logical operators explains:
The operation x && y is evaluated as T.false(x) ? x : T.&(x, y), where T.false(x) is an invocation of the operator false declared in T, and T.&(x, y) is an invocation of the selected operator &
The operation x || y is evaluated as T.true(x) ? x : T.|(x, y), where T.true(x) is an invocation of the operator true declared in T, and T.|(x, y) is an invocation of the selected operator |.
In both cases, the first operand x will be converted to a bool using the false or true operators. Then the appropriate logical operator is called. With this in mind, we have enough information to answer the rest of your questions.
But the evaluation for xx of A || B lead to no binding-time exception, and only the property A was read, not B. Why does this happen?
For the || operator, we know it follows true(A) ? A : |(A, B). We short circuit, so we won't get a binding time exception. Even if A was false, we would still not get a runtime binding exception, because of the specified resolution steps. If A is false, we then do the | operator, which can successfully handle null values, per Section 7.11.4.
Evaluating A && B (for yy) also leads to no binding-time error. And here both properties are retrieved, of course. Why is this allowed by the run-time binder? If the returned object from B is changed to a "bad" object (like a string), a binding exception does occur.
For similar reasons, this one also works. && is evaluated as false(x) ? x : &(x, y). A can be successfully converted to a bool, so there is no issue there. Because B is null, the & operator is lifted (Section 7.3.7) from the one that takes a bool to one that takes the bool? parameters, and thus there is no runtime exception.
For both conditional operators, if B is anything other than a bool (or a null dynamic), runtime binding fails because it can't find an overload that takes a bool and a non-bool as parameters. However, this only happens if A fails to satisfy the first conditional for the operator (true for ||, false for &&). The reason this happens is because dynamic binding is quite lazy. It won't try to bind the logical operator unless A is false and it has to go down that path to evaluate the logical operator. Once A fails to satisfy the first condition for the operator, it will fail with the binding exception.
If you try B as first operand, both B || A and B && A give runtime binder exception.
Hopefully, by now, you already know why this happens (or I did a bad job explaining). The first step in resolving this conditional operator is to take the first operand, B, and use one of the bool conversion operators (false(B) or true(B)) before handling the logical operation. Of course, B, being null cannot be converted to either true or false, and so the runtime binding exception happens.
The Nullable type does not define Conditional logical operators || and &&.
I suggest you following code:
bool a = true;
bool? b = null;
bool? xxxxOR = (b.HasValue == true) ? (b.Value || a) : a;
bool? xxxxAND = (b.HasValue == true) ? (b.Value && a) : false;

Warning CS0219 on unused local variable depends on Nullable<> syntax?

Consider this code:
static void Main()
{
int? a = new int?(12); // warning CS0219: The variable 'a' is assigned but its value is never used
int? b = 12; // warning CS0219: The variable 'b' is assigned but its value is never used
int? c = (int?)12; // (no warning for 'c'?)
}
The three variables a, b and c are really equivalent. In the first one, we call the public instance constructor on Nullable<> explicitly. In the second case, we utilize the implicit conversion from T to T?. And in the third case we write that conversion explicitly.
My question is, why will the Visual C# 5.0 compiler (from VS2013) not emit a warning for c the same way it does for the first two variables?
The IL code produced is the same in all three cases, both with Debug (no optimizations) and with Release (optimizations).
Not sure if this warning is covered by the Language Specification. Otherwise, it is "valid" for the C# compiler to be inconsistent like this, but I wanted to know what the reason is.
PS! If one prefers the var keyword a lot, it is actually plausible to write var c = (int?)12; where the cast syntax is needed to make var work as intended.
PPS! I am aware that no warning is raised in cases like int? neverUsed = MethodCallThatMightHaveSideEffects();, see another thread.

Unexpected operator is/as behavior when casting an int[] to object

Could someone please explain why this is happening?
var y = new int[]{1,2};
Console.WriteLine(y is uint[]); // false
Console.WriteLine(((object)y) is uint[]); // true
In c# you can't cast an int to a uint, so the first test fails because it is compiled to constant false.
However, int->uint cast is allowed by the CLR. The second check cannot be deduced by the compiler and therefore must be calculated at runtime. As you've dodged compiler checks, the CLR allows it.

Operator '=' chaining in C# - surely this test should pass?

I was just writing a property setter and had a brain-wave about why we don't have to return the result of a set when a property might be involved in operator = chaining, i.e:
var a = (b.c = d);
(I've added the brackets for clarity - but it makes no difference in practise)
I started thinking - where does the C# compiler derive the value that is assigned to a in the above example?
Logic says that it should be from the result of the (b.c = d) operation but since that's implemented with a void set_blah(value) method it can't be.
So the only other options are:
Re-read b.c after the assignment and use that value
Re-use d
Edit (since answered and comments from Eric) - there's a third option, which is what C# does: use the value written to b.c after any conversions have taken place
Now, to my mind, the correct reading of the above line of code is
set a to the result of setting b.c to d
I think that's a reasonable reading of the code - so I thought I'd test whether that is indeed what happens with a slightly contrived test - but ask yourself if you think it should pass or fail:
public class TestClass
{
private bool _invertedBoolean;
public bool InvertedBoolean
{
get
{
return _invertedBoolean;
}
set
{
//don't ask me why you would with a boolean,
//but consider rounding on currency values, or
//properties which clone their input value instead
//of taking the reference.
_invertedBoolean = !value;
}
}
}
[TestMethod]
public void ExampleTest()
{
var t = new TestClass();
bool result;
result = (t.InvertedBoolean = true);
Assert.IsFalse(result);
}
This test fails.
Closer examination of the IL that is generated for the code shows that the true value is loaded on to the stack, cloned with a dup command and then both are popped off in two successive assignments.
This technique works perfectly for fields, but to me seems terribly naive for properties where each is actually a method call where the actual final property value is not guaranteed to be the input value.
Now I know many people hate nested assignments etc etc, but the fact is the language lets you do them and so they should work as expected.
Perhaps I'm being really thick but to me this suggests an incorrect implementation of this pattern by the compiler (.Net 4 btw). But then is my expectation/reading of the code incorrect?
The result of an assignment x = {expr} is defined as the value evaluated from {expr}.
§14.14.1 Simple assignment (ECMA334 v4)
...
The result of a simple assignment expression is the value assigned to
the left operand. The result has the same type as the left operand,
and is always classified as a value.
...
And note that the value assigned is the value already evaluated from d. Hence the implementation here is:
var tmp = (TypeOfC)d;
b.c = tmp;
a = tmp;
although I would also expect with optimisations enabled it will use the dup instruction rather than a local variable.
I find it interesting that your expectation is that the crazy assignment -- that is, assigning two different values because one of them is an extremely weird property with unusual behaviour -- is the desirable state of affairs.
As you've deduced, we do everything in our power to avoid that state. That is a good thing. When you say "x = y = z" then if at all possible we should guarantee that x and y end up assigned the same value -- that of z -- even if y is some crazy thing that doesn't hold the value you give it. "x = y = z" should logically be like "y = z, x = z", except that z is only evaluated once. y doesn't come into the matter at all when assigning to x; why should it?
Also, of course when doing "x = y = z" we cannot consistently "reuse" y because y might be a write only property. What if there is no getter to read the value from?
Also, I note that you say "this works for fields" -- not if the field is volatile it doesn't. You have no guarantee whatsoever that the value you assigned is the value that the field takes on if it is a volatile field. You have no guarantee that the value you read from a volatile field in the past is the value of the field now.
For more thoughts on this subject, see my article:
http://blogs.msdn.com/b/ericlippert/archive/2010/02/11/chaining-simple-assignments-is-not-so-simple.aspx
The assignment operator is documented to return the result of evaluating its second operand (in this case, b). It doesn't matter that it also assigns this value to its first operand, and this assignment is done by calling a method that returns void.
The spec says:
14.14.1 Simple assignment
The = operator is called the simple assignment operator. In a simple assignment, the right operand shall
be an expression of a type that is implicitly convertible to the type
of the left operand. The operation assigns the value of the right
operand to the variable, property, or indexer element given by the
left operand. The result of a simple assignment expression is the
value assigned to the left operand. The result has the same type as
the left operand, and is always classified as a value.
So actually what happens is:
d is evaluated (let's call the value produced val)
the result is assigned to b.c
the assignment operator's result is val
a is assigned the value val
the second assignment operator's result is also val (but since the whole expression ends here, it goes unused)

Categories