I have come across a piece of code that I can't wrap my head around:
public void Connect()
{
if (!(!string.IsNullOrEmpty(_vpnConnectionName) & !string.IsNullOrEmpty(_ipToPing)))
{
return;
}
GetConnected();
}
I've researched the single ampersand on SO and elsewhere, and found it is a Bitwise AND operand. However the explanations all revolve around integers.
I can understand the applicability of bitwise operations on integers, but how are they applicable to booleans?
It is a logical operator on bools that evaluates both sides regardless of the value of the left side.
For bool, & isn't a bitwise AND operator - it's the non-short-circuiting logical AND operator - it always evaluates both arguments - as opposed to the "usual" && operator that only evaluates its second argument if its first argument is true.
(Realised that it may not be obvious, that the first & above is linked to the actual documentation)
Related
Context:
I'm learning C# and have been messing about on the Pex for fun site. The site challenges you to re-implement a secret algorithm, by typing code into the site and examining how the inputs and outputs differ between your implementation and the secret implementation.
Problem:
Anyway, I got stuck on a basic code duel called XAndY.
From the name it seemed obvious that the answer was just:
public static bool Puzzle(bool x, bool y)
{
return x && y;
}
However, this was incorrect and Pex informed me that the following inputs produced a different result than the secret implementation:
Input:
x:true y:true (0x02)
Output:
my implementation: true (0x02)
secret implementation: false
Mismatch Your puzzle method produced the wrong result.
Code:
Puzzle(true, PexSafeHelpers.ByteToBoolean((byte)2));
After a lot of confusion trying to compare different types of true, I realised that the implementation that Pex was looking for was actually just using a bitwise AND:
return x & y;
Questions:
I thought that for both semantic and short-circuiting reasons you should use logical && for comparing boolean values, but regardless:
Does this mean that x & y and x && y definitively do not have the same outputs for all possible bool arguments? (or could it be something buggy in Pex?)
Does this mean you can differentiate between different values of bool true in C#? If so, how?
The puzzle is exploiting what, in my opinion, is a bug in the C# compiler. (The bug affects VB.NET as well.)
In the C# 5.0 specification, §4.1.8 says that "The possible values of type bool are true and false", and §7.11.3 says that operator &(bool x, bool y) is a logical operator:
The result of x & y is true if both x and y are true. Otherwise, the result is false.
It's obviously a violation of the specification for true & true to yield false. What's going on?
At run time, a bool is represented by a 1-byte integer. The C# compiler uses 0 to represent false and 1 to represent true. To implement the & operator, the C# compiler emits a bitwise AND instruction in the generated IL. At first glance, this seems to be okay: bitwise AND operations involving 0 and 1 correspond exactly with logical AND operations involving false and true.
However, §III.1.1.2 of the CLI specification explicitly allows a bool to be represented by an integer other than 0 or 1:
A CLI Boolean type occupies 1 byte in memory. A bit pattern of all zeroes denotes a value of false. A bit pattern with any one or more bits set (analogous to a non-zero integer) denotes a value of true.
By going beyond the scope of C#, it is indeed possible—and perfectly legal—to create a bool whose value is, say, 2, thus causing & to behave unexpectedly. This is what the Pex site is doing.
Here's a demonstration:
using System;
using System.Reflection.Emit;
class Program
{
static void Main()
{
DynamicMethod method =
new DynamicMethod("ByteToBoolean", typeof(bool), new[] { typeof(byte) });
ILGenerator il = method.GetILGenerator();
il.Emit(OpCodes.Ldarg_0); // Load the byte argument...
il.Emit(OpCodes.Ret); // and "cast" it directly to bool.
var byteToBoolean =
(Func<byte, bool>)method.CreateDelegate(typeof(Func<byte, bool>));
bool x = true;
bool y = byteToBoolean(2);
Console.WriteLine(x); // True
Console.WriteLine(y); // True
Console.WriteLine(x && y); // True
Console.WriteLine(x & y); // False (!) because 1 & 2 == 0
Console.WriteLine(y.Equals(false)); // False
Console.WriteLine(y.Equals(true)); // False (!) because 2 != 1
}
}
So the answers to your questions are:
Currently, it's possible for x & y and x && y to have different values. However, this behavior violates the C# specification.
Currently, you can use Boolean.Equals (as shown above) to differentiate between true values. However, this behavior violates the CLI specification of Boolean.Equals.
I noticed that this issue has been raised to the Roslyn compiler group. Further discussion.
It had the following resolution:
This is effectively by design and you can find the document detailing
that here:
https://github.com/dotnet/roslyn/blob/master/docs/compilers/Boolean%20Representation.md
You can see more details and past conversations about this here:
#24652
The noted document states:
Representation of Boolean Values
The C# and VB compilers represent true (True) and false (False) bool
(Boolean) values with the single byte values 1 and 0, respectively,
and assume that any boolean values that they are working with are
restricted to being represented by these two underlying values. The
ECMA 335 CLI specification permits a "true" boolean value to be
represented by any nonzero value. If you use boolean values that have
an underlying representation other than 0 or 1, you can get unexpected
results. This can occur in unsafe code in C#, or by interoperating
with a language that permits other values. To avoid these unexpected
results, it is the programmer's responsibility to normalize such
incoming values.
(All italics are my own emphasis)
& in C# is not a bitwise operator, assuming that the input values are Boolean values. It is overloaded. There are two entirely separate implementations of the operator. A non-short circuiting logical boolean operator if the inputs are booleans, and a bitwise AND if the values are non-boolean values.
In the code that you have shown, then input is a boolean variable. It's not a numeric value, it's not an expression that resolves to a boolean value (which may have side effects), or anything else.
When the input is two boolean variables there is never going to be any different in the output between & and &&. The only way to have any observable difference between these two is to have a boolean expression that is more complex than just resolving a variable to its value, or some non-boolean input.
If the operands could be of some type other than bool then its pretty trivial to provide a type that has different results for either operator, such as a super mean type that overrides the true operator in a manor inconsistent with it's implicit conversion to bool:
Since the implementation to satisfy the puzzle is entirely up to you, why dont you try:
public static bool Puzzle(bool x, bool y)
{
return x & !y;
}
In VB.NET this happens:
Dim x As System.Nullable(Of Decimal) = Nothing
Dim y As System.Nullable(Of Decimal) = Nothing
y = 5
If x <> y Then
Console.WriteLine("true")
Else
Console.WriteLine("false") '' <-- I got this. Why?
End If
But in C# this happens:
decimal? x = default(decimal?);
decimal? y = default(decimal?);
y = 5;
if (x != y)
{
Debug.WriteLine("true"); // <-- I got this -- I'm with you, C# :)
}
else
{
Debug.WriteLine("false");
}
Why is there a difference?
VB.NET and C#.NET are different languages, built by different teams who have made different assumptions about usage; in this case the semantics of a NULL comparison.
My personal preference is for the VB.NET semantics, which in essence gives NULL the semantics "I don't know yet". Then the comparison of 5 to "I don't know yet". is naturally "I don't know yet"; ie NULL. This has the additional advantage of mirroring the behaviour of NULL in (most if not all) SQL databases. This is also a more standard (than C#'s) interpretation of three-valued logic, as explained here.
The C# team made different assumptions about what NULL means, resulting in the behaviour difference you show. Eric Lippert wrote a blog about the meaning of NULL in C#. Per Eric Lippert: "I also wrote about the semantics of nulls in VB / VBScript and JScript here and here".
In any environment in which NULL values are possible, it is imprtant to recognize that the Law of the Excluded Middle (ie that A or ~A is tautologically true) no longer can be relied on.
Update:
A bool (as opposed to a bool?) can only take the values TRUE and FALSE. However a language implementation of NULL must decide on how NULL propagates through expressions. In VB the expressions 5=null and 5<>null BOTH return false. In C#, of the comparable expressions 5==null and 5!=null only the second first [updated 2014-03-02 - PG] returns false. However, in ANY environment that supports null, it is incumbent on the programmer to know the truth tables and null-propagation used by that language.
Update
Eric Lippert's blog articles (mentioned in his comments below) on semantics are now at:
Sep. 30, 2003 - A Whole Lot of Nothing
Oct. 1, 2003 - A Little More on Nothing
Because x <> y returns Nothing instead of true. It is simply not defined since x is not defined. (similar to SQL null).
Note: VB.NET Nothing <> C# null.
You also have to compare the value of a Nullable(Of Decimal) only if it has a value.
So the VB.NET above compares similar to this(which looks less incorrect):
If x.HasValue AndAlso y.HasValue AndAlso x <> y Then
Console.WriteLine("true")
Else
Console.WriteLine("false")
End If
The VB.NET language specification:
7.1.1 Nullable Value Types
... A nullable value type can contain the same values as the non-nullable
version of the type as well as the null value. Thus, for a nullable
value type, assigning Nothing to a variable of the type sets the value
of the variable to the null value, not the zero value of the value
type.
For example:
Dim x As Integer = Nothing
Dim y As Integer? = Nothing
Console.WriteLine(x) ' Prints zero '
Console.WriteLine(y) ' Prints nothing (because the value of y is the null value) '
Look at the generated CIL (I've converted both to C#):
C#:
private static void Main(string[] args)
{
decimal? x = null;
decimal? y = null;
y = 5M;
decimal? CS$0$0000 = x;
decimal? CS$0$0001 = y;
if ((CS$0$0000.GetValueOrDefault() != CS$0$0001.GetValueOrDefault()) ||
(CS$0$0000.HasValue != CS$0$0001.HasValue))
{
Console.WriteLine("true");
}
else
{
Console.WriteLine("false");
}
}
Visual Basic:
[STAThread]
public static void Main()
{
decimal? x = null;
decimal? y = null;
y = 5M;
bool? VB$LW$t_struct$S3 = new bool?(decimal.Compare(x.GetValueOrDefault(), y.GetValueOrDefault()) != 0);
bool? VB$LW$t_struct$S1 = (x.HasValue & y.HasValue) ? VB$LW$t_struct$S3 : null;
if (VB$LW$t_struct$S1.GetValueOrDefault())
{
Console.WriteLine("true");
}
else
{
Console.WriteLine("false");
}
}
You'll see that the comparison in Visual Basic returns Nullable<bool> (not bool, false or true!). And undefined converted to bool is false.
Nothing compared to whatever is always Nothing, not false in Visual Basic (it is the same as in SQL).
The problem that's observed here is a special case of a more general problem, which is that the number of different definitions of equality that may be useful in at least some circumstances exceeds the number of commonly-available means to express them. This problem is in some cases made worse by an unfortunate belief that it is confusing to have different means of testing equality yield different results, and such confusion might be avoided by having the different forms of equality yield the same results whenever possible.
In reality, the fundamental cause of confusion is a misguided belief that the different forms of equality and inequality testing should be expected to yield the same result, notwithstanding the fact that different semantics are useful in different circumstances. For example, from an arithmetic standpoint, it's useful to be able to have Decimal which differ only in the number of trailing zeroes compare as equal. Likewise for double values like positive zero and negative zero. On the other hand, from a caching or interning standpoint, such semantics can be deadly. Suppose, for example, one had a Dictionary<Decimal, String> such that myDict[someDecimal] should equal someDecimal.ToString(). Such an object would seem reasonable if one had many Decimal values that one wanted to convert to string and expected there to be many duplicates. Unfortunately, if used such caching to convert 12.3 m and 12.40 m, followed by 12.30 m and 12.4 m, the latter values would yield "12.3", and "12.40" instead of "12.30" and "12.4".
Returning to the matter at hand, there is more than one sensible way of comparing nullable objects for equality. C# takes the standpoint that its == operator should mirror the behavior of Equals. VB.NET takes the standpoint that its behavior should mirror that of some other languages, since anyone who wants the Equals behavior could use Equals. In some sense, the right solution would be to have a three-way "if" construct, and require that if the conditional expression returns a three-valued result, code must specify what should happen in the null case. Since that is not an option with languages as they are, the next best alternative is to simply learn how different languages work and recognize that they are not the same.
Incidentally, Visual Basic's "Is" operator, which is lacking in C, can be used to test for whether a nullable object is, in fact, null. While one might reasonably question whether an if test should accept a Boolean?, having the normal comparison operators return Boolean? rather than Boolean when invoked on nullable types is a useful feature. Incidentally, in VB.NET, if one attempts to use the equality operator rather than Is, one will get a warning that the result of the comparison will always be Nothing, and one should use Is if one wants to test if something is null.
May be
this
post well help you:
If I remember correctly, 'Nothing' in VB means "the default value". For a value type, that's the default value, for a reference type, that would be null. Thus, assigning nothing to a struct, is no problem at all.
This is a definite weirdness of VB.
In VB, if you want to compare two nullable types, you should use Nullable.Equals().
In your example, it should be:
Dim x As System.Nullable(Of Decimal) = Nothing
Dim y As System.Nullable(Of Decimal) = Nothing
y = 5
If Not Nullable.Equals(x, y) Then
Console.WriteLine("true")
Else
Console.WriteLine("false")
End If
Your VB code is simply incorrect - if you change the "x <> y" to "x = y" you will still have "false" as the result. The most common way of expression this for nullable instances is "Not x.Equals(y)", and this will yield the same behavior as "x != y" in C#.
In order to evaluate a multiplication you have to evaluate the first term, then the second term and finally multiply the two values.
Given that every number multiplied by 0 is 0, if the evaluation of the first term returns 0 I would expect that the entire multiplication is evaluated to 0 without evaluating the second term.
However if you try this code:
var x = 0 * ComplexOperation();
The function ComplexOperation is called despite the fact that we know that x is 0.
The optimized behavior would be also consistent with the Boolean Operator '&&' that evaluates the second term only if the first one is evaluated as true. (The '&' operator evaluates both terms in any case)
I tested this behavior in C# but I guess it is the same for almost all languages.
Firstly, for floating-point, your assertion isn't even true! Consider that 0 * inf is not 0, and 0 * nan is not 0.
But more generally, if you're talking about optimizations, then I guess the compiler is free to not evaluate ComplexOperation if it can prove there are no side-effects.
However, I think you're really talking about short-circuit semantics (i.e. a language feature, not a compiler feature). If so, then the real justification is that C# is copying the semantics of earlier languages (originally C) to maintain consistency.
C# is not functional, so functions can have side effects. For example, you can print something from inside ComlpexOperation or change global static variables. So, whether it is called is defined by * contract.
You found yourself an example of different contracts with & and &&.
The language defines which operators have short-circuit semantics and which do not. Your ComplexOperation function may have side effects, those side effects may be deliberate, and the compiler is not free to assume that they should not occur just because the result of the function is effectively not used.
I will also add this would be obfuscated language design. There would be oodles of SO questions to the effect of...
//why is foo only called 9 times?????????
for(int i = 0; i < 10; i++) {
print((i-5)*foo());
}
Why allow short-circuiting booleans and not short-circuiting 0*? Well, firstly I will say that mixing short-circuit boolean with side-effects is a common source of bugs in code - if used well among maintainers who understand it as an obvious pattern then it may be okay, but it's very hard for me to imagine programmers becoming at all used to a hole in the integers at 0.
Assume myObj is null. Is it safe to write this?
if(myObj != null && myObj.SomeString != null)
I know some languages won't execute the second expression because the && evaluates to false before the second part is executed.
Yes. In C# && and || are short-circuiting and thus evaluates the right side only if the left side doesn't already determine the result. The operators & and | on the other hand don't short-circuit and always evaluate both sides.
The spec says:
The && and || operators are called the conditional logical operators. They are also called the “shortcircuiting” logical operators.
...
The operation x && y corresponds to the operation x & y, except that y is evaluated only if x is true
...
The operation x && y is evaluated as (bool)x ? (bool)y : false. In other words, x is first evaluated and converted to type bool. Then, if x is true, y is evaluated and converted to type bool, and this becomes the result of the operation. Otherwise, the result of the operation is false.
(C# Language Specification Version 4.0 - 7.12 Conditional logical operators)
One interesting property of && and || is that they are short circuiting even if they don't operate on bools, but types where the user overloaded the operators & or | together with the true and false operator.
The operation x && y is evaluated as T.false((T)x) ? (T)x : T.&((T)x, y), where
T.false((T)x) is an invocation of the operator false declared in T, and T.&((T)x, y) is an invocation of the selected operator &. In addition, the value (T)x shall only be evaluated once.
In other words, x is first evaluated and converted to type T and operator false is invoked on the result to determine if x is definitely false.
Then, if x is definitely false, the result of the operation is the value previously computed for x converted to type T.
Otherwise, y is evaluated, and the selected operator & is invoked on the value previously computed for x converted to type T and the value computed for y to produce the result of the operation.
(C# Language Specification Version 4.0 - 7.12.2 User-defined conditional logical operators)
Yes, C# uses logical short-circuiting.
Note that although C# (and some other .NET languages) behave this way, it is a property of the language, not the CLR.
I know I'm late to the party, but in C# 6.0 you can do this too:
if(myObj?.SomeString != null)
Which is the same thing as above.
Also see:
What does question mark and dot operator ?. mean in C# 6.0?
Your code is safe - && and || are both short-circuited. You can use non-short-circuited operators & or |, which evaluate both ends, but I really don't see that in much production code.
sure, it's safe on C#, if the first operand is false then the second is never evaluated.
It is perfectly safe. C# is one of those languages.
In C#, && and || are short-circuited, meaning that the first condition is evaluated and the rest is ignored if the answer is determined.
In VB.NET, AndAlso and OrElse are also short-circuited.
In javaScript, && and || are short-circuited too.
I mention VB.NET to show that the ugly red-headed step-child of .net also has cool stuff too, sometimes.
I mention javaScript, because if you are doing web development then you probably might also use javaScript.
Yes, C# and most languages compute the if sentences from left to right.
VB6 by the way will compute the whole thing, and throw an exception if it's null...
an example is
if(strString != null && strString.Length > 0)
This line would cause a null exception if both sides executed.
Interesting side note. The above example is quite a bit faster than the IsNullorEmpty method.
Situation: condition check in C++ or C# with many criteria:
if (condition1 && condition2 && condition3)
{
// Do something
}
I've always believed the sequence in which these checks are performed is not guaranteed. So it is not necessarily first condition1 then condition2 and only then condition3. I learned it in my times with C++. I think I was told that or read it somewhere.
Up until know I've always written secure code to account for possible null pointers in the following situation:
if ((object != null) && (object.SomeFunc() != value))
{
// A bad way of checking (or so I thought)
}
So I was writing:
if (object != null)
{
if (object.SomeFunc() != value)
{
// A much better and safer way
}
}
Because I was not sure the not-null check will run first and only then the instance method will be called to perform the second check.
Now our greatest community minds are telling me the sequence in which these checks are performed is guaranteed to run in the left-to-right order.
I'm very surprised. Is it really so for both C++ and C# languages?
Has anybody else heard the version I heard before now?
Short Answer is left to right with short-circuit evaluation. The order is predictable.
// perfectly legal and quite a standard way to express in C++/C#
if( x != null && x.Count > 0 ) ...
Some languages evaluate everything in the condition before branching (VB6 for example).
// will fail in VB6 if x is Nothing.
If x Is Not Nothing And x.Count > 0 Then ...
Ref: MSDN C# Operators and their order or precedence.
They are defined to be evaluated from left-to-right, and to stop evaluating when one of them evaluates to false. That's true in both C++ and C#.
I don't think there is or has been any other way. That would be like the compiler deciding to run statements out of order for no reason. :) Now, some languages (like VB.NET) have different logical operators for short-circuiting and not short-circuiting. But, the order is always well defined at compile time.
Here is the operator precedence from the C# language spec. From the spec ...
Except for the assignment operators,
all binary operators are
left-associative, meaning that
operations are performed from left to
right. For example, x + y + z is
evaluated as (x + y) + z.
They have to be performed from left to right. This allows short circuit evaluation to work.
See the Wikipedia article for more information.