is if(float > int) really if(float > (float)int)? - c#

is
if(float > int)
really just
if(float > (float)int)
I was doing so research and it seems like it costs a lot to do float to int and int to float casts.
I have a lot of float/int comparisons.
Just a quick question

Yes!
They're the same thing.
There's no instruction to directly compare a floating-point to an integer, so it first casts the integer to float.
However:
Be careful: That does not mean that the int-to-float conversion is lossless. It still can lose some information, so this code:
(int)(float)integer == integer
doesn't always evaluate to true! (Try it with int.MaxValue to see. Ditto with double/long.)

Yes. There's no >(float, int) operator - just >(int, int) and >(float, float). So the compiler calls the latter operator by converting the second operand to float. See section 7.3.6.2 of the C# spec for more details:
Binary numeric promotion occurs for the operands of the predefined +, -, *, / %, &, |, ^, ==, !=, >, <, >= and <= binary operators. Binary numeric promotion implicitly converts both operands to a common type which, in case of the nonrelational operators, also becomes the result type of the operation.
(It then lists the steps involved.)
Are you sure that the int to float conversion is taking a lot of time though? It should be pretty cheap.

Related

Why is does using operators on short (Struct.Int16) result in Struct.Int32 at Runtime, and sometimes does not require any casting at compile time? [duplicate]

Previously today I was trying to add two ushorts and I noticed that I had to cast the result back to ushort. I thought it might've become a uint (to prevent a possible unintended overflow?), but to my surprise it was an int (System.Int32).
Is there some clever reason for this or is it maybe because int is seen as the 'basic' integer type?
Example:
ushort a = 1;
ushort b = 2;
ushort c = a + b; // <- "Cannot implicitly convert type 'int' to 'ushort'. An explicit conversion exists (are you missing a cast?)"
uint d = a + b; // <- "Cannot implicitly convert type 'int' to 'uint'. An explicit conversion exists (are you missing a cast?)"
int e = a + b; // <- Works!
Edit: Like GregS' answer says, the C# spec says that both operands (in this example 'a' and 'b') should be converted to int. I'm interested in the underlying reason for why this is part of the spec: why doesn't the C# spec allow for operations directly on ushort values?
The simple and correct answer is "because the C# Language Specification says so".
Clearly you are not happy with that answer and want to know "why does it say so". You are looking for "credible and/or official sources", that's going to be a bit difficult. These design decisions were made a long time ago, 13 years is a lot of dog lives in software engineering. They were made by the "old timers" as Eric Lippert calls them, they've moved on to bigger and better things and don't post answers here to provide an official source.
It can be inferred however, at a risk of merely being credible. Any managed compiler, like C#'s, has the constraint that it needs to generate code for the .NET virtual machine. The rules for which are carefully (and quite readably) described in the CLI spec. It is the Ecma-335 spec, you can download it for free from here.
Turn to Partition III, chapter 3.1 and 3.2. They describe the two IL instructions available to perform an addition, add and add.ovf. Click the link to Table 2, "Binary Numeric Operations", it describes what operands are permissible for those IL instructions. Note that there are just a few types listed there. byte and short as well as all unsigned types are missing. Only int, long, IntPtr and floating point (float and double) is allowed. With additional constraints marked by an x, you can't add an int to a long for example. These constraints are not entirely artificial, they are based on things you can do reasonably efficient on available hardware.
Any managed compiler has to deal with this in order to generate valid IL. That isn't difficult, simply convert the ushort to a larger value type that's in the table, a conversion that's always valid. The C# compiler picks int, the next larger type that appears in the table. Or in general, convert any of the operands to the next largest value type so they both have the same type and meet the constraints in the table.
Now there's a new problem however, a problem that drives C# programmers pretty nutty. The result of the addition is of the promoted type. In your case that will be int. So adding two ushort values of, say, 0x9000 and 0x9000 has a perfectly valid int result: 0x12000. Problem is: that's a value that doesn't fit back into an ushort. The value overflowed. But it didn't overflow in the IL calculation, it only overflows when the compiler tries to cram it back into an ushort. 0x12000 is truncated to 0x2000. A bewildering different value that only makes some sense when you count with 2 or 16 fingers, not with 10.
Notable is that the add.ovf instruction doesn't deal with this problem. It is the instruction to use to automatically generate an overflow exception. But it doesn't, the actual calculation on the converted ints didn't overflow.
This is where the real design decision comes into play. The old-timers apparently decided that simply truncating the int result to ushort was a bug factory. It certainly is. They decided that you have to acknowledge that you know that the addition can overflow and that it is okay if it happens. They made it your problem, mostly because they didn't know how to make it theirs and still generate efficient code. You have to cast. Yes, that's maddening, I'm sure you didn't want that problem either.
Quite notable is that the VB.NET designers took a different solution to the problem. They actually made it their problem and didn't pass the buck. You can add two UShorts and assign it to an UShort without a cast. The difference is that the VB.NET compiler actually generates extra IL to check for the overflow condition. That's not cheap code, makes every short addition about 3 times as slow. But otherwise the reason that explains why Microsoft maintains two languages that have otherwise very similar capabilities.
Long story short: you are paying a price because you use a type that's not a very good match with modern cpu architectures. Which in itself is a Really Good Reason to use uint instead of ushort. Getting traction out of ushort is difficult, you'll need a lot of them before the cost of manipulating them out-weighs the memory savings. Not just because of the limited CLI spec, an x86 core takes an extra cpu cycle to load a 16-bit value because of the operand prefix byte in the machine code. Not actually sure if that is still the case today, it used to be back when I still paid attention to counting cycles. A dog year ago.
Do note that you can feel better about these ugly and dangerous casts by letting the C# compiler generate the same code that the VB.NET compiler generates. So you get an OverflowException when the cast turned out to be unwise. Use Project > Properties > Build tab > Advanced button > tick the "Check for arithmetic overflow/underflow" checkbox. Just for the Debug build. Why this checkbox isn't turned on automatically by the project template is another very mystifying question btw, a decision that was made too long ago.
ushort x = 5, y = 12;
The following assignment statement will produce a compilation error, because the arithmetic expression on the right-hand side of the assignment operator evaluates to int by default.
ushort z = x + y; // Error: conversion from int to ushort
http://msdn.microsoft.com/en-us/library/cbf1574z(v=vs.71).aspx
EDIT:
In case of arithmetic operations on ushort, the operands are converted to a type which can hold all values. So that overflow can be avoided. Operands can change in the order of int, uint, long and ulong.
Please see the C# Language Specification In this document go to section 4.1.5 Integral types (around page 80 in the word document). Here you will find:
For the binary +, –, *, /, %, &, ^, |, ==, !=, >, <, >=, and <=
operators, the operands are converted to type T, where T is the first
of int, uint, long, and ulong that can fully represent all possible
values of both operands. The operation is then performed using the
precision of type T, and the type of the result is T (or bool for the
relational operators). It is not permitted for one operand to be of
type long and the other to be of type ulong with the binary operators.
Eric Lipper has stated in a question
Arithmetic is never done in shorts in C#. Arithmetic can be done in
ints, uints, longs and ulongs, but arithmetic is never done in shorts.
Shorts promote to int and the arithmetic is done in ints, because like
I said before, the vast majority of arithmetic calculations fit into
an int. The vast majority do not fit into a short. Short arithmetic is
possibly slower on modern hardware which is optimized for ints, and
short arithmetic does not take up any less space; it's going to be
done in ints or longs on the chip.
From the C# language spec:
7.3.6.2 Binary numeric promotions
Binary numeric promotion occurs for the operands of the predefined +, –, *, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary numeric promotion implicitly converts both operands to a common type which, in case of the non-relational operators, also becomes the result type of the operation. Binary numeric promotion consists of applying the following rules, in the order they appear here:
· If either operand is of type decimal, the other operand is converted to type decimal, or a binding-time error occurs if the other operand is of type float or double.
· Otherwise, if either operand is of type double, the other operand is converted to type double.
· Otherwise, if either operand is of type float, the other operand is converted to type float.
· Otherwise, if either operand is of type ulong, the other operand is converted to type ulong, or a binding-time error occurs if the other operand is of type sbyte, short, int, or long.
· Otherwise, if either operand is of type long, the other operand is converted to type long.
· Otherwise, if either operand is of type uint and the other operand is of type sbyte, short, or int, both operands are converted to type long.
· Otherwise, if either operand is of type uint, the other operand is converted to type uint.
· Otherwise, both operands are converted to type int.
There is no reason that is intended. This is just an effect or applying the rules of overload resolution which state that the first overload for whose parameters there is an implicit conversion that fit the arguments, that overload will be used.
This is stated in the C# Specification, section 7.3.6 as follows:
Numeric promotion is not a distinct mechanism, but rather an effect of applying overload resolution to the predefined operators.
It goes on illustrating with an example:
As an example of numeric promotion, consider the predefined implementations of the binary * operator:
int operator *(int x, int y);
uint operator *(uint x, uint y);
long operator *(long x, long y);
ulong operator *(ulong x, ulong y);
float operator *(float x, float y);
double operator *(double x, double y);
decimal operator *(decimal x, decimal y);
When overload resolution rules (§7.5.3) are applied to this set of operators, the effect is to select the first of the operators for which implicit conversions exist from the operand types. For example, for the operation b * s, where b is a byte and s is a short, overload resolution selects operator *(int, int) as the best operator.
Your question is in fact, a bit tricky. The reason why this specification is part of the language is... because they took that decision when they created the language. I know this sounds like a disappointing answer, but that's just how it is.
However, the real answer would probably involve many context decision back in the day in 1999-2000. I am sure the team who made C# had pretty robust debates about all those language details.
...
C# is intended to be a simple, modern, general-purpose, object-oriented programming language.
Source code portability is very important, as is programmer portability, especially for those programmers already familiar with C and C++.
Support for internationalization is very important.
...
The quote above is from Wikipedia C#
All of those design goals might have influenced their decision. For instance, in the year 2000, most of the system were already native 32-bits, so they might have decided to limit the number of variable smaller than that, since it will be converted anyway on 32-bits when performing arithmetic operations. Which is generally slower.
At that point, you might ask me; if there is implicit conversion on those types why did they included them anyway? Well one of their design goals, as quoted above, is portability.
Thus, if you need to write a C# wrapper around an old C or C++ program you might need those type to store some values. In that case, those types are pretty handy.
That's a decision Java did not make. For instance, if you write a Java program that interacts with a C++ program in which way your are received ushort values, well Java only has short (which are signed) so you can't easily assign one to another and expect correct values.
I let you bet, next available type that could receive such value in Java is int (32-bits of course). You have just doubled your memory here. Which might not be a big deal, instead you have to instantiate an array of 100 000 elements.
In fact, We must remember that those decision are been made by looking at the past and the future in order provide a smooth transfer from one to another.
But now I feel that I am diverging of the initial question.
So your question is a good one and hopefully I was able to bring some answers to you, even though if I know that's probably not what you wanted to hear.
If you'd like, you could even read more about the C# spec, links below. There is some interesting documentation that might be interesting for you.
Integral types
The checked and unchecked operators
Implicit Numeric Conversions Table
By the way, I believe you should probably reward habib-osu for it, since he provided a fairly good answer to the initial question with a proper link. :)
Regards

How come the compiler can't tell the result is an integer

I came across this funny behavior of the compiler:
If I have
public int GetInt()
{
Random rnd = new Random();
double d = rnd.NextDouble();
int i = d % 1000;
return i;
}
I get an error of:
Cannot implicitly convert type 'double' to 'int'. An explicit conversion exists (are you missing a cast?)
which actually makes sense as 1000 can be a double, and the result of the modulo operator might be a double as well.
But after changing the code to:
public int GetInt()
{
Random rnd = new Random();
double d = rnd.NextDouble();
int i = d % (int)1000;
return i;
}
The error persists.
As far as I can tell, the compiler has all of the information in order to determine that the output of the modulo operator will be an int, so why doesn't it compile?
if d is == to 1500.72546 then the result of the calculation int d % (int)1000 would be 500.72546 so then implicitly casting to an int would result in a loss of data.
This is by design. See C# language specification:
7.3.6.2 Binary numeric promotions
Binary numeric promotion occurs for the operands of the predefined +,
–, *, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary
numeric promotion implicitly converts both operands to a common type
which, in case of the non-relational operators, also becomes the
result type of the operation. Binary numeric promotion consists of
applying the following rules, in the order they appear here:
...
Otherwise, if either operand is of type double, the other operand is converted to type double.
The compiler can tell the result is an integer here, but is being your friend.
Sure, it can automatically convert it for you and not tell you about it and in your trivial example this would be just fine. However, given that converting a double to an int means a LOSS of data, it might well have not been your intention.
If it had not been your intention and the compiler had just gone ahead and done the conversion for you, you could have ended up in a marathon debugging session, devoid of sanity - trying to figure out why a rather esoteric bug as been reported.
This way, the compiler is forcing you to say, as a programmer, "I know I will lose data, it's fine".
Compiler will assume you don't want to loose precision of the operation and implicitly use double.
http://msdn.microsoft.com/en-us/library/0w4e0fzs.aspx
From the documentation:
http://msdn.microsoft.com/en-us/library/0w4e0fzs.aspx
The result of a modulo of a double will be a double. If you need an integer result, then you must:
int i = (int)(d % 1000);
But bear in mind that you are liable to lose data here.
As I slowly engage my brain here - your code doesn't make any sense. NextDouble() will return a value between 0 and 1. There are some logical issues with what you are doing, the result will always be zero, e.g.:
0.15 % 1000 = 0.15
Cast 0.15 to int (always rounds towards zero) -> 0
double d = rnd.NextDouble();
int i = d % (int)1000;
This code doesn't makes sense. (int)1000 says 1000 is int, but d % (int)1000 says that oh d is double, so compiler has to convert both into a common type Binary numeric promotions mentioned in another answer to make it work.
One thing to understand is that you can't apply any operations with different types, so compiler will convert implicitly for you if there is no loss of data. So (int)1000 will be still converted to double by the compiler before applying operation. so the result will be of type double not int.

Why is ushort + ushort equal to int?

Previously today I was trying to add two ushorts and I noticed that I had to cast the result back to ushort. I thought it might've become a uint (to prevent a possible unintended overflow?), but to my surprise it was an int (System.Int32).
Is there some clever reason for this or is it maybe because int is seen as the 'basic' integer type?
Example:
ushort a = 1;
ushort b = 2;
ushort c = a + b; // <- "Cannot implicitly convert type 'int' to 'ushort'. An explicit conversion exists (are you missing a cast?)"
uint d = a + b; // <- "Cannot implicitly convert type 'int' to 'uint'. An explicit conversion exists (are you missing a cast?)"
int e = a + b; // <- Works!
Edit: Like GregS' answer says, the C# spec says that both operands (in this example 'a' and 'b') should be converted to int. I'm interested in the underlying reason for why this is part of the spec: why doesn't the C# spec allow for operations directly on ushort values?
The simple and correct answer is "because the C# Language Specification says so".
Clearly you are not happy with that answer and want to know "why does it say so". You are looking for "credible and/or official sources", that's going to be a bit difficult. These design decisions were made a long time ago, 13 years is a lot of dog lives in software engineering. They were made by the "old timers" as Eric Lippert calls them, they've moved on to bigger and better things and don't post answers here to provide an official source.
It can be inferred however, at a risk of merely being credible. Any managed compiler, like C#'s, has the constraint that it needs to generate code for the .NET virtual machine. The rules for which are carefully (and quite readably) described in the CLI spec. It is the Ecma-335 spec, you can download it for free from here.
Turn to Partition III, chapter 3.1 and 3.2. They describe the two IL instructions available to perform an addition, add and add.ovf. Click the link to Table 2, "Binary Numeric Operations", it describes what operands are permissible for those IL instructions. Note that there are just a few types listed there. byte and short as well as all unsigned types are missing. Only int, long, IntPtr and floating point (float and double) is allowed. With additional constraints marked by an x, you can't add an int to a long for example. These constraints are not entirely artificial, they are based on things you can do reasonably efficient on available hardware.
Any managed compiler has to deal with this in order to generate valid IL. That isn't difficult, simply convert the ushort to a larger value type that's in the table, a conversion that's always valid. The C# compiler picks int, the next larger type that appears in the table. Or in general, convert any of the operands to the next largest value type so they both have the same type and meet the constraints in the table.
Now there's a new problem however, a problem that drives C# programmers pretty nutty. The result of the addition is of the promoted type. In your case that will be int. So adding two ushort values of, say, 0x9000 and 0x9000 has a perfectly valid int result: 0x12000. Problem is: that's a value that doesn't fit back into an ushort. The value overflowed. But it didn't overflow in the IL calculation, it only overflows when the compiler tries to cram it back into an ushort. 0x12000 is truncated to 0x2000. A bewildering different value that only makes some sense when you count with 2 or 16 fingers, not with 10.
Notable is that the add.ovf instruction doesn't deal with this problem. It is the instruction to use to automatically generate an overflow exception. But it doesn't, the actual calculation on the converted ints didn't overflow.
This is where the real design decision comes into play. The old-timers apparently decided that simply truncating the int result to ushort was a bug factory. It certainly is. They decided that you have to acknowledge that you know that the addition can overflow and that it is okay if it happens. They made it your problem, mostly because they didn't know how to make it theirs and still generate efficient code. You have to cast. Yes, that's maddening, I'm sure you didn't want that problem either.
Quite notable is that the VB.NET designers took a different solution to the problem. They actually made it their problem and didn't pass the buck. You can add two UShorts and assign it to an UShort without a cast. The difference is that the VB.NET compiler actually generates extra IL to check for the overflow condition. That's not cheap code, makes every short addition about 3 times as slow. But otherwise the reason that explains why Microsoft maintains two languages that have otherwise very similar capabilities.
Long story short: you are paying a price because you use a type that's not a very good match with modern cpu architectures. Which in itself is a Really Good Reason to use uint instead of ushort. Getting traction out of ushort is difficult, you'll need a lot of them before the cost of manipulating them out-weighs the memory savings. Not just because of the limited CLI spec, an x86 core takes an extra cpu cycle to load a 16-bit value because of the operand prefix byte in the machine code. Not actually sure if that is still the case today, it used to be back when I still paid attention to counting cycles. A dog year ago.
Do note that you can feel better about these ugly and dangerous casts by letting the C# compiler generate the same code that the VB.NET compiler generates. So you get an OverflowException when the cast turned out to be unwise. Use Project > Properties > Build tab > Advanced button > tick the "Check for arithmetic overflow/underflow" checkbox. Just for the Debug build. Why this checkbox isn't turned on automatically by the project template is another very mystifying question btw, a decision that was made too long ago.
ushort x = 5, y = 12;
The following assignment statement will produce a compilation error, because the arithmetic expression on the right-hand side of the assignment operator evaluates to int by default.
ushort z = x + y; // Error: conversion from int to ushort
http://msdn.microsoft.com/en-us/library/cbf1574z(v=vs.71).aspx
EDIT:
In case of arithmetic operations on ushort, the operands are converted to a type which can hold all values. So that overflow can be avoided. Operands can change in the order of int, uint, long and ulong.
Please see the C# Language Specification In this document go to section 4.1.5 Integral types (around page 80 in the word document). Here you will find:
For the binary +, –, *, /, %, &, ^, |, ==, !=, >, <, >=, and <=
operators, the operands are converted to type T, where T is the first
of int, uint, long, and ulong that can fully represent all possible
values of both operands. The operation is then performed using the
precision of type T, and the type of the result is T (or bool for the
relational operators). It is not permitted for one operand to be of
type long and the other to be of type ulong with the binary operators.
Eric Lipper has stated in a question
Arithmetic is never done in shorts in C#. Arithmetic can be done in
ints, uints, longs and ulongs, but arithmetic is never done in shorts.
Shorts promote to int and the arithmetic is done in ints, because like
I said before, the vast majority of arithmetic calculations fit into
an int. The vast majority do not fit into a short. Short arithmetic is
possibly slower on modern hardware which is optimized for ints, and
short arithmetic does not take up any less space; it's going to be
done in ints or longs on the chip.
From the C# language spec:
7.3.6.2 Binary numeric promotions
Binary numeric promotion occurs for the operands of the predefined +, –, *, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary numeric promotion implicitly converts both operands to a common type which, in case of the non-relational operators, also becomes the result type of the operation. Binary numeric promotion consists of applying the following rules, in the order they appear here:
· If either operand is of type decimal, the other operand is converted to type decimal, or a binding-time error occurs if the other operand is of type float or double.
· Otherwise, if either operand is of type double, the other operand is converted to type double.
· Otherwise, if either operand is of type float, the other operand is converted to type float.
· Otherwise, if either operand is of type ulong, the other operand is converted to type ulong, or a binding-time error occurs if the other operand is of type sbyte, short, int, or long.
· Otherwise, if either operand is of type long, the other operand is converted to type long.
· Otherwise, if either operand is of type uint and the other operand is of type sbyte, short, or int, both operands are converted to type long.
· Otherwise, if either operand is of type uint, the other operand is converted to type uint.
· Otherwise, both operands are converted to type int.
There is no reason that is intended. This is just an effect or applying the rules of overload resolution which state that the first overload for whose parameters there is an implicit conversion that fit the arguments, that overload will be used.
This is stated in the C# Specification, section 7.3.6 as follows:
Numeric promotion is not a distinct mechanism, but rather an effect of applying overload resolution to the predefined operators.
It goes on illustrating with an example:
As an example of numeric promotion, consider the predefined implementations of the binary * operator:
int operator *(int x, int y);
uint operator *(uint x, uint y);
long operator *(long x, long y);
ulong operator *(ulong x, ulong y);
float operator *(float x, float y);
double operator *(double x, double y);
decimal operator *(decimal x, decimal y);
When overload resolution rules (§7.5.3) are applied to this set of operators, the effect is to select the first of the operators for which implicit conversions exist from the operand types. For example, for the operation b * s, where b is a byte and s is a short, overload resolution selects operator *(int, int) as the best operator.
Your question is in fact, a bit tricky. The reason why this specification is part of the language is... because they took that decision when they created the language. I know this sounds like a disappointing answer, but that's just how it is.
However, the real answer would probably involve many context decision back in the day in 1999-2000. I am sure the team who made C# had pretty robust debates about all those language details.
...
C# is intended to be a simple, modern, general-purpose, object-oriented programming language.
Source code portability is very important, as is programmer portability, especially for those programmers already familiar with C and C++.
Support for internationalization is very important.
...
The quote above is from Wikipedia C#
All of those design goals might have influenced their decision. For instance, in the year 2000, most of the system were already native 32-bits, so they might have decided to limit the number of variable smaller than that, since it will be converted anyway on 32-bits when performing arithmetic operations. Which is generally slower.
At that point, you might ask me; if there is implicit conversion on those types why did they included them anyway? Well one of their design goals, as quoted above, is portability.
Thus, if you need to write a C# wrapper around an old C or C++ program you might need those type to store some values. In that case, those types are pretty handy.
That's a decision Java did not make. For instance, if you write a Java program that interacts with a C++ program in which way your are received ushort values, well Java only has short (which are signed) so you can't easily assign one to another and expect correct values.
I let you bet, next available type that could receive such value in Java is int (32-bits of course). You have just doubled your memory here. Which might not be a big deal, instead you have to instantiate an array of 100 000 elements.
In fact, We must remember that those decision are been made by looking at the past and the future in order provide a smooth transfer from one to another.
But now I feel that I am diverging of the initial question.
So your question is a good one and hopefully I was able to bring some answers to you, even though if I know that's probably not what you wanted to hear.
If you'd like, you could even read more about the C# spec, links below. There is some interesting documentation that might be interesting for you.
Integral types
The checked and unchecked operators
Implicit Numeric Conversions Table
By the way, I believe you should probably reward habib-osu for it, since he provided a fairly good answer to the initial question with a proper link. :)
Regards

Why do C#'s binary operators always return int regardless of the format of their inputs?

If I have two bytes a and b, how come:
byte c = a & b;
produces a compiler error about casting byte to int? It does this even if I put an explicit cast in front of a and b.
Also, I know about this question, but I don't really know how it applies here. This seems like it's a question of the return type of operator &(byte operand, byte operand2), which the compiler should be able to sort out just like any other operator.
Why do C#'s bitwise operators always return int regardless of the format of their inputs?
I disagree with always. This works and the result of a & b is of type long:
long a = 0xffffffffffff;
long b = 0xffffffffffff;
long x = a & b;
The return type is not int if one or both of the arguments are long, ulong or uint.
Why do C#'s bitwise operators return int if their inputs are bytes?
The result of byte & byte is an int because there is no & operator defined on byte. (Source)
An & operator exists for int and there is also an implicit cast from byte to int so when you write byte1 & byte2 this is effectively the same as writing ((int)byte1) & ((int)byte2) and the result of this is an int.
This behavior is a consequence of the design of IL, the intermediate language generated by all .NET compilers. While it supports the short integer types (byte, sbyte, short, ushort), it has only a very limited number of operations on them. Load, store, convert, create array, that's all. This is not an accident, those are the kind of operations you could execute efficiently on a 32-bit processor, back when IL was designed and RISC was the future.
The binary comparison and branch operations only work on int32, int64, native int, native floating point, object and managed reference. These operands are 32-bits or 64-bits on any current CPU core, ensuring the JIT compiler can generate efficient machine code.
You can read more about it in the Ecma 335, Partition I, chapter 12.1 and Partition III, chapter 1.5
I wrote a more extensive post about this over here.
Binary operators are not defined for byte types (among others). In fact, all binary (numeric) operators act only on the following native types:
int
uint
long
ulong
float
double
decimal
If there are any other types involved, it will use one of the above.
It's all in the C# specs version 5.0 (Section 7.3.6.2):
Binary numeric promotion occurs for the operands of the predefined +, –, *, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary numeric promotion implicitly converts both operands to a common type which, in case of the non-relational operators, also becomes the result type of the operation. Binary numeric promotion consists of applying the following rules, in the order they appear here:
If either operand is of type decimal, the other operand is converted to type decimal, or a compile-time error occurs if the other operand is of type float or double.
Otherwise, if either operand is of type double, the other operand is converted to type double.
Otherwise, if either operand is of type float, the other operand is converted to type float.
Otherwise, if either operand is of type ulong, the other operand is converted to type ulong, or a compile-time error occurs if the other operand is of type sbyte, short, int, or long.
Otherwise, if either operand is of type long, the other operand is converted to type long.
Otherwise, if either operand is of type uint and the other operand is of type sbyte, short, or int, both operands are converted to type long.
Otherwise, if either operand is of type uint, the other operand is converted to type uint.
Otherwise, both operands are converted to type int.
It's because & is defined on integers, not on bytes, and the compiler implicitly casts your two arguments to int.

C# XOR on two byte variables will not compile without a cast [duplicate]

This question already has answers here:
byte + byte = int... why?
(16 answers)
Closed 5 years ago.
Why does the following raise a compile time error: 'Cannot implicitly convert type 'int' to 'byte':
byte a = 25;
byte b = 60;
byte c = a ^ b;
This would make sense if I were using an arithmentic operator because the result of a + b could be larger than can be stored in a single byte.
However applying this to the XOR operator is pointless. XOR here it a bitwise operation that can never overflow a byte.
using a cast around both operands works:
byte c = (byte)(a ^ b);
I can't give you the rationale, but I can tell why the compiler has that behavior from the stand point of the rules the compiler has to follow (which might not really be what you're interesting in knowing).
From an old copy of the C# spec (I should probably download a newer version), emphasis added:
14.2.6.2 Binary numeric promotions This clause is informative.
Binary numeric promotion occurs for
the operands of the predefined +, ?,
*, /, %, &, |, ^, ==, !=, >, <, >=, and <= binary operators. Binary
numeric promotion implicitly converts
both operands to a common type which,
in case of the non-relational
operators, also becomes the result
type of the operation. Binary numeric
promotion consists of applying the
following rules, in the order they
appear here:
If either operand is of type decimal, the other operand is
converted to type decimal, or a
compile-time error occurs if the other
operand is of type float or double.
Otherwise, if either operand is of type double, the other operand is
converted to type double.
Otherwise, if either operand is of type float, the other operand is
converted to type float.
Otherwise, if either operand is of type ulong, the other operand is
converted to type ulong, or a
compile-time error occurs if the other
operand is of type sbyte, short, int,
or long.
Otherwise, if either operand is of type long, the other operand is
converted to type long.
Otherwise, if either operand is of type uint and the other operand is of
type sbyte, short, or int, both
operands are converted to type long.
Otherwise, if either operand is of type uint, the other operand is
converted to type uint.
Otherwise, both operands are converted to type int.
So, basically operands smaller than an int will be converted to int for these operators (and the result will be an int for the non-relational ops).
I said that I couldn't give you a rationale; however, I will make a guess at one - I think that the designers of C# wanted to make sure that operations that might lose information if narrowed would need to have that narrowing operation made explicit by the programmer in the form of a cast. For example:
byte a = 200;
byte b = 100;
byte c = a + b; // value would be truncated
While this kind of truncation wouldn't happen when performing an xor operation between two byte operands, I think that the language designers probably didn't want to have a more complex set of rules where some operations would need explicit casts and other not.
Just a small note: the above quote is 'informational' not 'normative', but it covers all the cases in an easy to read form. Strictly speaking (in a normative sense), the reason the ^ operator behaves this way is because the closest overload for that operator when dealing with byte operands is (from 14.10.1 "Integer logical operators"):
int operator ^(int x, int y);
Therefore, as the informative text explains, the operands are promoted to int and an int result is produced.
FWIW
byte a = 25;
byte b = 60;
a = a ^ b;
does not work. However
byte a = 25;
byte b = 60;
a ^= b;
does work.
The demigod programmer from Microsoft has an answer: Link
And maybe it's more about compiler design. They make the compiler simpler by generalizing the compiling process, it doesn't have to look at operator of operands, so it lumped bitwise operations in the same category as arithmetic operators. Thereby, subjected to type widening
Link dead, archive here:
https://web.archive.org/web/20140118171646/http://blogs.msdn.com/b/oldnewthing/archive/2004/03/10/87247.aspx
I guess its because the operator XOR is defined for booleans and integers.
And a cast of the result from the integer result to a byte is an information-losing conversion ; hence needs an explicit cast (nod from the programmer).
It seems to be because in C# language specifications, it is defined for integer and long
http://msdn.microsoft.com/en-us/library/aa691307%28v=VS.71%29.aspx
So, what actually happens is that compiler casts byte operands to int implicitly because there is no loss of data that way. But the result (which is int) can not be down-cast-ed without loss of data (implicitly). So, you need to tell the compiler explicitly that you know what you are doing!
As to why the two bytes have to be converted to ints to do the XOR?
If you want to dig into it, 12.1.2 of the CLI Spec (Partition I) describes the fact that, on the evaluation stack, only int or long can exist. All shorter integral types have to be expanded during evaluation.
Unfortunately, I can't find a suitable link directly to the CLI Spec - I've got a local copy as PDF, but can't remember where I got it from.
This has more to do with the rules surrounding implicit and explicit casting in the CLI specification. An integer (int = System.Int32 = 4 bytes) is wider than a byte (1 byte, obviously!). Therefore any cast from int to byte is potentially a narrowing cast. Therefore, the compiler wants you to make this explicit.

Categories