After seeing how double.Nan == double.NaN is always false in C#, I became curious how the equality was implemented under the hood. So I used Resharper to decompile the Double struct, and here is what I found:
public struct Double : IComparable, IFormattable, IConvertible, IComparable<double>, IEquatable<double>
{
// stuff removed...
public const double NaN = double.NaN;
// more stuff removed...
}
This seems to indicate the the struct Double declares a constant that is defined in terms of this special lower case double, though I'd always thought that the two were completely synonymous. What's more, if I Go To Implementation on the lowercase double, Resharper simply scrolls me to the declaration at the top of the file. Similarly, jumping to implementation of the lowercase's NaN just takes me to the constant declaration earlier in the line!
So I'm trying to understand this seemingly recursive definition. Is this just an artefact of the decompiler? Perhaps a limitation in Resharper? Or is this lowercase double actually a different beast altogether - representing something at a lower level from the CLR/CTS?
Where does NaN really come from?
Beware looking at decompiled code, especially if it is for something inbuilt. The actual IL here (for .NET 4.5, at least) is:
.field public static literal float64 NaN = float64(NaN)
{
.custom instance void __DynamicallyInvokableAttribute::.ctor()
}
i.e. this is handled directly in IL via the NaN token.
However, because it is a const (literal in IL), it will get "burned into" the call site; anywhere else that uses double.NaN will also be using float64(NaN). Similarly, example, if I do:
const int I = 2;
int i = I;
int j = 2;
both of these assignments will look identical in the final IL (they will both be ldc.i4.2).
Because of this, most decompilers will recognise the IL pattern NaN and represent it with the language's equivalent of double.NaN. But that doesn't mean that the code is itself recursive; they probably just don't have a check for "but is it double.NaN itself?". Ultimately, this is simply a special case, where float64(NaN) is a recognised value in IL.
Incidentally, reflector decompiles it as:
[__DynamicallyInvokable]
public const double NaN = (double) 1.0 / (double) 0.0;
That again doesn't meant that this is truth :p Merely that this is something which may have the same end result.
By far the best source you can get for .NET assemblies is the actual source code that was used to build them. Beats any decompiler for accuracy, the comments can be quite useful as well. Download the Reference Source.
You'll then also see that Double.NaN isn't defined in IL as Marc assumed, it's actually in a C# source code file. The net/clr/bcl/system/double.cs source code file shows the real declaration:
public const double NaN = (double)0.0 / (double)0.0;
Which takes advantage of the C# compiler evaluating constant expressions at compile time. Or to put it tongue-in-cheek, NaN is defined by the C++ compiler since that's the language that was used to write the C# compiler ;)
Related
I already have found useful answers why it shouldn't be possible at all:
Why does C# limit the set of types that can be declared as const?
Why can't structs be declared as const?
The first one has a detailed answer, which I still have to re-read a couple of times until I fully get it.
The second one has a very easy and clear answer (like 'the constructor might do anything, so it had to be run and evaluated at compile time').
But both refer to C#.
However, I am using C++/CLI and have a
value class CLocation
{
public:
double x, y, z;
CLocation ( double i_x, double i_y, double i_z) : x(i_x), y(i_y), z(i_z) {}
CLocation ( double i_all) : x(i_all), y(i_all), z(i_all) {}
...
}
where I can easily create a
const CLoc c_loc (1,2,3);
which indeed is immutable, meaning 'const'.
Why?
CLocation furthermore has a function
System::Drawing::Point CLocation::ToPoint ()
{
return System::Drawing::Point (x_int, y_int);
}
which works well on CLocation, but doesn't on a const CLocation. I think this comes from the limitation in C# (known from the links above), which likely comes from the underlying IL, so C++/CLI is affected by that limitation in the same way.
Is this correct?
Is there a way to run this member function on a const CLocation?
You must indicate to the compiler that your function doesn't change the object by adding const after the argument list.
Your function may then be called on a const variable, but may not modify its fields.
Pay also attention that some keywords (including const and struct) have different meanings in C# and C++ (and other languages based on C).
Update
As CPP/CLI doesn't allow a const modifier on a member function, you'll have to copy the variable to a non-const one to be able to call any member function (on the copy).
It says here that the possible types for an enum are byte, sbyte, short, ushort, int, uint, long, or ulong.
What if I need a float or a double to define percentage increments such as 1.5 or 2.5 for example? Am I stuck?
As said here:
http://en.csharp-online.net/.NET_Type_Design_Guidelines%E2%80%94Enum_Design
An enum is a structure with a set of static constants. The reason to
follow this guideline is because you will get some additional compiler
and reflection support if you define an enum versus manually defining
a structure with static constants.
Since an enum is a set of constants, why can't I have float constants ?
Update: it is said here:
http://en.csharp-online.net/.NET_Type_Design_Guidelines%E2%80%94Enum_Design
"Did you know that the CLR supports enums with an underlying type of float or double even though most languages don't choose to expose it?"
Since I'm only using c# is there a way to do so with some hacks ?
Yes, you're stuck there. You can't use enums with floating-point types. You can use a static class with constants, however:
public static class MyFloatEnum {
public const float One = 1.0;
public const float OneAndAHalf = 1.5;
// etc.
}
And it will look somewhat close in IntelliSense. Alternatively, you may just want to use constants:
public const float A = 0.5;
public const float B = 17.62;
Although the CLR itself supports floating point enums, C# designers chose not to expose this in the language (see http://en.csharp-online.net/.NET_Type_Design_Guidelines%E2%80%94Enum_Design). You can either use constants as in John Saunders' answer, or you can define an integer enum with multiplied values and then divide them back if/when you need the value.
The use case would be definitely interesting, though. Why do you need/want this?
You will have to use a set of constants:
public const float Percentage1 = 1.5;
public const float Percentage2 = 2.5;
Been browsing through .NET source code of .NET Framework Reference Source, just for fun of it. And found something I don't understand.
There is a Int32.cs file with C# code for Int32 type. And somehow that seems strange to me. How does the C# compiler compile code for Int32 type?
public struct Int32: IComparable, IFormattable, IConvertible {
internal int m_value;
// ...
}
But isn't this illegal in C#? If int is only an alias for Int32, it should fail to compile with Error CS0523:
Struct member 'struct2 field' of type 'struct1' causes a cycle in the struct layout.
Is there some magic in the compiler, or am I completely off track?
isn't this illegal in C#? If "int" is only alias for "Int32" it should fail to compile with error CS0523. Is there some magic in the compiler?
Yes; the error is deliberately suppressed in the compiler. The cycle checker is skipped entirely if the type in question is a built-in type.
Normally this sort of thing is illegal:
struct S { S s; int i; }
In that case the size of S is undefined because whatever the size of S is, it must be equal to itself plus the size of an int. There is no such size.
struct S { S s; }
In that case we have no information from which to deduce the size of S.
struct Int32 { Int32 i; }
But in this case the compiler knows ahead of time that System.Int32 is four bytes because it is a very special type.
Incidentally, the details of how the C# compiler (and, for that matter, the CLR) determines when a set of struct types is cyclic is extremely interesting. I'll try to write a blog article about that at some point.
int is an alias for Int32, but the Int32 struct you are looking at is simply metadata, it is not a real object. The int m_value declaration is possibly there only to give the struct the appropriate size, because it is never actually referenced anywhere else (which is why it is allowed to be there).
So, in other words, the compiler kind of saves this from being a problem. There is a discussion on the topic in the MSDN Forums.
From the discussion, here is a quote from the chosen answer that helps to try to determine how the declaration is possible:
while it is true that the type contains an integer m_value field - the
field is never referenced. In every supporting method (CompareTo,
ToString, etc), "this" is used instead. It is possible that the
m_value fields only exist to force the structures to have the
appropriate size.
I suspect that when the compiler sees "int", it translates it into "a
reference to System.Int32 in mscorlib.dll, to be resolved later", and
since it's building mscorlib.dll, it does end up with a cyclical
reference (but not one that can ever cause problems, because m_value
is never used). If this assumption is correct, then this trick would
only work for special compiler types.
Reading further, it can be determined that the struct is simply metadata, and not a real object, so it is not bound by the same recursive definiton restraints.
I've just decompiled some 3rd party source to debug an issue, using DotPeek. The output code contains some unusual operators, which AFAIK aren't valid C#, so I'm wondering what they mean...
The extract looks like (with Dotpeek comments included, as they are probably relevant);
protected internal void DoReceive(ref byte[] Buffer, int MaxSize, out int Written)
{
Written = 0;
.
.
.
// ISSUE: explicit reference operation
// ISSUE: variable of a reference type
int& local = #Written;
int num = SomeMethod();
.
.
.
// ISSUE: explicit reference operation
^local = num;
}
So, 3 unusual operators in there... int& = #Written seems to be assigning a pointer to a variable that is named pointlessly with the # character?
But what is ^local = num; ???
OK, here is the equivalent snippet from ILSpy, which makes more sense, I guess the decompile to C# didn't produce a valid equivalent?
'C#'
int& local = #Written;
byte[] numArray2 = this.FInSpool;
int num = (int) __Global.Min(numArray2 == null ? 0L : (long) numArray2.Length, (long) MaxSize);
^local = num;
IL
byte[] expr_22 = this.FInSpool;
Written = (int)__Global.Min((long)((expr_22 == null) ? 0 : expr_22.Length), (long)MaxSize);
So, I guess the 'C#' isn't quite valid? That IL would be valid C#, not sure why DotPeek produced the output it did. Perhaps I'll stick to ILSpy for this one...?
If you look at the raw IL (from ildasm, not the C# equivalent via IL Spy), that may help you see what the decompiler is trying to say. 'Out' parameters are represented using a (managed) typed-reference, which isn't explicitly exposed as a type in C#. 'Instances' of this type can normally only be passed as parameters to methods accepting typed references ('ref' or 'out' parameters.) See OpCodes.Mkrefany for more information.
What dotPeek is complaining about is that this 'out' typed reference was stored from the argument into a local variable slot, then written to later via the local slot. The '#' and '^' are placeholders used to indicate this unexpected behavior detected by the decompiler (the one the ISSUE comments describe.)
It's possible the code was compiled from C++/CLI and thus the IL looks different from the typical C# compiler output. It's also possible this is some level of minor obfuscation to confuse decompilers (though I don't think so.) I don't think this is functionally any different from loading the reference from its argument onto the operation stack directly (avoiding the use of a local variable slot), but I could be wrong.
Putting a # before a name allows you to use a reserved name for a variable, For example if I wanted to have a variable called return I would need to do this.
public int Weird()
{
int #return = 0;
return #return;
}
See this SO question for more details.
Putting a ^ before the name ... umm, no clue. (will update as I research I could find any info on what ^ means when not being used as a XOR)
It's clearly a decompilation issue. Because of broad set of languages supported decompiler to any particular language may not always find exact match, but still tries to produce some output. Decompiler MAY try to produce somewhat equivalent output, like in this case could be:
protected internal void DoReceive(ref byte[] Buffer, int MaxSize, out int Written)
{
Written = 0;
.
int num = SomeMethod();
.
Written = num;
}
but SHOULD it really do this? in this case decompiler actually provided you with a hint, so you could decide if this is important for your particular case, as there MAY be some side effects.
To define constants, what is the more common and correct way? What is the cost, in terms of compilation, linking, etc., of defining constants with #define? It is another way less expensive?
The best way to define any const is to write
const int m = 7;
const float pi = 3.1415926f;
const char x = 'F';
Using #define is a bad c++ style. It is impossible to hide #define in namespace scope.
Compare
#define pi 3.1415926
with
namespace myscope {
const float pi = 3.1415926f;
}
Second way is obviously better.
The compiler itself never sees a #define. The preprocessor expands all macros before they're passed to the compiler. One of the side effects, though, is that the values are repeated...and two identical strings are not necessarily the exact same string. If you say
#define SOME_STRING "Just an example"
it's perfectly legal for the compiler to add a copy of the string to the output file each time it sees the string. A good compiler will probably eliminate duplicate literals, but that's extra work it has to do. If you use a const instead, the compiler doesn't have to worry about that as much.
The cost is only to the preprocessor, when #defines are resolved (ignoring the additional debugging cost of dealing with a project full of #defines for constants, of course).
#define macros are processed by the pre-processor, they are not visible to the compiler. And since they are not visible to the compiler as a symbol, it is hard to debug something which involves a macro.
The preferred way of defining constants is using the const keyword along with proper type information.
const unsigned int ArraySize = 100;
Even better is
static const unsigned int ArraySize = 100;
when the constant is used only in a single file.
#define will increase Compilation time but it will faster in execution...
generally in conditional compilation #define is used...
where const is used in general computation of numbers
Choice is depends upon your requirement...
#define is string replacement. So if you make mistakes in the macros, they will show up as errors later on. Mostly incorrect types or incorrect expressions are the common ones.
For conditional compilation, pre-processor macros work best. For other constants which are to be used in computation, const works good.
CPU time isn't really the cost of using #define or macros. The "cost" as a developer is as follows:
If there is an error in your macro, the compiler will flag it where you referenced the macro, not where you defined it.
You will lose type safety and scoping for your macro.
Debugging tools will not know the value of the macro.
These things may not burn CPU cycles, but they can burn developer cycles.
For constants, declaring const variables is preferable, and for little type-independent functions, inline functions and templates are preferable.