Why aren't unsigned variables used more often? [duplicate] - c#

This question already has answers here:
Closed 14 years ago.
It seems that unsigned integers would be useful for method parameters and class members that should never be negative, but I don't see many people writing code that way. I tried it myself and found the need to cast from int to uint somewhat annoying...
Anyhow what are you thoughts on this?
Duplicate
Why is Array Length an Int and not an UInt?

The idea, that unsigned will prevent you from problems with methods/members that should not have to deal with negative values is somewhat flawed:
now you have to check for big values ('overflow') in case of error
whereas you could have checked for <=0 with signed
use only one signed int in your methods and you are back to square "signed" :)
Use unsigned when dealing with bits. But don't use bits today anyway, except you have such a lot of them, that they fill some megabytes or at least your small embedded memory.

Using the standard ones probably avoids the casting to unsigned versions. In your code, you can probably maintain the distinction ok, but lots of other inputs and 3rd party libraries wont and thus the casting will just drive people mad!

I can't remember exactly how C# does its implicit conversions, but in C++, widening conversions are done implicitly. Unsigned is considered wider than signed, and so this leads to unexpected problems:
int s = 5;
unsigned int u = 25;
// s > u is false
int s = -1;
unsigned int u = 25;
// s > u is **TRUE**! Error error!
In the example above, s overflowed, so it's value will be something like 4294967295. This has caused me problems before, I often have methods return -1 to say "no match" or something like that, and with the implicit conversion it just fails to do what I think it should.
After a while programmers learnt to almost always use signed variables, except in exceptional cases. Compilers these days also produce warnings for this which is very helpful.

One reason is that public methods or properties referring to unsigned types aren't CLS compliant.
You'll almost always see this attribute applied to .Net assemblies as various wizards include it by default:
[assembly: CLSCompliant(true)]
So basically if your assembly includes the attribute above, and you try to use unsigned types in your public interface with the outside world, you'll get a compilation error.

For simplicity. Modern software involves enough casts and conversions. There's a benefit to stick to as few, commonly available data types as possible to reduce complexity and ambiguity about proper interfaces.

there is no real need. Declaring something as unsigned to say numbers should be positive is a poor mans attempt at validation.
In fact it would be better to just have a single number class that represented all numbers.
To Validate numbers you should use some other technique because generally the constraint isn't just positive numbers, it's a set range. It's usually best to use the most unconstrained method for representing numbers and then if you want to change the rules for allowable values, you change JUST the validation rules NOT the type.

Unsigned data types are carried over from the old days when memomry was a premium. So now we don't really need them for that purpose. Combine that with casting and they are a little cumbersome.

It's not advisable to use unsigned integer because if u assigned negative values to it, all hell break lose. However, if you insists on doing it the right way, try using Spec#, declare it as an integer (where you would have used uint) and attach an invariant to it saying it can never be negative.

You're right in that it probably would be better to use uint for things which should never be negative. In practice, though, there's a few reasons against it:
int is the 'standard' or 'default', it has a lot of inertia behind it.
You have to use annoying/ugly explicit casts everywhere to get back to int, which you're probably going to need to do a lot due to the first point after all.
When you do cast back to int, what happens if your uint value is going to overflow it?
A lot of the time, people use ints for things that are never negative, and then use -1 or -99 or something like that for error codes or uninitialised values. It's a little lazy perhaps, but again this is something that uint is less flexible with.

The biggest reason is that people are usually too lazy, or not thinking closely enough, to know where they're appropriate. Something like a size_t can never be negative, so unsigned is correct.
Casting from signed to unsigned can be fraught with peril though, because of peculiarities in how sign bits are handled by the underlying architecture.

You shouldn't need to manually cast it, I don't think.
Anyway, it's because for most applications, it doesn't matter - the range for either is big enough for most uses.

It is because they are not CLS compliant, which means your code might not run as you expect on other .NET framework implementations or even on other .NET languages (unless supported).
Also it is not interoperable with other systems if you try to pass them. Like a web service to be consumed by Java, or call to Win32 API
See that SO post on the reason why as well

Related

Implicit conversion from char to single character string

First of all: I know how to work around this issue. I'm not searching for a solution. I am interested in the reasoning behind the design choices that led to some implicit conversions and didn't lead to others.
Today I came across a small but influential error in our code base, where an int constant was initialised with a char representation of that same number. This results in an ASCII conversion of the char to an int. Something like this:
char a = 'a';
int z = a;
Console.WriteLine(z);
// Result: 97
I was confused why C# would allow something like this. After searching around I found the following SO question with an answer by Eric Lippert himself: Implicit Type cast in C#
An excerpt:
However, we can make educated guesses as to why implicit
char-to-ushort was considered a good idea. The key idea here is that
the conversion from number to character is a "possibly dodgy"
conversion. It's taking something that you do not KNOW is intended to
be a character, and choosing to treat it as one. That seems like the
sort of thing you want to call out that you are doing explicitly,
rather than accidentally allowing it. But the reverse is much less
dodgy. There is a long tradition in C programming of treating
characters as integers -- to obtain their underlying values, or to do
mathematics on them.
I can agree with the reasoning behind it, though an IDE hint would be awesome. However, I have another situation where the implicit conversion suddenly is not legal:
char a = 'a';
string z = a; // CS0029 Cannot implicitly convert type 'char' to 'string'
This conversion is in my humble opinion, very logical. It cannot lead to data loss and the intention of the writer is also very clear. Also after I read the rest of the answer on the char to int implicit conversion, I still don't see any reason why this should not be legal.
So that leads me to my actual question:
What reasons could the C# design team have, to not implement the implicit conversion from char to a string, while it appears so obvious by the looks of it (especially when comparing it to the char to int conversion).
First off, as I always say when someone asks "why not?" question about C#: the design team doesn't have to provide a reason to not do a feature. Features cost time, effort and money, and every feature you do takes time, effort and money away from better features.
But I don't want to just reject the premise out of hand; the question might be better phrased as "what are design pros and cons of this proposed feature?"
It's an entirely reasonable feature, and there are languages which allow you to treat single characters as strings. (Tim mentioned VB in a comment, and Python also treats chars and one-character strings as interchangeable IIRC. I'm sure there are others.) However, were I pitched the feature, I'd point out a few downsides:
This is a new form of boxing conversion. Chars are cheap value types. Strings are heap-allocated reference types. Boxing conversions can cause performance problems and produce collection pressure, and so there's an argument to be made that they should be more visible in the program, not less visible.
The feature will not be perceived as "chars are convertible to one-character strings". It will be perceived by users as "chars are one-character strings", and now it is perfectly reasonable to ask lots of knock-on questions, like: can call .Length on a char? If I can pass a char to a method that expects a string, and I can pass a string to a method that expects an IEnumerable<char>, can I pass a char to a method that expects an IEnumerable<char>? That seems... odd. I can call Select and Where on a string; can I on a char? That seems even more odd. All the proposed feature does is move your question; had it been implemented, you'd now be asking "why can't I call Select on a char?" or some such thing.
Now combine the previous two points together. If I think of chars as one-character strings, and I convert a char to an object, do I get a boxed char or a string?
We can also generalize the second point a bit further. A string is a collection of chars. If we're going to say that a char is convertible to a collection of chars, why stop with strings? Why not also say that a char can also be used as a List<char>? Why stop with char? Should we say that an int is convertible to IEnumerable<int>?
We can generalize even further: if there's an obvious conversion from char to sequence-of-chars-in-a-string, then there is also an obvious conversion from char to Task<char> -- just create a completed task that returns the char -- and to Func<char> -- just create a lambda that returns the char -- and to Lazy<char>, and to Nullable<char> -- oh, wait, we do allow a conversion to Nullable<char>. :-)
All of these problems are solvable, and some languages have solved them. That's not the issue. The issue is: all of these problems are problems that the language design team must identify, discuss and resolve. One of the fundamental problems in language design is how general should this feature be? In two minutes I've gone from "chars are convertible to single-character strings" to "any value of an underlying type is convertible to an equivalent value of a monadic type". There is an argument to be made for both features, and for various other points on the spectrum of generality. If you make your language features too specific, it becomes a mass of special cases that interact poorly with each other. If you make them too general, well, I guess you have Haskell. :-)
Suppose the design team comes to a conclusion about the feature: all of that has to be written up in the design documents and the specification, and the code, and tests have to be written, and, oh, did I mention that any time you make a change to convertibility rules, someone's overload resolution code breaks? Convertibility rules you really have to get right in the first version, because changing them later makes existing code more fragile. There are real design costs, and there are real costs to real users if you make this sort of change in version 8 instead of version 1.
Now compare these downsides -- and I'm sure there are more that I haven't listed -- to the upsides. The upsides are pretty tiny: you avoid a single call to ToString or + "" or whatever you do to convert a char to a string explicitly.
That's not even close to a good enough benefit to justify the design, implementation, testing, and backwards-compat-breaking costs.
Like I said, it's a reasonable feature, and had it been in version 1 of the language -- which did not have generics, or an installed base of billions of lines of code -- then it would have been a much easier sell. But now, there are a lot of features that have bigger bang for smaller buck.
Char is a value type. No, it is not numerical like int or double, but it still derives from System.ValueType. Then a char variable lives on the stack.
String is a reference type. So it lives on the heap. If you want to use a value type as a reference type, you need to box it first. The compiler needs to be sure that you know what you're doing via boxing. So, you can't have an implicit cast.
Cause when you do a cast of a char to int, you aren't just changing its' type, you are getting its' index in the ASCII table which is a really different thing, they are not letting you to "convert a char to an int" they are giving you a useful data for a possible need.
Why not letting the cast of a char to string?
Cause they actually represent much different things, and are stored in a much different way.
A char and a string aren't 2 ways of expressing the same thing.

Why is there no "date" shorthand of System.DateTime in C#?

Such as int, long, ushort, uint, short, etc.
Why isn't there a short hand for System.DateTime?
Many types are associated with "shorthand" keywords in C#; for example, System.Int32 can also be written int and System.String can be written string. Why isn't there a shorthand for System.DateTime?
Before I answer that question -- or rather, fail to answer it -- let's first note the types that have shorthands in C#. They are
object
string
sbyte byte short ushort int uint long ulong
char
bool
decimal double float
Let me first address some of the other answers.
Mystere Man notes correctly that several of these keywords come from C; C has int and the rather wordy unsigned int, double, and a few others. However C most notably does not have bool, long, decimal, object or string.
I think we can reasonably say that one justification for including keywords int and char and so on is so that users familiar with C and C++ can be productive quickly in C#. The aim here is for the keywords to be familiar to C programmers, but it was absolutely not the aim for these keywords to have the same semantics as they do in C. C does not specify the size of any of those, for example, and strongly encourages int to be the "natural size" of the machine. C# does specify the exact size and range of each type.
So I do not think we can reasonably say that the justification for not having a shorthand for System.DateTime is because there was none in C. There was no string or decimal in C either.
Jason notes that DateTime is not "part of the C# language" and therefore does not have a keyword. But that is thoroughly begging the question! The question is essentially then "OK, so then why is DateTime not a part of the C# language?" Answering a question in a manner which requires a question of equal difficulty to be answered is called "begging the question", and this question has been thoroughly begged.
It is instructive to consider what are the "fundamental" types. All of the types that have keywords in C# are "very special" in some way, except for decimal. That is, the underlying runtime has special behaviour built into it for object, obviously, as it is the universal base type. string could have simply been an array of char, but it is not; strings are special. (Because they can be interned, they can be constant, they can exist in metadata, and so on.) The integral and binary floating point types all have special handling built into the framework for their operations.
But System.Decimal is just another struct type; it's 128 bits of integers and a whole lot of user-defined operators. Anyone could implement their own decimal arithmetic type if they wanted to. But "blessing" System.Decimal by making it a part of the C# language means that even though its conversions are implemented as methods, we treat them as built in conversions, not as user-defined conversions.
So decimal really is an odd one. It is not a "fundamental" type of the runtime, but it is a keyword.
This brings up an interesting point though. System.IntPtr and System.UIntPtr *are* fundamental types of the runtime. They are the "pointer sized integer" types; they are what C means by int and unsigned int. Even though these types are fundamental to the .NET runtime's type system, they do not get blessed with a keyword.
Thus, we can reject the argument that only "fundamental" types get a keyword. There is a non-fundamental type that got a keyword, and a fundamental type that did not get a keyword, so there is no one-to-one relationship between fundamental types and types that got a keyword.
Tigran opines that the choices were "historical", which is correct but does not actually answer the question.
Hans Passant notes correctly that clearly specifying the size and range of an int helps make the language behaviour consistent even as the native integer size changes, and notes that DateTime has already been designed to be "future proof". Though this analysis is correct, it does not explain why decimal is made a keyword. There's no fear that the "native decimal size" of a machine is going to change in the future. Moreover, the C# language already notes that though a double will always consume 8 bytes of storage, there is no requirement that C# restrict the processing of doubles to a mere 64 bits of precision; in fact C# programs often do double arithmetic in 80 or more bits of precision.
I don't think any of these answers successfully address the question. So let's return to the question:
Many types are associated with "shorthand" keywords in C#; for example, System.Int32 can also be written int and System.String can be written string. Why isn't there a shorthand for System.DateTime?
The answer to this question is the same as the answer to every question of the form "why does C# not implement a feature I like?" The answer is: we are not required to provide a justification for not implementing a feature. Features are expensive, and, as Raymond Chen often points out, are unimplemented by default. It takes no work to leave an unimplemented feature unimplemented.
The feature suggestion is not unreasonable at all; Visual Basic in some sense treats DateTime as a special type, and C# could too if we decided that it was worthwhile doing that work. But not every reasonable feature gets implemented.
The aliases for Int32, Int64, etcetera in the C# language exist to make the language future-proof. To still make it relevant when everybody has a 256-bit core in their desktop machine. With a wholelotta hemming and hawing to really make that happen, the amount of C# code that's around that implicitly assumes that int is 32-bits is rife. But not actually that hard, I moved chunks of code I wrote from CP/M to MS-DOS to Windows 3.x to Windows NT with surprisingly little effort.
That's not an issue for DateTime. It is future-proof until the year 10,000. I trust and hope that by then the machine understands what I meant, not what I typed :)
Noone other then someone from BCL/.NET team can give a true answer, but I think it's simply historical reason, like for the types char, float... and their "extensions" (like int->uint matnioned in the comment) there is, for others no.
I could make the same question for other .NET Framework type types, cause we have shorthand for string in .NET, but string is a reference type. So...
Hope I'm explained.

Do enums have a limit of members in C#?

I was wondering if the enum structure type has a limit on its members. I have this very large list of "variables" that I need to store inside an enum or as constants in a class but I finally decided to store them inside a class, however, I'm being a little bit curious about the limit of members of an enum (if any).
So, do enums have a limit on .Net?
Yes. The number of members with distinct values is limited by the underlying type of enum - by default this is Int32, so you can get that many different members (2^32 - I find it hard that you will reach that limit), but you can explicitly specify the underlying type like this:
enum Foo : byte { /* can have at most 256 members with distinct values */ }
Of course, you can have as many members as you want if they all have the same value:
enum { A, B = A, C = A, ... }
In either case, there is probably some implementation-defined limit in C# compiler, but I would expect it to be MIN(range-of-Int32, free-memory), rather than a hard limit.
Due to a limit in the PE file format, you probably can't exceed some 100,000,000 values. Maybe more, maybe less, but definitely not a problem.
From the C# Language Specification 3.0, 1.10:
An enum type’s storage format and
range of possible values are
determined by its underlying type.
While I'm not 100% sure I would expect Microsoft C# compiler only allowing non-negative enum values, so if the underlying type is an Int32 (it is, by default) then I would expect about 2^31 possible values, but this is an implementation detail as it is not specified. If you need more than that, you're probably doing something wrong.
You could theoretically use int64 as your base type in the enum and get 2^63 possible entries. Others have given you excellent answers on this.
I think there is a second implied question of should you use an enum for something with a huge number of items. This actually directly applies to your project in many ways.
One of the biggest considerations would be long term maintainability. Do you think the company will ever change the list of values you are using? If so will there need to be backward compatibility to previous lists? How significant a problem could this be? In general, the larger the number of members in an enum correlates to a higher probability the list will need to be modified at some future date.
Enums are great for many things. They are clean, quick and simple to implement. They work great with IntelliSense and make the next programmer's job easier, especially if the names are clear, concise and if needed, well documented.
The problem is an enumeration also comes with drawbacks. They can be problematic if they ever need to be changed, especially if the classes using them are being persisted to storage.
In most cases enums are persisted to storage as their underlying values, not as their friendly names.
enum InsuranceClass
{
Home, //value = 0 (int32)
Vehicle, //value = 1 (int32)
Life, //value = 2 (int32)
Health //value = 3 (int32)
}
In this example the value InsuranceClass.Life would get persisted as a number 2.
If another programmer makes a small change to the system and adds Pet to the enum like this;
enum InsuranceClass
{
Home, //value = 0 (int32)
Vehicle, //value = 1 (int32)
Pet, //value = 2 (int32)
Life, //value = 3 (int32)
Health //value = 4 (int32)
}
All of the data coming out of the storage will now show the Life policies as Pet policies. This is an extremely easy mistake to make and can introduce bugs that are difficult to track down.
The second major issue with enums is that every change of the data will require you to rebuild and redeploy your program. This can cause varying degrees of pain. On a web server that may not be a big issue, but if this is an app used on 5000 desktop systems you have an entirely different cost to redeploy your minor list change.
If your list is likely to change periodically you should really consider a system that stores that list in some other form, most likely outside your code. Databases were specifically designed for this scenario or even a simple config file could be used (not the preffered solution). Smart planning for changes can reduce or avoid the problems associated with rebuilding and redeploying your software.
This is not a suggestion to prematurely optimize your system for the possibility of change, but more a suggestion to structure the code so that a likely change in the future doesn't create a major problem. Different situations will require difference decisions.
Here are my rough rules of thumb for the use of enums;
Use them to classify and define other data, but not as data
themselves. To be clearer, I would use InsuranceClass.Life to
determine how the other data in a class should be used, but I would
not make the underlying value of {pseudocode} InsuranceClass.Life = $653.00 and
use the value itself in calculations. Enums are not constants. Doing
this creates confusion.
Use enums when the enum list is unlikely to change. Enums are great
for fundamental concepts but poor for constantly changing ideas.
When you create an enumeration this is a contract with future
programmers that you want to avoid breaking.
If you must change an enum, then have a rule everyone follows that
you add to the end, not the middle. The alternative is that you
define specific values to each enum and never change those. The
point is that you are unlikely to know how others are using your
enumerations underlying values and changing them can cause misery for anyone
else using your code. This is an order of magnitude more important
for any system that persists data.
The corollary to #2 and #3 is to never delete a member of an enum.
There are specific circles of hell for programmers who do this in a codebase used by others.
Hopefully that expanded on the answers in a helpful way.

Constant abuse?

I have run across a bunch of code in a few C# projects that have the following constants:
const int ZERO_RECORDS = 0;
const int FIRST_ROW = 0;
const int DEFAULT_INDEX = 0;
const int STRINGS_ARE_EQUAL = 0;
Has anyone ever seen anything like this? Is there any way to rationalize using constants to represent language constructs? IE: C#'s first index in an array is at position 0. I would think that if a developer needs to depend on a constant to tell them that the language is 0 based, there is a bigger issue at hand.
The most common usage of these constants is in handling Data Tables or within 'for' loops.
Am I out of place thinking these are a code smell? I feel that these aren't a whole lot better than:
const int ZERO = 0;
const string A = "A";
Am I out of place thinking these are a code smell? I feel that these aren't a whole lot better than:
Compare the following:
if(str1.CompareTo(str2) == STRINGS_ARE_EQUAL) ...
with
if(str1.CompareTo(str2) == ZERO) ...
if(str1.CompareTo(str2) == 0) ...
Which one makes more immediate sense?
Abuse, IMHO. "Zero" is just is one of the basics.
Although the STRINGS_ARE_EQUAL could be easy, why not ".Equals"?
Accepted limited use of magic numbers?
That definitely a code smell.
The intent may have been to 'add readability' to the code, however things like that actually decrease the readability of code in my opinion.
Some people consider any raw number within a program to be a 'magic number'. I have seen coding standards that basically said that you couldn't just write an integer into a program, it had to be a const int.
Am I out of place thinking these are a code smell? I feel that these aren't a whole lot better than:
const int ZERO = 0;
const int A = 'A';
Probably a bit of smell, but definitely better than ZERO=0 and A='A'. In the first case they're defining logical constants, i.e. some abstract idea (string equality) with a concrete value implementation.
In your example, you're defining literal constants -- the variables represent the values themselves. If this is the case, I would think that an enumeration is preferred since they rarely are singular values.
That is definite bad coding.
I say constants should be used only where needed where things could possible change sometime later. For instance, I have a lot of "configuration" options like SESSION_TIMEOUT defined where it should stay the same, but maybe it could be tweaked later on down the road. I do not think ZERO can ever be tweaked down the road.
Also, for magic numbers zero should not be included.
I'm a bit strange I think on that belief though because I would say something like this is going to far
//input is FIELD_xxx where xxx is a number
input.SubString(LENGTH_OF_FIELD_NAME); //cut out the FIELD_ to give us the number
You should have a look at some of the things at thedailywtf
One2Pt20462262185th
and
Enterprise SQL
I think sometimes people blindly follow 'Coding standards' which say "Don't use hardcoded values, define them as constants so that it's easier to manage the code when it needs to be updated' - which is fair enough for stuff like:
const in MAX_NUMBER_OF_ELEMENTS_I_WILL_ALLOW = 100
But does not make sense for:
if(str1.CompareTo(str2) == STRINGS_ARE_EQUAL)
Because everytime I see this code I need to search for what STRINGS_ARE_EQUAL is defined as and then check with docs if that is correct.
Instead if I see:
if(str1.CompareTo(str2) == 0)
I skip step 1 (search what STRINGS_ARE... is defined as) and can check specs for what value 0 means.
You would correctly feel like replacing this with Equals() and use CompareTo() in cases where you are interested in more that just one case, e.g.:
switch (bla.CompareTo(bla1))
{
case IS_EQUAL:
case IS_SMALLER:
case IS_BIGGER:
default:
}
using if/else statements if appropriate (no idea what CompareTo() returns ...)
I would still check if you defined the values correctly according to specs.
This is of course different if the specs defines something like ComparisonClass::StringsAreEqual value or something like that (I've just made that one up) then you would not use 0 but the appropriate variable.
So it depends, when you specifically need to access first element in array arr[0] is better than arr[FIRST_ELEMENT] because I will still go and check what you have defined as FIRST_ELEMENT because I will not trust you and it might be something different than 0 - for example your 0 element is dud and the real first element is stored at 1 - who knows.
I'd go for code smell. If these kinds of constants are necessary, put them in an enum:
enum StringEquality
{
Equal,
NotEqual
}
(However I suspect STRINGS_ARE_EQUAL is what gets returned by string.Compare, so hacking it to return an enum might be even more verbose.)
Edit: Also SHOUTING_CASE isn't a particularly .NET-style naming convention.
i don't know if i would call them smells, but they do seem redundant. Though DEFAULT_INDEX could actually be useful.
The point is to avoid magic numbers and zeros aren't really magical.
Is this code something in your office or something you downloaded?
If it's in the office, I think it's a problem with management if people are randomly placing constants around. Globally, there shouldn't be any constants unless everyone has a clear idea or agreement of what those constants are used for.
In C# ideally you'd want to create a class that holds constants that are used globally by every other class. For example,
class MathConstants
{
public const int ZERO=0;
}
Then in later classes something like:
....
if(something==MathConstants.ZERO)
...
At least that's how I see it. This way everyone can understand what those constants are without even reading anything else. It would reduce confusion.
There are generally four reasons I can think of for using a constant:
As a substitute for a value that could reasonably change in the future (e.g., IdColumnNumber = 1).
As a label for a value that may not be easy to understand or meaningful on its own (e.g. FirstAsciiLetter = 65),
As a shorter and less error-prone way of typing a lengthy or hard to type value (e.g., LongSongTitle = "Supercalifragilisticexpialidocious")
As a memory aid for a value that is hard to remember (e.g., PI = 3.14159265)
For your particular examples, here's how I'd judge each example:
const int ZERO_RECORDS = 0;
// almost definitely a code smell
const int FIRST_ROW = 0;
// first row could be 1 or 0, so this potentially fits reason #2,
// however, doesn't make much sense for standard .NET collections
// because they are always zero-based
const int DEFAULT_INDEX = 0;
// this fits reason #2, possibly #1
const int STRINGS_ARE_EQUAL = 0;
// this very nicely fits reason #2, possibly #4
// (at least for anyone not intimately familiar with string.CompareTo())
So, I would say that, no, these are not worse than Zero = 0 or A = "A".
If the zero indicates something other than zero (in this case STRINGS_ARE_EQUAL) then that IS Magical. Creating a constant for it is both acceptable and makes the code more readable.
Creating a constant called ZERO is pointless and a waste of finger energy!
Smells a bit, but I could see cases where this would make sense, especially if you have programmers switching from language to language all the time.
For instance, MATLAB is one-indexed, so I could imagine someone getting fed up with making off-by-one mistakes whenever they switch languages, and defining DEFAULT_INDEX in both C++ and MATLAB programs to abstract the difference. Not necessarily elegant, but if that's what it takes...
Right you are to question this smell young code warrior. However, these named constants derive from coding practices much older than the dawn of Visual Studio. They probably are redundant but you could do worse than to understand the origin of the convention. Think NASA computers, way back when...
You might see something like this in a cross-platform situation where you would use the file with the set of constants appropriate to the platform. But Probably not with these actual examples. This looks like a COBOL coder was trying to make his C# look more like english language (No offence intended to COBOL coders).
It's all right to use constants to represent abstract values, but quite another to represent constructs in your own language.
const int FIRST_ROW = 0 doesn't make sense.
const int MINIMUM_WIDGET_COUNT = 0 makes more sense.
The presumption that you should follow a coding standard makes sense. (That is, coding standards are presumptively correct within an organization.) Slavishly following it when the presumption isn't met doesn't make sense.
So I agree with the earlier posters that some of the smelly constants probably resulted from following a coding standard ("no magic numbers") to the letter without exception. That's the problem here.

Alternatives to nullable types in C#

I am writing algorithms that work on series of numeric data, where sometimes, a value in the series needs to be null. However, because this application is performance critical, I have avoided the use of nullable types. I have perf tested the algorithms to specifically compare the performance of using nullable types vs non-nullable types, and in the best case scenario nullable types are 2x slower, but often far worse.
The data type most often used is double, and currently the chosen alternative to null is double.NaN. However I understand this is not the exact intended usage for the NaN value, so am unsure whether there are any issues with this I cannot foresee and what the best practise would be.
I am interested in finding out what the best null alternatives are for the following data types in particular: double/float, decimal, DateTime, int/long (although others are more than welcome)
Edit: I think I need to clarify my requirements about performance. Gigs of numerical data are processed through these algorithms at a time which takes several hours. Therefore, although the difference between eg 10ms or 20ms is usually insignificant, in this scenario it really does makes a significant impact to the time taken.
Well, if you've ruled out Nullable<T>, you are left with domain values - i.e. a magic number that you treat as null. While this isn't ideal, it isn't uncommon either - for example, a lot of the main framework code treats DateTime.MinValue the same as null. This at least moves the damage far away from common values...
edit to highlight only where no NaN
So where there is no NaN, maybe use .MinValue - but just remember what evils happen if you accidentally use that same value meaning the same number...
Obviously for unsigned data you'll need .MaxValue (avoid zero!!!).
Personally, I'd try to use Nullable<T> as expressing my intent more safely... there may be ways to optimise your Nullable<T> code, perhaps. And also - by the time you've checked for the magic number in all the places you need to, perhaps it won't be much faster than Nullable<T>?
I somewhat disagree with Gravell on this specific edge case: a Null-ed variable is considered 'not defined', it doesn't have a value. So whatever is used to signal that is OK: even magic numbers, but with magic numbers you have to take into account that a magic number will always haunt you in the future when it becomes a 'valid' value all of a sudden. With Double.NaN you don't have to be afraid for that: it's never going to become a valid double. Though, you have to consider that NaN in the sense of the sequence of doubles can only be used as a marker for 'not defined', you can't use it as an error code in the sequences as well, obviously.
So whatever is used to mark 'undefined': it has to be clear in the context of the set of values that that specific value is considered the value for 'undefined' AND that won't change in the future.
If Nullable give you too much trouble, use NaN, or whatever else, as long as you consider the consequences: the value chosen represents 'undefined' and that will stay.
I am working on a large project that uses NaN as a null value. I am not entirely comfortable with it - for similar reasons as yours: not knowing what can go wrong. We haven't encountered any real problems so far, but be aware of the following:
NaN arithmetics - While, most of the time, "NaN promotion" is a good thing, it might not always be what you expect.
Comparison - Comparison of values gets rather expensive, if you want NaN's to compare equal. Now, testing floats for equality isn't simple anyway, but ordering (a < b) can get really ugly, because nan's sometimes need to be smaller, sometimes larger than normal values.
Code Infection - I see lots of arithmetic code that requires specific handling of NaN's to be correct. So you end up with "functions that accept NaN's" and "functions that don't" for performance reasons.
Other non-finites NaN is nto the only non-finite value. Should be kept in mind...
Floating Point Exceptions are not a problem when disabled. Until someone enables them. True story: Static intialization of a NaN in an ActiveX control. Doesn't sound scary, until you change installation to use InnoSetup, which uses a Pascal/Delphi(?) core, which has FPU exceptions enabled by default. Took me a while to figure out.
So, all in all, nothing serious, though I'd prefer not to have to consider NaNs that often.
I'd use Nullable types as often as possible, unless they are (proven to be) performance / ressource constraints. One case could be large vectors / matrices with occasional NaNs, or large sets of named individual values where the default NaN behavior is correct.
Alternatively, you can use an index vector for vectors and matrices, standard "sparse matrix" implementations, or a separate bool/bit vector.
Partial answer:
Float and Double provide NaN (Not a Number). NaN is a little tricky since, per spec, NaN != NaN. If you want to know if a number is NaN, you'll need to use Double.IsNaN().
See also Binary floating point and .NET.
Maybe the significant performance decrease happens when calling one of Nullable's members or properties (boxing).
Try to use a struct with the double + a boolean telling whether the value is specified or not.
One can avoid some of the performance degradation associated with Nullable<T> by defining your own structure
struct MaybeValid<T>
{
public bool isValue;
public T Value;
}
If desired, one may define constructor, or a conversion operator from T to MaybeValid<T>, etc. but overuse of such things may yield sub-optimal performance. Exposed-field structs can be efficient if one avoids unnecessary data copying. Some people may frown upon the notion of exposed fields, but they can be massively more efficient that properties. If a function that will return a T would need to have a variable of type T to hold its return value, using a MaybeValid<Foo> simply increases by 4 the size of thing to be returned. By contrast, using a Nullable<Foo> would require that the function first compute the Foo and then pass a copy of it to the constructor for the Nullable<Foo>. Further, returning a Nullable<Foo> will require that any code that wants to use the returned value must make at least one extra copy to a storage location (variable or temporary) of type Foo before it can do anything useful with it. By contrast, code can use the Value field of a variable of type Foo about as efficiently as any other variable.

Categories