Why is Count not an unsigned integer? [duplicate]

Why is Count not an unsigned integer? [duplicate] - c#

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Why does .NET use int instead of uint in certain classes?
Why is Array.Length an int, and not an uint
I've always wonder why .Count isn't an unsigned integer instead of a signed one?
For example, take ListView.SelectedItems.Count. The number of elements can't be less then 0, so why is it a signed int?
If I try to test if there are elements selected, I would like to test
if (ListView.SelectedItems.Count == 0) {}
but because it's a signed integer, I have to test
if (ListView.SelectedItems.Count <= 0) {}
or is there any case when .Count could be < 0 ?

Unsigned integer is not CLS-compliant (Common Language Specification)
For more info on CLS compliant code, see this link:
http://msdn.microsoft.com/en-us/library/bhc3fa7f.aspx

Mabye because the uint data type is not part of the CLS (common language specification) as not all .Net languages support it.
Here is very similar thread about arrays:
Why is Array.Length an int, and not an uint

It's not CLS compliant, largely to allow wider support from different languages.
A signed int offers ease in porting code from C or C++ that uses pointer arithmetic.
Count can be part of an expression where the overall value can be negative. In particular, count has a direct relationship to indices, where valid indices are always in the range [0, Count - 1], but negative results are used e.g. by some binary search methods (including those provided by the BCL) to reflect the position where a new item should be inserted to maintain order.

Let’s look at this from a practical angle.
For better or worse, signed ints are the normal sort of ints in use in .NET. It was also normal to use signed ints in C and C++. So, most variables are declared to be int rather than unsigned int unless there is a good reason otherwise.
Converting between an unsigned int and a signed int has issues and is not always safe.
On a 32 bit system it is not possible for a collection to have anywhere close to 2^^32 items in it, so a signed int is big enough in all cases.
On a 64 bit system, an unsigned int does not gain you much, in most cases a signed int is still big enough, otherwise you need to use a 64 bit int. (I expect that none of the standard collection will cope well with anywhere near 2^^31 items on a 64 system!)
Therefore given that using an unsigned int has no clear advantage, why would you use an unsigned int?

In vb.net, the normal looping construct (a "For/Next loop") will execute the loop with values up to and including the maximum value specified, unlike C which can easily loop with values below the upper limit. Thus, it is often necessary to specify a loop as e.g. "For I=0 To Array.Length-1"; if Array.Length were unsigned and and zero, that could cause an obvious problem. Even in C, one benefits from being able to say "for (i=Array.Length-1; i GE 0; --i)". Sometimes I think it would be useful to have a 31-bit integer type which would support widening casts to both signed and unsigned int, but I've never heard of a language supporting such.

Related

Why do the constructors of various collections accept an int instead of an unsigned int [duplicate]

I always come across code that uses int for things like .Count, etc, even in the framework classes, instead of uint.
What's the reason for this?

UInt32 is not CLS compliant so it might not be available in all languages that target the Common Language Specification. Int32 is CLS compliant and therefore is guaranteed to exist in all languages.

int, in c, is specifically defined to be the default integer type of the processor, and is therefore held to be the fastest for general numeric operations.

Unsigned types only behave like whole numbers if the sum or product of a signed and unsigned value will be a signed type large enough to hold either operand, and if the difference between two unsigned values is a signed value large enough to hold any result. Thus, code which makes significant use of UInt32 will frequently need to compute values as Int64. Operations on signed integer types may fail to operate like whole numbers when the operands are overly large, but they'll behave sensibly when operands are small. Operations on unpromoted arguments of unsigned types pose problems even when operands are small. Given UInt32 x; for example, the inequality x-1 < x will fail for x==0 if the result type is UInt32, and the inequality x<=0 || x-1>=0 will fail for large x values if the result type is Int32. Only if the operation is performed on type Int64 can both inequalities be upheld.
While it is sometimes useful to define unsigned-type behavior in ways that differ from whole-number arithmetic, values which represent things like counts should generally use types that will behave like whole numbers--something unsigned types generally don't do unless they're smaller than the basic integer type.

UInt32 isn't CLS-Compliant. http://msdn.microsoft.com/en-us/library/system.uint32.aspx
I think that over the years people have come to the conclusions that using unsigned types doesn't really offer that much benefit. The better question is what would you gain by making Count a UInt32?

Some things use int so that they can return -1 as if it were "null" or something like that. Like a ComboBox will return -1 for it's SelectedIndex if it doesn't have any item selected.

If the number is truly unsigned by its intrinsic nature then I would declare it an unsigned int. However, if I just happen to be using a number (for the time being) in the positive range then I would call it an int.
The main reasons being that:
It avoids having to do a lot of type-casting as most methods/functions are written to take an int and not an unsigned int.
It eliminates possible truncation warnings.
You invariably end up wishing you could assign a negative value to the number that you had originally thought would always be positive.
Are just a few quick thoughts that came to mind.
I used to try and be very careful and choose the proper unsigned/signed and I finally realized that it doesn't really result in a positive benefit. It just creates extra work. So why make things hard by mixing and matching.

Some old libraries and even InStr use negative numbers to mean special cases. I believe either its laziness or there's negative special values.

Can Random.Next() ever return int.MaxValue?

The documentation for the Random.Next() method states that it returns:
A 32-bit signed integer that is greater than or equal to 0 and less than MaxValue.
But, I took a peek at the implementation, and while I don't understand the algorithm (a quick Google search suggests that it is a subtractive generator), I can't see any way in which a value of exactly int.MaxValue is ruled out.
If, for pedantic reasons, someone wants a random number across the entire range of 32-bit integers, does Random.Next() alone suffice, or does it become necessary to do something like assemble two separate 16-bit samples?

It will always be less than int.MaxValue.
In your linked source code it explicitly handles int.MaxValue:
if (retVal == MBIG) retVal--;
MBIG is defined earlier:
private const int MBIG = int.MaxValue;
https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.CoreLib/src/System/Random.cs#L105

Difference between two large numbers C#

There are already solutions to this problem for small numbers:
Here: Difference between 2 numbers
Here: C# function to find the delta of two numbers
Here: How can I find the difference between 2 values in C#?
I'll summarise the answer to them all:
Math.Abs(a - b)
The problem is when the numbers are large this gives the wrong answer (by means of an overflow). Worse still, if (a - b) = Int32.MinValue then Math.Abs crashes with an exception (because Int32.MaxValue = Int32.MinValue - 1):
System.OverflowException occurred
HResult=0x80131516
Message=Negating the minimum value of a twos complement number is
invalid.
Source=mscorlib
StackTrace: at
System.Math.AbsHelper(Int32 value) at System.Math.Abs(Int32 value)
Its specific nature leads to difficult-to-reproduce bugs.
Maybe I'm missing some well known library function, but is there any way of determining the difference safely?

As suggested by others, use BigInteger as defined in System.Numerics (you'll have to include the namespace in Visual Studio)
Then you can just do:
BigInteger a = new BigInteger();
BigInteger b = new BigInteger();
// Assign values to a and b somewhere in here...
// Then just use included BigInteger.Abs method
BigInteger result = BigInteger.Abs(a - b);
Jeremy Thompson's answer is still valid, but note that the BigInteger namespace includes an absolute value method, so there shouldn't be any need for special logic. Also, Math.Abs expects a decimal, so it will give you grief if you try to pass in a BigInteger.
Keep in mind there are caveats to using BigIntegers. If you have a ludicrously large number, C# will try to allocate memory for it, and you may run into out of memory exceptions. On the flip side, BigIntegers are great because the amount of memory allotted to them is dynamically changed as the number gets larger.
Check out the microsoft reference here for more info: https://msdn.microsoft.com/en-us/library/system.numerics.biginteger(v=vs.110).aspx

The question is, how do you want to hold the difference between two large numbers? If you're calculating the difference between two signed long (64-bit) integers, for example, and the difference will not fit into a signed long integer, how do you intend to store it?
long a = +(1 << 62) + 1000;
long b = -(1 << 62);
long dif = a - b; // Overflow, bit truncation
The difference between a and b is wider than 64 bits, so when it's stored into a long integer, its high-order bits are truncated, and you get a strange value for dif.
In other words, you cannot store all possible differences between signed integer values of a given width into a signed integer of the same width. (You can only store half of all of the possible values; the other half require an extra bit.)
Your options are to either use a wider type to hold the difference (which won't help you if you're already using the widest long integer type), or to use a different arithmetic type. If you need at least 64 signed bits of precision, you'll probably need to use BigInteger.

The BigInteger was introduced in .Net 4.0.
There are some open source implementations available in lower versions of the .Net Framework, however you'd be wise to go with the standard.
If the Math.Abs still gives you grief you can implement the function yourself; if the number is negative (a - b < 0) simply trim the negative symbol so its unsigned.
Also, have you tried using Doubles? They hold much larger values.

Here's an alternative that might be interesting to you, but is very much within the confines of a particular int size. This example uses Int32, and uses bitwise operators to accomplish the difference and then the absolute value. This implementation is tolerant of your scenario where a - b equals the min int value, it naturally returns the min int value (not much else you can do, without casting things to the a larger data type). I don't think this is as good an answer as using BigInteger, but it is fun to play with if nothing else:
static int diff(int a, int b)
{
int xorResult = (a ^ b);
int diff = (a & xorResult) - (b & xorResult);
return (diff + (diff >> 31)) ^ (diff >> 31);
}
Here are some cases I ran it through to play with the behavior:
Console.WriteLine(diff(13, 14)); // 1
Console.WriteLine(diff(11, 9)); // 2
Console.WriteLine(diff(5002000, 2346728)); // 2655272
Console.WriteLine(diff(int.MinValue, 0)); // Should be 2147483648, but int data type can't go that large. Actual result will be -2147483648.

What's the relationship between bitwise capacity and bit capacity?

In Conversions (on Chapter 2) in C# topic in C# 5.0 in a Nutshell, the author says:
...Conversions can be either implicit or explicit: implicit conversions happen automatically, and explicit conversions require a cast. In the following example, we implicitly convert an int to long type (which has twice the bitwise capacity of an int)...
This is the example:
int x = 12345; // int is a 32-bit integer
long y = x; // Implicit conversion to 64-bit integer
short z = (short)x; // Explicit conversion to 16-bit integer
Is there a relationship between bitwise capacity and bit capacity? or, what is author´s point respect to bitwise capacity?

I think, he wants to differntiate between "bitwise capacity" and "numeric capacity".
In the example, the data types differ in bitwise capacity: int has 32, long 64 and short 16. In this case, conversions to data types with higher capacity happen implicit, conversions to data types with lower bitwise capacity happen explicit.
On the other hand, there's something like "numeric capacity" where int and uint do share the same number of bits (they have the same "bitwise capacity"), but are still not fully compatible in terms of values you can store (uint has no support for negative values).

I think they mean “capacity, with respect to bits”. If they had left out the “bitwise” part, then it could easily be interpreted as “this type holds twice as many values as the other type”, which is wrong: it holds much more than twice the number of values. It holds twice the number of bits, which increases the number of values exponentially.

It is the same thing. It just means that you have twice as many bits to represent your value, which means you can store much larger numbers. Numeric capacity is therefore tied to bitwise capacity since the more bits the higher numeric capacity.
With a 64 bit data type you can represent your value using 64 bit binary numbers

Why some types do not have literal modifiers

For example, why long int has a literal modifier, but short int does not? I am referring to the following question on this site: C# compiler number literals
In general, C# seems to be a very well designed and consistent language. Probably there is a strong reason to provide literal modifiers for some types, but not for all. What is it?

Why long int has a literal modifier, but short int does not?
The question is "why does C# not have this feature?" The answer to that question is always the same. Features are unimplemented by default; C# does not have that feature because no one designed, implemented and shipped the feature to customers.
The absence of a feature does not need justification. Rather, all features must be justified by showing that their benefits outweigh their costs. As the person proposing the feature, the onus is on you to describe why you think the feature is valuable; the onus is not on me to explain why it is not.
Probably there is a strong reason to provide literal modifiers for some types, but not for all. What is it?
Now that is a more answerable question. Now the question is "what justifies the literal suffix on long, and why is that not also a justification for a similar literal suffix on short?"
Integers can be used for a variety of purposes. You can use them as arithmetic numbers. You can use them as collections of bit flags. You can use them as indices into arrays. And there are lots of more special-purpose usages. But I think it is fair to say that most of the time, integers are used as arithmetical numbers.
The vast majority of calculations performed in integers by normal programs involve numbers that are far, far smaller than the range of a 32 bit signed integer -- roughly +/- two billion. And lots of modern hardware is extremely efficient when dealing solely with 32 bit integers. It therefore makes sense to make the default representation of numbers to be signed 32 bit integers. C# is therefore designed to make calculations involving 32 bit signed integers look perfectly normal; when you say "x = x + 1" that "1" is understood to be a signed 32 bit integer, and odds are good that x is too, and the result of the sum is too.
What if the calculation is integral but does not fit into the range of a 32 bit integer? "long" 64 bit integers are a sensible next step up; they are also efficient on a lot of hardware and longs have a range that should satisfy the needs of pretty much anyone who isn't doing heavy-duty combinatorics that involve extremely large numbers. It therefore makes sense to have some way to specify clearly and concisely in source code that this literal here is to be treated as a long integer.
Interop scenarios, or scenarios in which integers are used as bit fields, often require the use of unsigned integers. Again, it makes sense to have a way to clearly and concisely specify that this literal is intended to be treated as an unsigned integer.
So, summing up, when you see "1", odds are good that the vast majority of the time the user intends it to be used as a 32 bit signed integer. The next most likely cases are that the user intends it to be a long integer or an unsigned int or unsigned long. Therefore there are concise suffixes for each of those cases.
Thus, the feature is justified.
Why is that not a justification for shorts?
Because first, in every context in which a short is legal, it is already legal to use an integer literal. "short x = 1;" is perfectly legal; the compiler realizes that the integer fits into a short and lets you use it.
Second, arithmetic is never done in shorts in C#. Arithmetic can be done in ints, uints, longs and ulongs, but arithmetic is never done in shorts. Shorts promote to int and the arithmetic is done in ints, because like I said before, the vast majority of arithmetic calculations fit into an int. The vast majority do not fit into a short. Short arithmetic is possibly slower on modern hardware which is optimized for ints, and short arithmetic does not take up any less space; it's going to be done in ints or longs on the chip.
You want a "long" suffix to tell the compiler "this arithmetic needs to be done in longs" but a "short" suffix doesn't tell the compiler "this arithmetic needs to be done in shorts" because that's simply not a feature of the C# language to begin with.
The reasons for providing a long suffix and an unsigned syntax do not apply to shorts. If you think there is a compelling benefit to the feature, state what the benefit is. Without a benefit to justify its costs, the feature will not be implemented in C#.

According to MSDN:
short x = 32767;
In the preceding declaration, the integer literal 32767 is implicitly
converted from int to short. If the integer literal does not fit into
a short storage location, a compilation error will occur.
So it is a compile time feature. short does not have a suffix because it would never be needed.
The related question probably is : Why do long, float and decimal do have suffixes?
And a short answer would be that i + 1 and i + 1L can produce different values and are therefore of different types.
But there exists no such thing as 'short arithmetic', short values are always converted to int when used in a calculation.

As Eric points out in the comment, my answer below doesn't make sense. I think it's more correct to say that the inability to express a short literal in C# and the inability to express a short literal in IL share a common cause (the lack of a compelling reason for the feature.) VB.Net apparently has a short literal specifier, which is interesting (for backwards compatibility with VB syntax?) In any case, I've left the answer here as some of the information may be interesting, even if the reasoning is incorrect.
There is no short literal because there is not actually any way for a short literal to be loaded in IL, the underlying language used by the CLR. This is because all 'short' types (anything smaller than an int) are implicitly widened to an int when loaded onto the operation stack. Signed and unsigned are likewise a matter of operations and not actually 'stored' with the active number on the operation stack. The 'short' types only come into play when you want to store a number on the operation stack into a memory location, so there are IL operations to Convert to various 'short' types (though it actually still widens the number back to an int after the conversion; it just makes sure that the value will be suitable for storing into a field of the 'short' type.)
Long types have a literal specifier, on the other hand, due to the fact that they are treated differently on the operation stack. There is a separate Ldc_I8 instruction for loading constant long values. There are also Ldc_R4 (hence why you need 'f' for float) and Ldc_R8 (C# chooses this as it's default if you use a decimal number without a specifier.) Decimal is a special case, as it's not actually a primitive type in IL; it just has a built-in constant specifier 'm' in C# that compiles to a constructor call.
As for why there are no special short operations (and corresponding short literals), that's likely because most current CPU architectures do not operate with registers smaller than 32-bits, so there is no distinction at the CPU level worth exploiting. You could potentially save code size (in terms of bytes of IL) by allowing for 'short' load IL opcodes, but at the cost of additional complexity for the jitter; the code space saved is probably not worth it.

Since a short can be implicitly converted to int, long, float, double, or decimal; there's no need for a literal modifier.
Consider:
void method(int a) {}
void method2()
{
short a = 4;
method(a); // no problems
}
You may notice that char and byte are also with literal modifiers, for possibly this same reason.
From To
sbyte short, int, long, float, double, or decimal
byte short, ushort, int, uint, long, ulong, float, double, or decimal
short int, long, float, double, or decimal
ushort int, uint, long, ulong, float, double, or decimal
int long, float, double, or decimal
uint long, ulong, float, double, or decimal
long float, double, or decimal
char ushort, int, uint, long, ulong, float, double, or decimal
float double
ulong float, double, or decimal

If you declare a literal short and make it larger than Short.MaxValue a compiler error will occur, otherwise the literal will be a short.

The time I have "worked in Short" was for values that are stored in a Database.
They are positive integer values that will rarely go over 10 to 20.(a byte or sbyte would be big enough, but I figured a little over kill would keep me from regretting my choice, if the code got reused in a slightly different way)
The field is used to let the user sort the records in a table. This table feeds a drop down or radio button list that is ordered by "time" (step one, step two, ...).
Being new to C# (and old enough to remember when counting bytes was important) I figured it would be a little more efficient. I don't do math on the values. I just Sort them (and swap them between records). The only math so far has been "MaxInUse"+1 (for new records), which is a special case "++MaxInUse". This is good, because the lack of a literal means "s = s+2" would have to be "s = (Int16)(s+2)".
Now that I see how annoying C# makes working with the other ints, I expect to join the modern world and waste bytes, JUST to make the compiler happy.
But, shouldn't "making the compiler happy" rank about #65 in our top 10 programming goals?
Is it EVER an advantage to have the compiler complain about adding the integer "2" to ANY of the INTEGER types? It should complain about "s=123456", but that's a different case.
If anyone does have to deal with math AND shorts, I suggest you make your own literals. (How many could you need?)
short s1= 1, s2 = 2, s123 = 123;
Then s = s + s2 is only a little annoying (and confusing for those who follow after you).

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.