In Conversions (on Chapter 2) in C# topic in C# 5.0 in a Nutshell, the author says:
...Conversions can be either implicit or explicit: implicit conversions happen automatically, and explicit conversions require a cast. In the following example, we implicitly convert an int to long type (which has twice the bitwise capacity of an int)...
This is the example:
int x = 12345; // int is a 32-bit integer
long y = x; // Implicit conversion to 64-bit integer
short z = (short)x; // Explicit conversion to 16-bit integer
Is there a relationship between bitwise capacity and bit capacity? or, what is author´s point respect to bitwise capacity?
I think, he wants to differntiate between "bitwise capacity" and "numeric capacity".
In the example, the data types differ in bitwise capacity: int has 32, long 64 and short 16. In this case, conversions to data types with higher capacity happen implicit, conversions to data types with lower bitwise capacity happen explicit.
On the other hand, there's something like "numeric capacity" where int and uint do share the same number of bits (they have the same "bitwise capacity"), but are still not fully compatible in terms of values you can store (uint has no support for negative values).
I think they mean “capacity, with respect to bits”. If they had left out the “bitwise” part, then it could easily be interpreted as “this type holds twice as many values as the other type”, which is wrong: it holds much more than twice the number of values. It holds twice the number of bits, which increases the number of values exponentially.
It is the same thing. It just means that you have twice as many bits to represent your value, which means you can store much larger numbers. Numeric capacity is therefore tied to bitwise capacity since the more bits the higher numeric capacity.
With a 64 bit data type you can represent your value using 64 bit binary numbers
Related
I always come across code that uses int for things like .Count, etc, even in the framework classes, instead of uint.
What's the reason for this?
UInt32 is not CLS compliant so it might not be available in all languages that target the Common Language Specification. Int32 is CLS compliant and therefore is guaranteed to exist in all languages.
int, in c, is specifically defined to be the default integer type of the processor, and is therefore held to be the fastest for general numeric operations.
Unsigned types only behave like whole numbers if the sum or product of a signed and unsigned value will be a signed type large enough to hold either operand, and if the difference between two unsigned values is a signed value large enough to hold any result. Thus, code which makes significant use of UInt32 will frequently need to compute values as Int64. Operations on signed integer types may fail to operate like whole numbers when the operands are overly large, but they'll behave sensibly when operands are small. Operations on unpromoted arguments of unsigned types pose problems even when operands are small. Given UInt32 x; for example, the inequality x-1 < x will fail for x==0 if the result type is UInt32, and the inequality x<=0 || x-1>=0 will fail for large x values if the result type is Int32. Only if the operation is performed on type Int64 can both inequalities be upheld.
While it is sometimes useful to define unsigned-type behavior in ways that differ from whole-number arithmetic, values which represent things like counts should generally use types that will behave like whole numbers--something unsigned types generally don't do unless they're smaller than the basic integer type.
UInt32 isn't CLS-Compliant. http://msdn.microsoft.com/en-us/library/system.uint32.aspx
I think that over the years people have come to the conclusions that using unsigned types doesn't really offer that much benefit. The better question is what would you gain by making Count a UInt32?
Some things use int so that they can return -1 as if it were "null" or something like that. Like a ComboBox will return -1 for it's SelectedIndex if it doesn't have any item selected.
If the number is truly unsigned by its intrinsic nature then I would declare it an unsigned int. However, if I just happen to be using a number (for the time being) in the positive range then I would call it an int.
The main reasons being that:
It avoids having to do a lot of type-casting as most methods/functions are written to take an int and not an unsigned int.
It eliminates possible truncation warnings.
You invariably end up wishing you could assign a negative value to the number that you had originally thought would always be positive.
Are just a few quick thoughts that came to mind.
I used to try and be very careful and choose the proper unsigned/signed and I finally realized that it doesn't really result in a positive benefit. It just creates extra work. So why make things hard by mixing and matching.
Some old libraries and even InStr use negative numbers to mean special cases. I believe either its laziness or there's negative special values.
In C# standard is said:
Conversions from int, uint, long, or ulong to float and from long or
ulong to double may cause a loss of precision, but will never cause a
loss of magnitude
can anyone explain me what does it means magnitude? and given
a number how I can calculate its magnitude? (i.e. to a long or an int).
for example:
var a = Int64.MaxValue; // a = 9223372036854775807L
var b = (float)a; // b = 9.223372037e+18
a and b have the same order of magnitude, they are very close to each other, but they are not equal...
Basically long (Int64) can represent whole numbers to an exact precision within a certain range. Floating point types sacrifice precision so they can represent numbers of a much larger range, and also fractions. So all the integral types in .net will fit into the floating points in the sense of range, but you may loos precision (digits after a certain decimal place might be incorrect. But the "scale" (order of magnitude) of the number and some of its more significant digits will be preserved...
https://en.wikipedia.org/wiki/Order_of_magnitude
It seems that it's unclear what the spec you quoted is referring to by "magnitude", and it's probably actually should be "order of magnitude".
Since Dictionary.com defines "magnitude"`as:
a number characteristic of a quantity and forming a basis for
comparison with similar quantities, as length.
and Wikipedia defines it says:
The magnitude of any number is usually called its "absolute value" or
"modulus", denoted by |x|.
you may conclude that the spec is saying that it's the quantity or actual value represented by the variable. However, as others have duly pointed out here, that is not the case.
That fact is made obvious by running a simple test (as again others here have done):
long x = 8223372036854775807; // arbitrary long number
double y = x; // implicit conversion to double
long z = Convert.ToInt64(y); // convert back to int64 (a.k.a. long)
System.Diagnostics.Debug.Print(x.ToString());
System.Diagnostics.Debug.Print(z.ToString());
This produces the output:
8223372036854775807
8223372036854775808
So, from this, you can see that the specification, while vague and imprecise, does not mean the definition of "magnitude" as defined by the dictionary or Wikipedia, but more closely resembles the definition of "order of magnitude". Specifically:
Orders of magnitude are written in powers of 10
and
Orders of magnitude are used to make approximate comparisons.
and
Two numbers of the same order of magnitude have roughly the same scale.
Which comports with the C# spec in question and also to the results we've seen from tests.
For example, why long int has a literal modifier, but short int does not? I am referring to the following question on this site: C# compiler number literals
In general, C# seems to be a very well designed and consistent language. Probably there is a strong reason to provide literal modifiers for some types, but not for all. What is it?
Why long int has a literal modifier, but short int does not?
The question is "why does C# not have this feature?" The answer to that question is always the same. Features are unimplemented by default; C# does not have that feature because no one designed, implemented and shipped the feature to customers.
The absence of a feature does not need justification. Rather, all features must be justified by showing that their benefits outweigh their costs. As the person proposing the feature, the onus is on you to describe why you think the feature is valuable; the onus is not on me to explain why it is not.
Probably there is a strong reason to provide literal modifiers for some types, but not for all. What is it?
Now that is a more answerable question. Now the question is "what justifies the literal suffix on long, and why is that not also a justification for a similar literal suffix on short?"
Integers can be used for a variety of purposes. You can use them as arithmetic numbers. You can use them as collections of bit flags. You can use them as indices into arrays. And there are lots of more special-purpose usages. But I think it is fair to say that most of the time, integers are used as arithmetical numbers.
The vast majority of calculations performed in integers by normal programs involve numbers that are far, far smaller than the range of a 32 bit signed integer -- roughly +/- two billion. And lots of modern hardware is extremely efficient when dealing solely with 32 bit integers. It therefore makes sense to make the default representation of numbers to be signed 32 bit integers. C# is therefore designed to make calculations involving 32 bit signed integers look perfectly normal; when you say "x = x + 1" that "1" is understood to be a signed 32 bit integer, and odds are good that x is too, and the result of the sum is too.
What if the calculation is integral but does not fit into the range of a 32 bit integer? "long" 64 bit integers are a sensible next step up; they are also efficient on a lot of hardware and longs have a range that should satisfy the needs of pretty much anyone who isn't doing heavy-duty combinatorics that involve extremely large numbers. It therefore makes sense to have some way to specify clearly and concisely in source code that this literal here is to be treated as a long integer.
Interop scenarios, or scenarios in which integers are used as bit fields, often require the use of unsigned integers. Again, it makes sense to have a way to clearly and concisely specify that this literal is intended to be treated as an unsigned integer.
So, summing up, when you see "1", odds are good that the vast majority of the time the user intends it to be used as a 32 bit signed integer. The next most likely cases are that the user intends it to be a long integer or an unsigned int or unsigned long. Therefore there are concise suffixes for each of those cases.
Thus, the feature is justified.
Why is that not a justification for shorts?
Because first, in every context in which a short is legal, it is already legal to use an integer literal. "short x = 1;" is perfectly legal; the compiler realizes that the integer fits into a short and lets you use it.
Second, arithmetic is never done in shorts in C#. Arithmetic can be done in ints, uints, longs and ulongs, but arithmetic is never done in shorts. Shorts promote to int and the arithmetic is done in ints, because like I said before, the vast majority of arithmetic calculations fit into an int. The vast majority do not fit into a short. Short arithmetic is possibly slower on modern hardware which is optimized for ints, and short arithmetic does not take up any less space; it's going to be done in ints or longs on the chip.
You want a "long" suffix to tell the compiler "this arithmetic needs to be done in longs" but a "short" suffix doesn't tell the compiler "this arithmetic needs to be done in shorts" because that's simply not a feature of the C# language to begin with.
The reasons for providing a long suffix and an unsigned syntax do not apply to shorts. If you think there is a compelling benefit to the feature, state what the benefit is. Without a benefit to justify its costs, the feature will not be implemented in C#.
According to MSDN:
short x = 32767;
In the preceding declaration, the integer literal 32767 is implicitly
converted from int to short. If the integer literal does not fit into
a short storage location, a compilation error will occur.
So it is a compile time feature. short does not have a suffix because it would never be needed.
The related question probably is : Why do long, float and decimal do have suffixes?
And a short answer would be that i + 1 and i + 1L can produce different values and are therefore of different types.
But there exists no such thing as 'short arithmetic', short values are always converted to int when used in a calculation.
As Eric points out in the comment, my answer below doesn't make sense. I think it's more correct to say that the inability to express a short literal in C# and the inability to express a short literal in IL share a common cause (the lack of a compelling reason for the feature.) VB.Net apparently has a short literal specifier, which is interesting (for backwards compatibility with VB syntax?) In any case, I've left the answer here as some of the information may be interesting, even if the reasoning is incorrect.
There is no short literal because there is not actually any way for a short literal to be loaded in IL, the underlying language used by the CLR. This is because all 'short' types (anything smaller than an int) are implicitly widened to an int when loaded onto the operation stack. Signed and unsigned are likewise a matter of operations and not actually 'stored' with the active number on the operation stack. The 'short' types only come into play when you want to store a number on the operation stack into a memory location, so there are IL operations to Convert to various 'short' types (though it actually still widens the number back to an int after the conversion; it just makes sure that the value will be suitable for storing into a field of the 'short' type.)
Long types have a literal specifier, on the other hand, due to the fact that they are treated differently on the operation stack. There is a separate Ldc_I8 instruction for loading constant long values. There are also Ldc_R4 (hence why you need 'f' for float) and Ldc_R8 (C# chooses this as it's default if you use a decimal number without a specifier.) Decimal is a special case, as it's not actually a primitive type in IL; it just has a built-in constant specifier 'm' in C# that compiles to a constructor call.
As for why there are no special short operations (and corresponding short literals), that's likely because most current CPU architectures do not operate with registers smaller than 32-bits, so there is no distinction at the CPU level worth exploiting. You could potentially save code size (in terms of bytes of IL) by allowing for 'short' load IL opcodes, but at the cost of additional complexity for the jitter; the code space saved is probably not worth it.
Since a short can be implicitly converted to int, long, float, double, or decimal; there's no need for a literal modifier.
Consider:
void method(int a) {}
void method2()
{
short a = 4;
method(a); // no problems
}
You may notice that char and byte are also with literal modifiers, for possibly this same reason.
From To
sbyte short, int, long, float, double, or decimal
byte short, ushort, int, uint, long, ulong, float, double, or decimal
short int, long, float, double, or decimal
ushort int, uint, long, ulong, float, double, or decimal
int long, float, double, or decimal
uint long, ulong, float, double, or decimal
long float, double, or decimal
char ushort, int, uint, long, ulong, float, double, or decimal
float double
ulong float, double, or decimal
If you declare a literal short and make it larger than Short.MaxValue a compiler error will occur, otherwise the literal will be a short.
The time I have "worked in Short" was for values that are stored in a Database.
They are positive integer values that will rarely go over 10 to 20.(a byte or sbyte would be big enough, but I figured a little over kill would keep me from regretting my choice, if the code got reused in a slightly different way)
The field is used to let the user sort the records in a table. This table feeds a drop down or radio button list that is ordered by "time" (step one, step two, ...).
Being new to C# (and old enough to remember when counting bytes was important) I figured it would be a little more efficient. I don't do math on the values. I just Sort them (and swap them between records). The only math so far has been "MaxInUse"+1 (for new records), which is a special case "++MaxInUse". This is good, because the lack of a literal means "s = s+2" would have to be "s = (Int16)(s+2)".
Now that I see how annoying C# makes working with the other ints, I expect to join the modern world and waste bytes, JUST to make the compiler happy.
But, shouldn't "making the compiler happy" rank about #65 in our top 10 programming goals?
Is it EVER an advantage to have the compiler complain about adding the integer "2" to ANY of the INTEGER types? It should complain about "s=123456", but that's a different case.
If anyone does have to deal with math AND shorts, I suggest you make your own literals. (How many could you need?)
short s1= 1, s2 = 2, s123 = 123;
Then s = s + s2 is only a little annoying (and confusing for those who follow after you).
I was just flipping through the specification and found that byte is odd. Others are short, ushort, int, uint, long, and ulong. Why this naming of sbyte and byte instead of byte and ubyte?
It's a matter of semantics. When you think of a byte you usually (at least I do) think of an 8-bit value from 0-255. So that's what byte is. The less common interpretation of the binary data is a signed value (sbyte) of -128 to 127.
With integers, it's more intuitive to think in terms of signed values, so that's what the basic name style represents. The u prefix then allows access to the less common unsigned semantics.
The reason a type "byte", without any other adjective, is often unsigned while a type "int", without any other adjective, is often signed, is that unsigned 8-bit values are often more practical (and thus widely used) than signed bytes, but signed integers of larger types are often more practical (and thus widely used) than unsigned integers of such types.
There is a common linguistic principle that, if a "thing" comes in two types, "usual" and "unusual", the term "thing" without an adjective means a "usual thing"; the term "unusual thing" is used to refer to the unusual type. Following that principle, since unsigned 8-bit quantities are more widely used than signed ones, the term "byte" without modifiers refers to the unsigned flavor. Conversely, since signed integers of larger sizes are more widely used than their unsigned equivalents, terms like "int" and "long" refer to the signed flavors.
As for the reason behind such usage patterns, if one is performing maths on numbers of a certain size, it generally won't matter--outside of comparisons--whether the numbers are signed or unsigned. There are times when it's convenient to regard them as signed (it's more natural, for example, to say think in terms of adding -1 to a number than adding 65535) but for the most part, declaring numbers to be signed doesn't require any extra work for the compiler except when one is either performing comparisons or extending the numbers to a larger size. Indeed, if anything, signed integer math may be faster than unsigned integer math (since unsigned integer math is required to behave predictably in case of overflow, whereas unsigned math isn't).
By contrast, since 8-bit operands must be extended to type 'int' before performing any math upon them, the compiler must generate different code to handle signed and unsigned operands; in most cases, the signed operands will require more code than unsigned ones. Thus, in cases where it wouldn't matter whether an 8-bit value was signed or unsigned, it often makes more sense to use unsigned values. Further, numbers of larger types are often decomposed into a sequence of 8-bit values or reconstituted from such a sequence. Such operations are easier with 8-bit unsigned types than with 8-bit signed types. For these reasons, among others, unsigned 8-bit values are used much more commonly than signed 8-bit values.
Note that in the C language, "char" is an odd case, since all characters within the C character set are required to translate as non-negative values (so machines which use an 8-bit char type with an EBCDIC character set are required to have "char" be unsigned), but an "int" is required to hold all values that a "char" can hold (so machines where both "char" and "int" are 16 bits are required to have "char" be signed).
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Why does .NET use int instead of uint in certain classes?
Why is Array.Length an int, and not an uint
I've always wonder why .Count isn't an unsigned integer instead of a signed one?
For example, take ListView.SelectedItems.Count. The number of elements can't be less then 0, so why is it a signed int?
If I try to test if there are elements selected, I would like to test
if (ListView.SelectedItems.Count == 0) {}
but because it's a signed integer, I have to test
if (ListView.SelectedItems.Count <= 0) {}
or is there any case when .Count could be < 0 ?
Unsigned integer is not CLS-compliant (Common Language Specification)
For more info on CLS compliant code, see this link:
http://msdn.microsoft.com/en-us/library/bhc3fa7f.aspx
Mabye because the uint data type is not part of the CLS (common language specification) as not all .Net languages support it.
Here is very similar thread about arrays:
Why is Array.Length an int, and not an uint
It's not CLS compliant, largely to allow wider support from different languages.
A signed int offers ease in porting code from C or C++ that uses pointer arithmetic.
Count can be part of an expression where the overall value can be negative. In particular, count has a direct relationship to indices, where valid indices are always in the range [0, Count - 1], but negative results are used e.g. by some binary search methods (including those provided by the BCL) to reflect the position where a new item should be inserted to maintain order.
Let’s look at this from a practical angle.
For better or worse, signed ints are the normal sort of ints in use in .NET. It was also normal to use signed ints in C and C++. So, most variables are declared to be int rather than unsigned int unless there is a good reason otherwise.
Converting between an unsigned int and a signed int has issues and is not always safe.
On a 32 bit system it is not possible for a collection to have anywhere close to 2^^32 items in it, so a signed int is big enough in all cases.
On a 64 bit system, an unsigned int does not gain you much, in most cases a signed int is still big enough, otherwise you need to use a 64 bit int. (I expect that none of the standard collection will cope well with anywhere near 2^^31 items on a 64 system!)
Therefore given that using an unsigned int has no clear advantage, why would you use an unsigned int?
In vb.net, the normal looping construct (a "For/Next loop") will execute the loop with values up to and including the maximum value specified, unlike C which can easily loop with values below the upper limit. Thus, it is often necessary to specify a loop as e.g. "For I=0 To Array.Length-1"; if Array.Length were unsigned and and zero, that could cause an obvious problem. Even in C, one benefits from being able to say "for (i=Array.Length-1; i GE 0; --i)". Sometimes I think it would be useful to have a 31-bit integer type which would support widening casts to both signed and unsigned int, but I've never heard of a language supporting such.