Or is it always guaranteed to be positive for all possible Chars?
It's guaranteed to be non-negative.
char is an unsigned 16-bit value.
From section 4.1.5 of the C# 4 spec:
The char type represents unsigned 16-bit integers with values between 0 and 65535. The set of possible values for the char type corresponds to the Unicode character set. Although char has the same representation as ushort, not all operations permitted on one type are permitted on the other.
Since the range of char is U+0000 to U+ffff, then a cast to an Int32 will always be positive.
Each 16-bit value ranges from hexadecimal 0x0000 through 0xFFFF and is
stored in a Char structure.
Char Structure - MSDN
See Microsoft's documentation
There you can see, that Char is a 16 bit value in the range of U+0000 to U+ffff. If you cast it to a Int32, there should be no negative value.
char can be inplicitly converted to ushort and ushort range is 0 to 65,535 so its always positive
Related
As I remember we had learned that signed integer types (sbyte, short, int , long)
the first bit is for the sign and the latter 7 bit is for the value.
I saw that sbyte range is -128 to 127 while I thought it must be -127 to 127.
I tried some codes to understand how this is possible and I faced two strange things:
1- I tried the following code:
sbyte a = -128;
Console.Write(Convert.ToString(a, 2));
and the resutl was
1111111100000000
As if its a two byte variable.
2-Tried converting all numbers in the range to binary:
for(sbyte i=-128;i<=127;i++)
{
Console.WriteLine(Convert.ToString(i, 2));
if(i==127) break;
}
If I omit the if(i==127) break; the loop goes on. and with the break, the code in the loop does not execute, some how as if -128 is greater than 127.
My Conclusion:
As I thought that -128 must not fit in a unsigned byte variable and the first and second tries approves that (111111110000000 > 01111111)
but If it does not fit, then why range is -128 to 127?
I saw that sbyte range is -128 to 127 while I thought it must be -127 to 127.
The range is [-128, +127] indeed. The range [-127, +127] would mean that sbyte can represent only 255 different values, while 8 bits make 256 combinations.
And BTW if -128 would not be a legal value, compiler would complain about
sbyte a = -128;
There is no overload Convert.ToString(sbyte value, int toBase), so in your case Convert.ToString(int value, int toBase) is called, and sbyte is promoted to int.
To quickly check the value, you can print sbyte as a hexadecimal number:
sbyte s = -1;
Console.WriteLine("{0:X}", s); // FF, i.e. 11111111
If I omit the if(i==127) break; the loop goes on.
Sure, sbyte.MaxValue is 127, so i<=127 is always true. When i is 127 and gets incremented, an overflow occurs and the next value is -128.
First, this question has related posts:
Why Int32 maximum value is 0x7FFFFFFF?
However, I want to know why the hexadecimal value is always treated as an unsigned quantity.
See the following snippet:
byte a = 0xFF; //No error (byte is an unsigned type).
short b = 0xFFFF; //Error! (even though both types are 16 bits).
int c = 0xFFFFFFFF; //Error! (even though both types are 32 bits).
long d = 0xFFFFFFFFFFFFFFFF; //Error! (even though both types are 64 bits).
The reason for the error is because the hexadecimal values are always treated as unsigned values, regardless of what data-type they are stored as. Hence, the value is 'too large' for the data-type described.
For instance, I expected:
int c = 0xFFFFFFFF;
To store the value:
-1
And not the value:
4294967295
Simply because int is a signed type.
So, why is it that the hexadecimal values are always treated as unsigned even if the sign type can be inferred by the data-type used to store them?
How can I store these bits into these data-types without resorting to the use of ushort, uint, and ulong?
In particular, how can I achieve this for the long data-type considering I cannot use a larger signed data-type?
What's going on is that a literal is intrinsically typed. 0.1 is a double, which is why you can't say float f = 0.1. You can cast a double to a float (float f = (float)0.1), but you may lose precision. Similarly, the literal 0xFFFFFFFF is intrinsically a uint. You can cast it to an int, but that's after it has been interpreted by the compiler as a uint. The compiler doesn't use the variable to which you are assigning it to figure out its type; its type is defined by what sort of literal it is.
They are treated as unsigned numbers, as that is what the language specification says to do
I have a character '¿'. If I cast it with integer in C, result is -61 and same casting in C#, result is 191. Can someone explain me the reason.
C Code
char c = '¿';
int I = (int)c;
Result I = -62
C# Code
char c = '¿';
int I = (int)c;
Result I = 191
This is how singed/unsigned numbers are represented and converted.
It looks like C compiler's default in your case use signed byte as underlying type for char (since you are note explicitly specifying unsigend char compiler's default is used, See - Why is 'char' signed by default in C++? ).
So 191 (0xBF) as signed byte means negative number (most significant bit is 1) - -65.
If you'd use unsigned char value would stay positive as you expect.
If your compiler would you wider type for char (i.e. short) that 191 would stay as positive 191 irrespective of whether or not char is signed or not.
In C# where it always unsigned - see MSDN char:
Type: char
Range: U+0000 to U+FFFF
So 191 will always convert to to int as you expect.
I have a very simple problem that is giving me a really big headache in that I am port a bit of code from C++ into C# and for a very simple operation I am getting totally different results:-
C++
char OutBuff[25];
int i;
unsigned int SumCheck = 46840;
OutBuff[i++] = SumCheck & 0xFF; //these 2 ANDED make 248
The value written to the char array is -8
C#
char[] OutBuff = new char[25];
int i;
uint SumCheck = 46840;
OutBuff[i++] = (char)(sumCheck & 0xFF); //these 2 ANDED also make 248
The value written to the char array is 248.
Interestingly they are both the same characters, so this may be something to do with the format of a char array in C++ and C# - but ultimately I would be grateful if someone could give me a definitive answer.
Thanks in advance for any help.
David
Its overflow in C++, and no overflow in C#.
In C#, char is two byte. In C++, char is one byte!
So in C#, there is no overflow, and the value is retained. In C++, there is integral overflow.
Change the data type from char to uint16_t or unsigned char (in C++), you will see same result. Note that unsigned char can have a value of 248, without overflow. It can have value upto 255, in fact.
Maybe you should be using byte or sbyte instead of char. (char is only to store text chars and the actual binary serialization for char is not the same as in c++. char allows us to store characters without worrying about character byte width.)
A C# char is actually 16 bits, while a C++ char is usually 8 bits (a char is exactly 8 bits on Visual C++). So you're actually overflowing the integer in the C++ code, but the C# code does not overflow, since it holds more bits, and therefore has a bigger integer range.
Notice that 248 is outside the range of a signed char (-128 to 127). That should give you a hint that C#'s char might be bigger than 8 bits.
You're probably meant to use C#'s sbyte, (the closest equivalent to Visual C++'s char) if you want to preserve the behavior. Although you may want to recheck the code code since there's an overflow occurring in the C++ implementation.
As everyone has stated, in C# a char is 16 bits while in C++ it is usually 8 bits.
-8 and 248 in binary both (essentially) look like this:
11111000
Because a char in C++ is usually 8 bits (which is in fact your case), the result is -8. In C#, the value looks like this:
00000000 11111000
Which is 16 bits and becomes 248.
2's complement representation of -8 is the same as the binary represenation of 248 (unsigned)
So the binary representation is the same in both cases. The c++ is interpreted as an int8 result and in c# it's simply interpreted as a positive integer (int is 32 bit an truncating to 16 by casting to char doesn't affect the sign in this case)
The difference between -8 and 248 is all in how you interpret the data. They are stored exactly the same (0xF8). In C++, the default char type is 'signed'. So, 0xF8 = -8. If you change the data type to 'unsigned char', it will be interpreted as 248. VS also has a compiler option to make 'char' default to 'unsigned'.
Motivation:
I would like to convert hashes (MD5/SHA1 etc) into decimal integers for the purpose of making barcodes in Code128C.
For simplicity, I prefer all the resulting (large) numbers to be positive.
I am able to convert byte[] to BigInteger in C#...
Sample from what I have so far:
byte[] data;
byte[] result;
BigInteger biResult;
result = shaM.ComputeHash(data);
biResult = new BigInteger(result);
But (rusty CS here) am I correct that a byte array can always be interpreted in two ways:
(A): as a signed number
(B): as an unsigned number
Is it possible to make an UNSIGNED BigInteger from a byte[] in C#?
Should I simply prepend a 0x00 (zero byte) to the front of the byte[]?
EDIT:
Thank you to AakashM, Jon and Adam Robinson, appending a zero byte achieved what I needed.
EDIT2:
The main thing I should have done was to read the detailed doc of the BigInteger(byte[]) constructor, then I would have seen the sections about how to restrict to positive numbers by appending the zero byte.
The remarks for the BigInteger constructor state that you can make sure any BigInteger created from a byte[] is unsigned if you append a 00 byte to the end of the array before calling the constructor.
Note: the BigInteger constructor expects the array to be in little-endian order. Keep that in mind if you expect the resulting BigInteger to have a particular value.
Since .NET Core 2.1, BigInteger has a constructor with an optional parameter isUnsigned:
public BigInteger (ReadOnlySpan<byte> value, bool isUnsigned = false, bool isBigEndian = false);
Examining the documentation for the relevant BigInteger constructor, we see:
The individual bytes in the value
array should be in little-endian
order, from lowest-order byte to
highest-order byte
[...]
The constructor expects positive
values in the byte array to use
sign-and-magnitude representation, and
negative values to use two's
complement representation. In other
words, if the highest-order bit of the
highest-order byte in value is set,
the resulting BigInteger value is
negative. Depending on the source of
the byte array, this may cause a
positive value to be misinterpreted as
a negative value.
[...]
To prevent
positive values from being
misinterpreted as negative values, you
can add a zero-byte value to the end
of the array.
As other answers have pointed out, you should append a 00 byte to the end of the array to ensure the resulting BigInteger is positive.
According to the the BigInteger Structure (System.Numerics) MSDN Documentation
To prevent the BigInteger(Byte[]) constructor from confusing the two's complement representation of a negative value with the sign and magnitude representation of a positive value, positive values in which the most significant bit of the last byte in the byte array would ordinarily be set should include an additional byte whose value is 0.
Here's code to do it:
byte[] byteArray;
// ...
var bigInteger = new BigInteger(byteArray.Concat(new byte[] { 0 }).ToArray());
But (rusty CS here) am I correct that a byte array can always be interpreted in two ways: A: as a signed number B: as an unsigned number
What's more correct is that all numbers (by virtue of being stored in the computer) are basically a series of bytes, which is what a byte array is. It's not true to say that a byte array can always be interpreted as a signed or unsigned version of a particular numeric type, as not all numeric types have signed and unsigned versions. Floating point types generally only have signed versions (there's no udouble or ufloat), and, in this particular instance, there is no unsigned version of BigInteger.
So, in other words, no, it's not possible, but since BigInteger can represent an arbitrarily large integer value, you're not losing any range by virtue of its being signed.
As to your second question, you would need to append 0x00 to end end of the array, as the BigInteger constructor parses the values in little-endian byte order.