How can I simulate a C++ union in C#? - c#

I have a small question about structures with the LayoutKind.Explicit attribute set. I declared the struct as you can see, with a fieldTotal with 64 bits, being fieldFirst the first 32 bytes and fieldSecond the last 32 bytes. After setting both fieldfirst and fieldSecond to Int32.MaxValue, I'd expect fieldTotal to be Int64.MaxValue, which actually doesn't happen. Why is this? I know C# does not really support C++ unions, maybe it will only read the values well when interoping, but when we try to set the values ourselves it simply won't handle it really well?
[StructLayout(LayoutKind.Explicit)]
struct STRUCT {
[FieldOffset(0)]
public Int64 fieldTotal;
[FieldOffset(0)]
public Int32 fieldFirst;
[FieldOffset(32)]
public Int32 fieldSecond;
}
STRUCT str = new STRUCT();
str.fieldFirst = Int32.MaxValue;
str.fieldSecond = Int32.MaxValue;
Console.WriteLine(str.fieldTotal); // <----- I'd expect both these values
Console.WriteLine(Int64.MaxValue); // <----- to be the same.
Console.ReadKey();

The reason is that FieldOffsetAttribute takes a number of bytes as parameter -- not number of bits. This works as expected:
[StructLayout(LayoutKind.Explicit)]
struct STRUCT
{
[FieldOffset(0)]
public Int64 fieldTotal;
[FieldOffset(0)]
public Int32 fieldFirst;
[FieldOffset(4)]
public Int32 fieldSecond;
}

Looking at the hex values if Int32.MaxValue and Int64.MaxValue should provide the answer.
The key is the most significant bit. For a positive integer, the most significant bit is only set for a negative number. So the max value of Int32 is a 0 followed by a whole series of 1s. The order is unimportant, just that there will be at least a single 0 bit. The same is true of Int64.MaxValue.
Now consider how a union should work. It will essentially lay out the bits of the values next to one another. So now you have a set of bits 64 in length which contains two 0 bit values. One for each of the Int32.MaxValue instances. This cannot ever be equal to Int64.MaxValue since it can contain only a single 0 bit.
Oddly enough you will probably get the behavior you are looking for if you set fieldSecond to Int32.MinValue.
EDIT Missed that you need to make it FieldOffset(4) as well.

Ben M provided one of the more important elements - your definition is not setup correctly.
That being said, this won't work - even in C++ with a union. The values you specified won't be (and shouldn't be) the same values, since you're using signed (not unsigned) ints. With a signed int (Int32), you're going to have a 0 bit followed by 1 bits. When you do the union, you'll end up with a 0 bit, followed by a bunch of 1 bits, then another 0 bit, then a bunch of 1 bits... The second 0 bit is what's messing you up.
If you used UInt32/UInt64, this would work property, since the extra sign bit doesn't exist.

Related

Why is BigInteger in C# a struct if it has an unbounded size?

Why is BigInteger declared as a ValueType (struct) in C#? It seems to be very similar to the string type which is declared as a reference type.
Both are immutable (value types). Both can be arbitrarily large.
The recommendation I have heard is that a struct should never be more than 16 Bytes. BigInteger can get much larger than 16 Bytes and I would think this would make frequent operations extremely slow since it is always copied by value.
Copying a BigInteger does not cause the underlying data to be copied. Instead, just a reference to the data is copied.
Since BigInteger values are immutable it is safe for two or more values to share a common data buffer.
BigInteger has two instance fields:
int _sign - probably tells whether its a positive or negative value.
uint[] _bits - this is a reference to the data buffer.
An int is 4 bytes and a reference is 8 bytes (on a 64-bit system). Therefore the size of a BigInteger is ≤ 16 bytes.
If you look at the source for BigInteger and strip it down to only instance level fields (the things that would count toward it's size) all the class has is
public struct BigInteger : IFormattable, IComparable, IComparable<BigInteger>, IEquatable<BigInteger>
{
internal int _sign;
internal uint[] _bits;
}
So you have 4 bytes for _sign and 4 or 8 bytes for uint[] depending on if you are on a 32 or 64 bit system due to the fact that arrays are reference types. This gives you a total of 8 or 12 bytes, well below the 16 recommendation. (note: The CLR will pad the 12 byte version to 16 to make it a multiple of 8 for optimization reasons)
When a new BigInteger is created the _bits array will be shared between the two instances. Because the type is immutable (you can't change the value of any cell of _bits) it is safe for the two copies to share the array.
Here are the fields of a BigInteger:
// For values int.MinValue < n <= int.MaxValue, the value is stored in sign
// and _bits is null. For all other values, sign is +1 or -1 and the bits are in _bits
internal int _sign;
internal uint[] _bits;
So, one int and one uint[], which is a reference type. The type itself can't grow arbitrarily large. It'll be 8 bytes on x86 and 16 bytes on x64 (12 bytes for the field + 4 bytes of padding).
string and arrays are the only types in the framework which have a varying size and are special-cased in the runtime.
As to answer the question: there is less overhead in using a struct. Having a class wrapper over two fields would cause more indirection and more GC pressure for no good reason. Besides, a BigInteger is semantically a value.
The size of a struct matters only because the entire struct has to be copied each time you pass it around from one function to another. If it was not for the copying, nobody would care.
However, BigInteger consists of two parts:
The actual struct, which is the part that gets copied when you pass a BigInteger around, and is fairly small, and
The array of bits, which is of arbitrary length, but which is not copied each time the struct is copied.
So, when you pass a BigInteger, this is what happens:
Before copying:
[BigInteger instance 1] ---------> [array of bits]
After copying:
[BigInteger instance 1] ---------> [array of bits]
|
[BigInteger instance 2] ----+
Notice how there is always just one array of bits.

What is the limit of the Value Type BigInteger in C#?

As described in MSDN BigInteger is :
An immutable type that represents an arbitrarily large integer whose
value in theory has no upper or lower bounds.
As I can see BigInteger is a ValueType, as much as I know, a ValueType must have a maximum size of 16 bytes.
MSDN goes further saying :
an OutOfMemoryException can be thrown for any operation that causes a
BigInteger value to grow too large.
and more :
Although this process is transparent to the caller, it does incur a
performance penalty. In some cases, especially when repeated
operations are performed in a loop on very large BigInteger values
How could it store such big values, as big as double.MaxValue + double.MaxValue ?
I was told that it has ReferenceType obejects inside it, but all I can find here in its definition in VisualStudio is ValueTypes.
What's its real limit ? And even if doesn't have one, how can it "as a value type" manage to store all that amount of data ?
As I can see BigInteger is a ValueType, as much as I know, a ValueType must have a maximum size of 16 bytes.
No, that's not true. It's a conventional limit, but it's entirely feasible for a value type to take more than that. For example:
public struct Foo {
private readonly int a, b, c, d, e; // Look ma, 20 bytes!
}
However, I strongly suspect that BigInteger actually includes a reference to a byte array:
public struct BigInteger {
private readonly byte[] data;
// Some other fields...
}
(Moslem Ben Dhaou's answer shows one current implementation using int and uint[], but of course the details of this are intentionally hidden.)
So the value of a BigInteger can still be small, but it can refer to a big chunk of memory - and if there isn't enough memory to allocate what's required when you perform some operation, you'll get an exception.
How could it store such big values, as big as double.MaxValue + double.MaxValue ?
Well BigInteger is for integers, so I wouldn't particularly want to use it for anything to do with double... but fundamentally the limitations are going to be around how much memory you've got and the size of array the CLR can cope with. In reality, you'd be talking about enormous numbers before actually hitting the limit for any specific number - but if you have gazillions of smaller numbers, that obviously has large memory requirements too.
As a confirmation to the answer from Jon Skeet, I looked to the source code of BigInteger. It actually contains two internal properties as follow:
internal int _sign;
internal uint[] _bits;
_bits is used by almost all private/public methods within the class which are used to read/write the actual data.
_sign is used to keep the sign of the BigInteger.
The private methods are extensively using binary operators and calculations. Here is a small list of constants used in the class that might reflect a bit the limits:
private const int knMaskHighBit = -2147483648;
private const uint kuMaskHighBit = 2147483648U;
private const int kcbitUint = 32;
private const int kcbitUlong = 64;
private const int DecimalScaleFactorMask = 16711680;
private const int DecimalSignMask = -2147483648;
PS: I should have commented on J.S. answer, but a comment is too short. To view the source code, either download it or decompile System.Numerics.dll.
TL;DR: BigInteger maxvalue is 2^68685922272
In .Net 4.7.2 BigInteger uses an uint array for bits.
An uint holds 32bits of data.
An array's max size is defined as internal const int MaxArrayLength = 0X7FEFFFFF;
7FEFFFFF = 2146435071
Now, to calculate: max size of array x capacity of each uint is: 2146435071 x 32 = 68685922272. But that's only the count of the bits in a BigInteger.
Which means BigInteger's max value is: 2^68'685'922'272 which is stupendusly large (used ' for easier readability).
If they ever decide to increase the array's max size, then it will also increase the max value for BigInteger.
I just did some quick experiments on this. Max seems to be around 2^65,000,000,000 but actual practicality 2146435071
I get a System.OverflowException on the below at 0x1F. It overflowed between E FFFF FFE2 and F 7FFF FFE1. (or somewhere between 2^64,424,509,410 and 2^66,571,993,057)
// Test 1
BigInteger test = 1;
for (int i = 0x00; i < 0xFF; i++)
test <<= 0x7FFFFFFF;
// Test 2
BigInteger.Pow((BigInteger)2, 0x7FEFFFF0); // OK - I think - never finished
BigInteger.Pow((BigInteger)2, 0x7FEFFFFF); // Immediate OutOfMemoryException
I should also note that while ~66,571,993,057 seems to be supported. The usefulness is more like 2^2146435071 because POWER and shifts don't seem to work with a POWER larger then 2,146,435,071(for POW() ) or a shift amount more than 2,147,483,647. Larger shifts can be done but it would take several rounds ruining efficiency. And the other item is slow at those speeds - a single shift was taking about 7 seconds and BigInteger.Pow() took at least 5 minutes.
.Net 5, AMD Threadripper, 32GB RAM, Windows 10 x64

Is there an easy way to convert from 32bit integer to 16bit integer?

I have a 32 bit int and I want to address only the lower half of this variable. I know I can convert to bit array and to int16, but is there any more straight forward way to do that?
It you want only the lower half, you can just cast it: (Int16)my32BitInt
In general, if you're extending/truncating bit patterns like this, then you do need to be careful about signed types - unsigned types may cause fewer surprises.
As mentioned in the comments - if you've enclosed your code in a 'checked' context, or changed your compiler options so that the default is 'checked', then you can't truncate a number like this without an exception being thrown if there are any non-zero bits being discarded - in that situation you'd need to do:
(UInt16)(my32BitInt & 0xffff)
(The option of using signed types is gone in this case, because you'd have to use & 0x7fff which then preserves only 15 bits)
just use this function
Convert.ToInt16()
or just
(Int16)valueasint
You can use implicit conversation to Int16 like;
(Int16)2;
but be careful when you do that. Because Int16 can't hold all possible Int32 values.
For example this won't work;
(Int16)2147483683;
because Int16 can hold 32787 as maximum value. You can use unchecked (C# Reference) keyword such this cases.
If you force an unchecked operation, a cast should work:
int r = 0xF000001;
short trimmed = unchecked((short) r);
This will truncate the value of r to fit in a short.
If the value of r should always fit in a short, you can just do a normal cast and let an exception be thrown.
If you need a 16 bit value and you happen to know something specific like that the number will never be less than zero, you could use a UINT16 value. That conversion looks like:
int x = 0;
UInt16 value = (UInt16)x;
This has the full (positive) range of an integer.
Well, first, make sure you actually want to have the value signed. uint and ushort are there for a reason. Then:
ushort ret = (ushort)(val & ((1 << 16) - 1));

What is the difference between int, Int16, Int32 and Int64?

What is the difference between int, System.Int16, System.Int32 and System.Int64 other than their sizes?
Each type of integer has a different range of storage capacity
Type Capacity
Int16 -- (-32,768 to +32,767)
Int32 -- (-2,147,483,648 to +2,147,483,647)
Int64 -- (-9,223,372,036,854,775,808 to +9,223,372,036,854,775,807)
As stated by James Sutherland in his answer:
int and Int32 are indeed synonymous; int will be a little more
familiar looking, Int32 makes the 32-bitness more explicit to those
reading your code. I would be inclined to use int where I just need
'an integer', Int32 where the size is important (cryptographic code,
structures) so future maintainers will know it's safe to enlarge an
int if appropriate, but should take care changing Int32 variables
in the same way.
The resulting code will be identical: the difference is purely one of
readability or code appearance.
The only real difference here is the size. All of the int types here are signed integer values which have varying sizes
Int16: 2 bytes
Int32 and int: 4 bytes
Int64 : 8 bytes
There is one small difference between Int64 and the rest. On a 32 bit platform assignments to an Int64 storage location are not guaranteed to be atomic. It is guaranteed for all of the other types.
int
It is a primitive data type defined in C#.
It is mapped to Int32 of FCL type.
It is a value type and represent System.Int32 struct.
It is signed and takes 32 bits.
It has minimum -2147483648 and maximum +2147483647 value.
Int16
It is a FCL type.
In C#, short is mapped to Int16.
It is a value type and represent System.Int16 struct.
It is signed and takes 16 bits.
It has minimum -32768 and maximum +32767 value.
Int32
It is a FCL type.
In C#, int is mapped to Int32.
It is a value type and represent System.Int32 struct.
It is signed and takes 32 bits.
It has minimum -2147483648 and maximum +2147483647 value.
Int64
It is a FCL type.
In C#, long is mapped to Int64.
It is a value type and represent System.Int64 struct.
It is signed and takes 64 bits.
It has minimum –9,223,372,036,854,775,808 and maximum 9,223,372,036,854,775,807 value.
According to Jeffrey Richter(one of the contributors of .NET framework development)'s book 'CLR via C#':
int is a primitive type allowed by the C# compiler, whereas Int32 is the Framework Class Library type (available across languages that abide by CLS). In fact, int translates to Int32 during compilation.
Also,
In C#, long maps to System.Int64, but in a different programming
language, long could map to Int16 or Int32. In fact, C++/CLI does
treat long as Int32.
In fact, most (.NET) languages won't even treat long as a keyword and won't
compile code that uses it.
I have seen this author, and many standard literature on .NET preferring FCL types(i.e., Int32) to the language-specific primitive types(i.e., int), mainly on such interoperability concerns.
They tell what size can be stored in a integer variable. To remember the size you can think in terms of :-) 2 beers (2 bytes), 4 beers (4 bytes) or 8 beers (8 bytes).
Int16 :- 2 beers/bytes = 16 bit = 2^16 = 65536 = 65536/2 = -32768 to 32767
Int32 :- 4 beers/bytes = 32 bit = 2^32 = 4294967296 = 4294967296/2 = -2147483648 to 2147483647
Int64 :- 8 beers/bytes = 64 bit = 2^64 = 18446744073709551616 = 18446744073709551616/2 = -9223372036854775808 to 9223372036854775807
In short you can not store more than 32767 value in int16 , more than
2147483647 value in int32 and more than 9223372036854775807 value in
int64.
To understand above calculation you can check out this video int16 vs int32 vs int64
A very important note on the 16, 32 and 64 types:
if you run this query...
Array.IndexOf(new Int16[]{1,2,3}, 1)
you are suppose to get zero(0) because you are asking... is 1 within the array of 1, 2 or 3.
if you get -1 as answer, it means 1 is not within the array of 1, 2 or 3.
Well check out what I found:
All the following should give you 0 and not -1
(I've tested this in all framework versions 2.0, 3.0, 3.5, 4.0)
C#:
Array.IndexOf(new Int16[]{1,2,3}, 1) = -1 (not correct)
Array.IndexOf(new Int32[]{1,2,3}, 1) = 0 (correct)
Array.IndexOf(new Int64[]{1,2,3}, 1) = 0 (correct)
VB.NET:
Array.IndexOf(new Int16(){1,2,3}, 1) = -1 (not correct)
Array.IndexOf(new Int32(){1,2,3}, 1) = 0 (correct)
Array.IndexOf(new Int64(){1,2,3}, 1) = -1 (not correct)
So my point is, for Array.IndexOf comparisons, only trust Int32!
EDIT: This isn't quite true for C#, a tag I missed when I answered this question - if there is a more C# specific answer, please vote for that instead!
They all represent integer numbers of varying sizes.
However, there's a very very tiny difference.
int16, int32 and int64 all have a fixed size.
The size of an int depends on the architecture you are compiling for - the C spec only defines an int as larger or equal to a short though in practice it's the width of the processor you're targeting, which is probably 32bit but you should know that it might not be.
Nothing. The sole difference between the types is their size (and, hence, the range of values they can represent).
int and int32 are one and the same (32-bit integer)
int16 is short int (2 bytes or 16-bits)
int64 is the long datatype (8 bytes or 64-bits)
They both are indeed synonymous, However i found the small difference between them,
1)You cannot use Int32 while creatingenum
enum Test : Int32
{ XXX = 1 // gives you compilation error
}
enum Test : int
{ XXX = 1 // Works fine
}
2) Int32 comes under System declaration. if you remove using.System you will get compilation error but not in case for int
The answers by the above people are about right. int, int16, int32... differs based on their data holding capacity. But here is why the compilers have to deal with these - it is to solve the potential Year 2038 problem. Check out the link to learn more about it.
https://en.wikipedia.org/wiki/Year_2038_problem
Int=Int32 --> Original long type
Int16 --> Original int
Int64 --> New data type become available after 64 bit systems
"int" is only available for backward compatibility. We should be really using new int types to make our programs more precise.
---------------
One more thing I noticed along the way is there is no class named Int similar to Int16, Int32 and Int64. All the helpful functions like TryParse for integer come from Int32.TryParse.

Why is Count not an unsigned integer? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Why does .NET use int instead of uint in certain classes?
Why is Array.Length an int, and not an uint
I've always wonder why .Count isn't an unsigned integer instead of a signed one?
For example, take ListView.SelectedItems.Count. The number of elements can't be less then 0, so why is it a signed int?
If I try to test if there are elements selected, I would like to test
if (ListView.SelectedItems.Count == 0) {}
but because it's a signed integer, I have to test
if (ListView.SelectedItems.Count <= 0) {}
or is there any case when .Count could be < 0 ?
Unsigned integer is not CLS-compliant (Common Language Specification)
For more info on CLS compliant code, see this link:
http://msdn.microsoft.com/en-us/library/bhc3fa7f.aspx
Mabye because the uint data type is not part of the CLS (common language specification) as not all .Net languages support it.
Here is very similar thread about arrays:
Why is Array.Length an int, and not an uint
It's not CLS compliant, largely to allow wider support from different languages.
A signed int offers ease in porting code from C or C++ that uses pointer arithmetic.
Count can be part of an expression where the overall value can be negative. In particular, count has a direct relationship to indices, where valid indices are always in the range [0, Count - 1], but negative results are used e.g. by some binary search methods (including those provided by the BCL) to reflect the position where a new item should be inserted to maintain order.
Let’s look at this from a practical angle.
For better or worse, signed ints are the normal sort of ints in use in .NET. It was also normal to use signed ints in C and C++. So, most variables are declared to be int rather than unsigned int unless there is a good reason otherwise.
Converting between an unsigned int and a signed int has issues and is not always safe.
On a 32 bit system it is not possible for a collection to have anywhere close to 2^^32 items in it, so a signed int is big enough in all cases.
On a 64 bit system, an unsigned int does not gain you much, in most cases a signed int is still big enough, otherwise you need to use a 64 bit int. (I expect that none of the standard collection will cope well with anywhere near 2^^31 items on a 64 system!)
Therefore given that using an unsigned int has no clear advantage, why would you use an unsigned int?
In vb.net, the normal looping construct (a "For/Next loop") will execute the loop with values up to and including the maximum value specified, unlike C which can easily loop with values below the upper limit. Thus, it is often necessary to specify a loop as e.g. "For I=0 To Array.Length-1"; if Array.Length were unsigned and and zero, that could cause an obvious problem. Even in C, one benefits from being able to say "for (i=Array.Length-1; i GE 0; --i)". Sometimes I think it would be useful to have a 31-bit integer type which would support widening casts to both signed and unsigned int, but I've never heard of a language supporting such.

Categories