What is the memory footprint of a Nullable<T> - c#

An int (Int32) has a memory footprint of 4 bytes. But what is the memory footprint of:
int? i = null;
and :
int? i = 3;
Is this in general or type dependent?

I'm not 100% sure, but I believe it should be 8 Bytes, 4 bytes for the int32, and (since every thing has to be 4-Byte aligned on a 32 bit machine) another 4 bytes for a boolean indicating whether the integer value has been specified or not.
Note, thanks to #sensorSmith, I am now aware that newer releases of .Net allow nullable values to be stored in smaller footprints (when the hardware memory design allows smaller chunks of memory to be independently allocated). On a 64 Bit machine it would still be 8 bytes (64 bits) since that is the smallest chunk of memory that can be addressed...
A nullable for example only requires a single bit for the boolean, and another single bit for the IsNull flag and so the total storage requirements is less than a byte it theoretically could be stored in a single byte, however, as usual, if the smallest chunk of memory that can be allocated is 8 bytes (like on a 64 bit machine), then it will still take 8 bytes of memory.

The size of Nullable<T> is definitely type dependent. The structure has two members
boolean: For the hasValue
value: for the underlying value
The size of the structure will typically map out to 4 plus the size of the type parameter T.

int? a = 3;
00000038 lea ecx,[ebp-48h]
0000003b mov edx,3
00000040 call 78BFD740
00000045 nop
a = null;
00000046 lea edi,[ebp-48h]
00000049 pxor xmm0,xmm0
0000004d movq mmword ptr [edi],xmm0
It seems that first dword is for the value, and the second one is for null flag. So, 8 bytes total.
Curious, BinaryWritter doesn't like to write nullable types. I was wandering if it could pack it tighter then 8 bytes...

The .NET (and most other languages/frameworks) default behavior is to align struct fields to a multiple of their size and structs themselves to a multiple of the size of their largest field. Reference: StructLayout
Nullable<T> has a bool flag and the T value. Since bool takes just 1 byte, the size of the largest field is the size of T; and Nullable doubles the space needed compared to a T alone. Reference:Nullable Source
Clarification: If T is itself a non-primitive struct rather than a primitive type, Nullable increases the space needed by the size of the largest primitive field within T or, recursively, within any of T's non-primitive fields. So, the size of a Nullable<Nullable<bool>> is 3, not 4.

You can check using some code similar to the one at https://www.dotnetperls.com/nullable-memory.
I got the following results:
Int32 4 bytes
Int32? 8 bytes
Int16 2 bytes
Int16? 4 bytes
Int64 8 bytes
Int64? 16 bytes
Byte 1 bytes
Byte? 2 bytes
bool 1 bytes
bool? 2 bytes

An int? is a struct containing a boolean hasValue, and an int. Therefore, it has a footprint of 5 bytes. The same applies to all instances of a nullable<T>: size = sizeof(T)+sizeof(bool)

The nullable type is a structure that contains the regular variable and a flag for the null state.
For a nullable int that would mean that it contains five bytes of data, but it's of course padded up to complete words, so it's using eight bytes.
You can generally expect that any nullable type will be four bytes larger than the regular type, except for small types like byte and boolean.

32-bit and 64-bit machines:
int == 4 bytes
int? == 8 bytes == 4 for int + 4 for the nullable type wrapper.
The nullable type wrapper requires 4 bytes of storage. And the integer
itself requires 4 bytes for each element. This is an efficient
implementation. In an array many nullable types are stored in
contiguous memory.
Based on a personal test (.NET Framework 4.6.1, x64, Release) and from – https://www.dotnetperls.com/nullable-memory
Also, if interesting: why int on x64 equals only 4 bytes?
Note: this is valid for Nullable<int> only, the size of Nullable<T> totally depends on the type.

Related

Why is BigInteger in C# a struct if it has an unbounded size?

Why is BigInteger declared as a ValueType (struct) in C#? It seems to be very similar to the string type which is declared as a reference type.
Both are immutable (value types). Both can be arbitrarily large.
The recommendation I have heard is that a struct should never be more than 16 Bytes. BigInteger can get much larger than 16 Bytes and I would think this would make frequent operations extremely slow since it is always copied by value.
Copying a BigInteger does not cause the underlying data to be copied. Instead, just a reference to the data is copied.
Since BigInteger values are immutable it is safe for two or more values to share a common data buffer.
BigInteger has two instance fields:
int _sign - probably tells whether its a positive or negative value.
uint[] _bits - this is a reference to the data buffer.
An int is 4 bytes and a reference is 8 bytes (on a 64-bit system). Therefore the size of a BigInteger is ≤ 16 bytes.
If you look at the source for BigInteger and strip it down to only instance level fields (the things that would count toward it's size) all the class has is
public struct BigInteger : IFormattable, IComparable, IComparable<BigInteger>, IEquatable<BigInteger>
{
internal int _sign;
internal uint[] _bits;
}
So you have 4 bytes for _sign and 4 or 8 bytes for uint[] depending on if you are on a 32 or 64 bit system due to the fact that arrays are reference types. This gives you a total of 8 or 12 bytes, well below the 16 recommendation. (note: The CLR will pad the 12 byte version to 16 to make it a multiple of 8 for optimization reasons)
When a new BigInteger is created the _bits array will be shared between the two instances. Because the type is immutable (you can't change the value of any cell of _bits) it is safe for the two copies to share the array.
Here are the fields of a BigInteger:
// For values int.MinValue < n <= int.MaxValue, the value is stored in sign
// and _bits is null. For all other values, sign is +1 or -1 and the bits are in _bits
internal int _sign;
internal uint[] _bits;
So, one int and one uint[], which is a reference type. The type itself can't grow arbitrarily large. It'll be 8 bytes on x86 and 16 bytes on x64 (12 bytes for the field + 4 bytes of padding).
string and arrays are the only types in the framework which have a varying size and are special-cased in the runtime.
As to answer the question: there is less overhead in using a struct. Having a class wrapper over two fields would cause more indirection and more GC pressure for no good reason. Besides, a BigInteger is semantically a value.
The size of a struct matters only because the entire struct has to be copied each time you pass it around from one function to another. If it was not for the copying, nobody would care.
However, BigInteger consists of two parts:
The actual struct, which is the part that gets copied when you pass a BigInteger around, and is fairly small, and
The array of bits, which is of arbitrary length, but which is not copied each time the struct is copied.
So, when you pass a BigInteger, this is what happens:
Before copying:
[BigInteger instance 1] ---------> [array of bits]
After copying:
[BigInteger instance 1] ---------> [array of bits]
|
[BigInteger instance 2] ----+
Notice how there is always just one array of bits.

Why is the minimum size of a reference type 12 bytes for a 32 bit .NET process

I was reading the Pro .Net Performance book section on reference type internals. It mentions that for a 32 bit .net process a reference type has 4 bytes of object header and 4 bytes of method table pointer. Also, says that on a 32 bit system, the objects are aligned to the nearest 4 byte multiple, which makes the minimum size of a reference type 12 bytes.
My question is, why is the minimum size 12 bytes? The object is 8 bytes and that already aligns with a 4 byte boundary.
Minimum of 12 bytes is a requirement of the garbage collection implementation.
From here: http://msdn.microsoft.com/en-us/magazine/cc163791.aspx#S9
The Base Instance Size is the size of the object as computed by the class loader, based on the field declarations in the code. As discussed previously, the current GC implementation needs an object instance of at least 12 bytes. If a class does not have any instance fields defined, it will carry an overhead of 4 bytes. The rest of the 8 bytes will be taken up by the Object Header (which may contain a syncblk number) and TypeHandle.
(TypeHandle being a handle to the method table).
So you have 8 bytes of overhead (the object header and the method table pointer). If you want any data in the object, then you need at least one more byte, and because memory is allocated to objects in 4-byte chunks, you end up with a minimum of 12 bytes.

How is memory allocated in int array

How much space does a int array take up? Or how much space (in bytes) does a int array consumes that looks something like this:
int[] SampleArray=new int[]{1,2,3,4};
Is memory allocation language specific ??
Thank you all
Since you add a lot of language tags, I want to write for C#. In C#, this depends on operating system.
For 32-bit, each int is 4 byte and 4 byte also for reference to the object, that makes 4 * 4 + 4 = 20 byte
For 64-bit, each int is 4 byte and 8 byte also for reference to the object, that makes 4 * 4 + 8 = 24 byte
From C# 5.0 in a Nutshell in page 22;
Each reference to an object requires an extra four or eight bytes,
depending on whether the .NET runtime is running on a 32- or 64-bit
platform.
In C++, how much memory new int[4]{1, 2, 3, 4} actually allocates is implementation-defined but the size of the array will be sizeof(int)*4.
Ques is : Is memory allocation language specific ??
Yes memory allocation is language specific..it vary according the language..
for exp:
sizeof(int)*4
in java int size is 4byte so memory consumption is: 4*4=16bytes
It depends on both the language, but moreover to the operating system.
You need 4 integers. Normally an integer is 2 or 4 bytes (mostly 4 on most systems), but to be sure check sizeof(int).
(Also keep in mind the values can be differently represented depending on the operating system), like MSB first or LSB first (or a mix in case 4 bytes are used).
In Java int[] array is an Object which in memory represented by the header (8 bytes for x86) and int length field (4 bytes) followed by array of ints (arrayLength * 4).
approxSize = 8 + 4 + 4 * arraylength
see more here http://www.javamex.com/tutorials/memory/object_memory_usage.shtml

What is the difference between int, Int16, Int32 and Int64?

What is the difference between int, System.Int16, System.Int32 and System.Int64 other than their sizes?
Each type of integer has a different range of storage capacity
Type Capacity
Int16 -- (-32,768 to +32,767)
Int32 -- (-2,147,483,648 to +2,147,483,647)
Int64 -- (-9,223,372,036,854,775,808 to +9,223,372,036,854,775,807)
As stated by James Sutherland in his answer:
int and Int32 are indeed synonymous; int will be a little more
familiar looking, Int32 makes the 32-bitness more explicit to those
reading your code. I would be inclined to use int where I just need
'an integer', Int32 where the size is important (cryptographic code,
structures) so future maintainers will know it's safe to enlarge an
int if appropriate, but should take care changing Int32 variables
in the same way.
The resulting code will be identical: the difference is purely one of
readability or code appearance.
The only real difference here is the size. All of the int types here are signed integer values which have varying sizes
Int16: 2 bytes
Int32 and int: 4 bytes
Int64 : 8 bytes
There is one small difference between Int64 and the rest. On a 32 bit platform assignments to an Int64 storage location are not guaranteed to be atomic. It is guaranteed for all of the other types.
int
It is a primitive data type defined in C#.
It is mapped to Int32 of FCL type.
It is a value type and represent System.Int32 struct.
It is signed and takes 32 bits.
It has minimum -2147483648 and maximum +2147483647 value.
Int16
It is a FCL type.
In C#, short is mapped to Int16.
It is a value type and represent System.Int16 struct.
It is signed and takes 16 bits.
It has minimum -32768 and maximum +32767 value.
Int32
It is a FCL type.
In C#, int is mapped to Int32.
It is a value type and represent System.Int32 struct.
It is signed and takes 32 bits.
It has minimum -2147483648 and maximum +2147483647 value.
Int64
It is a FCL type.
In C#, long is mapped to Int64.
It is a value type and represent System.Int64 struct.
It is signed and takes 64 bits.
It has minimum –9,223,372,036,854,775,808 and maximum 9,223,372,036,854,775,807 value.
According to Jeffrey Richter(one of the contributors of .NET framework development)'s book 'CLR via C#':
int is a primitive type allowed by the C# compiler, whereas Int32 is the Framework Class Library type (available across languages that abide by CLS). In fact, int translates to Int32 during compilation.
Also,
In C#, long maps to System.Int64, but in a different programming
language, long could map to Int16 or Int32. In fact, C++/CLI does
treat long as Int32.
In fact, most (.NET) languages won't even treat long as a keyword and won't
compile code that uses it.
I have seen this author, and many standard literature on .NET preferring FCL types(i.e., Int32) to the language-specific primitive types(i.e., int), mainly on such interoperability concerns.
They tell what size can be stored in a integer variable. To remember the size you can think in terms of :-) 2 beers (2 bytes), 4 beers (4 bytes) or 8 beers (8 bytes).
Int16 :- 2 beers/bytes = 16 bit = 2^16 = 65536 = 65536/2 = -32768 to 32767
Int32 :- 4 beers/bytes = 32 bit = 2^32 = 4294967296 = 4294967296/2 = -2147483648 to 2147483647
Int64 :- 8 beers/bytes = 64 bit = 2^64 = 18446744073709551616 = 18446744073709551616/2 = -9223372036854775808 to 9223372036854775807
In short you can not store more than 32767 value in int16 , more than
2147483647 value in int32 and more than 9223372036854775807 value in
int64.
To understand above calculation you can check out this video int16 vs int32 vs int64
A very important note on the 16, 32 and 64 types:
if you run this query...
Array.IndexOf(new Int16[]{1,2,3}, 1)
you are suppose to get zero(0) because you are asking... is 1 within the array of 1, 2 or 3.
if you get -1 as answer, it means 1 is not within the array of 1, 2 or 3.
Well check out what I found:
All the following should give you 0 and not -1
(I've tested this in all framework versions 2.0, 3.0, 3.5, 4.0)
C#:
Array.IndexOf(new Int16[]{1,2,3}, 1) = -1 (not correct)
Array.IndexOf(new Int32[]{1,2,3}, 1) = 0 (correct)
Array.IndexOf(new Int64[]{1,2,3}, 1) = 0 (correct)
VB.NET:
Array.IndexOf(new Int16(){1,2,3}, 1) = -1 (not correct)
Array.IndexOf(new Int32(){1,2,3}, 1) = 0 (correct)
Array.IndexOf(new Int64(){1,2,3}, 1) = -1 (not correct)
So my point is, for Array.IndexOf comparisons, only trust Int32!
EDIT: This isn't quite true for C#, a tag I missed when I answered this question - if there is a more C# specific answer, please vote for that instead!
They all represent integer numbers of varying sizes.
However, there's a very very tiny difference.
int16, int32 and int64 all have a fixed size.
The size of an int depends on the architecture you are compiling for - the C spec only defines an int as larger or equal to a short though in practice it's the width of the processor you're targeting, which is probably 32bit but you should know that it might not be.
Nothing. The sole difference between the types is their size (and, hence, the range of values they can represent).
int and int32 are one and the same (32-bit integer)
int16 is short int (2 bytes or 16-bits)
int64 is the long datatype (8 bytes or 64-bits)
They both are indeed synonymous, However i found the small difference between them,
1)You cannot use Int32 while creatingenum
enum Test : Int32
{ XXX = 1 // gives you compilation error
}
enum Test : int
{ XXX = 1 // Works fine
}
2) Int32 comes under System declaration. if you remove using.System you will get compilation error but not in case for int
The answers by the above people are about right. int, int16, int32... differs based on their data holding capacity. But here is why the compilers have to deal with these - it is to solve the potential Year 2038 problem. Check out the link to learn more about it.
https://en.wikipedia.org/wiki/Year_2038_problem
Int=Int32 --> Original long type
Int16 --> Original int
Int64 --> New data type become available after 64 bit systems
"int" is only available for backward compatibility. We should be really using new int types to make our programs more precise.
---------------
One more thing I noticed along the way is there is no class named Int similar to Int16, Int32 and Int64. All the helpful functions like TryParse for integer come from Int32.TryParse.

How can I simulate a C++ union in C#?

I have a small question about structures with the LayoutKind.Explicit attribute set. I declared the struct as you can see, with a fieldTotal with 64 bits, being fieldFirst the first 32 bytes and fieldSecond the last 32 bytes. After setting both fieldfirst and fieldSecond to Int32.MaxValue, I'd expect fieldTotal to be Int64.MaxValue, which actually doesn't happen. Why is this? I know C# does not really support C++ unions, maybe it will only read the values well when interoping, but when we try to set the values ourselves it simply won't handle it really well?
[StructLayout(LayoutKind.Explicit)]
struct STRUCT {
[FieldOffset(0)]
public Int64 fieldTotal;
[FieldOffset(0)]
public Int32 fieldFirst;
[FieldOffset(32)]
public Int32 fieldSecond;
}
STRUCT str = new STRUCT();
str.fieldFirst = Int32.MaxValue;
str.fieldSecond = Int32.MaxValue;
Console.WriteLine(str.fieldTotal); // <----- I'd expect both these values
Console.WriteLine(Int64.MaxValue); // <----- to be the same.
Console.ReadKey();
The reason is that FieldOffsetAttribute takes a number of bytes as parameter -- not number of bits. This works as expected:
[StructLayout(LayoutKind.Explicit)]
struct STRUCT
{
[FieldOffset(0)]
public Int64 fieldTotal;
[FieldOffset(0)]
public Int32 fieldFirst;
[FieldOffset(4)]
public Int32 fieldSecond;
}
Looking at the hex values if Int32.MaxValue and Int64.MaxValue should provide the answer.
The key is the most significant bit. For a positive integer, the most significant bit is only set for a negative number. So the max value of Int32 is a 0 followed by a whole series of 1s. The order is unimportant, just that there will be at least a single 0 bit. The same is true of Int64.MaxValue.
Now consider how a union should work. It will essentially lay out the bits of the values next to one another. So now you have a set of bits 64 in length which contains two 0 bit values. One for each of the Int32.MaxValue instances. This cannot ever be equal to Int64.MaxValue since it can contain only a single 0 bit.
Oddly enough you will probably get the behavior you are looking for if you set fieldSecond to Int32.MinValue.
EDIT Missed that you need to make it FieldOffset(4) as well.
Ben M provided one of the more important elements - your definition is not setup correctly.
That being said, this won't work - even in C++ with a union. The values you specified won't be (and shouldn't be) the same values, since you're using signed (not unsigned) ints. With a signed int (Int32), you're going to have a 0 bit followed by 1 bits. When you do the union, you'll end up with a 0 bit, followed by a bunch of 1 bits, then another 0 bit, then a bunch of 1 bits... The second 0 bit is what's messing you up.
If you used UInt32/UInt64, this would work property, since the extra sign bit doesn't exist.

Categories