I am working on optimization of memory consuming application. In relation to that I have question regarding C# reference type size overhead.
The C# object consumes as many bytes as its fields, plus some additional administrative
overhead. I presume that administrative overhead can be different for different .NET versions and implementations.
Do you know what is the size (or maximum size if the overhead is variable) of the administrative overhead for C# objects (C# 4.0 and Windows 7 and 8 environment)?
Does the administrative overhead differs between 32- or 64-bit .NET runtime?
Typically, there is an 8 or 12 byte overhead per object allocated by the GC. There are 4 bytes for the syncblk and 4 bytes for the type handle on 32bit runtimes, 8 bytes on 64bit runtimes. For details, see the "ObjectInstance" section of Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects on MSDN Magazine.
Note that the actual reference does change on 32bit or 64bit .NET runtimes as well.
Also, there may be padding for types to fit on address boundaries, though this depends a lot on the type in question. This can cause "empty space" between objects as well, but is up to the runtime (mostly, though you can affect it with StructLayoutAttribute) to determine when and how data is aligned.
There is an article online with the title "The Truth About .NET Objects And Sharing Them Between AppDomains" which shows some rotor source code and some results of experimenting with objects and sharing them between app domains via a plain pointer.
http://geekswithblogs.net/akraus1/archive/2012/07/25/150301.aspx
12 bytes for all 32-bit versions of the CLR
24 bytes for all 64-bit versions of the CLR
You can do test this quite easily by adding millions of objects (N) to an array. Since the pointer size is known you can calculate the object size by dividing the value by N.
var initial = GC.GetTotalMemory(true);
const int N = 10 * 1000 * 1000;
var arr = new object[N];
for (int i = 0; i < N; i++)
{
arr[i] = new object();
}
var ObjSize = (GC.GetTotalMemory(false) - initial - N * IntPtr.Size) / N;
to get an approximate value on your .NET platform.
The object size is actually defined to allow the GC to make assumptions about the minimum object size.
\sscli20\clr\src\vm\object.h
//
// The generational GC requires that every object be at least 12 bytes
// in size.
#define MIN_OBJECT_SIZE (2*sizeof(BYTE*) + sizeof(ObjHeader))
For e.g. 32 bit this means that the minimum object size is 12 bytes which do leave a 4-byte hole. This hole is empty for an empty object but if you add e.g. int to your empty class then it is filled and the object size stays at 12 bytes.
There are two types of overhead for an object:
Internal data used to handle the object.
Padding between data members.
The internal data is two pointers, so in a 32-bit application that is 8 bytes, and in a 64-bit application that is 16 bytes.
Data members are padded so that they start on an even address boundary. If you for example have a byte and an int in the class, the byte is probably padded with three unused bytes so that the int starts on the next machine word boundary.
The layout of the classes is determined by the JIT compiler depending on the architecture of the system (and might vary between framework versions), so it's not known to the C# compiler.
Related
I have been looking at some SO questions related to the max size of an array of bytes (here and here) and have been playing with some arrays and getting some results I don't quite understand. My code is as follows:
byte[] myByteArr;
byte[] myByteArr2 = new byte[671084476];
for (int i = 1; i < 2; i++)
{
myByteArr = new byte[671084476];
}
This will compile and upon execution it will throw a 'System.OutOfMemoryException' on the initialization of myByteArr. If I change the 2 in the for loop to a 1 or I comment out one of the initialization's (either myByteArr2 or myByteArr) it will run fine.
Also, byte[] myByteArr = new byte[Int32.MaxValue - 56]; throws the same exception.
Why does this happen when compiled for 32-bit? Aren't they within the 2GB limit?
The limits of a 32-bit program are not per-object. It's a process limit. You cannot have more than 2GB total in use.
Not only that, but in practice, it's often difficult to get anywhere near 2GB due to address space fragmentation. .NET's managed (ie. movable) memory helps somewhat, but doesn't eliminate this problem.
Even if you are using a 64-bit process, you may have a similar problem because in C# arrays are indexed by an int, which is defined as a 32-bit signed integer, and thus can't address past the 2GB boundary in an array of bytes. If you read the answer to the second link carefully, you'll also see that there is a 2GB per object limit. Your array of bytes presumably has some overhead, so it can't get to the full 2GB just for the raw data.
See #Habib's link in the comments for details.
I have an empty object, I have created instance of type MyCustomType and compiled my application(x64 platform). Then I wonder how many bytes does my type hold. I opened .NET memory profiler and accordint to it, my type weight is - 24 bytes. So I know that in x64 platform any reference type in .NET has overhead - 16 bytes. Doubdless 16 != 24. And my question is: where other 8 bytes?
Thanks!
internal class MyCustomType
{
}
1 - There’s a "base" overhead of 8 bytes per object in x86 and 16 per object in x64… given that we can store an Int32 of "real" data in x86 and still have an object size of 12, and likewise we can store two Int32s of real data in x64 and still have an object of x64.
2 - There’s a "minimum" size of 12 bytes and 24 bytes respectively. In other words, you can’t have a type which is just the overhead. Note how the "Empty" class takes up the same size as creating instances of Object… there’s effectively some spare room, because the CLR doesn’t like operating on an object with no data. (Note that a struct with no fields takes up space too, even for local variables.)
3 - The x86 objects are padded to 4 byte boundaries; on x64 it’s 8 bytes (just as before)
4 - By default, the CLR is happy to pack fields pretty densely – Mixed2 only took as much space as ThreeInt32. My guess is that it reorganized the in-memory representation so that the bytes all came after the ints… and that’s what a quick bit of playing around with unsafe pointers suggests too… but I’m not sufficiently comfortable with this sort of thing to say for sure. Frankly, I don’t care… so long as it all works, what we’re interested in is the overall size, not the precise layout.
http://codeblog.jonskeet.uk/2011/04/05/of-memory-and-strings/
The main question in about the maximum number of items that can be in a collection such as List. I was looking for answers on here but I don't understand the reasoning.
Assume we are working with a List<int> with sizeof(int) = 4 bytes... Everyone seems to be sure that for x64 you can have a maximum 268,435,456 int and for x86 a maximum of 134,217,728 int. Links:
List size limitation in C#
Where is the maximum capacity of a C# Collection<T> defined?
What's the max items in a List<T>?
However, when I tested this myself I see that it's not the case for x86. Can anyone point me to where I may be wrong?
//// Test engine set to `x86` for `default processor architecture`
[TestMethod]
public void TestMemory()
{
var x = new List<int>();
try
{
for (long y = 0; y < long.MaxValue; y++)
x.Add(0);
}
catch (Exception)
{
System.Diagnostics.Debug.WriteLine("Actual capacity (int): " + x.Count);
System.Diagnostics.Debug.WriteLine("Size of objects: " + System.Runtime.InteropServices.Marshal.SizeOf(x.First().GetType())); //// This gives us "4"
}
}
For x64: 268435456 (expected)
For x86: 67108864 (2 times less than expected)
Why do people say that a List containing 134217728 int is exactly 512MB of memory... when you have 134217728 * sizeof(int) * 8 = 4,294,967,296 = 4GB... what's way more than 2GB limit per process.
Whereas 67108864 * sizeof(int) * 8 = 2,147,483,648 = 2GB... which makes sense.
I am using .NET 4.5 on a 64 bit machine running windows 7 8GB RAM. Running my tests in x64 and x86.
EDIT: When I set capacity directly to List<int>(134217728) I get a System.OutOfMemoryException.
EDIT2: Error in my calculations: multiplying by 8 is wrong, indeed MB =/= Mbits. I was computing Mbits. Still 67108864 ints would only be 256MB... which is way smaller than expected.
The underlying storage for a List<T> class is a T[] array. A hard requirement for an array is that the process must be able to allocate a contiguous chunk of memory to store the array.
That's a problem in a 32-bit process. Virtual memory is used for code and data, you allocate from the holes that are left between them. And while a 32-bit process will have 2 gigabytes of memory, you'll never get anywhere near a hole that's close to that size. The biggest hole in the address space you can get, right after you started the program, is around 500 or 600 megabytes. Give or take, it depends a lot on what DLLs get loaded into the process. Not just the CLR, the jitter and the native images of the framework assemblies but also the kind that have nothing to do with managed code. Like anti-malware and the raft of "helpful" utilities that worm themselves into every process like Dropbox and shell extensions. A poorly based one can cut a nice big hole in two small ones.
These holes will also get smaller as the program has been allocating and releasing memory for a while. A general problem called address space fragmentation. A long-running process can fail on a 90 MB allocation, even though there is lots of unused memory laying around.
You can use SysInternals' VMMap utility to get more insight. A copy of Russinovich's book Windows Internals is typically necessary as well to make sense of what you see.
This could maybe also help but i was able to replicate this 67108864 limit by creating a test project with the provided code
in console, winform, wpf, i was able to get the 134217728 limit
in asp.net i was getting 33554432 limit
so in one of your comment you said [TestMethod], this seem to be the issue.
While you can have MaxValue Items, in practice you will run out of memory before then.
Running as x86 the most ram you can have even on a x46 box would be 4GB more likely 2GB or 3GB is the max if on a x86 version of Windows.
The available ram is most likely much smaller as you would only be able to allocate the biggest continuous space to the array.
I am a tinkerer—no doubt about that. For this reason (and very little beyond that), I recently did a little experiment to confirm my suspicion that writing to a struct is not an atomic operation, which means that a so-called "immutable" value type which attempts to enforce certain constraints could hypothetically fail at its goal.
I wrote a blog post about this using the following type as an illustration:
struct SolidStruct
{
public SolidStruct(int value)
{
X = Y = Z = value;
}
public readonly int X;
public readonly int Y;
public readonly int Z;
}
While the above looks like a type for which it could never be true that X != Y or Y != Z, in fact this can happen if a value is "mid-assignment" at the same time it is copied to another location by a separate thread.
OK, big deal. A curiosity and little more. But then I had this hunch: my 64-bit CPU should actually be able to copy 64 bits atomically, right? So what if I got rid of Z and just stuck with X and Y? That's only 64 bits; it should be possible to overwrite those in one step.
Sure enough, it worked. (I realize some of you are probably furrowing your brows right now, thinking, Yeah, duh. How is this even interesting? Humor me.) Granted, I have no idea whether this is guaranteed or not given my system. I know next to nothing about registers, cache misses, etc. (I am literally just regurgitating terms I've heard without understanding their meaning); so this is all a black box to me at the moment.
The next thing I tried—again, just on a hunch—was a struct consisting of 32 bits using 2 short fields. This seemed to exhibit "atomic assignability" as well. But then I tried a 24-bit struct, using 3 byte fields: no go.
Suddenly the struct appeared to be susceptible to "mid-assignment" copies once again.
Down to 16 bits with 2 byte fields: atomic again!
Could someone explain to me why this is? I've heard of "bit packing", "cache line straddling", "alignment", etc.—but again, I don't really know what all that means, nor whether it's even relevant here. But I feel like I see a pattern, without being able to say exactly what it is; clarity would be greatly appreciated.
The pattern you're looking for is the native word size of the CPU.
Historically, the x86 family worked natively with 16-bit values (and before that, 8-bit values). For that reason, your CPU can handle these atomically: it's a single instruction to set these values.
As time progressed, the native element size increased to 32 bits, and later to 64 bits. In every case, an instruction was added to handle this specific amount of bits. However, for backwards compatibility, the old instructions were still kept around, so your 64-bit processor can work with all of the previous native sizes.
Since your struct elements are stored in contiguous memory (without padding, i.e. empty space), the runtime can exploit this knowledge to only execute that single instruction for elements of these sizes. Put simply, that creates the effect you're seeing, because the CPU can only execute one instruction at a time (although I'm not sure if true atomicity can be guaranteed on multi-core systems).
However, the native element size was never 24 bits. Consequently, there is no single instruction to write 24 bits, so multiple instructions are required for that, and you lose the atomicity.
The C# standard (ISO 23270:2006, ECMA-334) has this to say regarding atomicity:
12.5 Atomicity of variable references
Reads and writes of the following data types shall be atomic: bool, char, byte, sbyte, short, ushort,
uint, int, float, and reference types. In addition, reads and writes of enum types with an underlying type
in the previous list shall also be atomic. Reads and writes of other types, including long, ulong, double,
and decimal, as well as user-defined types, need not be atomic. (emphasis mine) Aside from the library functions designed
for that purpose, there is no guarantee of atomic read-modify-write, such as in the case of increment or
decrement.Your example X = Y = Z = value is short hand for 3 separate assignment operations, each of which is defined to be atomic by 12.5. The sequence of 3 operations (assign value to Z, assign Z to Y, assign Y to X) is not guaranteed to be atomic.
Since the language specification doesn't mandate atomicity, while X = Y = Z = value; might be an atomic operation, whether it is or not is dependent on a whole bunch of factors:
the whims of the compiler writers
what code generation optimizations options, if any, were selected at build time
the details of the JIT compiler responsible for turning the assembly's IL into machine language. Identical IL run under Mono, say, might exhibit different behaviour than when run under .Net 4.0 (and that might even differ from earlier versions of .Net).
the particular CPU on which the assembly is running.
One might also note that even a single machine instruction is not necessarily warranted to be an atomic operation—many are interruptable.
Further, visiting the CLI standard (ISO 23217:2006), we find section 12.6.6:
12.6.6 Atomic reads and writes
A conforming CLI shall guarantee that read and write access to properly
aligned memory locations no larger than the native word size (the size of type
native int) is atomic (see §12.6.2) when all the write accesses to a location are
the same size. Atomic writes shall alter no bits other than those written. Unless
explicit layout control (see Partition II (Controlling Instance Layout)) is used to
alter the default behavior, data elements no larger than the natural word size (the
size of a native int) shall be properly aligned. Object references shall be treated
as though they are stored in the native word size.
[Note: There is no guarantee
about atomic update (read-modify-write) of memory, except for methods provided for
that purpose as part of the class library (see Partition IV). (emphasis mine)
An atomic write of a “small data item” (an item no larger than the native word size)
is required to do an atomic read/modify/write on hardware that does not support direct
writes to small data items. end note]
[Note: There is no guaranteed atomic access to 8-byte data when the size of
a native int is 32 bits even though some implementations might perform atomic
operations when the data is aligned on an 8-byte boundary. end note]
x86 CPU operations take place in 8, 16, 32, or 64 bits; manipulating other sizes requires multiple operations.
The compiler and x86 CPU are going to be careful to move only exactly as many bytes as the structure defines. There are no x86 instructions that can move 24 bits in one operation, but there are single instruction moves for 8, 16, 32, and 64 bit data.
If you add another byte field to your 24 bit struct (making it a 32 bit struct), you should see your atomicity return.
Some compilers allow you to define padding on structs to make them behave like native register sized data. If you pad your 24 bit struct, the compiler will add another byte to "round up" the size to 32 bits so that the whole structure can be moved in one atomic instruction. The downside is your structure will always occupy 30% more space in memory.
Note that alignment of the structure in memory is also critical to atomicity. If a multibyte structure does not begin at an aligned address, it may span multiple cache lines in the CPU cache. Reading or writing this data will require multiple clock cycles and multiple read/writes even though the opcode is a single move instruction. So, even single instruction moves may not be atomic if the data is misaligned. x86 does guarantee atomicity for native sized read/writes on aligned boundaries, even in multicore systems.
It is possible to achieve memory atomicity with multi-step moves using the x86 LOCK prefix. However this should be avoided as it can be very expensive in multicore systems (LOCK not only blocks other cores from accessing memory, it also locks the system bus for the duration of the operation which can impact disk I/O and video operations. LOCK may also force the other cores to purge their local caches)
I'm doing some Project Euler exercises and I've run into a scenario where I have want arrays which are larger than 2,147,483,647 (the upper limit of int in C#).
Sure these are large arrays, but for instance, I can't do this
// fails
bool[] BigArray = new BigArray[2147483648];
// also fails, cannot convert uint to int
ArrayList BigArrayList = new ArrayList(2147483648);
So, can I have bigger arrays?
EDIT:
It was for a Sieve of Atkin, you know, so I just wanted a really big one :D
Anytime you are working with an array this big, you should probably try to find a better solution to the problem. But that being said I'll still attempt to answer your question.
As mentioned in this article there is a 2 GB limit on any object in .Net. For all x86, x64 and IA64.
As with 32-bit Windows operating
systems, there is a 2GB limit on the
size of an object you can create while
running a 64-bit managed application
on a 64-bit Windows operating system.
Also if you define an array too big on the stack, you will have a stack overflow. If you define the array on the heap, it will try to allocate it all in one big continuous block. It would be better to use an ArrayList which has implicit dynamic allocation on the heap. This will not allow you to get past the 2GB, but will probably allow you to get closer to it.
I think the stack size limit will be bigger only if you are using an x64 or IA64 architecture and operating system. Using x64 or IA64 you will have 64-bit allocatable memory instead of 32-bit.
If you are not able to allocate the array list all at once, you can probably allocate it in parts.
Using an array list and adding 1 object at a time on an x64 Windows 2008 machine with 6GB of RAM, the most I can get the ArrayList to is size: 134217728. So I really think you have to find a better solution to your problem that does not use as much memory. Perhaps writing to a file instead of using RAM.
The array limit is, afaik, fixed as int32 even on 64-bit. There is a cap on the maximum size of a single object. However, you could have a nice big jagged array quite easily.
Worse; because references are larger in x64, for ref-type arrays you actually get less elements in a single array.
See here:
I’ve received a number of queries as
to why the 64-bit version of the 2.0
.Net runtime still has array maximum
sizes limited to 2GB. Given that it
seems to be a hot topic of late I
figured a little background and a
discussion of the options to get
around this limitation was in order.
First some background; in the 2.0
version of the .Net runtime (CLR) we
made a conscious design decision to
keep the maximum object size allowed
in the GC Heap at 2GB, even on the
64-bit version of the runtime. This is
the same as the current 1.1
implementation of the 32-bit CLR,
however you would be hard pressed to
actually manage to allocate a 2GB
object on the 32-bit CLR because the
virtual address space is simply too
fragmented to realistically find a 2GB
hole. Generally people aren’t
particularly concerned with creating
types that would be >2GB when
instantiated (or anywhere close),
however since arrays are just a
special kind of managed type which are
created within the managed heap they
also suffer from this limitation.
It should be noted that in .NET 4.5 the memory size limit is optionally removed by the gcAllowVeryLargeObjects flag, however, this doesn't change the maximum dimension size. The key point is that if you have arrays of a custom type, or multi-dimension arrays, then you can now go beyond 2GB in memory size.
You don't need an array that large at all.
When your method runs into resource problems, don't just look at how to expand the resources, look at the method also. :)
Here's a class that uses a 3 MB buffer to calculate primes using the sieve of Eratosthenes. The class keeps track of how far you have calculated primes, and when the range needs to be expanded it creates a buffer to test another 3 million numbers.
It keeps the found prime numbers in a list, and when the range is expanded the previos primes are used to rule out numbers in the buffer.
I did some testing, and a buffer around 3 MB is most efficient.
public class Primes {
private const int _blockSize = 3000000;
private List<long> _primes;
private long _next;
public Primes() {
_primes = new List<long>() { 2, 3, 5, 7, 11, 13, 17, 19 };
_next = 23;
}
private void Expand() {
bool[] sieve = new bool[_blockSize];
foreach (long prime in _primes) {
for (long i = ((_next + prime - 1L) / prime) * prime - _next;
i < _blockSize; i += prime) {
sieve[i] = true;
}
}
for (int i = 0; i < _blockSize; i++) {
if (!sieve[i]) {
_primes.Add(_next);
for (long j = i + _next; j < _blockSize; j += _next) {
sieve[j] = true;
}
}
_next++;
}
}
public long this[int index] {
get {
if (index < 0) throw new IndexOutOfRangeException();
while (index >= _primes.Count) {
Expand();
}
return _primes[index];
}
}
public bool IsPrime(long number) {
while (_primes[_primes.Count - 1] < number) {
Expand();
}
return _primes.BinarySearch(number) >= 0;
}
}
I believe that even within a 64 bit CLR, there's a limit of 2GB (or possibly 1GB - I can't remember exactly) per object. That would prevent you from creating a larger array. The fact that Array.CreateInstance only takes Int32 arguments for sizes is suggestive too.
On a broader note, I suspect that if you need arrays that large you should really change how you're approaching the problem.
I'm very much a newbie with C# (i.e. learning it this week), so I'm not sure of the exact details of how ArrayList is implemented. However, I would guess that as you haven't defined a type for the ArrayList example, then the array would be allocated as an array of object references. This might well mean that you are actually allocating 4-8Gb of memory depending on the architecture.
According to MSDN, the index for array of bytes cannot be greater than 2147483591. For .NET prior to 4.5 it also was a memory limit for an array. In .NET 4.5 this maximum is the same, but for other types it can be up to 2146435071.
This is the code for illustration:
static void Main(string[] args)
{
// -----------------------------------------------
// Pre .NET 4.5 or gcAllowVeryLargeObjects unset
const int twoGig = 2147483591; // magic number from .NET
var type = typeof(int); // type to use
var size = Marshal.SizeOf(type); // type size
var num = twoGig / size; // max element count
var arr20 = Array.CreateInstance(type, num);
var arr21 = new byte[num];
// -----------------------------------------------
// .NET 4.5 with x64 and gcAllowVeryLargeObjects set
var arr451 = new byte[2147483591];
var arr452 = Array.CreateInstance(typeof(int), 2146435071);
var arr453 = new byte[2146435071]; // another magic number
return;
}