Memory allocation in .NET - c#

I have an empty object, I have created instance of type MyCustomType and compiled my application(x64 platform). Then I wonder how many bytes does my type hold. I opened .NET memory profiler and accordint to it, my type weight is - 24 bytes. So I know that in x64 platform any reference type in .NET has overhead - 16 bytes. Doubdless 16 != 24. And my question is: where other 8 bytes?
Thanks!
internal class MyCustomType
{
}

1 - There’s a "base" overhead of 8 bytes per object in x86 and 16 per object in x64… given that we can store an Int32 of "real" data in x86 and still have an object size of 12, and likewise we can store two Int32s of real data in x64 and still have an object of x64.
2 - There’s a "minimum" size of 12 bytes and 24 bytes respectively. In other words, you can’t have a type which is just the overhead. Note how the "Empty" class takes up the same size as creating instances of Object… there’s effectively some spare room, because the CLR doesn’t like operating on an object with no data. (Note that a struct with no fields takes up space too, even for local variables.)
3 - The x86 objects are padded to 4 byte boundaries; on x64 it’s 8 bytes (just as before)
4 - By default, the CLR is happy to pack fields pretty densely – Mixed2 only took as much space as ThreeInt32. My guess is that it reorganized the in-memory representation so that the bytes all came after the ints… and that’s what a quick bit of playing around with unsafe pointers suggests too… but I’m not sufficiently comfortable with this sort of thing to say for sure. Frankly, I don’t care… so long as it all works, what we’re interested in is the overall size, not the precise layout.
http://codeblog.jonskeet.uk/2011/04/05/of-memory-and-strings/

Related

Can an array of 8 longs fit in a 64 byte cache line?

So on Intel's I7 Processor memory is written and read at 64 bytes.
So if I wanted to fill a cache line up, I could use 16 longs (4 bytes each).
If a make an array of 16 longs, would that fit the entire cache line, or is there some overhead for the array?
My concern is that if an array has any overhead at all, and I use 16 longs, the total size in bytes will spill over 64.
So is it more like new long[63], or new long[62] etc?
In C#, the long data type, which is an alias for System.Int64, is 8 bytes, not 4. The int type, aka System.Int32, is 4 bytes. So 64 bytes of long values is eight elements, not sixteen.
The storage of a managed array is in fact contiguous, so in theory yes, a long[8] would fit in a 64-byte cache line exactly. But note that it would do so only if properly aligned on an address that's a multiple of 64. Since you don't have control over allocation location, that's going to be difficult to do.
So even without overhead in the array, you can't guarantee that a single 8-element array of longs would actually fit exactly in a 64-byte cache line.
Of course, an array longer than that will have sub-ranges that are aligned and so can be cached entirely. But then that's true for pretty much any data type you might have. Frankly, the thing to worry about isn't the size of your data or the length of your array, but the pattern of access. See "data locality" for advice on how to access your data in ways that help ensure efficient use of the cache.

Maximum capacity of Collection<T> different than expected for x86

The main question in about the maximum number of items that can be in a collection such as List. I was looking for answers on here but I don't understand the reasoning.
Assume we are working with a List<int> with sizeof(int) = 4 bytes... Everyone seems to be sure that for x64 you can have a maximum 268,435,456 int and for x86 a maximum of 134,217,728 int. Links:
List size limitation in C#
Where is the maximum capacity of a C# Collection<T> defined?
What's the max items in a List<T>?
However, when I tested this myself I see that it's not the case for x86. Can anyone point me to where I may be wrong?
//// Test engine set to `x86` for `default processor architecture`
[TestMethod]
public void TestMemory()
{
var x = new List<int>();
try
{
for (long y = 0; y < long.MaxValue; y++)
x.Add(0);
}
catch (Exception)
{
System.Diagnostics.Debug.WriteLine("Actual capacity (int): " + x.Count);
System.Diagnostics.Debug.WriteLine("Size of objects: " + System.Runtime.InteropServices.Marshal.SizeOf(x.First().GetType())); //// This gives us "4"
}
}
For x64: 268435456 (expected)
For x86: 67108864 (2 times less than expected)
Why do people say that a List containing 134217728 int is exactly 512MB of memory... when you have 134217728 * sizeof(int) * 8 = 4,294,967,296 = 4GB... what's way more than 2GB limit per process.
Whereas 67108864 * sizeof(int) * 8 = 2,147,483,648 = 2GB... which makes sense.
I am using .NET 4.5 on a 64 bit machine running windows 7 8GB RAM. Running my tests in x64 and x86.
EDIT: When I set capacity directly to List<int>(134217728) I get a System.OutOfMemoryException.
EDIT2: Error in my calculations: multiplying by 8 is wrong, indeed MB =/= Mbits. I was computing Mbits. Still 67108864 ints would only be 256MB... which is way smaller than expected.
The underlying storage for a List<T> class is a T[] array. A hard requirement for an array is that the process must be able to allocate a contiguous chunk of memory to store the array.
That's a problem in a 32-bit process. Virtual memory is used for code and data, you allocate from the holes that are left between them. And while a 32-bit process will have 2 gigabytes of memory, you'll never get anywhere near a hole that's close to that size. The biggest hole in the address space you can get, right after you started the program, is around 500 or 600 megabytes. Give or take, it depends a lot on what DLLs get loaded into the process. Not just the CLR, the jitter and the native images of the framework assemblies but also the kind that have nothing to do with managed code. Like anti-malware and the raft of "helpful" utilities that worm themselves into every process like Dropbox and shell extensions. A poorly based one can cut a nice big hole in two small ones.
These holes will also get smaller as the program has been allocating and releasing memory for a while. A general problem called address space fragmentation. A long-running process can fail on a 90 MB allocation, even though there is lots of unused memory laying around.
You can use SysInternals' VMMap utility to get more insight. A copy of Russinovich's book Windows Internals is typically necessary as well to make sense of what you see.
This could maybe also help but i was able to replicate this 67108864 limit by creating a test project with the provided code
in console, winform, wpf, i was able to get the 134217728 limit
in asp.net i was getting 33554432 limit
so in one of your comment you said [TestMethod], this seem to be the issue.
While you can have MaxValue Items, in practice you will run out of memory before then.
Running as x86 the most ram you can have even on a x46 box would be 4GB more likely 2GB or 3GB is the max if on a x86 version of Windows.
The available ram is most likely much smaller as you would only be able to allocate the biggest continuous space to the array.

C# Object Size Overhead

I am working on optimization of memory consuming application. In relation to that I have question regarding C# reference type size overhead.
The C# object consumes as many bytes as its fields, plus some additional administrative
overhead. I presume that administrative overhead can be different for different .NET versions and implementations.
Do you know what is the size (or maximum size if the overhead is variable) of the administrative overhead for C# objects (C# 4.0 and Windows 7 and 8 environment)?
Does the administrative overhead differs between 32- or 64-bit .NET runtime?
Typically, there is an 8 or 12 byte overhead per object allocated by the GC. There are 4 bytes for the syncblk and 4 bytes for the type handle on 32bit runtimes, 8 bytes on 64bit runtimes. For details, see the "ObjectInstance" section of Drill Into .NET Framework Internals to See How the CLR Creates Runtime Objects on MSDN Magazine.
Note that the actual reference does change on 32bit or 64bit .NET runtimes as well.
Also, there may be padding for types to fit on address boundaries, though this depends a lot on the type in question. This can cause "empty space" between objects as well, but is up to the runtime (mostly, though you can affect it with StructLayoutAttribute) to determine when and how data is aligned.
There is an article online with the title "The Truth About .NET Objects And Sharing Them Between AppDomains" which shows some rotor source code and some results of experimenting with objects and sharing them between app domains via a plain pointer.
http://geekswithblogs.net/akraus1/archive/2012/07/25/150301.aspx
12 bytes for all 32-bit versions of the CLR
24 bytes for all 64-bit versions of the CLR
You can do test this quite easily by adding millions of objects (N) to an array. Since the pointer size is known you can calculate the object size by dividing the value by N.
var initial = GC.GetTotalMemory(true);
const int N = 10 * 1000 * 1000;
var arr = new object[N];
for (int i = 0; i < N; i++)
{
arr[i] = new object();
}
var ObjSize = (GC.GetTotalMemory(false) - initial - N * IntPtr.Size) / N;
to get an approximate value on your .NET platform.
The object size is actually defined to allow the GC to make assumptions about the minimum object size.
\sscli20\clr\src\vm\object.h
//
// The generational GC requires that every object be at least 12 bytes
// in size.
#define MIN_OBJECT_SIZE (2*sizeof(BYTE*) + sizeof(ObjHeader))
For e.g. 32 bit this means that the minimum object size is 12 bytes which do leave a 4-byte hole. This hole is empty for an empty object but if you add e.g. int to your empty class then it is filled and the object size stays at 12 bytes.
There are two types of overhead for an object:
Internal data used to handle the object.
Padding between data members.
The internal data is two pointers, so in a 32-bit application that is 8 bytes, and in a 64-bit application that is 16 bytes.
Data members are padded so that they start on an even address boundary. If you for example have a byte and an int in the class, the byte is probably padded with three unused bytes so that the int starts on the next machine word boundary.
The layout of the classes is determined by the JIT compiler depending on the architecture of the system (and might vary between framework versions), so it's not known to the C# compiler.

Why is sizeof(bool) == sizeof(byte) in C#? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is the binary representation of a boolean value in c#
According to the MSDN documentation, the sizeof keyword is "used to obtain the size in bytes for an unmanaged type" and primitives are considered unmanaged types. If I check the sizeof(bool), the result is 1.
It seems to me that using a Boolean value should only require a bit of memory. Am I mistaken? Does using a Boolean value actually requires a full byte of memory? Why?
It uses a whole byte of memory for performance reasons.
If it only used a single bit, what do you do with the other 7 bits? Few variables are booleans, and other variables may not require a single bit. So it would only be useful for other booleans.
For example, 4-byte integers. Also, many larger types need to start at appropriate byte boundaries for performance reasons. For example, a CPU may not allow you to easily reference a 4-byte address starting from any address (ie. the address may need to be divisible by 4).
If it used a single bit of memory, meaning the other 7-bits could be used for other booleans, trying to use this boolean would be more complicated. Because it is not directly addressable, you would need to get the byte, and then extract the bit, before testing if it is 1 or 0. That means more instructions - hence slower performance.
If you have many booleans, and you want them to only use a single bit of memory EACH, you should use a BitArray. These are containers for single bits. They act like arrays of booleans.
A byte is the smallest amount of addressable memory. The .NET team have chosen to use a byte to store a bool to simplify the implementation.
If you want to store a large number of bits more compactly you can look at BitArray.
Yes it requires a full byte of memory because that's the smallest addressable memory.
It would of course be possible to come up with a scheme where several bools can be put in the same byte, thus saving space. For more cases the overhead of such a solution would cost much more than gained.
If you have a lot of bits to store, a specialised bit vector (such as BitArray that Mark Byers metnions) can save precious space.
If you think of 1 Byte as numeral value is 1 due to sizeof. So how can it say 1 bit ? Impossible, either it floors and return 0 and thats impossible or it returns 1 because it takes up for saving a byte because you don't save in bits.
But wether it's managed as a bit or a byte, I don't know.
In c++ you add to the variable-name a :1 to say it should be just 1 bit wide.

Should I Make These Vectors Classes or Structs in C#

I am creating a geometry library in C# and I will need the following immutable types:
Vector2f (2 floats - 8 bytes)
Vector2d (2 doubles - 16 bytes)
Vector3f (3 floats - 12 bytes)
Vector3d (3 doubles - 24 bytes)
Vector4f (4 floats - 16 bytes)
Vector4d (4 doubles - 32 bytes)
I am trying to determine whether to make them structs or classes. MSDN suggests only using a struct if the size if going to be no greater than 16 bytes. That reference seems to be from 2005. Is 16 bytes still the max suggested size?
I am sure that using structs for the float vectors would be more efficient than using a class, but what should I do about the double vectors? Should I make them structs also to be consistent, or should I make them classes?
Updated:
Looks like the everyone agrees they should be structs. Thanks for the great answers.
Microsoft's XNA Framework uses structures for its Vector2/3/4 data types. These contain fields of type float. I don't see anything wrong with doing the same; using fields of type double shouldn't make a big difference.
I would generally make types like these structs. Structs are more likely to be placed on the stack, and if you use arrays with them, you can get much nicer peformance than if, say, you were to use a List<> of objects. As long as you stay away from operations which will cause your vector classes to be boxed, the struct is the generally going to be the higher performance way to go.
Immutable structs is a good choice. The size recommendation on MSDN is the weakest, 32 bytes is not really big.
The main argument would be that Vectors are used like simple (numerical) types and an implementation as value type is most appropriate.
A good parallel are the new Complex and BigInteger structs in dotNet 4
Structs, definitely. Why? Well, it's part feeling I guess, a vector just feels and behaves like a value.
Rico Mariani here gives some reasons why value types were chosen for certain types for an API (I don't think it's XNA he's talking about).
And while efficiency is a factor here, I don't think it's about garbage collection, but more about data density like Rico says. Say you also have a Vertex type, that contains two Vector3s: a Vector3 for the normal and a Vector3 for the world co-ordinates. If you made those types classes, then having an array with 100 Vertex elements, would consist of:
100 * 8 bytes (8 bytes is I believe the overhead of a class in memory, 4 bytes for the type header and 4 bytes for something else, a GC handle?)
100 * 4 bytes (for the pointers in the array to the Vertex elements)
200 * 4 bytes (for the pointers from each Vertex to the two Vector3 elements)
200 * 8 bytes (for the 8 byte overhead that you pay for making Vector3 a class)
200 * 12 bytes (for the actual payload of 3 float per Vector3)
6000 bytes (on a 32-bit system).
As a value type, it's simply 200 * 12 bytes = 2400 bytes. So much more efficient space-wise not to mention a lower level of indirection when reading the array.
But taking up a lot of space doesn't necessarily make it slow, using a value type incorrectly can be slower than making it a class, as I have found out. You definitely want to pass them by ref as much as possible, avoid copying them, but this doesn't go for all operations, so measure. I think I remember calculating the dot-product was slower when passing by ref, perhaps because it somehow prevented in-lining by making the IL larger. But don't take my word for it, just measure.
In my opinion I would go with structs for this, since they would be allocated on the stack most times this would reduce the pressure on the GC so your application should run smoother.
One tip, your methods that opperate on the structs should probably take the arguments as ref and out arguments, this will reduce the amount of copying of the data that takes place when passing and returning structs.
We recently switched some maths types (vec3, vec4, matrices etc.) over to use structs instead of classes and, as well as being slightly quicker in benchmarks, it also reduced the GC pressure (we had hundreds of megabytes of Vec3 allocations over a relatively short period of time, discovered via the CLR profiler).
It's interesting to note that the Xna implementation of Vectors & Matrices is done using structs. You can always pass by reference where appropriate if you're concerned about performance, too.
I assume you're worried about perf?
If they're immutable, they should be behaviorally identical. Just go with one, and if it turns out that there's a perf issue, switch it later on.
If you're targetting .NET CE at all, you should probably consider using structs as the GC isn't as efficient as the full .NET implementation.
Structs are (usually) stored on the stack, which might make them more efficient, but probably not enough to make a noticeable difference.
http://www.c-sharpcorner.com/UploadFile/rmcochran/csharp_memory01122006130034PM/csharp_memory.aspx?ArticleID=9adb0e3c-b3f6-40b5-98b5-413b6d348b91

Categories