Why this memory management trick works? - c#

Refers to this Unity documentation and go to section
Large heap with slow but infrequent garbage collection
var tmp = new System.Object[1024];
// make allocations in smaller blocks to avoid them to be treated in a special way, which is designed for large blocks
for (int i = 0; i < 1024; i++)
tmp[i] = new byte[1024];
// release reference
tmp = null;
The trick is to pre-allocate some memory chunks at the program start.
Why does this trick work?
Are the chunks being somekind of "registered" (or "bound") to the application when they are being pre-allocated, so that even though the tmp is being freed when Start() is finished, the OS still treat these chunks as "registered" to the application?
Since the chunks are "registered" to the application, so the heap size of the application is expanded to certain size, and the next time it acquires memory chunk, the OS would just pick it from the heap of this application.
Is my explanation correct? No matter Yes or No could someone please explain in more details, thanks.

It's not really a trick. It's the way that parts of Unity3D handle memory.
In Unity3D you have objects that are handled by Mono and will be garbage collected, and objects that are handled by Unity, that will not be garbage collected. Strings, ints etc are cleaned up by Mono automatically and we do not have to worry about this. Texture(2D)s etc are not, and we have to dispose of these objects manually.
When a request for memory is made the first thing that happens is that the memory manager scans the application's currently allocated memory from the OS for a chunk large enough to store the data you are requesting. If a match is found, that memory is used. If a match is not found, then the application will request additional memory from the OS in order to store your data. When this data is no longer used up it is garbage collected, but the application still retains that memory. In essence, it sets a flag on the memory to say it is 'usable' or re-allocatable. This reduces the requests for memory made to the OS by never returning it.
What this means is two things;
1) Your application's memory will only continue to grow, and will not return memory to the OS. On mobile devices this is dangerous, as if you use too much memory your application will be terminated.
2) Your application may actually be allocated way more memory than it actually needs. This is due to fragmented memory. You may have 10MB of available memory in your application's memory pool, but non of those chunks are large enough to house the data you need to store. Therefore, it is possible that the application will request more memory from the OS because there is not a single piece of contiguous memory available that can be used.
Because you're creating a large object, an therefore requesting memory, when you set that object to null and signal to the garbage collector that memory is no longer needed by the application, it is quicker to reallocate that kept memory to other objects rather than requesting additional memory from the OS. It is this reason why in theory this particular method is fast and will result in less performance spikes as the garbage collector is invoked less often. Especially as this is a large, contiguous memory allocation.

Why does this trick work?
This trick works because the application won't return memory to the OS, unless the OS memory manager is low and explicitly requests them to do so, in which then they will free up as much as possible. There is an assumption that once the memory is allocated, it will be needed again. If it is already allocated, there is no reason to return it back to the OS, unless it really needs to use it.

Related

Data Protection in .Net

I am getting this question from our clients where they are saying if we do Copy-Paste or store data in a variable, then there are chances where data can be hacked where a hacker can get the data from RAM and use it before GC disposes of it.
We generally don't dispose string objects where it gets stored in heap memory and will be collected by GC when it flushes the memory.
This is what I get about GC
The memory that is used by allocated objects on the managed heap surpasses an acceptable threshold. This threshold is continuously
adjusted as the process runs. The GC.Collect method is called. In
almost all cases, you do not have to call this method, because the
garbage collector runs continuously
Is it possible where any hacker can get into RAM and read the data from it before GC flushes it? If yes, then how can we overcome it.
If the hacker can read memory in your process, the unpredictable lifetime of objects due to GC are the least of your problems. Any language is vulnerable to this kind of issue as computers effectively manipulate all data in memory (whether it's in a GC-able heap or elsewhere - C and assembly language need to store the data in memory too).
Technologies exist (like Intel SGX) that try to overcome this issue, but it too has exploits. Fundamentally, no software only solution can stop bad folks once they can read your memory.
I agree with the comments regarding the futility of trying to safeguard data in memory if an attacker already has the ability to read process memory entirely.
That said many attackers will be attacking via exploits that allow imperfect access to subsections of system memory, meaning use of SecureString is still of practical utility.
I recommend reading this thread for a discussion of the applications and limitations: When would I need a SecureString in .NET?

Who is responsible for C# memory allocation?

What part of the .NET framework takes responsibility to allocate memory. Is it GC?
It is the CLR but in close cooperation with the GC. And the GC is a part of the CLR so it's not such a clear division.
Allocation takes place at the start of the free section of the Heap, it is a very simple and fast operation. Allocation on the Large Object Heap (LOH) is slightly more complicated.
Do Visit http://www.codeproject.com/Articles/38069/Memory-Management-in-NET
Allocation of Memory
"Generally .NET is hosted using Host process, during debugging .NET
creates a process using VSHost.exe which gives the programmer the
basic debugging facilities of the IDE and also direct managed memory
management of the CLR. After deploying your application, the CLR
creates the process in the name of its executable and allocates memory
directly through Managed Heaps.
When CLR is loaded, generally two managed heaps are allocated; one is
for small objects and other for Large Objects. We generally call it as
SOH (Small Object Heap) and LOH (Large Object Heap). Now when any
process requests for memory, it transfers the request to CLR, it then
assigns memory from these Managed Heaps based on their size.
Generally, SOH is assigned for the memory request when size of the
memory is less than 83 KBs( 85,000 bytes). If it is greater than this,
it allocates memory from LOH. On more and more requests of memory .NET
commits memory in smaller chunks."
Upon reading further this paragraphs, Its the CLR with the help of Windows (32bit or 64Bit) it "allocates" the memory.
The "De-allocation" is managed by GC.
"The relationships between the Object and the process associated with
that object are maintained through a Graph. When garbage collection is
triggered it deems every object in the graph as garbage and traverses
recursively to all the associated paths of the graph associated with
the object looking for reachable objects. Every time the Garbage
collector reaches an object, it marks the object as reachable. Now
after finishing this task, garbage collector knows which objects are
reachable and which aren’t. The unreachable objects are treated as
Garbage to the garbage collector."
Despite the name, many kinds of modern "garbage collectors" don't actually collect garbage as their primary operation. Instead, they often identify everything in an area of memory that isn't garbage and move it somewhere that is known not to contain anything of value. Unless the area contained an object that was "pinned" and couldn't be moved (in which case things are more complicated) the system will then know that the area of memory from which things were moved contains nothing of value.
In many such collectors, once the last reference to an object has disappeared, no bytes of memory that had been associated with that object will ever again be examined prior to the time that they get blindly overwritten with new data. If the GC expects that the next use of the old region of memory will be used to hold new objects, it will likely zero out all the bytes in one go, rather than doing so piecemeal to satisfy allocations, but if the GC expects that it will be used as a destination for objects copied from elsewhere it may not bother. While objects are guaranteed to remain in memory as long as any reference exists, once the last reference to an object has ceased to exist there may be no way of knowing whether every byte of memory that had been allocated to that object has actually been overwritten.
While .NET does sometimes have to take affirmative action when certain objects (e.g. those whose type overrides Finalize) are found to have been abandoned, in general I think it's best to think of the "GC" as being not a subsystem that "collects" garbage, but rather as a garbage-collected memory pool's manager, that needs to at all times be kept informed of everything that isn't garbage. While the manager's duties include the performance of GC cycles, they go far beyond that, and I don't think it's useful to separate GC cycles from the other duties.

Why does .NET reserve so much memory for my application?

When I run my application, in a profiler I see that is uses about 80MB of memory (total committed bytes, performance counter). But when I look at the size of the allocated memory, it is over 400MB!
So my question is, why is .NET reserving so much memory for my application? Is this normal?
you should read Memory Mystery. I had similar questions a while ago and stopped asking myself after reading this.
I read other sources, but I cant find now, use keywords "unreasonable allocation of memory windows OS". In a nutshell, OS gives more than your app require depending upon physically available memory resources
for e.g. if you are running your app on two machines with different RAM, it can be guaranteed that both these machines will have different memory allocations
As you no doubt know, there is a massive difference between actual memory used and allocated. An application's allocated memory doesn't mean that it's actually being used anywhere; all it really means is that the OS has 'marked' a zone of virtual memory (which is exactly that - virtual) ready for use by the application.
The memory isn't necessarily being used or starving other processes - it just could if the app starts to fill it.
This allocated number, also, will likely scale based on the overall memory ecosystem of the machine. If there's plenty of room when an app starts up, then it'll likely grab a larger allocation than if there's less.
That principle is the same as the one which says it's good practise to create a List<T>, say, with a reasonable initial capacity that'll mean a decent number of items can be added before resizing needs to take place. The OS takes the same approach with memory usage.
"Reserving" memory is by no means the same as "allocated" ram. Read the posts Steve and Krishna linked to.
The part your client needs to look at is Private Bytes. But even that isn't exactly a hard number as parts of your app may be swapped to the virtual disk.
In short, unless your Private Bytes section is pretty well out of control OR you have leaks (ie: undisposed unmanaged resources) you (and your client) should ignore this and let the OS manage what is allocated, what's in physical ram and what's swapped out to disk.
It's fairly common for software to issue one large memory request to the underlying operating system, then internally manage its own use of the allocated memory block. So common, in fact, that Windows' (and other operating systems') memory manager explicitly supports the concept, called "uncommitted memory" -- memory that the process has requested but hasn't made use of yet. That memory really doesn't exist as far as bits taking up space on your DRAM chips until the process actually makes use of it. The preallocation of memory effectively costs nothing.
Applications do this for many reasons -- though it's primarily done for performance reasons. An application with knowledge of its own memory usage patterns can optimize its allocator for that pattern; similarly, for address locality reasons, as successive memory requests from the OS won't always be 'next' to each other in memory, which can affect the performance of the CPU cache and could even preclude you from using some optimizations.
.NET in particular allocates space for the managed heap ahead of time, for both of the reasons listed above. In most cases, allocating memory on the managed heap merely involves incrementing a top-of-heap pointer, which is incredibly fast --- and also not possible with the standard memory allocator (which has a more general design to perform acceptably in a fragmented heap, whereas the CLR's GC uses memory compaction to sharply limit the fragmentation of the managed heap), and also not possible if the managed heap itself is fragmented across the process address space due to multiple allocations at different points in time.

C#: managing large memory buffers

I am maintaining a video application written in C#.
I need as much control as possible over memory allocation/deallocation
for large memory buffers (hundreds of megabytes).
As it is written, when pixel data needs to be freed, the pixel buffer
is set to null. Is there a better way of freeing up memory?
Is there a large cost to garbage collecting large objects?
Thanks!
Don't throw big buffers like that away, you are lucky to have it. Video gives lots of opportunity for re-use. Don't lose a buffer until you are sure you won't need it anymore. At which point it doesn't matter when it get collected.
The cost of garbage collecting large objects is very high from what I remember. From what I read they automatically become generation 2 on allocation(they are allocated in the large object heap). And since they are large they force frequent generation 2 collections.
So I'd rather implement manual pooling for the bitmap arrays, or even use unmanaged memory. Have some pool class and return the array back to it in the Dispose of your pixels/bitmap class.
With memory blocks that large ("hundreds of megabytes") it should be relativaly easy to know precisely who and where uses them (you can fit just 10-20 of such blocks in memory anyway). As ypu plan to use such amounts of mmeory you need to carefully budget memory usage - i.e. simple copy of whole buffer will take non-trivial time.
When you are done with particular block you can force GC yourself. It sounds like reasonable usage of GC.Collect API - you done with using huge portion of all memory avaialble.
You also may consider switchihng to allocation of smaller (64k) blocks and link them together if it works for your application. This will align better with garbage collection and may provide more flexibility for your application.

Understanding Memory Performance Counters

[Update - Sep 30, 2010]
Since I studied a lot on this & related topics, I'll write whatever tips I gathered out of my experiences and suggestions provided in answers over here-
1) Use memory profiler (try CLR Profiler, to start with) and find the routines which consume max mem and fine tune them, like reuse big arrays, try to keep references to objects to minimal.
2) If possible, allocate small objects (less than 85k for .NET 2.0) and use memory pools if you can to avoid high CPU usage by garbage collector.
3) If you increase references to objects, you're responsible to de-reference them the same number of times. You'll have peace of mind and code probably will work better.
4) If nothing works and you are still clueless, use elimination method (comment/skip code) to find out what is consuming most memory.
Using memory performance counters inside your code might also help you.
Hope these help!
[Original question]
Hi!
I'm working in C#, and my issue is out of memory exception.
I read an excellent article on LOH here ->
http://www.simple-talk.com/dotnet/.net-framework/the-dangers-of-the-large-object-heap/
Awesome read!
And,
http://dotnetdebug.net/2005/06/30/perfmon-your-debugging-buddy/
My issue:
I am facing out of memory issue in an enterprise level desktop application. I tried to read and understand stuff about memory profiling and performance counter (tried WinDBG also! - little bit) but am still clueless about basic stuff.
I tried CLR profiler to analyze the memory usage. It was helpful in:
Showing me who allocated huge chunks of memory
What data type used maximum memory
But, both, CLR Profiler and Performance Counters (since they share same data), failed to explain:
The numbers that is collected after each run of the app - how to understand if there is any improvement?!?!
How do I compare the performance data after each run - is lower/higher number of a particular counter good or bad?
What I need:
I am looking for the tips on:
How to free (yes, right) managed data type objects (like arrays, big strings) - but not by making GC.Collect calls, if possible. I have to handle arrays of bytes of length like 500KB (unavoidable size :-( ) every now and then.
If fragmentation occurs, how to compact memory - as it seems that .NET GC is not really effectively doing that and causing OOM.
Also, what exactly is 85KB limit for LOH? Is this the size of the object of the overall size of the array? This is not very clear to me.
What memory counters can tell if code changes are actually reducing the chances of OOM?
Tips I already know
Set managed objects to null - mark them garbage - so that garbage collector can collect them. This is strange - after setting a string[] object to null, the # bytes in all Heaps shot up!
Avoid creating objects/arrays > 85KB - this is not in my control. So, there could be lots of LOH.
3.
Memory Leaks Indicators:
# bytes in all Heaps increasing
Gen 2 Heap Size increasing
# GC handles increasing
# of Pinned Objects increasing
# total committed Bytes increasing
# total reserved Bytes increasing
Large Object Heap increasing
My situation:
I have got 4 GB, 32-bit machine with Wink 2K3 server SP2 on it.
I understand that an application can use <= 2 GB of physical RAM
Increasing the Virtual Memory (pagefile) size has no effect in this scenario.
As its OOM issue, I am only focusing on memory related counters only.
Please advice! I really need some help as I'm stuck because of lack of good documentation!
Nayan, here are the answers to your questions, and a couple of additional advices.
You cannot free them, you can only make them easier to be collected by GC. Seems you already know the way:the key is reducing the number of references to the object.
Fragmentation is one more thing which you cannot control. But there are several factors which can influence this:
LOH external fragmentation is less dangerous than Gen2 external fragmentation, 'cause LOH is not compacted. The free slots of LOH can be reused instead.
If the 500Kb byte arrays are referring to are used as some IO buffers (e.g. passed to some socket-based API or unmanaged code), there are high chances that they will get pinned. A pinned object cannot be compacted by GC, and they are one of the most frequent reasons of heap fragmentation.
85K is a limit for an object size. But remember, System.Array instance is an object too, so all your 500K byte[] are in LOH.
All counters that are in your post can give a hint about changes in memory consumption, but in your case I would select BIAH (Bytes in all heaps) and LOH size as primary indicators. BIAH show the total size of all managed heaps (Gen1 + Gen2 + LOH, to be precise, no Gen0 - but who cares about Gen0, right? :) ), and LOH is the heap where all large byte[] are placed.
Advices:
Something that already has been proposed: pre-allocate and pool your buffers.
A different approach which can be effective if you can use any collection instead of contigous array of bytes (this is not the case if the buffers are used in IO): implement a custom collection which internally will be composed of many smaller-sized arrays. This is something similar to std::deque from C++ STL library. Since each individual array will be smaller than 85K, the whole collection won't get in LOH. The advantage you can get with this approach is the following: LOH is only collected when a full GC happens. If the byte[] in your application are not long-lived, and (if they were smaller in size) would get in Gen0 or Gen1 before being collected, this would make memory management for GC much easier, since Gen2 collection is much more heavyweight.
An advice on the testing & monitoring approach: in my experience, the GC behavior, memory footprint and other memory-related stuff need to be monitored for quite a long time to get some valid and stable data. So each time you change something in the code, have a long enough test with monitoring the memory performance counters to see the impact of the change.
I would also recommend to take a look at % Time in GC counter, as it can be a good indicator of the effectiveness of memory management. The larger this value is, the more time your application spends on GC routines instead of processing the requests from users or doing other 'useful' operations. I cannot give advices for what absolute values of this counter indicate an issue, but I can share my experience for your reference: for the application I am working on, we usually treat % Time in GC higher than 20% as an issue.
Also, it would be useful if you shared some values of memory-related perf counters of your application: Private bytes and Working set of the process, BIAH, Total committed bytes, LOH size, Gen0, Gen1, Gen2 size, # of Gen0, Gen1, Gen2 collections, % Time in GC. This would help better understand your issue.
You could try pooling and managing the large objects yourself. For example, if you often need <500k arrays and the number of arrays alive at once is well understood, you could avoid deallocating them ever--that way if you only need, say, 10 of them at a time, you could suffer a fixed 5mb memory overhead instead of troublesome long-term fragmentation.
As for your three questions:
Is just not possible. Only the garbage collector decides when to finalize managed objects and release their memory. That's part of what makes them managed objects.
This is possible if you manage your own heap in unsafe code and bypass the large object heap entirely. You will end up doing a lot of work and suffering a lot of inconvenience if you go down this road. I doubt that it's worth it for you.
It's the size of the object, not the number of elements in the array.
Remember, fragmentation only happens when objects are freed, not when they're allocated. If fragmentation is indeed your problem, reusing the large objects will help. Focus on creating less garbage (especially large garbage) over the lifetime of the app instead of trying to deal with the nuts and bolts of the gc implementation directly.
Another indicator is watching Private Bytes vs. Bytes in all Heaps. If Private Bytes increases faster than Bytes in all Heaps, you have an unmanaged memory leak. If 'Bytes in all Heaps` increases faster than 'Private Bytes' it is a managed leak.
To correct something that #Alexey Nedilko said:
"LOH external fragmentation is less dangerous than Gen2 external
fragmentation, 'cause LOH is not compacted. The free slots of LOH can
be reused instead."
is absolutely incorrect. Gen2 is compacted which means there is never free space after a collection. The LOH is NOT compacted (as he correctly mentions) and yes, free slots are reused. BUT if the free space is not contiguous to fit the requested allocation, then the segment size is increased - and can continue to grow and grow. So, you can end up with gaps in the LOH that are never filled. This is a common cause of OOMs and I've seen this in many memory dumps I've analyzed.
Though there are now methods in the GC API (as of .NET 4.51) that can be called to programatically compact the LOH, I strongly recommend to avoid this - if app performance is a concern. It is extremely expensive to perform this operation at runtime and and hurt your app performance significantly. The reason that the default implementation of the GC was to be performant which is why they omitted this step in the first place. IMO, if you find that you have to call this because of LOH fragmentation, you are doing something wrong in your app - and it can be improved with pooling techniques, splitting arrays, and other memory allocation tricks instead. If this app is an offline app or some batch process where performance isn't a big deal, maybe it's not so bad but I'd use it sparingly at best.
A good visual example of how this can happen is here - The Dangers of the Large Object Heap and here Large Object Heap Uncovered - by Maoni (GC Team Lead on the CLR)

Categories