.Net garbage collection mark phase and huge linked lists - c#

If there is a linked list with 4M+ nodes, does the mark phase needs to traverse the entire list each time to build the graph? Are there any optimizations applied in this case? In the plain sight it doesn't look efficient. Is there a way to verify if GC traverses the entire list or not?
TIA.

Yes, it will need to traverse the whole object graph. I can't think how there could be any optimizations, to be honest... but it doesn't need to do very much on each node. Most of the time will probably be spent waiting on memory, I suspect, as obviously it'll burn through the cache. Of course, by the time the linked list ends up in gen2 (and if you're allocating millions of nodes, most of it will be in gen2 pretty quickly), it will only need to do that very rarely.
If this is the most reasonable data structure for your app, I would use it for the moment, but keep track of the performance hit of garbage collection using Performance Monitor etc. If it turns out to be a problem, you can consider alternative strategies.

What Jon said.
Also, once an object ends up in Gen2, an optimisation that's available (on Windows but not other platforms IIRC) is that the GC can register with the kernel for notifications to a given page of memory. In cases where a page remains unchanged between GC events, some work need not be repeated.

There is one very important optimization being made. The .NET GC is generational, and data in gen2 is only rarely traversed.
With large data structures (such as huge linked lists), most of your data will quickly end up in gen2, where the GC will only rarely access it.
Also, the GC only traverses live data during collections, dead data is collected "for free". So when your list becomes unreachable (or if most of its nodes, but not all, do), then the GC will be able to collect millions of nodes basically for free.

Related

What are approaches to optimize the mark phase of a non-generational GC?

I am running on Unity's Boehm–Demers–Weiser garbage collector, which is a non-generational GC.
I have a large tree of managed objects in memory (~100k objects, ~200MiB allocation).
These objects are essentially a cache and never go out of scope, so they never actually get sweeped by the GC.
However, because Boehm is non-generational, this stale cache never gets moved up to higher generations. This causes the mark phase to take a very high amount of processing time, as it has to traverse this whole cache on every collection, causing noticeable lag spikes.
This is "by-design", as the Unity documentation puts it:
Crucially, Unity’s garbage collection – which uses the Boehm GC algorithm – is non-generational and non-compacting. “Non-generational” means that the GC must sweep through the entire heap when performing a collection pass, and its performance therefore degrades as the heap expands.
I am well aware of approaches to reduce recurring garbage allocation, however I cannot find any information on how to optimize a large, stale, baseline allocation in a non-generational GC.
More specifically:
Is there any way to mark a root pointer (e.g. static field) as ignored from GC entirely?
Are there some data structure patterns that are faster to traverse in the mark phase?
Conversely, are there known data structure patterns that hinder the mark phase speed?
These questions are just some of my hypotheses to solve this, but I'm open to all suggestions.
One could approximate generational behavior by separating program startup with initialization of static data structures from steady state operation. All pointers into the startup memory region could be ignored while pointers from it should not exist since nothing allocated after the switchpoint (which would be under GC control) has been allocated yet.
One could even GC the startup region once before switching to a new region. Essentially you would end up with a limited form of region-based, non-moving collector where references between regions only happen in one direction.

Counting total objects queued for garbage collection

I wanted to add a small debug UI to my OpenGL game, which will be updated frequently with various debugging options/output displays. One thing I wanted was a constant counter that shows active objects in each generation of the garbage collector. I don't want names or anything, just a total count; something that I can eyeball when I do certain things within the game.
My problem, however, is that I can't seem to find a way to count the total objects currently alive in the various generations.
I even considered keeping a global static field, which would be incremented within every constructor and decremented within class finalizers. This would require hand-coding said functionality into every class though, and would not solve the problem of a "per-generation total".
Do you know how I could go about doing this?
(Question title:) "Counting total objects queued for garbage collection"
(From the question's body:) "My problem, however, is that I can't seem to find a way to count the total objects currently alive in the various generations."
Remark: Your question's title and body ask for opposite things. In the title, you're asking for the number of objects that can no longer be reached via any GC root, while in the body, you're asking for "live" objects, i.e. those that can still be reached via any GC root.
Let me start by saying that there might not be any way to do this, basically because objects in .NET are not reference-counted, so they cannot be immediately marked as "no longer needed" when the last reference to them disappears or goes out of scope. I believe .NET's mark-and-compact garbage collector only discovers which objects are alive and which can be reclaimed during an actual garbage collection (during the "mark" phase). You however seem to want this information in advance, i.e. before a GC occurs.
That being said, here are perhaps your best options:
Perhaps your best bet in .NET's managed Framework Class Library are performance counters. But it doesn't look like there are any suitable counters available: There are performance counters giving you the number of allocated bytes in the various GC generations, but AFAIK no counters for the number of live/dead objects.
You might also want to take a look at the CLR's (i.e. the runtime's) unmanaged, COM-based Debugging API. Given that you have retrieved an ICorDebugProcess5 interface, these methods might be of interest:
ICorDebugProcess5::EnumerateGCReferences method:
"Gets an enumerator for all objects that are to be garbage-collected in a process."
See also this answer to a similar question on SO.
Note that this is about objects that are to be garbage-collected, not about live objects.
ICorDebugProcess5::GetGCHeapInformation method:
"Provides general information about the garbage collection heap, including whether it is currently enumerable."
If it turns out that the managed heap is enumerable, you could use…
ICorDebugProcess5::EnumerateHeap method:
"Gets an enumerator for the objects on the managed heap."
The objects returned by this enumerator are of this type:
COR_HEAPOBJECT structure:
"Provides information about an object on the managed heap."
You might not be actually interested in these details, but just in the number of objects returned by the enumerator.
(I haven't used this API myself, perhaps there exists a better and more efficient way.)
In Sept 2015, Microsoft published a managed library called clrmd aka Microsoft.Diagnostics.Runtime on GitHub. It is based on the same foundation as the unmanaged debugging API mentioned above. The project includes documentation about enumerating objects in the GC heap.
Btw. there is an extremely informative book out there by Ben Watson, "Writing High-Performance .NET Code", which includes solid tips on how to make .NET memory allocation and GC more efficient.
Garbage Collector doesn't have to collect objects.
... that fact will be discovered when the garbage collector
runs the collector for whatever generation the object was in. (If it
runs at all, which it might not. There is no guarantee that the GC
runs.)
(C) Eric Lippert
If the application performs normally and the memory consumption is not increasing the GC can let it work without interruptions. That means that numbers will differ from run to run.
If I were you I wouldn't spend time on getting generations information, but just the size of used memory.
The simple but not very accurate way is to get it from GC.
// Determine the best available approximation of the number
// of bytes currently allocated in managed memory.
Console.WriteLine("Total Memory: {0}", GC.GetTotalMemory(false));
If you see that used memory increases and decreases often, then you can use existing profilers to figure out where are you allocating too mush, or even where the memory leak is.

Optimizing collection of long lived objects

Background:
I have a service whose purpose in life is to provide objects to requestors - it basically gets complicated data from a database and transforms it once (a bit like a view over data) to produce a simplified record. This then services requests from other services by providing up to 100k records (depending on the nature of the request) on demand.
The idea is that the complicated transformation is done once and is cached by the service - it works out quicker than letting the database work it out each time a view is accessed and for my purposes works just fine. (I believe this is called SSOS by some)
The way data is being cached is in a list of objects which are property bags for standard .Net types. These objects have no references to anything else.
Periodically a record will change, and the cache must be updated which means that the original record must be located, thrown away and replaced.
Now the record in the cache will have been in there for a long time and will have been marked for a Gen 2 collection; pretty much all the collections will happen in the Gen2 phase as these objects are hanging around for ages (on purpose).
So my understanding of Gen2 collections is that they are slow, and if the collections are mainly working on Gen2 then the optimizer is going to do this more often.
I would like to be able to de-reference an object in the list in a way that doesn't end up triggering a full Gen2 collection... I was thinking that maybe there is a way of marking it as Gen0 and then de-referencing it before replacing it - but I don't think that is possible.
I am constrained to using .Net 4 for this and the application is a service which serves data to up to 100 clients who request full lists or changes to the list over a period of time.
Question: Can anyone suggest a way to de-reference long lived objects in a GC friendly way or perhaps another way to approach this problem?
There is no simple answer to this. If you have lots of long-lived objects, then full collections really can hurt, as I discussed here. Since a picture tells a thousand words:
Those vertical spikes are where garbage collection happens and slaughters the response times.
The way we reduced the impact of this was: don't have a gazillion long-lived objects. What we did was to change the classes to structs, which meant that the only object was the array that contained them. We were fortunate here is that the data was simple and didn't involve strings, which would of course themselves be objects. We also did some crazy fixed-size buffer work to reduce things that were previously collections, and changed what were references to indices (into the array). If you do have to use string data, perhaps try to ensure you don't have 20,000 different string instancs with the same value - some kind of manual interner (a Dictionary<string,string> would suffice) can be really useful there.
Note that this needn't impact your public API, since you can always create the old class data from the struct storage - the difference is that this class will only exist briefly as a DTO - so will be collected cheaply in the next gen-0 sweep.
YMMV, but this worked enough well for us.
The problem is: you need to be really careful when working with structs; I strongly advise making them immutable.

Best time to cull WeakReferences in a collection in .NET

I have a collection (I'm writing a Weak Dictionary) and I need to cull the dead WeakReferences periodically. What I've usually seen is checks in the Add and Remove methods that say, "After X modifications to the collection, it's time to cull." This will be acceptable for me, but it seems like there should be a better way.
I would really like to know when the GC ran and run my cleanup code immediately after. After all, the GC is probably the best mechanism for determining when is a good time to clean up dead references. I found Garbage Collection Notifications, but it doesn't look like this is what I want. I don't want to spawn a separate thread just to monitor the GC. Ideally, my collection would implement IWantToRunCodeDuringGC or subscribe to a System.GC.Collected event. But the .NET framework probably can't trust user code to run during a GC...
Or maybe there's another approach I'm overlooking.
EDIT: I don't think it matters if my code runs after, before, or during the GC.
I guess you want to create a weak dictionary as a way of “lazy” caching of some data.
You need to consider the fact that GC will occur very often and your weak references will be dead most of the time if no other objects will reference them. GC occurs approximately after every 256KB of memory is allocated. It is pretty often.
You probably will be better by implementing your cache as a dictionary with a maximum number of elements. Then you can use least-recently used algorithm or time-based algorithm for pushing elements out of the collection. Usually such approach has better performance and memory consumption than using weak references.
Can't you just use an object with no references, then call some code in the finalizer? The finalizer is called when the GC is collecting the object.
EDIT: Then of course, you wouldn't know when the GC was done.. Hmm.
From the standpoint of avoiding memory leaks, I would suggest that when something is added to a dictionary, you check whether the number of garbage-collections that have been performed. If the number of items that have been added between the last time the dictionary was checked and the time the last collection occurred exceeds a reasonable fraction of the size of the dictionary (say, 10%, or some minimum number of items, whichever is less), that would be a sign that the dictionary should be swept. Note that this approach will limit the number of excessive items in the dictionary to a certain fraction of the dictionary size while offering reasonable performance regardless of dictionary size.

Does WeakReference make a good cache?

i have a cache that uses WeakReferences to the cached objects to make them automatically removed from the cache in case of memory pressure. My problem is that the cached objects are collected very soon after they have been stored in the cache. The cache runs in a 64-Bit application and in spite of the case that more than 4gig of memory are still available, all the cached objects are collected (they usually are stored in the G2-heap at that moment). There are no garbage collection induced manually as the process explorer shows.
What methods can i apply to make the objects live a litte longer?
Using WeakReferences as the primary means of referencing cached objects is not really a great idea, because as Josh said, your at the mercy of any future behavioral changes to WeakReference and the GC.
However, if your cache needs any kind of resurrection capability, use of WeakReferences for items that are pending purge is useful. When an item meets eviction criteria, rather than immediately evicting it, you change its reference to a weak reference. If anything requests it before it is GC'ed, you restore its strong reference, and the object can live again. I have found this useful for some caches that have hard to predict hit rate patterns with frequent enough "resurrections" to be beneficial.
If you have predictable hit rate patterns, then I would forgoe the WeakReference option and perform explicit evictions.
There is one situation where a WeakReference-based cache may be good: when the usefulness of an item in the class is predicated upon the existence of a reference to it. In such a situation, a weak interning cache may be useful. For example, if one had an application which would deserialize many large immutable objects, many of which were expected to be duplicates, and would have to perform many comparisons between them. If X and Y are references to some immutable class type, testing X.Equals(Y) will be very fast if both variables point to the same instance, but may be very slow if they point to distinct instances that happen to be equal. If a deserialized object happens to match another object to which a reference already exists, fetching a from the dictionary a reference to that latter object (requiring one slow comparison) may expedite future comparisons. On the other hand, if it matched an item in the dictionary but the dictionary was the only reference to that item, there would be little advantage to using the dictionary object instead of simply keeping the object that was read in; probably not enough advantage to justify the cost of the comparison. For an interning cache, having WeakReferences get invalidated as soon as possible once no other references exist to an object would be a good thing.
In .net, a WeakReference is not considered a reference from the GC standpoint at all, so any object that only has weak references will be collected in the next GC run (for the appropriate generation).
That makes weak reference completely inappropriate for caching - as your experience shows.
You need a "real" cache component, and the most important thing about caching is to get one where the eviction policy (that is, the rules about when to drop an object from the cache) are a good match for you application's usage pattern.
No, WeakReference is not good for that because the behavior of the garbage collector can and will change over time and your cache should not be relying on today's behavior. Also many factors outside of your control could affect memory pressure.
There are many implementations of a cache for .NET. You could find probably a dozen on CodePlex. I guess what you need to add to it is something that looks at the application's current working set to use that as a trigger for purging.
One more note about why your objects are being collected so frequently. The GC is very aggressive at cleaning up Gen0 objects. If your objects are very short-lived (up until the only reference to it is a weak reference) then the GC is doing what it's designed to do by cleaning up as quickly as it can.
I believe the problem you are having is that the Garbage Collector removes weakly referenced objects in response not only in response to memory pressure - instead it will do collection quite aggressively sometimes just because the runtime system thinks some objects may likely have become unreachable.
You may be better off using e.g. System.Runtime.Caching.MemoryCache which can be configured with a memory limit, or custom eviction policies for the items.
The answer actually depends on usage characteristics of the cache you are trying to build. I have successfully used WeakReference based caching strategy for improving performance in many of my projects where the cached objects are expected to be used in short bursts of multiple reads. As others pointed out, the weak references are pretty much garbage from GC's point of view and will be collected whenever the next GC cycle is run. It's nothing to do with the memory utilization.
If, however, you need a cache that survives such brutality from GC, you need to use or mimic the functionality provided by System.Runtime.Caching namespace. Keep in mind that you'd need an additional thread that cleans up the cache when the memory usage is crossing your thresholds.
A bit late, but here's a relevant use case:
I need to cache two types of objects: large (deserialised) data files that take 10 minutes to load and cost 15G of ram each, and smaller (dynamically compiled) objects that contain internal references to those data files (the smaller objects are also cached because they take ~10s to generate). These caches are hidden within the factories that supply the objects (the former component having no knowledge of the latter), and have different eviction policies.
When my `data file' cache evicts an object, it replaces it by a weak reference, so if that object is still available when next requested, we can resurrect it (and renew its cache timeout). In this way we avoid losing (or accidentally duplicating) any object before it is truly defunct (i.e. not used anywhere else). Notice that neither cache is required to be aware of the other, and that no other client objects need to be aware that there are any caches at all (eg: we avoid needing 'keepalives', callbacks, registration, retrieve-and-return scopes, etc - things get a lot simpler).
So although using WeakReference by itself (instead of a cache) is a terrible idea (because modern GCs are typically tuned to the size of the L2 CPU cache, and regular code will burn through this many times per minute), it's very useful as a way to hide your caches from the rest of your code.

Categories