Best time to cull WeakReferences in a collection in .NET - c#

I have a collection (I'm writing a Weak Dictionary) and I need to cull the dead WeakReferences periodically. What I've usually seen is checks in the Add and Remove methods that say, "After X modifications to the collection, it's time to cull." This will be acceptable for me, but it seems like there should be a better way.
I would really like to know when the GC ran and run my cleanup code immediately after. After all, the GC is probably the best mechanism for determining when is a good time to clean up dead references. I found Garbage Collection Notifications, but it doesn't look like this is what I want. I don't want to spawn a separate thread just to monitor the GC. Ideally, my collection would implement IWantToRunCodeDuringGC or subscribe to a System.GC.Collected event. But the .NET framework probably can't trust user code to run during a GC...
Or maybe there's another approach I'm overlooking.
EDIT: I don't think it matters if my code runs after, before, or during the GC.

I guess you want to create a weak dictionary as a way of “lazy” caching of some data.
You need to consider the fact that GC will occur very often and your weak references will be dead most of the time if no other objects will reference them. GC occurs approximately after every 256KB of memory is allocated. It is pretty often.
You probably will be better by implementing your cache as a dictionary with a maximum number of elements. Then you can use least-recently used algorithm or time-based algorithm for pushing elements out of the collection. Usually such approach has better performance and memory consumption than using weak references.

Can't you just use an object with no references, then call some code in the finalizer? The finalizer is called when the GC is collecting the object.
EDIT: Then of course, you wouldn't know when the GC was done.. Hmm.

From the standpoint of avoiding memory leaks, I would suggest that when something is added to a dictionary, you check whether the number of garbage-collections that have been performed. If the number of items that have been added between the last time the dictionary was checked and the time the last collection occurred exceeds a reasonable fraction of the size of the dictionary (say, 10%, or some minimum number of items, whichever is less), that would be a sign that the dictionary should be swept. Note that this approach will limit the number of excessive items in the dictionary to a certain fraction of the dictionary size while offering reasonable performance regardless of dictionary size.

Related

.NET - Reducing GC contention on array creation

I have some code that deal with a lot of copying of arrays. Basically my class is a collection that uses arrays as backing fields, and since I don't want to run the risk of anyone modifying an existing collection, most operations involves creating copies of the collection before modifying it, hence also copying the backing arrays.
I have noticed that the copying can be slow sometimes, within acceptable limits but I am worried that it might be a problem when the application is scaled up and starts using more data.
Some performance analysis testing suggests that while barely consuming CPU resources at all, my array copy code spends a lot of time blocked. There are few contentions, but a lot of time blocked. Since the testing application is single threaded, I assume there is some GC contention magic going on. I'm not confident enough in how the GC works in these scenarios, so I'm asking here.
My question - is there a way to create new arrays that reduces the strain on the GC? Or is there some other way I can speed this up (simplified for testing and readability purposes):
public MyCollection(MyCollection copyFrom)
{
_items = new KeyValuePair<T, double>[copyFrom._items.Length]; //this line is reported to have a lot of contention time
Array.Copy(copyFrom._items, _items, copyFrom._items.Length);
_numItems = copyFrom._numItems;
}
Not so sure what's going on here, but contention is a threading problem, not an array copying problem. And yes, a concurrency analyzer is liable to point at a new statement since memory allocation requires acquiring a lock that protects the heap.
That lock is held for a very short time when allocations come from the gen #0 heap. So having threads fighting over the lock and losing a great deal of time being locked out is a very unlikely mishap. It is not so fast when the allocation comes from the Large Object Heap. Happens when the allocation is 85,000 bytes or more. But then a thread would of course be pretty busy with copying the array elements as well.
Do watch out for what the tool tells you, a very large number of total contentions does not automatically mean you have a problem. It only gets ugly when threads end up getting blocked for a substantial amount of time. If that is a real problem then you next need to look at how much time is spent on garbage collection. There is a basic perf counter for that, you can see it in Perfmon.exe. Category ".NET CLR Memory", counter "% Time in GC", instance = yourapp. Which is liable to be high, considering the amount of copying you do. A knob you can tweak if that is the real problem is to enable server GC.
There's a concept of persistent immutable data structure. This is one of the possible solutions that basically let's you create immutable objects, while still modifying them, in a memory efficient way.
For example,
Roslyn has a SyntaxTree object, that is immutable. You can modify the immutable object, and get back modified immutable object. Note that the "modified immutable object" has possibly no memory allocations, because it can build on the "first immutable object".
The same concept is also used in Visual Studio text editor itself. The TextBuffer is immutable object, but each time you press a keyboard button, new immutable TextBuffer is created, however, they do not allocate memory(as that would be slow).
Also, if it's true that you're facing the issues with LOH, it can help sometimes when you allocate the big memory block yourself, and use that as "reusable" memory pool, thus avoiding GC completely. It's worth considering.
No. You can wait for the new runtime in 2015 though that will use SIMD instructions for the Array.Copy operation. This will be quite a lot faster. The current implementation is very sub-optimal.
At the end, the trick is in avoiding memory operations - which sometime just is not possible.

.Net garbage collection mark phase and huge linked lists

If there is a linked list with 4M+ nodes, does the mark phase needs to traverse the entire list each time to build the graph? Are there any optimizations applied in this case? In the plain sight it doesn't look efficient. Is there a way to verify if GC traverses the entire list or not?
TIA.
Yes, it will need to traverse the whole object graph. I can't think how there could be any optimizations, to be honest... but it doesn't need to do very much on each node. Most of the time will probably be spent waiting on memory, I suspect, as obviously it'll burn through the cache. Of course, by the time the linked list ends up in gen2 (and if you're allocating millions of nodes, most of it will be in gen2 pretty quickly), it will only need to do that very rarely.
If this is the most reasonable data structure for your app, I would use it for the moment, but keep track of the performance hit of garbage collection using Performance Monitor etc. If it turns out to be a problem, you can consider alternative strategies.
What Jon said.
Also, once an object ends up in Gen2, an optimisation that's available (on Windows but not other platforms IIRC) is that the GC can register with the kernel for notifications to a given page of memory. In cases where a page remains unchanged between GC events, some work need not be repeated.
There is one very important optimization being made. The .NET GC is generational, and data in gen2 is only rarely traversed.
With large data structures (such as huge linked lists), most of your data will quickly end up in gen2, where the GC will only rarely access it.
Also, the GC only traverses live data during collections, dead data is collected "for free". So when your list becomes unreachable (or if most of its nodes, but not all, do), then the GC will be able to collect millions of nodes basically for free.

Does WeakReference make a good cache?

i have a cache that uses WeakReferences to the cached objects to make them automatically removed from the cache in case of memory pressure. My problem is that the cached objects are collected very soon after they have been stored in the cache. The cache runs in a 64-Bit application and in spite of the case that more than 4gig of memory are still available, all the cached objects are collected (they usually are stored in the G2-heap at that moment). There are no garbage collection induced manually as the process explorer shows.
What methods can i apply to make the objects live a litte longer?
Using WeakReferences as the primary means of referencing cached objects is not really a great idea, because as Josh said, your at the mercy of any future behavioral changes to WeakReference and the GC.
However, if your cache needs any kind of resurrection capability, use of WeakReferences for items that are pending purge is useful. When an item meets eviction criteria, rather than immediately evicting it, you change its reference to a weak reference. If anything requests it before it is GC'ed, you restore its strong reference, and the object can live again. I have found this useful for some caches that have hard to predict hit rate patterns with frequent enough "resurrections" to be beneficial.
If you have predictable hit rate patterns, then I would forgoe the WeakReference option and perform explicit evictions.
There is one situation where a WeakReference-based cache may be good: when the usefulness of an item in the class is predicated upon the existence of a reference to it. In such a situation, a weak interning cache may be useful. For example, if one had an application which would deserialize many large immutable objects, many of which were expected to be duplicates, and would have to perform many comparisons between them. If X and Y are references to some immutable class type, testing X.Equals(Y) will be very fast if both variables point to the same instance, but may be very slow if they point to distinct instances that happen to be equal. If a deserialized object happens to match another object to which a reference already exists, fetching a from the dictionary a reference to that latter object (requiring one slow comparison) may expedite future comparisons. On the other hand, if it matched an item in the dictionary but the dictionary was the only reference to that item, there would be little advantage to using the dictionary object instead of simply keeping the object that was read in; probably not enough advantage to justify the cost of the comparison. For an interning cache, having WeakReferences get invalidated as soon as possible once no other references exist to an object would be a good thing.
In .net, a WeakReference is not considered a reference from the GC standpoint at all, so any object that only has weak references will be collected in the next GC run (for the appropriate generation).
That makes weak reference completely inappropriate for caching - as your experience shows.
You need a "real" cache component, and the most important thing about caching is to get one where the eviction policy (that is, the rules about when to drop an object from the cache) are a good match for you application's usage pattern.
No, WeakReference is not good for that because the behavior of the garbage collector can and will change over time and your cache should not be relying on today's behavior. Also many factors outside of your control could affect memory pressure.
There are many implementations of a cache for .NET. You could find probably a dozen on CodePlex. I guess what you need to add to it is something that looks at the application's current working set to use that as a trigger for purging.
One more note about why your objects are being collected so frequently. The GC is very aggressive at cleaning up Gen0 objects. If your objects are very short-lived (up until the only reference to it is a weak reference) then the GC is doing what it's designed to do by cleaning up as quickly as it can.
I believe the problem you are having is that the Garbage Collector removes weakly referenced objects in response not only in response to memory pressure - instead it will do collection quite aggressively sometimes just because the runtime system thinks some objects may likely have become unreachable.
You may be better off using e.g. System.Runtime.Caching.MemoryCache which can be configured with a memory limit, or custom eviction policies for the items.
The answer actually depends on usage characteristics of the cache you are trying to build. I have successfully used WeakReference based caching strategy for improving performance in many of my projects where the cached objects are expected to be used in short bursts of multiple reads. As others pointed out, the weak references are pretty much garbage from GC's point of view and will be collected whenever the next GC cycle is run. It's nothing to do with the memory utilization.
If, however, you need a cache that survives such brutality from GC, you need to use or mimic the functionality provided by System.Runtime.Caching namespace. Keep in mind that you'd need an additional thread that cleans up the cache when the memory usage is crossing your thresholds.
A bit late, but here's a relevant use case:
I need to cache two types of objects: large (deserialised) data files that take 10 minutes to load and cost 15G of ram each, and smaller (dynamically compiled) objects that contain internal references to those data files (the smaller objects are also cached because they take ~10s to generate). These caches are hidden within the factories that supply the objects (the former component having no knowledge of the latter), and have different eviction policies.
When my `data file' cache evicts an object, it replaces it by a weak reference, so if that object is still available when next requested, we can resurrect it (and renew its cache timeout). In this way we avoid losing (or accidentally duplicating) any object before it is truly defunct (i.e. not used anywhere else). Notice that neither cache is required to be aware of the other, and that no other client objects need to be aware that there are any caches at all (eg: we avoid needing 'keepalives', callbacks, registration, retrieve-and-return scopes, etc - things get a lot simpler).
So although using WeakReference by itself (instead of a cache) is a terrible idea (because modern GCs are typically tuned to the size of the L2 CPU cache, and regular code will burn through this many times per minute), it's very useful as a way to hide your caches from the rest of your code.

Variation in execution time

I've been profiling a method using the stopwatch class, which is sub-millisecond accurate. The method runs thousands of times, on multiple threads.
I've discovered that most calls (90%+) take 0.1ms, which is acceptable. Occasionally, however, I find that the method takes several orders of magnitude longer, so that the average time for the call is actually more like 3-4ms.
What could be causing this?
The method itself is run from a delegate, and is essentially an event handler.
There are not many possible execution paths, and I've not yet discovered a path that would be conspicuously complicated.
I'm suspecting garbage collection, but I don't know how to detect whether it has occurred.
Finally, I am also considering whether the logging method itself is causing the problem. (The logger is basically a call to a static class + event listener that writes to the console.)
Just because Stopwatch has a high accuracy doesn't mean that other things can't get in the way - like the OS interrupting that thread to do something else. Garbage collection is another possibility. Writing to the console could easily cause delays like that.
Are you actually interested in individual call times, or is it overall performance which is important? It's generally more useful to run a method thousands of times and look at the total time - that's much more indicative of overall performance than individual calls which can be affected by any number of things on the computer.
As I commented, you really should at least describe what your method does, if you're not willing to post some code (which would be best).
That said, one way you can tell if garbage collection has occurred (from Windows):
Run perfmon (Start->Run->perfmon)
Right-click on the graph; select "Add Counters..."
Under "Performance object", select ".NET CLR Memory"
From there you can select # Gen 0, 1, and 2 collections and click "Add"
Now on the graph you will see a graph of all .NET CLR garbage collections
Just keep this graph open while you run your application
EDIT: If you want to know if a collection occurred during a specific execution, why not do this?
int initialGen0Collections = GC.CollectionCount(0);
int initialGen1Collections = GC.CollectionCount(1);
int initialGen2Collections = GC.CollectionCount(2);
// run your method
if (GC.CollectionCount(0) > initialGen0Collections)
// gen 0 collection occurred
if (GC.CollectionCount(1) > initialGen1Collections)
// gen 1 collection occurred
if (GC.CollectionCount(2) > initialGen2Collections)
// gen 2 collection occurred
SECOND EDIT: A couple of points on how to reduce garbage collections within your method:
You mentioned in a comment that your method adds the object passed in to "a big collection." Depending on the type you use for said big collection, it may be possible to reduce garbage collections. For instance, if you use a List<T>, then there are two possibilities:
a. If you know in advance how many objects you'll be processing, you should set the list's capacity upon construction:
List<T> bigCollection = new List<T>(numObjects);
b. If you don't know how many objects you'll be processing, consider using something like a LinkedList<T> instead of a List<T>. The reason for this is that a List<T> automatically resizes itself whenever a new item is added beyond its current capacity; this results in a leftover array that (eventually) will need to be garbage collected. A LinkedList<T> does not use an array internally (it uses LinkedListNode<T> objects), so it will not result in this garbage collection.
If you are creating objects within your method (i.e., somewhere in your method you have one or more lines like Thing myThing = new Thing();), consider using a resource pool to eliminate the need for constantly constructing objects and thereby allocating more heap memory. If you need to know more about resource pooling, check out the Wikipedia article on Object Pools and the MSDN documentation on the ConcurrentBag<T> class, which includes a sample implementation of an ObjectPool<T>.
That can depend on many things and you really have to figure out which one you are delaing with.
I'm not terribly familiar with what triggers garbage collection and what thread it runs on, but that sounds like a possibility.
My first thought around this is with paging. If this is the first time the method runs and the application needs to page in some code to run the method, it would be waiting on that. Or, it could be the data that you're using within the method that triggered a cache miss and now you have to wait for that.
Maybe you're doing an allocation and the allocator did some extra reshuffling in order to get you the allocation you requested.
Not sure how thread time is calculated with Stopwatch, but a context switch might be what you're seeing.
Or...it could be something completely different...
Basically, it could be one of several things and you really have to look at the code itself to see what is causing your occasional slow-down.
It could well be GC. If you use a profiler application such as Redgate's ANTS profiler you can profile % time in GC along side your application's performance to see what's going on.
In addition, you can use the CLRProfiler...
https://github.com/MicrosoftArchive/clrprofiler
Finally, Windows Performance Monitor will show the % time in GC for a given running applicaiton too.
These tools will help you get a holistic view of what's going on in your app as well as the OS in general.
I'm sure you know this stuff already but microbenchmarking such as this is sometimes useful for determining how fast one line of code might be compared to another than you might write, but you generally want to profile your application under typical load too.
Knowing that a given line of code is 10 times faster than another is useful, but if that line of code is easier to read and not part of a tight loop then the 10x performance hit might not be a problem.
What you need is a performance profile to tell you exactly what causes a slow down. Here is a quick list And of course here is the ANTS profiler.
Without knowing what your operation is doing, it sounds like it could be the garbage collection. However that might not be the only reason. If you are reading or writing to the disc it is possible your application might have to wait while something else is using the disk.
Timing issues may occur if you have a multi-threaded application and another thread could be taking some processor time that is only running 10 % of the time. This is why a profiler would help.
If you're only running the code "thousands" of times on a pretty quick function, the occasional longer time could easily be due to transient events on the system (maybe Windows decided it was time to cache something).
That being said, I would suggest the following:
Run the function many many more times, and take an average.
In the code that uses the function, determine if the function in question actually is a bottleneck. Use a profiler for this.
It can be dependent on your OS, environment, page reads, CPU ticks per second and so on.
The most realistic way is to run an execution path several thousand times and take the average.
However, if that logging class is only called occasionally and it logs to disk, that is quite likely to be a slow-down factor if it has to seek on the drive first.
A read of http://en.wikipedia.org/wiki/Profiling_%28computer_programming%29 may give you an insight into more techniques for determining slowdowns in your applications, while a list of profiling tools that you may find useful is at:
http://en.wikipedia.org/wiki/Visual_Studio_Team_System_Profiler
specifically http://en.wikipedia.org/wiki/Visual_Studio_Team_System_Profiler if you're doing c# stuff.
Hope that helps!

Garbage Collection on one object, C#

I need to dispose of an object so it can release everything it owns, but it doesn't implement the IDisposable so I can't use it in a using block. How can I make the garbage collector collect it?
You can force a collection with GC.Collect(). Be very careful using this, since a full collection can take some time. The best-practice is to just let the GC determine when the best time to collect is.
Does the object contain unmanaged resources but does not implement IDisposable? If so, it's a bug.
If it doesn't, it shouldn't matter if it gets released right away, the garbage collector should do the right thing.
If it "owns" anything other than memory, you need to fix the object to use IDisposable. If it's not an object you control this is something worth picking a different vendor over, because it speaks to the core of how well your vendor really understands .Net.
If it does just own memory, even a lot of it, all you have to do is make sure the object goes out of scope. Don't call GC.Collect() — it's one of those things that if you have to ask, you shouldn't do it.
You can't perform garbage collection on a single object. You could request a garbage collection by calling GC.Collect() but this will effect all objects subject to cleanup. It is also highly discouraged as it can have a negative effect on the performance of later collections.
Also, calling Dispose on an object does not clean up it's memory. It only allows the object to remove references to unmanaged resources. For example, calling Dispose on a StreamWriter closes the stream and releases the Windows file handle. The memory for the object on the managed heap does not get reclaimed until a subsequent garbage collection.
Chris Sells also discussed this on .NET Rocks. I think it was during his first appearance but the subject might have been revisited in later interviews.
http://www.dotnetrocks.com/default.aspx?showNum=10
This article by Francesco Balena is also a good reference:
When and How to Use Dispose and Finalize in C#
http://www.devx.com/dotnet/Article/33167/0/page/1
Garbage collection in .NET is non deterministic, meaning you can't really control when it happens. You can suggest, but that doesn't mean it will listen.
Tells us a little bit more about the object and why you want to do this. We can make some suggestions based off of that. Code always helps. And depending on the object, there might be a Close method or something similar. Maybe the useage is to call that. If there is no Close or Dispose type of method, you probably don't want to rely on that object, as you will probably get memory leaks if in fact it does contain resourses which will need to be released.
If the object goes out of scope and it have no external references it will be collected rather fast (likely on the next collection).
BEWARE: of f ra gm enta tion in many cases, GC.Collect() or some IDisposal is not very helpful, especially for large objects (LOH is for objects ~80kb+, performs no compaction and is subject to high levels of fragmentation for many common use cases) which will then lead to out of memory (OOM) issues even with potentially hundreds of MB free. As time marches on, things get bigger, though perhaps not this size (80 something kb) for LOH relegated objects, high degrees of parallelism exasperates this issue due simply due to more objects in less time (and likely varying in size) being instantiated/released.
Array’s are the usual suspects for this problem (it’s also often hard to identify due to non-specific exceptions and assertions from the runtime, something like “high % of large object heap fragmentation” would be swell), the prognosis for code suffering from this problem is to implement an aggressive re-use strategy.
A class in Systems.Collections.Concurrent.ObjectPool from the parallel extensions beta1 samples helps (unfortunately there is not a simple ubiquitous pattern which I have seen, like maybe some attached property/extension methods?), it is simple enough to drop in or re-implement for most projects, you assign a generator Func<> and use Get/Put helper methods to re-use your previous object’s and forgo usual garbage collection. It is usually sufficient to focus on array’s and not the individual array elements.
It would be nice if .NET 4 updated all of the .ToArray() methods everywhere to include .ToArray(T target).
Getting the hang of using SOS/windbg (.loadby sos mscoreei for CLRv4) to analyze this class of issue can help. Thinking about it, the current garbage collection system is more like garbage re-cycling (using the same physical memory again), ObjectPool is analogous to garbage re-using. If anybody remembers the 3 R’s, reducing your memory use is a good idea too, for performance sakes ;)

Categories