Hello and thank you very much for your help!
Does anybody have a good idea to find unreferenced objects of a specific class before garbage collection? (preferable as soon as possible)
In my case, I need to create a lot of small objects of a specific class for temporary use only. The problem is that I don’t know when the object is not needed anymore. I would like to collect the objects of that class which are not referenced any more (as soon as possible) before garbage collection so that I can recycle them and don’t need to create them new. I think that would make the code much faster.
Kind Regards,
David
First off, before you do this you should do extensive profiling to determine that you really, truly do have a performance problem caused by collection pressure. The garbage collector is highly tuned and works quite well most of the time; situations where you need to pool objects for performance reasons are rare.
I actually am in that scenario; we have determined through extensive testing that there are certain objects we use all the time on a temporary basis, ("builders" of other objects, essentially) and that the cost of collection pressure caused by re-allocating them frequently is measurable and high.
What we do is we have a pool class which maintains an array of "blank" objects. When you need a new object, the pool checks the array and returns an object that is in the array if we have one, nulling out the array entry. If we don't have one then it creates a new object. When the temporary user is done with the object, it passes it back to the pool, which "blanks" it and sticks it back in the array. (Growing the array if necessary.)
If a user forgets to put the object back into the pool, or cannot do so because an exception was thrown before the "back in the pool" call, who cares? All we've done in that case is perhaps slightly de-optimized a future allocation. The cost is that you need to remember to put the object back in the pool when you're done with it.
There's no way to "hook" the garbage collector to put stuff back in the pool automatically that I know of.
You can't directly control garbage collection, but you could create a manager class that is responsible for creating, holding the references and disposing of these objects. As long as the manager class is in scope, its objects will not be garbage collected.
Related
I have an app that seems to accumulate a lot of memory.
One of the suspects is below, and I'm just trying to wrap my head around what it is actually doing. Or, more specifically, how is it cleaned up?
private static readonly ConcurrentDictionary<string, AsyncLocal<object>> State;
Problem context:
The idea is to simulate what OperationContext in WCF would do - provide static access to information about the current call. I am doing this inside a Service Fabric remoting service.
Can someone help me understand the nature of this in terms of what happens to the AsyncLocal<object> once the async call ends? I see it hanging around in memory but can't tell if it is a memory leak, or ig the GC just hasn't reclaimed it yet.
I know the static dictionary stays around, but do the values also, or do I need to be manually clearing those before my current service invocation completes to ensure no memory leak here?
*Edit - Here is some more info as requested by Pavel.
Posting relevant code below, but the whole picture is here.
Github where the general idea came from. They are trying to make headers work in ServiceFabric/.net core like they used to in old WCF.
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders
The RemotingContext object is here:
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders/blob/master/src/ServiceFabric.Remoting.CustomHeaders/RemotingContext.cs
It's use can be seen here (line 52, among others):
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders/blob/master/src/ServiceFabric.Remoting.CustomHeaders/ReliableServices/ExtendedServiceRemotingMessageDispatcher.cs
Here is a code snippet:
public override async Task HandleRequestResponseAsync(IServiceRemotingRequestContext requestContext,
IServiceRemotingRequestMessage requestMessage)
{
var header = requestMessage.GetHeader();
RemotingContext.FromRemotingMessageHeader(header);
//Some other code where the actual service is invoked (and where RemotingContext values can be references for the current call.
return null;
}
The Garbage Collector is a strange beast, so it's worth getting to know about its behaviour. (Note, this is a simplistic view of the GC)
First of all, if just one reference exists to an object, that object is not considered garbage, so it will never be collected.
As your Dictionary is static, it is always going to exist, so anything contained within it will always have a reference to it. If there are no other references to the contained object and you remove it from the Dictionary, it will become unreachable and therefore garbage and will be collected. Making sure there are no references to your objects is the way to ensure they will be collected. It's very easy to forget about some reference you made somewhere, which keeps the object alive.
Secondly, the Garbage Collector doesn't run continuously. It runs when memory resources are getting low and needs to release some memory. This is so that it doesn't hog the CPU to the detriment of the main applications. This means that objects can still be in memory for some time before the next Garbage Collection. This can make memory usage seem high at times, even when it isn't.
Thirdly, the Garbage Collector has "Generations". Generation 0 objects are the newest and most short-lived objects. Generation 0 objects are collected most often, but only when memory is needed.
Generation 1 contains short-lived objects that are on their way to becoming long-lived objects. They are collected less often that Generation 0 objects. In fact, Generation 1 collection only happens if a Generation 0 collection did not release enough memory.
Generation 2 objects are long-lived. These are typically things like static objects which exist for the lifetime of the application. Similarly, these are collected when the Generation 1 collection doesn't release enough memory.
Finally, there is the Large Object heap. Objects that consume a lot of memory take time to move around (part of the garbage collection process involves defragmenting the memory after collection has taken place), so they tend to remain uncollected unless collection didn't release enough memory. Some people refer to this as Generation 3, but they are actually collected in Generation 2 when necessary.
I'm in a situation where I'm having to create thousands of objects at once, and the cost of instantiating the objects and garbage collecting them is impacting the performance of the application, and the impact of the garbage collector running hurts the performance more since this is on older hardware, so I'm mostly trying to prevent the creation of garbage. I believe a memory pool would solve my issue but I'm not sure how the memory pool would know when a resource in the pool was freed up for re-use. The tricky part is that receiver of objects from the pool end up passing that object around throughout the program and it would be very difficult to know when it could be manually freed up. I'd like it to be like a WeakReference, where I could know when nobody was using it anymore. But my understanding is that if I use a WeakReference in the memory pool then it would eventually get garbage collected from the pool itself and I need these objects to remain pretty much forever so they'll continue to get recycled. Sometimes the program can go for awhile without needing the objects so I imagine the garbage collector would collect them before the next time when they were needed and that would then trigger another performance hit as another thousand of these objects were made.
Is there a way I can make sure these objects are never collected, but know when there are no references to them aside from the memory pool itself? Do I need to implement reference counting for these objects somehow?
I've been googling for a couple of hours now and have seen no implementation of a Memory Pool that doesn't require the user to let the memory pool know when they're done with it. I find it hard to believe there is no way to do this in C#.
Is there a way I can make sure these objects are never collected, but know when there are no references to them aside from the memory pool itself?
Usually an object pool only holds references to available objects (you can check ObjectPool implementation in Roslyn). With that in mind, you can use the finalizer to resurrect the object and return it to the pool when it is unreachable.
However, I don't think it will improve performance. The whole pool will very soon reach generation 2, so the unreachable objects will need a full garbage collection to be returned to the pool. Depending on the memory usage pattern in your program, it might not happen very often. You can GC.Collect() and GC.WaitForPendingFinalizers() of course, but it will also hurt performance. You can try it out and check if it helps.
Another problem is the design - your objects are coupled to the pool.
I'd rather try to return the objects to the pool explicitly. Remember that not all objects have to be returned. If there are no more available objects, the pool can create new ones. The ones that were not returned will just be garbage collected. Check if there are some code paths where you are sure the objects are not needed anymore. If you can't find any, try to refactor your code.
Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap).
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack. I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
I have 2 questions:
Is my understanding of Garbage Collection correct?
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
Am I understanding this correctly or am I missing something?
Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap
I don't think you understand about the stack and the heap. If values live on the stack then where does an array of integers live? Integers are values. Are you telling me that an array of integers keeps its integers on the stack? When you return an array of integers from a method, say, with ten thousand integers in it, are you telling me that those ten thousand integers are copied onto the stack?
Values live on the stack when they live on the stack, and live on the heap when they live on the heap. The idea that the type of a thing has to do with the lifetime of its storage is nonsense. Storage locations that are short lived go on the stack; storage locations that are long lived go on the heap, and that is independent of their type. A long-lived int has to go on the heap, same as a long-lived instance of a class.
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack.
Why does the reference have to go on the stack? Again, the lifetime of the storage of the reference has nothing to do with its type. If the storage of the reference is long-lived then the reference goes on the heap.
I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
The C# language does not do so; the CLR does so.
Is my understanding of Garbage Collection correct?
You seem to believe a lot of lies about the stack and the heap, so odds are good no, it's not.
Can I do my own?
Not in C#, no.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
The whole point of garbage collection is to free you from worrying about tidying up. That's why its called "automatic garbage collection". It tidies for you.
If you are worried that your loops are creating collection pressure, and you wish to avoid collection pressure for performance reasons then I advise that you pursue a pooling strategy. It would be wise to start with an explicit pooling strategy; that is:
while(whatever)
{
Frob f = FrobPool.FetchFromPool();
f.Blah();
FrobPool.ReturnToPool(f);
}
rather than attempting to do automatic pooling using a resurrecting finalizer. I advise against both finalizers and object resurrection in general unless you are an expert on finalization semantics.
The pool of course allocates a new Frob if there is not one in the pool. If there is one in the pool, then it hands it out and removes it from the pool until it is put back in. (If you forget to put a Frob back in the pool, the GC will get to it eventually.) By pursuing a pooling strategy you cause the GC to eventually move all the Frobs to the generation 2 heap, instead of creating lots of collection pressure in the generation 0 heap. The collection pressure then disappears because no new Frobs are allocated. If something else is producing collection pressure, the Frobs are all safely in the gen 2 heap where they are rarely visited.
This of course is the exact opposite of the strategy you described; the whole point of the pooling strategy is to cause objects to hang around forever. Objects hanging around forever is a good thing if you're going to use them.
Of course, do not make these sorts of changes before you know via profiling that you have a performance problem due to collection pressure! It is rare to have such a problem on the desktop CLR; it is rather more common on the compact CLR.
More generally, if you are the kind of person who feels uncomfortable having a memory manager clean up for you on its schedule, then C# is not the right language for you. Consider C instead.
values live on the Stack, references on the Heap
This is an implementation detail. There is nothing to stop a .NET Framework from storing both on the stack.
I also know that C# does it's own Garbage Collection
C# has nothing to do with this. This is a service provided by the CLR. VB.NET, F#, etc all still have garbage collection.
The CLR will remove an object from memory if it has no strong roots. For example, when your class instance goes out of scope in your for loop. There will be a few lying around, but they will get collected eventually, either by garbage collection or the program terminating.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it?
You can use GC.Collect to force a collection. You should not do it because it is an expensive operation. More expensive than letting a few objects occupy memory a little bit longer than they are absolutely needed. The GC is incredibly good at what it does on its own. You will also force short lived objects to promote to generations they wouldn't get normally.
First off, to Erics seminal post about The truth about value types
Secondly on Garbage collection, the collector knows far more about your running program than you do, don't try to second guess it unless you're in the incredibly unlikely situation that you have a memory leak.
So to your second question, no don't try to "help" the GC.
I'll find a post to this effect on the CG and update this answer.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
Yes you can with GC.Collect but you shouldn't. The GC is optimized for variables that are short lived, ones in a method, and variables that are long lived, ones that generally stick around for the life time of the application.
Variables that are in-between aren't as common and aren't really optimum for the GC.
By forcing a GC.Collect you're more likely to cause variables in scope to be in forced into that in-between state which is the opposite from you are trying to accomplish.
Also from the MSDN article Writing High-Performance Managed Applications : A Primer
The GC is self-tuning and will adjust itself according to applications
memory requirements. In most cases programmatically invoking a GC will
hinder that tuning. "Helping" the GC by calling GC.Collect will more
than likely not improve your applications performance
Your understanding of Garbage Collection is good enough. Essentially, an unreferenced instance is deemed as being out-of-scope and no longer needed. Having determined this, the collector will remove an unreferenced object at some future point.
There's no way to force the Garbage Collector to collect just a specific instance. You can ask it to do its normal "collect everything possible" operation GC.Collect(), but you shouldn't.; the garbage-collector is efficient and effective if you just leave it to its own devices.
In particular it excels at collecting objects which have a short lifespan, just like those that are created as temporary objects. You shouldn't have to worry about creating loads of objects in a loop, unless they have a long lifespan that prevents immediate collection.
Please see this related question with regard to the Stack and Heap.
In your specific scenario, agreed, if you new up objects in a for-loop then you're going to have sub-optimal performance. Are the objects stored (or otherwise used) within the loop, or are they discarded? If the latter, can you optimize this by newing up one object outside the loop and re-using it?
With regard to can you implement your own GC, there is no explicit delete keyword in C#, you have to leave it to the CLR. You can however give it hints such as when to collect, or what to ignore during collection, however I'd leave that unless absolutely necessary.
Best regards,
Read the following article by Microsoft to get a level of knowledge about Garbage Collection in C#. I'm sure it'll help anyone who need information regarding this matter.
Memory Management and Garbage Collection in the .NET Framework
If you are interested in performance of some areas in your code when writing C#, you can write unsafe code. You will have a plus of performance, and also, in your fixed block, the garbage collector most likely will not occur.
Garbage collection is basically reference tracking. I can't think of any good reason why you would want to change it. Are you having some sort of problem where you find that memory isn't being freed? Or maybe you are looking for the dispose pattern
Edit:
Replaced "reference counting" with "reference tracking" to not be confused with the Increment/Decrement Counter on object Reference/Dereference (eg from Python).
I thought it was pretty common to refer to the object graph generation as "Counting" like in this answer:
Why no Reference Counting + Garbage Collection in C#?
But I will not pick up the glove of (the) Eric Lippert :)
i have a cache that uses WeakReferences to the cached objects to make them automatically removed from the cache in case of memory pressure. My problem is that the cached objects are collected very soon after they have been stored in the cache. The cache runs in a 64-Bit application and in spite of the case that more than 4gig of memory are still available, all the cached objects are collected (they usually are stored in the G2-heap at that moment). There are no garbage collection induced manually as the process explorer shows.
What methods can i apply to make the objects live a litte longer?
Using WeakReferences as the primary means of referencing cached objects is not really a great idea, because as Josh said, your at the mercy of any future behavioral changes to WeakReference and the GC.
However, if your cache needs any kind of resurrection capability, use of WeakReferences for items that are pending purge is useful. When an item meets eviction criteria, rather than immediately evicting it, you change its reference to a weak reference. If anything requests it before it is GC'ed, you restore its strong reference, and the object can live again. I have found this useful for some caches that have hard to predict hit rate patterns with frequent enough "resurrections" to be beneficial.
If you have predictable hit rate patterns, then I would forgoe the WeakReference option and perform explicit evictions.
There is one situation where a WeakReference-based cache may be good: when the usefulness of an item in the class is predicated upon the existence of a reference to it. In such a situation, a weak interning cache may be useful. For example, if one had an application which would deserialize many large immutable objects, many of which were expected to be duplicates, and would have to perform many comparisons between them. If X and Y are references to some immutable class type, testing X.Equals(Y) will be very fast if both variables point to the same instance, but may be very slow if they point to distinct instances that happen to be equal. If a deserialized object happens to match another object to which a reference already exists, fetching a from the dictionary a reference to that latter object (requiring one slow comparison) may expedite future comparisons. On the other hand, if it matched an item in the dictionary but the dictionary was the only reference to that item, there would be little advantage to using the dictionary object instead of simply keeping the object that was read in; probably not enough advantage to justify the cost of the comparison. For an interning cache, having WeakReferences get invalidated as soon as possible once no other references exist to an object would be a good thing.
In .net, a WeakReference is not considered a reference from the GC standpoint at all, so any object that only has weak references will be collected in the next GC run (for the appropriate generation).
That makes weak reference completely inappropriate for caching - as your experience shows.
You need a "real" cache component, and the most important thing about caching is to get one where the eviction policy (that is, the rules about when to drop an object from the cache) are a good match for you application's usage pattern.
No, WeakReference is not good for that because the behavior of the garbage collector can and will change over time and your cache should not be relying on today's behavior. Also many factors outside of your control could affect memory pressure.
There are many implementations of a cache for .NET. You could find probably a dozen on CodePlex. I guess what you need to add to it is something that looks at the application's current working set to use that as a trigger for purging.
One more note about why your objects are being collected so frequently. The GC is very aggressive at cleaning up Gen0 objects. If your objects are very short-lived (up until the only reference to it is a weak reference) then the GC is doing what it's designed to do by cleaning up as quickly as it can.
I believe the problem you are having is that the Garbage Collector removes weakly referenced objects in response not only in response to memory pressure - instead it will do collection quite aggressively sometimes just because the runtime system thinks some objects may likely have become unreachable.
You may be better off using e.g. System.Runtime.Caching.MemoryCache which can be configured with a memory limit, or custom eviction policies for the items.
The answer actually depends on usage characteristics of the cache you are trying to build. I have successfully used WeakReference based caching strategy for improving performance in many of my projects where the cached objects are expected to be used in short bursts of multiple reads. As others pointed out, the weak references are pretty much garbage from GC's point of view and will be collected whenever the next GC cycle is run. It's nothing to do with the memory utilization.
If, however, you need a cache that survives such brutality from GC, you need to use or mimic the functionality provided by System.Runtime.Caching namespace. Keep in mind that you'd need an additional thread that cleans up the cache when the memory usage is crossing your thresholds.
A bit late, but here's a relevant use case:
I need to cache two types of objects: large (deserialised) data files that take 10 minutes to load and cost 15G of ram each, and smaller (dynamically compiled) objects that contain internal references to those data files (the smaller objects are also cached because they take ~10s to generate). These caches are hidden within the factories that supply the objects (the former component having no knowledge of the latter), and have different eviction policies.
When my `data file' cache evicts an object, it replaces it by a weak reference, so if that object is still available when next requested, we can resurrect it (and renew its cache timeout). In this way we avoid losing (or accidentally duplicating) any object before it is truly defunct (i.e. not used anywhere else). Notice that neither cache is required to be aware of the other, and that no other client objects need to be aware that there are any caches at all (eg: we avoid needing 'keepalives', callbacks, registration, retrieve-and-return scopes, etc - things get a lot simpler).
So although using WeakReference by itself (instead of a cache) is a terrible idea (because modern GCs are typically tuned to the size of the L2 CPU cache, and regular code will burn through this many times per minute), it's very useful as a way to hide your caches from the rest of your code.
I need to dispose of an object so it can release everything it owns, but it doesn't implement the IDisposable so I can't use it in a using block. How can I make the garbage collector collect it?
You can force a collection with GC.Collect(). Be very careful using this, since a full collection can take some time. The best-practice is to just let the GC determine when the best time to collect is.
Does the object contain unmanaged resources but does not implement IDisposable? If so, it's a bug.
If it doesn't, it shouldn't matter if it gets released right away, the garbage collector should do the right thing.
If it "owns" anything other than memory, you need to fix the object to use IDisposable. If it's not an object you control this is something worth picking a different vendor over, because it speaks to the core of how well your vendor really understands .Net.
If it does just own memory, even a lot of it, all you have to do is make sure the object goes out of scope. Don't call GC.Collect() — it's one of those things that if you have to ask, you shouldn't do it.
You can't perform garbage collection on a single object. You could request a garbage collection by calling GC.Collect() but this will effect all objects subject to cleanup. It is also highly discouraged as it can have a negative effect on the performance of later collections.
Also, calling Dispose on an object does not clean up it's memory. It only allows the object to remove references to unmanaged resources. For example, calling Dispose on a StreamWriter closes the stream and releases the Windows file handle. The memory for the object on the managed heap does not get reclaimed until a subsequent garbage collection.
Chris Sells also discussed this on .NET Rocks. I think it was during his first appearance but the subject might have been revisited in later interviews.
http://www.dotnetrocks.com/default.aspx?showNum=10
This article by Francesco Balena is also a good reference:
When and How to Use Dispose and Finalize in C#
http://www.devx.com/dotnet/Article/33167/0/page/1
Garbage collection in .NET is non deterministic, meaning you can't really control when it happens. You can suggest, but that doesn't mean it will listen.
Tells us a little bit more about the object and why you want to do this. We can make some suggestions based off of that. Code always helps. And depending on the object, there might be a Close method or something similar. Maybe the useage is to call that. If there is no Close or Dispose type of method, you probably don't want to rely on that object, as you will probably get memory leaks if in fact it does contain resourses which will need to be released.
If the object goes out of scope and it have no external references it will be collected rather fast (likely on the next collection).
BEWARE: of f ra gm enta tion in many cases, GC.Collect() or some IDisposal is not very helpful, especially for large objects (LOH is for objects ~80kb+, performs no compaction and is subject to high levels of fragmentation for many common use cases) which will then lead to out of memory (OOM) issues even with potentially hundreds of MB free. As time marches on, things get bigger, though perhaps not this size (80 something kb) for LOH relegated objects, high degrees of parallelism exasperates this issue due simply due to more objects in less time (and likely varying in size) being instantiated/released.
Array’s are the usual suspects for this problem (it’s also often hard to identify due to non-specific exceptions and assertions from the runtime, something like “high % of large object heap fragmentation” would be swell), the prognosis for code suffering from this problem is to implement an aggressive re-use strategy.
A class in Systems.Collections.Concurrent.ObjectPool from the parallel extensions beta1 samples helps (unfortunately there is not a simple ubiquitous pattern which I have seen, like maybe some attached property/extension methods?), it is simple enough to drop in or re-implement for most projects, you assign a generator Func<> and use Get/Put helper methods to re-use your previous object’s and forgo usual garbage collection. It is usually sufficient to focus on array’s and not the individual array elements.
It would be nice if .NET 4 updated all of the .ToArray() methods everywhere to include .ToArray(T target).
Getting the hang of using SOS/windbg (.loadby sos mscoreei for CLRv4) to analyze this class of issue can help. Thinking about it, the current garbage collection system is more like garbage re-cycling (using the same physical memory again), ObjectPool is analogous to garbage re-using. If anybody remembers the 3 R’s, reducing your memory use is a good idea too, for performance sakes ;)