I'm in a situation where I'm having to create thousands of objects at once, and the cost of instantiating the objects and garbage collecting them is impacting the performance of the application, and the impact of the garbage collector running hurts the performance more since this is on older hardware, so I'm mostly trying to prevent the creation of garbage. I believe a memory pool would solve my issue but I'm not sure how the memory pool would know when a resource in the pool was freed up for re-use. The tricky part is that receiver of objects from the pool end up passing that object around throughout the program and it would be very difficult to know when it could be manually freed up. I'd like it to be like a WeakReference, where I could know when nobody was using it anymore. But my understanding is that if I use a WeakReference in the memory pool then it would eventually get garbage collected from the pool itself and I need these objects to remain pretty much forever so they'll continue to get recycled. Sometimes the program can go for awhile without needing the objects so I imagine the garbage collector would collect them before the next time when they were needed and that would then trigger another performance hit as another thousand of these objects were made.
Is there a way I can make sure these objects are never collected, but know when there are no references to them aside from the memory pool itself? Do I need to implement reference counting for these objects somehow?
I've been googling for a couple of hours now and have seen no implementation of a Memory Pool that doesn't require the user to let the memory pool know when they're done with it. I find it hard to believe there is no way to do this in C#.
Is there a way I can make sure these objects are never collected, but know when there are no references to them aside from the memory pool itself?
Usually an object pool only holds references to available objects (you can check ObjectPool implementation in Roslyn). With that in mind, you can use the finalizer to resurrect the object and return it to the pool when it is unreachable.
However, I don't think it will improve performance. The whole pool will very soon reach generation 2, so the unreachable objects will need a full garbage collection to be returned to the pool. Depending on the memory usage pattern in your program, it might not happen very often. You can GC.Collect() and GC.WaitForPendingFinalizers() of course, but it will also hurt performance. You can try it out and check if it helps.
Another problem is the design - your objects are coupled to the pool.
I'd rather try to return the objects to the pool explicitly. Remember that not all objects have to be returned. If there are no more available objects, the pool can create new ones. The ones that were not returned will just be garbage collected. Check if there are some code paths where you are sure the objects are not needed anymore. If you can't find any, try to refactor your code.
Related
I have an app that seems to accumulate a lot of memory.
One of the suspects is below, and I'm just trying to wrap my head around what it is actually doing. Or, more specifically, how is it cleaned up?
private static readonly ConcurrentDictionary<string, AsyncLocal<object>> State;
Problem context:
The idea is to simulate what OperationContext in WCF would do - provide static access to information about the current call. I am doing this inside a Service Fabric remoting service.
Can someone help me understand the nature of this in terms of what happens to the AsyncLocal<object> once the async call ends? I see it hanging around in memory but can't tell if it is a memory leak, or ig the GC just hasn't reclaimed it yet.
I know the static dictionary stays around, but do the values also, or do I need to be manually clearing those before my current service invocation completes to ensure no memory leak here?
*Edit - Here is some more info as requested by Pavel.
Posting relevant code below, but the whole picture is here.
Github where the general idea came from. They are trying to make headers work in ServiceFabric/.net core like they used to in old WCF.
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders
The RemotingContext object is here:
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders/blob/master/src/ServiceFabric.Remoting.CustomHeaders/RemotingContext.cs
It's use can be seen here (line 52, among others):
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders/blob/master/src/ServiceFabric.Remoting.CustomHeaders/ReliableServices/ExtendedServiceRemotingMessageDispatcher.cs
Here is a code snippet:
public override async Task HandleRequestResponseAsync(IServiceRemotingRequestContext requestContext,
IServiceRemotingRequestMessage requestMessage)
{
var header = requestMessage.GetHeader();
RemotingContext.FromRemotingMessageHeader(header);
//Some other code where the actual service is invoked (and where RemotingContext values can be references for the current call.
return null;
}
The Garbage Collector is a strange beast, so it's worth getting to know about its behaviour. (Note, this is a simplistic view of the GC)
First of all, if just one reference exists to an object, that object is not considered garbage, so it will never be collected.
As your Dictionary is static, it is always going to exist, so anything contained within it will always have a reference to it. If there are no other references to the contained object and you remove it from the Dictionary, it will become unreachable and therefore garbage and will be collected. Making sure there are no references to your objects is the way to ensure they will be collected. It's very easy to forget about some reference you made somewhere, which keeps the object alive.
Secondly, the Garbage Collector doesn't run continuously. It runs when memory resources are getting low and needs to release some memory. This is so that it doesn't hog the CPU to the detriment of the main applications. This means that objects can still be in memory for some time before the next Garbage Collection. This can make memory usage seem high at times, even when it isn't.
Thirdly, the Garbage Collector has "Generations". Generation 0 objects are the newest and most short-lived objects. Generation 0 objects are collected most often, but only when memory is needed.
Generation 1 contains short-lived objects that are on their way to becoming long-lived objects. They are collected less often that Generation 0 objects. In fact, Generation 1 collection only happens if a Generation 0 collection did not release enough memory.
Generation 2 objects are long-lived. These are typically things like static objects which exist for the lifetime of the application. Similarly, these are collected when the Generation 1 collection doesn't release enough memory.
Finally, there is the Large Object heap. Objects that consume a lot of memory take time to move around (part of the garbage collection process involves defragmenting the memory after collection has taken place), so they tend to remain uncollected unless collection didn't release enough memory. Some people refer to this as Generation 3, but they are actually collected in Generation 2 when necessary.
Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap).
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack. I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
I have 2 questions:
Is my understanding of Garbage Collection correct?
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
Am I understanding this correctly or am I missing something?
Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap
I don't think you understand about the stack and the heap. If values live on the stack then where does an array of integers live? Integers are values. Are you telling me that an array of integers keeps its integers on the stack? When you return an array of integers from a method, say, with ten thousand integers in it, are you telling me that those ten thousand integers are copied onto the stack?
Values live on the stack when they live on the stack, and live on the heap when they live on the heap. The idea that the type of a thing has to do with the lifetime of its storage is nonsense. Storage locations that are short lived go on the stack; storage locations that are long lived go on the heap, and that is independent of their type. A long-lived int has to go on the heap, same as a long-lived instance of a class.
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack.
Why does the reference have to go on the stack? Again, the lifetime of the storage of the reference has nothing to do with its type. If the storage of the reference is long-lived then the reference goes on the heap.
I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
The C# language does not do so; the CLR does so.
Is my understanding of Garbage Collection correct?
You seem to believe a lot of lies about the stack and the heap, so odds are good no, it's not.
Can I do my own?
Not in C#, no.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
The whole point of garbage collection is to free you from worrying about tidying up. That's why its called "automatic garbage collection". It tidies for you.
If you are worried that your loops are creating collection pressure, and you wish to avoid collection pressure for performance reasons then I advise that you pursue a pooling strategy. It would be wise to start with an explicit pooling strategy; that is:
while(whatever)
{
Frob f = FrobPool.FetchFromPool();
f.Blah();
FrobPool.ReturnToPool(f);
}
rather than attempting to do automatic pooling using a resurrecting finalizer. I advise against both finalizers and object resurrection in general unless you are an expert on finalization semantics.
The pool of course allocates a new Frob if there is not one in the pool. If there is one in the pool, then it hands it out and removes it from the pool until it is put back in. (If you forget to put a Frob back in the pool, the GC will get to it eventually.) By pursuing a pooling strategy you cause the GC to eventually move all the Frobs to the generation 2 heap, instead of creating lots of collection pressure in the generation 0 heap. The collection pressure then disappears because no new Frobs are allocated. If something else is producing collection pressure, the Frobs are all safely in the gen 2 heap where they are rarely visited.
This of course is the exact opposite of the strategy you described; the whole point of the pooling strategy is to cause objects to hang around forever. Objects hanging around forever is a good thing if you're going to use them.
Of course, do not make these sorts of changes before you know via profiling that you have a performance problem due to collection pressure! It is rare to have such a problem on the desktop CLR; it is rather more common on the compact CLR.
More generally, if you are the kind of person who feels uncomfortable having a memory manager clean up for you on its schedule, then C# is not the right language for you. Consider C instead.
values live on the Stack, references on the Heap
This is an implementation detail. There is nothing to stop a .NET Framework from storing both on the stack.
I also know that C# does it's own Garbage Collection
C# has nothing to do with this. This is a service provided by the CLR. VB.NET, F#, etc all still have garbage collection.
The CLR will remove an object from memory if it has no strong roots. For example, when your class instance goes out of scope in your for loop. There will be a few lying around, but they will get collected eventually, either by garbage collection or the program terminating.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it?
You can use GC.Collect to force a collection. You should not do it because it is an expensive operation. More expensive than letting a few objects occupy memory a little bit longer than they are absolutely needed. The GC is incredibly good at what it does on its own. You will also force short lived objects to promote to generations they wouldn't get normally.
First off, to Erics seminal post about The truth about value types
Secondly on Garbage collection, the collector knows far more about your running program than you do, don't try to second guess it unless you're in the incredibly unlikely situation that you have a memory leak.
So to your second question, no don't try to "help" the GC.
I'll find a post to this effect on the CG and update this answer.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
Yes you can with GC.Collect but you shouldn't. The GC is optimized for variables that are short lived, ones in a method, and variables that are long lived, ones that generally stick around for the life time of the application.
Variables that are in-between aren't as common and aren't really optimum for the GC.
By forcing a GC.Collect you're more likely to cause variables in scope to be in forced into that in-between state which is the opposite from you are trying to accomplish.
Also from the MSDN article Writing High-Performance Managed Applications : A Primer
The GC is self-tuning and will adjust itself according to applications
memory requirements. In most cases programmatically invoking a GC will
hinder that tuning. "Helping" the GC by calling GC.Collect will more
than likely not improve your applications performance
Your understanding of Garbage Collection is good enough. Essentially, an unreferenced instance is deemed as being out-of-scope and no longer needed. Having determined this, the collector will remove an unreferenced object at some future point.
There's no way to force the Garbage Collector to collect just a specific instance. You can ask it to do its normal "collect everything possible" operation GC.Collect(), but you shouldn't.; the garbage-collector is efficient and effective if you just leave it to its own devices.
In particular it excels at collecting objects which have a short lifespan, just like those that are created as temporary objects. You shouldn't have to worry about creating loads of objects in a loop, unless they have a long lifespan that prevents immediate collection.
Please see this related question with regard to the Stack and Heap.
In your specific scenario, agreed, if you new up objects in a for-loop then you're going to have sub-optimal performance. Are the objects stored (or otherwise used) within the loop, or are they discarded? If the latter, can you optimize this by newing up one object outside the loop and re-using it?
With regard to can you implement your own GC, there is no explicit delete keyword in C#, you have to leave it to the CLR. You can however give it hints such as when to collect, or what to ignore during collection, however I'd leave that unless absolutely necessary.
Best regards,
Read the following article by Microsoft to get a level of knowledge about Garbage Collection in C#. I'm sure it'll help anyone who need information regarding this matter.
Memory Management and Garbage Collection in the .NET Framework
If you are interested in performance of some areas in your code when writing C#, you can write unsafe code. You will have a plus of performance, and also, in your fixed block, the garbage collector most likely will not occur.
Garbage collection is basically reference tracking. I can't think of any good reason why you would want to change it. Are you having some sort of problem where you find that memory isn't being freed? Or maybe you are looking for the dispose pattern
Edit:
Replaced "reference counting" with "reference tracking" to not be confused with the Increment/Decrement Counter on object Reference/Dereference (eg from Python).
I thought it was pretty common to refer to the object graph generation as "Counting" like in this answer:
Why no Reference Counting + Garbage Collection in C#?
But I will not pick up the glove of (the) Eric Lippert :)
I'm using the new MemoryCache in .Net 4, with a max cache size limit in MB (I've tested it set between 10 and 200MB, on systems with between 1.75 and 8GB of memory). I don't set any time based expiration on the objects, as I'm using the cache simply as a high performance drive, and as long as there is space, I want it used. To my surprise, the cache refused to evict any objects, to the point that I would get SystemOutOfMemory exceptions.
I fired up perfmon, wired up my application to .Net CLR Memory\#Bytes In All Heaps, .Net Memory Cache 4.0, and Process\Private Bytes -- indeed, the memory consumption was out of control, and no cache trims were being registered.
Did some googling and stackoverflowing, downloaded and attached the CLRProfiler, and wham: evictions everywhere! The memory stayed within reasonable bounds based upon the memory size limit I had set. Ran it in debug mode again, no evictions. CLRProfiler again, evictions.
I finally noticed that the profiler forced the application to run without concurrent garbage collection (also see useful SO Concurrent Garbage Collection Question). I turned it off in my app.config, and, sure enough, evictions!
This seems like at best an outrageous lack of documentation to not say: this only works with non-concurrent garbage collection -- though I image since its ported from ASP.NET, they may not have had to worry about concurrent garbage collection.
So has anyone else seen this? I'd love to get some other experiences out there, and maybe some more educated insights.
Update 1
I've reproduced the issue within a single method: it seems that the cache must be written to in parallel for the cache evictions not to fire (in concurrent garbage collection mode). If there is some interest, I'll upload the test code to a public repo. I'm definitely getting toward the deep end of the the CLR/GC/MemoryCache pool, and I think I forgot my floaties...
Update 2
I published test code on CodePlex to reproduce the issue. Also, possibly of interest, the original production code runs in Azure, as a Worker Role. Interesting, changing the GC concurrency setting in the role's app.config has no effect. Possibly Azure overrides GC settings much like ASP.NET? Further, running the test code under WPF vs a Console application will produce slightly different eviction results.
You can "force" a garbage collection right after the problematic method and see if the problem reproduces executing:
System.Threading.Thread.Sleep(200);
GC.Collect();
GC.WaitForPendingFinalizers();
right at the end of the method (make sure that you free any handles to reference objects and null them out). If this prevents memory leakage, and then yes, there may be a runtime bug.
Stop-the-world garbage collection is based on determining whether a strong live reference to an object exists at the moment the world is stopped. Concurrent garbage collection usually determines whether a strong live reference to an object has existed since some particular time in the past. My conjecture would be that many strong references to objects held in WeakReferences are being individually created and discarded. If a stop-the-world garbage collector fires between the time a particular object is created and the time it's discarded, that particular object will be kept alive, but previously-discarded objects will not. By contrast, a concurrent garbage collector may not detect that all strong references an object have been discarded until a certain amount of time goes by without any strong references to that object being created.
I've sometimes wished that .net would offer something between a strong reference and a weak one, which would prevent an object from being wiped from memory, but would not protect it from being finalized or having weak WeakReferences to it invalidated. Such references would slightly complicate the GC process, requiring every object to have separate flags indicating whether strong and quasi-weak references exist to it, and whether it has been scanned for both strong and quasi-weak references, but such a feature could be helpful in many "weak event" scenarios.
I found this entry while searching for a similiar topic and I'm focusing on your Out of Memory exception.
If you put an object in the cache then it still may be referencing other objects and therefore these objects would not be garbage collected -- hence the out of memory exception and probably a CPU being pegged out due to Gen 2 garbage collection.
Are you putting "used" objects on the cache or clones of "used" objects on the cache? If you put a clone on the cache then the "used" object that possible references other objects could be garbage collected.
If you shut off your caching mechanism does your program still run out of memory? If it doesn't run out of memory then that would prove that the objects you would otherwise be putting on the cache are still holding references to other objects hindering garbage collection.
Forcing garbage collection is not a best practice and shouldn't have to be done. In this scenario forcing a garbage collection wouldn't dispose of referenced objects anyway.
MemoryCache definately has some issues. It ate 160Mb of memory on my asp.net server, just changed to simple list and added some extra logic to get the same functionality.
Hello and thank you very much for your help!
Does anybody have a good idea to find unreferenced objects of a specific class before garbage collection? (preferable as soon as possible)
In my case, I need to create a lot of small objects of a specific class for temporary use only. The problem is that I don’t know when the object is not needed anymore. I would like to collect the objects of that class which are not referenced any more (as soon as possible) before garbage collection so that I can recycle them and don’t need to create them new. I think that would make the code much faster.
Kind Regards,
David
First off, before you do this you should do extensive profiling to determine that you really, truly do have a performance problem caused by collection pressure. The garbage collector is highly tuned and works quite well most of the time; situations where you need to pool objects for performance reasons are rare.
I actually am in that scenario; we have determined through extensive testing that there are certain objects we use all the time on a temporary basis, ("builders" of other objects, essentially) and that the cost of collection pressure caused by re-allocating them frequently is measurable and high.
What we do is we have a pool class which maintains an array of "blank" objects. When you need a new object, the pool checks the array and returns an object that is in the array if we have one, nulling out the array entry. If we don't have one then it creates a new object. When the temporary user is done with the object, it passes it back to the pool, which "blanks" it and sticks it back in the array. (Growing the array if necessary.)
If a user forgets to put the object back into the pool, or cannot do so because an exception was thrown before the "back in the pool" call, who cares? All we've done in that case is perhaps slightly de-optimized a future allocation. The cost is that you need to remember to put the object back in the pool when you're done with it.
There's no way to "hook" the garbage collector to put stuff back in the pool automatically that I know of.
You can't directly control garbage collection, but you could create a manager class that is responsible for creating, holding the references and disposing of these objects. As long as the manager class is in scope, its objects will not be garbage collected.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
When is it acceptable to call GC.Collect?
From what I know the CLR does all this garbage collection for you, but is there a reason to manually call GC.Collect()?
Is it for cases when you close a file, dispose of an image resource or an unmanaged resource, that you should call GC.Collect() to quickly clean up the unused memory immediately?
Is it for cases when you close a file, dispose of an image resource or an unmanaged resource, that you should call GC.Collect() to quickly clean up the unused memory immediately?
No, not really. This is typically handled via IDisposable.
There are very few reasons to call GC.Dispose() directly, and doing so often causes a lot of harm, since it interferes with the internal heuristics of the GC in .NET. There are, however, rare occurrences when it's useful.
For example, if you have a operation that is a rare operation which uses a large object graph, and you know that your going to be "idle" afterwards, you might want to call GC.Collect immediately afterwards to release the memory used by the objects. That being said, this is often still best left up to the system to handle.
For the most part, I've found the most useful scenario for GC.Collect() is for debugging. It can help you guarantee that you don't have a leak, since it allows you to force a full Gen2 collection.
http://blogs.msdn.com/b/ricom/archive/2003/12/02/40780.aspx
http://blogs.msdn.com/b/ricom/archive/2004/11/29/271829.aspx
Right from the equine's oral orifice.
I usually use GC.Collect() to bring the heap to a mostly-known state. For example, when benchmarking, you need to make sure that you start each run from a known state, and GC.Collect() helps with this.
It should not be used to dispose of unmanaged resources -- for that you should use using or manually call Dispose.
I can just think of two cases where GC.Collect() might be useful:
In unit tests. Call GC.Collect() before and after some test to find potential memory leaks. In this case consider using GC.WaitForPendingFinalizers(), because finalizers are executed in a separate thread. This means that classes with finalizers do not immediately release all ressources after calling GC.Collect().
In long living processes like Windows Services where there is a long idle time. Call GC.Collect() before it goes idle. Reason: If a process is idle, the garbage collector will not kick in, thus unused memory will be not released during idle time.
GC.Collect() should not be called for the purpose of "quickly clean up the unused memory immediately". Having a forced release of memory does not pay off the overall performance caused by an active garbage collection.
The only time I've ever used it in code that wasn't specifically to compare memory usage of two or more different approaches to something had the following scenario:
In a web application (and hence, long-running), there were a few very large collections that would generally be re-built at most a few times a day, often much less frequently.
Several objects within that collection would be equivalent, and hence a lot of memory could be saved by replacing references to one such object to a reference to the other (after building the collection was read-only in use, and hence the aliasing involved was safe). So first the collection would be built, but then it would be reduced in size, killing many objects.
This meant that there was a sudden spike in the number of objects destroyed per second, that would then not happen again for several hours at least. As such the GC would not correctly judge the amount of collection needed, and that memory was about to be needed to build the second large collection. Hence doing a manual collection every thousand of clean-up operations did have a positive effect on performance (enough to sometimes go from the application crashing on the third collection, to it being dependable).
A lot of measurement was done to make sure it was indeed beneficial in this case.
The two factors that made this beneficial were:
A lot of object deaths happened.
The event causing this was rare in the lifetime of the application.
Without both of those being true, the manual call would be needless. Without a lot of object deaths, there's no point, and if it wasn't rare in the application's lifetime, the GC would have self-tuned to cope.