Explain if static readonly ConcurrentDictionary<string, AsyncLocal<object>> State leaks memory? - c#

I have an app that seems to accumulate a lot of memory.
One of the suspects is below, and I'm just trying to wrap my head around what it is actually doing. Or, more specifically, how is it cleaned up?
private static readonly ConcurrentDictionary<string, AsyncLocal<object>> State;
Problem context:
The idea is to simulate what OperationContext in WCF would do - provide static access to information about the current call. I am doing this inside a Service Fabric remoting service.
Can someone help me understand the nature of this in terms of what happens to the AsyncLocal<object> once the async call ends? I see it hanging around in memory but can't tell if it is a memory leak, or ig the GC just hasn't reclaimed it yet.
I know the static dictionary stays around, but do the values also, or do I need to be manually clearing those before my current service invocation completes to ensure no memory leak here?
*Edit - Here is some more info as requested by Pavel.
Posting relevant code below, but the whole picture is here.
Github where the general idea came from. They are trying to make headers work in ServiceFabric/.net core like they used to in old WCF.
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders
The RemotingContext object is here:
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders/blob/master/src/ServiceFabric.Remoting.CustomHeaders/RemotingContext.cs
It's use can be seen here (line 52, among others):
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders/blob/master/src/ServiceFabric.Remoting.CustomHeaders/ReliableServices/ExtendedServiceRemotingMessageDispatcher.cs
Here is a code snippet:
public override async Task HandleRequestResponseAsync(IServiceRemotingRequestContext requestContext,
IServiceRemotingRequestMessage requestMessage)
{
var header = requestMessage.GetHeader();
RemotingContext.FromRemotingMessageHeader(header);
//Some other code where the actual service is invoked (and where RemotingContext values can be references for the current call.
return null;
}

The Garbage Collector is a strange beast, so it's worth getting to know about its behaviour. (Note, this is a simplistic view of the GC)
First of all, if just one reference exists to an object, that object is not considered garbage, so it will never be collected.
As your Dictionary is static, it is always going to exist, so anything contained within it will always have a reference to it. If there are no other references to the contained object and you remove it from the Dictionary, it will become unreachable and therefore garbage and will be collected. Making sure there are no references to your objects is the way to ensure they will be collected. It's very easy to forget about some reference you made somewhere, which keeps the object alive.
Secondly, the Garbage Collector doesn't run continuously. It runs when memory resources are getting low and needs to release some memory. This is so that it doesn't hog the CPU to the detriment of the main applications. This means that objects can still be in memory for some time before the next Garbage Collection. This can make memory usage seem high at times, even when it isn't.
Thirdly, the Garbage Collector has "Generations". Generation 0 objects are the newest and most short-lived objects. Generation 0 objects are collected most often, but only when memory is needed.
Generation 1 contains short-lived objects that are on their way to becoming long-lived objects. They are collected less often that Generation 0 objects. In fact, Generation 1 collection only happens if a Generation 0 collection did not release enough memory.
Generation 2 objects are long-lived. These are typically things like static objects which exist for the lifetime of the application. Similarly, these are collected when the Generation 1 collection doesn't release enough memory.
Finally, there is the Large Object heap. Objects that consume a lot of memory take time to move around (part of the garbage collection process involves defragmenting the memory after collection has taken place), so they tend to remain uncollected unless collection didn't release enough memory. Some people refer to this as Generation 3, but they are actually collected in Generation 2 when necessary.

Related

What mechanisms of the C# language are used in order to pass an instance of an object to the `GC.AddMemoryPressure` method?

What mechanisms of the C# language are used in order to pass an instance of an object to the GC.AddMemoryPressure method?
I met the following code sample in the CLR via C# book:
private sealed class BigNativeResource {
private readonly Int32 m_size;
public BigNativeResource(Int32 size) {
m_size = size;
// Make the GC think the object is physically bigger
if (m_size > 0) GC.AddMemoryPressure(m_size);
Console.WriteLine("BigNativeResource create.");
}
~BigNativeResource() {
// Make the GC think the object released more memory
if (m_size > 0) GC.RemoveMemoryPressure(m_size);
Console.WriteLine("BigNativeResource destroy.");
}
}
I can not understand how do we associate an instance of an object with the pressure it adds. I do not see object reference being passed to the GC.AddMemoryPressure. Do we associate the added memory pressure (amp) with an object at all?
Also, I do not see any reasons in calling the GC.RemoveMemoryPressure(m_size);. Literally it should be of no use. Let me explain myself. There are two possibilities: there is an association between the object instance or there is no such association.
In the former case, the GC should now the m_size in order to prioritize and decide when to undertake a collection. So, it definitely should remove the memory pressure by itself (otherwise what would it mean for a GC to remove an object while taking into an account the amp?).
In the later case it is not clear what the use of the adding and removing the amp at all. The GC can only work with the roots which are by definitions instances of classes. I.e. GC only can collect the objects. So, in case there is no association between objects and the amp I see no way how the amp could affect the GC (so I assume there is an association).
I can not understand how do we associate an instance of an object with the pressure it adds.
The instance of the object associates the pressure it adds with a reference to itself by calling AddMemoryPressure. The object already has identity with itself! The code which adds and removes the pressure knows what this is.
I do not see object reference being passed to the GC.AddMemoryPressure.
Correct. There is not necessarily an association between added pressure and any object, and regardless of whether there is or not, the GC does not need to know that information to act appropriately.
Do we associate the added memory pressure (amp) with an object at all?
The GC does not. If your code does, that's the responsibility of your code.
Also, I do not see any reasons in calling the GC.RemoveMemoryPressure(m_size)
That's so that the GC knows that the additional pressure has gone away.
I see no way how the amp could affect the GC
It affects the GC by adding pressure!
I think there is a fundamental misunderstanding of what's going on here.
Adding memory pressure is just telling the GC that there are facts about memory allocation that you know, and that the GC does not know, but are relevant to the action of the GC. There is no requirement that the added memory pressure be associated with any instance of any object or tied to the lifetime of any object.
The code you've posted is a common pattern: an object has additional memory associated with each instance, and it adds a corresponding amount of pressure upon allocation of the additional memory and removes it upon deallocation of the additional memory. But there is no requirement that additional pressure be associated with a specific object or objects. If you added a bunch of unmanaged memory allocations in your static void Main() method, you might decide to add memory pressure corresponding to it, but there is no object associated with that additional pressure.
These methods exist to let GC know about memory usage outside of managed heap. There is no object to pass to these methods because memory is not directly related to any particular managed object. It's responsibility of author of the code to notify GC about change in memory usage correctly.
GC.AddMemoryPressure(Int64)
… the runtime takes into account only the managed memory, and thus underestimates the urgency of scheduling garbage collection.
Extreme example would be you have 32 bit app and GC thinks it can easily allocate almost 2GB of managed (C#) objects. As part of the code you use native interop to allocate 1GB. Without the AddMemoryPressure call GC will still think it's free to wait till you allocate/deallocate a lot of managed objects... but around the time you allocated 1GB of managed objects GC runs into strange state - it should have whole extra GB to play with but there is nothing left so it has to scramble to collect memory at that point. If AddMemoryPressure was properly used GC would had chance to adjust and more aggressively collect earlier in background or at points that allowed shorter/smaller impact.
AddMemoryPressure is used to declare (emphasis here) that you have sensible sized unmanaged data allocated somewhere. This method is a courtesy that the runtime gives to you.
The purpose of the method is to declare under your own responsibility that somewhere you have unmanaged data that is logically bound to some managed object instance. The garbage collector has a simple counter and tracks your request by simply adding the amount you specify to the counter.
The documentation is is clear about that: when the unmanaged memory goes away, you must tell the garbage collector that it has gone away.
You need to use this method to inform the garbage collector that the unmanaged memory is there but could be freed if the associated object is disposed. Then, the garbage collector is able to schedule better its collection tasks.

C# .NET 3.5: Memory Pool, know when object is released?

I'm in a situation where I'm having to create thousands of objects at once, and the cost of instantiating the objects and garbage collecting them is impacting the performance of the application, and the impact of the garbage collector running hurts the performance more since this is on older hardware, so I'm mostly trying to prevent the creation of garbage. I believe a memory pool would solve my issue but I'm not sure how the memory pool would know when a resource in the pool was freed up for re-use. The tricky part is that receiver of objects from the pool end up passing that object around throughout the program and it would be very difficult to know when it could be manually freed up. I'd like it to be like a WeakReference, where I could know when nobody was using it anymore. But my understanding is that if I use a WeakReference in the memory pool then it would eventually get garbage collected from the pool itself and I need these objects to remain pretty much forever so they'll continue to get recycled. Sometimes the program can go for awhile without needing the objects so I imagine the garbage collector would collect them before the next time when they were needed and that would then trigger another performance hit as another thousand of these objects were made.
Is there a way I can make sure these objects are never collected, but know when there are no references to them aside from the memory pool itself? Do I need to implement reference counting for these objects somehow?
I've been googling for a couple of hours now and have seen no implementation of a Memory Pool that doesn't require the user to let the memory pool know when they're done with it. I find it hard to believe there is no way to do this in C#.
Is there a way I can make sure these objects are never collected, but know when there are no references to them aside from the memory pool itself?
Usually an object pool only holds references to available objects (you can check ObjectPool implementation in Roslyn). With that in mind, you can use the finalizer to resurrect the object and return it to the pool when it is unreachable.
However, I don't think it will improve performance. The whole pool will very soon reach generation 2, so the unreachable objects will need a full garbage collection to be returned to the pool. Depending on the memory usage pattern in your program, it might not happen very often. You can GC.Collect() and GC.WaitForPendingFinalizers() of course, but it will also hurt performance. You can try it out and check if it helps.
Another problem is the design - your objects are coupled to the pool.
I'd rather try to return the objects to the pool explicitly. Remember that not all objects have to be returned. If there are no more available objects, the pool can create new ones. The ones that were not returned will just be garbage collected. Check if there are some code paths where you are sure the objects are not needed anymore. If you can't find any, try to refactor your code.

C# - Garbage Collection

Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap).
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack. I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
I have 2 questions:
Is my understanding of Garbage Collection correct?
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
Am I understanding this correctly or am I missing something?
Ok so I understand about the stack and the heap (values live on the Stack, references on the Heap
I don't think you understand about the stack and the heap. If values live on the stack then where does an array of integers live? Integers are values. Are you telling me that an array of integers keeps its integers on the stack? When you return an array of integers from a method, say, with ten thousand integers in it, are you telling me that those ten thousand integers are copied onto the stack?
Values live on the stack when they live on the stack, and live on the heap when they live on the heap. The idea that the type of a thing has to do with the lifetime of its storage is nonsense. Storage locations that are short lived go on the stack; storage locations that are long lived go on the heap, and that is independent of their type. A long-lived int has to go on the heap, same as a long-lived instance of a class.
When I declare a new instance of a Class, this lives on the heap, with a reference to this point in memory on the stack.
Why does the reference have to go on the stack? Again, the lifetime of the storage of the reference has nothing to do with its type. If the storage of the reference is long-lived then the reference goes on the heap.
I also know that C# does it's own Garbage Collection (ie. It determines when an instanciated class is no longer in use and reclaims the memory).
The C# language does not do so; the CLR does so.
Is my understanding of Garbage Collection correct?
You seem to believe a lot of lies about the stack and the heap, so odds are good no, it's not.
Can I do my own?
Not in C#, no.
I ask because I have a method in a For loop. Every time I go through a loop, I create a new instance of my Class. In my head I visualise all of these classes lying around in a heap, not doing anything but taking up memory and I want to get rid of them as quickly as I can to keep things neat and tidy!
The whole point of garbage collection is to free you from worrying about tidying up. That's why its called "automatic garbage collection". It tidies for you.
If you are worried that your loops are creating collection pressure, and you wish to avoid collection pressure for performance reasons then I advise that you pursue a pooling strategy. It would be wise to start with an explicit pooling strategy; that is:
while(whatever)
{
Frob f = FrobPool.FetchFromPool();
f.Blah();
FrobPool.ReturnToPool(f);
}
rather than attempting to do automatic pooling using a resurrecting finalizer. I advise against both finalizers and object resurrection in general unless you are an expert on finalization semantics.
The pool of course allocates a new Frob if there is not one in the pool. If there is one in the pool, then it hands it out and removes it from the pool until it is put back in. (If you forget to put a Frob back in the pool, the GC will get to it eventually.) By pursuing a pooling strategy you cause the GC to eventually move all the Frobs to the generation 2 heap, instead of creating lots of collection pressure in the generation 0 heap. The collection pressure then disappears because no new Frobs are allocated. If something else is producing collection pressure, the Frobs are all safely in the gen 2 heap where they are rarely visited.
This of course is the exact opposite of the strategy you described; the whole point of the pooling strategy is to cause objects to hang around forever. Objects hanging around forever is a good thing if you're going to use them.
Of course, do not make these sorts of changes before you know via profiling that you have a performance problem due to collection pressure! It is rare to have such a problem on the desktop CLR; it is rather more common on the compact CLR.
More generally, if you are the kind of person who feels uncomfortable having a memory manager clean up for you on its schedule, then C# is not the right language for you. Consider C instead.
values live on the Stack, references on the Heap
This is an implementation detail. There is nothing to stop a .NET Framework from storing both on the stack.
I also know that C# does it's own Garbage Collection
C# has nothing to do with this. This is a service provided by the CLR. VB.NET, F#, etc all still have garbage collection.
The CLR will remove an object from memory if it has no strong roots. For example, when your class instance goes out of scope in your for loop. There will be a few lying around, but they will get collected eventually, either by garbage collection or the program terminating.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it?
You can use GC.Collect to force a collection. You should not do it because it is an expensive operation. More expensive than letting a few objects occupy memory a little bit longer than they are absolutely needed. The GC is incredibly good at what it does on its own. You will also force short lived objects to promote to generations they wouldn't get normally.
First off, to Erics seminal post about The truth about value types
Secondly on Garbage collection, the collector knows far more about your running program than you do, don't try to second guess it unless you're in the incredibly unlikely situation that you have a memory leak.
So to your second question, no don't try to "help" the GC.
I'll find a post to this effect on the CG and update this answer.
Can I do my own? If so is there any real benefit to doing this myself or should I just leave it.
Yes you can with GC.Collect but you shouldn't. The GC is optimized for variables that are short lived, ones in a method, and variables that are long lived, ones that generally stick around for the life time of the application.
Variables that are in-between aren't as common and aren't really optimum for the GC.
By forcing a GC.Collect you're more likely to cause variables in scope to be in forced into that in-between state which is the opposite from you are trying to accomplish.
Also from the MSDN article Writing High-Performance Managed Applications : A Primer
The GC is self-tuning and will adjust itself according to applications
memory requirements. In most cases programmatically invoking a GC will
hinder that tuning. "Helping" the GC by calling GC.Collect will more
than likely not improve your applications performance
Your understanding of Garbage Collection is good enough. Essentially, an unreferenced instance is deemed as being out-of-scope and no longer needed. Having determined this, the collector will remove an unreferenced object at some future point.
There's no way to force the Garbage Collector to collect just a specific instance. You can ask it to do its normal "collect everything possible" operation GC.Collect(), but you shouldn't.; the garbage-collector is efficient and effective if you just leave it to its own devices.
In particular it excels at collecting objects which have a short lifespan, just like those that are created as temporary objects. You shouldn't have to worry about creating loads of objects in a loop, unless they have a long lifespan that prevents immediate collection.
Please see this related question with regard to the Stack and Heap.
In your specific scenario, agreed, if you new up objects in a for-loop then you're going to have sub-optimal performance. Are the objects stored (or otherwise used) within the loop, or are they discarded? If the latter, can you optimize this by newing up one object outside the loop and re-using it?
With regard to can you implement your own GC, there is no explicit delete keyword in C#, you have to leave it to the CLR. You can however give it hints such as when to collect, or what to ignore during collection, however I'd leave that unless absolutely necessary.
Best regards,
Read the following article by Microsoft to get a level of knowledge about Garbage Collection in C#. I'm sure it'll help anyone who need information regarding this matter.
Memory Management and Garbage Collection in the .NET Framework
If you are interested in performance of some areas in your code when writing C#, you can write unsafe code. You will have a plus of performance, and also, in your fixed block, the garbage collector most likely will not occur.
Garbage collection is basically reference tracking. I can't think of any good reason why you would want to change it. Are you having some sort of problem where you find that memory isn't being freed? Or maybe you are looking for the dispose pattern
Edit:
Replaced "reference counting" with "reference tracking" to not be confused with the Increment/Decrement Counter on object Reference/Dereference (eg from Python).
I thought it was pretty common to refer to the object graph generation as "Counting" like in this answer:
Why no Reference Counting + Garbage Collection in C#?
But I will not pick up the glove of (the) Eric Lippert :)

C# memory leak?

I have a C# application that loops through a datatable, and pushes these into some locations such as Sage and a SQL table.
While it used to work fine, I'm inexplicably now getting Out of Memory exceptions after an hour or so of running it. I've noticed in the task manager, the memory usage rises by anbout 1mb every second, and keeps on going!
I was under the impression garbage collection would take of anything, but to be sure I ensure I dispose any objects after using them. I know without code it's hard to diagnose, but there's a lot of it and I'm looking more for general advice.
but to be sure I ensure I dispose any objects after using them
Dispose() is not directly related to memory management or leaks.
You'll have to look for unused objects that are still 'reachable'. Use a memory-profiler to find out.
You can start with the free CLR-Profiler.
There are a couple of potential problems that spring to mind:
There is a large pool of objects that are left inelegible for garbage collection (i.e. they are still "reachable"). For example if you add an object to an list in every loop then the list will grown unboundedly and each element in the list will remain inelegible for garbage collection as long as that list is still reachable. I'm not claiming that this is what is happening, this is just an example of how memory might be allocated and then left without being collected.
For some reason the garbage collector isn't doing a collection.
The high memory use is actually due to an unmanaged component that you are using in your application (e.g. via P/Invoke or COM interop).
Without seeing any code its tricky to give specific advice on how to fix your problem however reading through Investigating Memory Issues should give you some pointers on how to diagnose the memory problem yourself. In particular my first step would probably be to examine performance counters to see if the garbage collector is actually running, and to check the various heap sizes.
Note that Dispose and the IDisposable interface is unrelated to memory use - its important to dispose of objects like database connections once you are done with them as it frees up any associated resources (e.g. handles) however disposing of objects that implement IDisposable is very unlikely to have an impact on memory use.
Garbage collection can only get rid of objects that are no longer referenced from anything else. In addition it can only get rid of managed objects - it has no control about memory created from native code you may be interfacing with. These therefore are the two root causes for memory leaks in C# code.
The first thing to look at is perfmon. Get the counters for the private bytes and the .net heap size for the process. If the heap size remains flat (or rises and drops) but private bytes keeps increasing you've got some native code allocating memory and not releasing it.
If the heap size just keeps growing then the leak is in your managed code and you'll need a profiler like ANTS, DotTrace or even WinDbg (with SOS extension) to inspect the heap and see what objects are lying about.
The most popular "memory leak" on .Net platform is forgotten collection that repeatetly added in some infinite loop.
When you new something for temporary memory use.
Always use following way, it ensures calling dispose.
using (Someclass A = new Someclass())
{
....something about A
}
Someclass is a class implemented interface IDisposable
GC won't save you if there some part of unsafe code is involved(P/Invoke, Com etc..), and if there still a reference some where exists.
If you find memory leaking, use WinDbg will see what is in the heap.
This article may give you some help.
http://www.codeproject.com/KB/dotnet/Memory_Leak_Detection.aspx

.Net 4 MemoryCache Leaks with Concurrent Garbage Collection

I'm using the new MemoryCache in .Net 4, with a max cache size limit in MB (I've tested it set between 10 and 200MB, on systems with between 1.75 and 8GB of memory). I don't set any time based expiration on the objects, as I'm using the cache simply as a high performance drive, and as long as there is space, I want it used. To my surprise, the cache refused to evict any objects, to the point that I would get SystemOutOfMemory exceptions.
I fired up perfmon, wired up my application to .Net CLR Memory\#Bytes In All Heaps, .Net Memory Cache 4.0, and Process\Private Bytes -- indeed, the memory consumption was out of control, and no cache trims were being registered.
Did some googling and stackoverflowing, downloaded and attached the CLRProfiler, and wham: evictions everywhere! The memory stayed within reasonable bounds based upon the memory size limit I had set. Ran it in debug mode again, no evictions. CLRProfiler again, evictions.
I finally noticed that the profiler forced the application to run without concurrent garbage collection (also see useful SO Concurrent Garbage Collection Question). I turned it off in my app.config, and, sure enough, evictions!
This seems like at best an outrageous lack of documentation to not say: this only works with non-concurrent garbage collection -- though I image since its ported from ASP.NET, they may not have had to worry about concurrent garbage collection.
So has anyone else seen this? I'd love to get some other experiences out there, and maybe some more educated insights.
Update 1
I've reproduced the issue within a single method: it seems that the cache must be written to in parallel for the cache evictions not to fire (in concurrent garbage collection mode). If there is some interest, I'll upload the test code to a public repo. I'm definitely getting toward the deep end of the the CLR/GC/MemoryCache pool, and I think I forgot my floaties...
Update 2
I published test code on CodePlex to reproduce the issue. Also, possibly of interest, the original production code runs in Azure, as a Worker Role. Interesting, changing the GC concurrency setting in the role's app.config has no effect. Possibly Azure overrides GC settings much like ASP.NET? Further, running the test code under WPF vs a Console application will produce slightly different eviction results.
You can "force" a garbage collection right after the problematic method and see if the problem reproduces executing:
System.Threading.Thread.Sleep(200);
GC.Collect();
GC.WaitForPendingFinalizers();
right at the end of the method (make sure that you free any handles to reference objects and null them out). If this prevents memory leakage, and then yes, there may be a runtime bug.
Stop-the-world garbage collection is based on determining whether a strong live reference to an object exists at the moment the world is stopped. Concurrent garbage collection usually determines whether a strong live reference to an object has existed since some particular time in the past. My conjecture would be that many strong references to objects held in WeakReferences are being individually created and discarded. If a stop-the-world garbage collector fires between the time a particular object is created and the time it's discarded, that particular object will be kept alive, but previously-discarded objects will not. By contrast, a concurrent garbage collector may not detect that all strong references an object have been discarded until a certain amount of time goes by without any strong references to that object being created.
I've sometimes wished that .net would offer something between a strong reference and a weak one, which would prevent an object from being wiped from memory, but would not protect it from being finalized or having weak WeakReferences to it invalidated. Such references would slightly complicate the GC process, requiring every object to have separate flags indicating whether strong and quasi-weak references exist to it, and whether it has been scanned for both strong and quasi-weak references, but such a feature could be helpful in many "weak event" scenarios.
I found this entry while searching for a similiar topic and I'm focusing on your Out of Memory exception.
If you put an object in the cache then it still may be referencing other objects and therefore these objects would not be garbage collected -- hence the out of memory exception and probably a CPU being pegged out due to Gen 2 garbage collection.
Are you putting "used" objects on the cache or clones of "used" objects on the cache? If you put a clone on the cache then the "used" object that possible references other objects could be garbage collected.
If you shut off your caching mechanism does your program still run out of memory? If it doesn't run out of memory then that would prove that the objects you would otherwise be putting on the cache are still holding references to other objects hindering garbage collection.
Forcing garbage collection is not a best practice and shouldn't have to be done. In this scenario forcing a garbage collection wouldn't dispose of referenced objects anyway.
MemoryCache definately has some issues. It ate 160Mb of memory on my asp.net server, just changed to simple list and added some extra logic to get the same functionality.

Categories