I'm using the new MemoryCache in .Net 4, with a max cache size limit in MB (I've tested it set between 10 and 200MB, on systems with between 1.75 and 8GB of memory). I don't set any time based expiration on the objects, as I'm using the cache simply as a high performance drive, and as long as there is space, I want it used. To my surprise, the cache refused to evict any objects, to the point that I would get SystemOutOfMemory exceptions.
I fired up perfmon, wired up my application to .Net CLR Memory\#Bytes In All Heaps, .Net Memory Cache 4.0, and Process\Private Bytes -- indeed, the memory consumption was out of control, and no cache trims were being registered.
Did some googling and stackoverflowing, downloaded and attached the CLRProfiler, and wham: evictions everywhere! The memory stayed within reasonable bounds based upon the memory size limit I had set. Ran it in debug mode again, no evictions. CLRProfiler again, evictions.
I finally noticed that the profiler forced the application to run without concurrent garbage collection (also see useful SO Concurrent Garbage Collection Question). I turned it off in my app.config, and, sure enough, evictions!
This seems like at best an outrageous lack of documentation to not say: this only works with non-concurrent garbage collection -- though I image since its ported from ASP.NET, they may not have had to worry about concurrent garbage collection.
So has anyone else seen this? I'd love to get some other experiences out there, and maybe some more educated insights.
Update 1
I've reproduced the issue within a single method: it seems that the cache must be written to in parallel for the cache evictions not to fire (in concurrent garbage collection mode). If there is some interest, I'll upload the test code to a public repo. I'm definitely getting toward the deep end of the the CLR/GC/MemoryCache pool, and I think I forgot my floaties...
Update 2
I published test code on CodePlex to reproduce the issue. Also, possibly of interest, the original production code runs in Azure, as a Worker Role. Interesting, changing the GC concurrency setting in the role's app.config has no effect. Possibly Azure overrides GC settings much like ASP.NET? Further, running the test code under WPF vs a Console application will produce slightly different eviction results.
You can "force" a garbage collection right after the problematic method and see if the problem reproduces executing:
System.Threading.Thread.Sleep(200);
GC.Collect();
GC.WaitForPendingFinalizers();
right at the end of the method (make sure that you free any handles to reference objects and null them out). If this prevents memory leakage, and then yes, there may be a runtime bug.
Stop-the-world garbage collection is based on determining whether a strong live reference to an object exists at the moment the world is stopped. Concurrent garbage collection usually determines whether a strong live reference to an object has existed since some particular time in the past. My conjecture would be that many strong references to objects held in WeakReferences are being individually created and discarded. If a stop-the-world garbage collector fires between the time a particular object is created and the time it's discarded, that particular object will be kept alive, but previously-discarded objects will not. By contrast, a concurrent garbage collector may not detect that all strong references an object have been discarded until a certain amount of time goes by without any strong references to that object being created.
I've sometimes wished that .net would offer something between a strong reference and a weak one, which would prevent an object from being wiped from memory, but would not protect it from being finalized or having weak WeakReferences to it invalidated. Such references would slightly complicate the GC process, requiring every object to have separate flags indicating whether strong and quasi-weak references exist to it, and whether it has been scanned for both strong and quasi-weak references, but such a feature could be helpful in many "weak event" scenarios.
I found this entry while searching for a similiar topic and I'm focusing on your Out of Memory exception.
If you put an object in the cache then it still may be referencing other objects and therefore these objects would not be garbage collected -- hence the out of memory exception and probably a CPU being pegged out due to Gen 2 garbage collection.
Are you putting "used" objects on the cache or clones of "used" objects on the cache? If you put a clone on the cache then the "used" object that possible references other objects could be garbage collected.
If you shut off your caching mechanism does your program still run out of memory? If it doesn't run out of memory then that would prove that the objects you would otherwise be putting on the cache are still holding references to other objects hindering garbage collection.
Forcing garbage collection is not a best practice and shouldn't have to be done. In this scenario forcing a garbage collection wouldn't dispose of referenced objects anyway.
MemoryCache definately has some issues. It ate 160Mb of memory on my asp.net server, just changed to simple list and added some extra logic to get the same functionality.
Related
I have an app that seems to accumulate a lot of memory.
One of the suspects is below, and I'm just trying to wrap my head around what it is actually doing. Or, more specifically, how is it cleaned up?
private static readonly ConcurrentDictionary<string, AsyncLocal<object>> State;
Problem context:
The idea is to simulate what OperationContext in WCF would do - provide static access to information about the current call. I am doing this inside a Service Fabric remoting service.
Can someone help me understand the nature of this in terms of what happens to the AsyncLocal<object> once the async call ends? I see it hanging around in memory but can't tell if it is a memory leak, or ig the GC just hasn't reclaimed it yet.
I know the static dictionary stays around, but do the values also, or do I need to be manually clearing those before my current service invocation completes to ensure no memory leak here?
*Edit - Here is some more info as requested by Pavel.
Posting relevant code below, but the whole picture is here.
Github where the general idea came from. They are trying to make headers work in ServiceFabric/.net core like they used to in old WCF.
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders
The RemotingContext object is here:
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders/blob/master/src/ServiceFabric.Remoting.CustomHeaders/RemotingContext.cs
It's use can be seen here (line 52, among others):
https://github.com/Expecho/ServiceFabric-Remoting-CustomHeaders/blob/master/src/ServiceFabric.Remoting.CustomHeaders/ReliableServices/ExtendedServiceRemotingMessageDispatcher.cs
Here is a code snippet:
public override async Task HandleRequestResponseAsync(IServiceRemotingRequestContext requestContext,
IServiceRemotingRequestMessage requestMessage)
{
var header = requestMessage.GetHeader();
RemotingContext.FromRemotingMessageHeader(header);
//Some other code where the actual service is invoked (and where RemotingContext values can be references for the current call.
return null;
}
The Garbage Collector is a strange beast, so it's worth getting to know about its behaviour. (Note, this is a simplistic view of the GC)
First of all, if just one reference exists to an object, that object is not considered garbage, so it will never be collected.
As your Dictionary is static, it is always going to exist, so anything contained within it will always have a reference to it. If there are no other references to the contained object and you remove it from the Dictionary, it will become unreachable and therefore garbage and will be collected. Making sure there are no references to your objects is the way to ensure they will be collected. It's very easy to forget about some reference you made somewhere, which keeps the object alive.
Secondly, the Garbage Collector doesn't run continuously. It runs when memory resources are getting low and needs to release some memory. This is so that it doesn't hog the CPU to the detriment of the main applications. This means that objects can still be in memory for some time before the next Garbage Collection. This can make memory usage seem high at times, even when it isn't.
Thirdly, the Garbage Collector has "Generations". Generation 0 objects are the newest and most short-lived objects. Generation 0 objects are collected most often, but only when memory is needed.
Generation 1 contains short-lived objects that are on their way to becoming long-lived objects. They are collected less often that Generation 0 objects. In fact, Generation 1 collection only happens if a Generation 0 collection did not release enough memory.
Generation 2 objects are long-lived. These are typically things like static objects which exist for the lifetime of the application. Similarly, these are collected when the Generation 1 collection doesn't release enough memory.
Finally, there is the Large Object heap. Objects that consume a lot of memory take time to move around (part of the garbage collection process involves defragmenting the memory after collection has taken place), so they tend to remain uncollected unless collection didn't release enough memory. Some people refer to this as Generation 3, but they are actually collected in Generation 2 when necessary.
Today I have seen a piece of code that first seemed odd to me at first glance and made me reconsider. Here is a shortened version of the code:
if(list != null){
list.Clear();
list = null;
}
My thought was, why not replace it simply by:
list = null;
I read a bit and I understand that clearing a list will remove the reference to the objects allowing the GC to do it's thing but will not "resize". The allocated memory for this list stays the same.
On the other side, setting to null would also remove the reference to the list (and thus to its items) also allowing the GC to do it's thing.
So I have been trying to figure out a reason to do it the like the first block. One scenario I thought of is if you have two references to the list. The first block would clear the items in the list so even if the second reference remains, the GC can still clear the memory allocated for the items.
Nonetheless, I feel like there's something weird about this so I would like to know if the scenario I mentioned makes sense?
Also, are there any other scenarios where we would have to Clear() a list right before setting the reference to null?
Finally, if the scenario I mentioned made sense, wouldn't it be better off to just make sure we don't hold multiple references to this list at once and how would we do that (explicitly)?
Edit: I get the difference between Clearing and Nulling the list. I'm mostly curious to know if there is something inside the GC that would make it so that there would be a reason to Clear before Nulling.
The list.Clear() is not necessary in your scenario (where the List is private and only used within the class).
A great intro level link on reachability / live objects is http://levibotelho.com/development/how-does-the-garbage-collector-work :
How does the garbage collector identify garbage?
In Microsoft’s
implementation of the .NET framework the garbage collector determines
if an object is garbage by examining the reference type variables
pointing to it. In the context of the garbage collector, reference
type variables are known as “roots”. Examples of roots include:
A reference on the stack
A reference in a static variable
A reference in another object on the managed heap that is not eligible for garbage
collection
A reference in the form of a local variable in a method
The key bit in this context is A reference in another object on the managed heap that is not eligible for garbage collection. Thus, if the List is eligible to be collected (and the objects within the list aren't referenced elsewhere) then those objects in the List are also eligible to be collected.
In other words, the GC will realise that list and its contents are unreachable in the same pass.
So, is there an instance where list.Clear() would be useful? Yes. It might be useful if you have two references to a single List (e.g. as two fields in two different objects). One of those references may wish to clear the list in a way that the other reference is also impacted - in which list.Clear() is perfect.
This answer started as a comment for Mick, who claims that:
It depends on which version of .NET you are working with. On mobile platforms like Xamarin or mono, you may find that the garbage collector needs this kind of help in order to do its work.
That statement is begging to be fact checked. So, let us see...
.NET
.NET uses a generational mark and sweep garbage collector. You can see the abstract of the algorithm in What happens during a garbage collection
. For summary, it goes over the object graph, and if it cannot reach a object, that one can be erased.
Thus, the garbage collector will correctly identify the items of the list as collectible in the same iteration, regardless of whatever or not you clear the list. There is no need to decouple the objects beforehand.
This means that clearing the list does not help the garbage collector on the regular implementation of .NET.
Note: If there were another reference to the list, then the fact that you cleared the list would be visible.
Mono and Xamarin
Mono
As it turns out, the same is true for Mono.
Xamarin.Android
Also true for Xamarin.Android.
Xamarin.iOS
However, Xamarin.iOS requires additional considerations. In particular, MonoTouch will use wrapped Objective-C objects which are beyond the garbage collector. See Avoid strong circular references under iOS Performance. These objects require different semantics.
Xamarin.iOS will minimize the use of Objetive-C objects by keeping a cache:
C# NSObjects are also created on demand when you invoke a method or a property that returns an NSObject. At this point, the runtime will look into an object cache and determine whether a given Objective-C NSObject has already been surfaced to the managed world or not. If the object has been surfaced, the existing object will be returned, otherwise a constructor that takes an IntPtr as a parameter is invoked to construct the object.
The system keeps these objects alive even there are no references from managed code:
User-subclasses of NSObjects often contain C# state so whenever the Objective-C runtime performs a "retain" operation on one of these objects, the runtime creates a GCHandle that keeps the managed object alive, even if there are no C# visible references to the object. This simplifies bookeeping a lot, since the state will be preserved automatically for you.
Emphasis mine.
Thus, under Xamarin.iOS, if there were a chance that the list might contain wrapped Objetive-C objects, this code would help the garbage collector.
See the question How does memory management works on Xamarin.IOS, Miguel de Icaza explains in his answer that the semantics are to "retain" the object when you take a reference and "release" it when the reference is null.
On the Objetive-C side, "release" does not mean to destroy the object. Objetive-C uses a reference count garbage collector. When we "retain" the object the counter is incremented and when we "release" the counter is decreased. The system destroys the object when the counter reaches zero. See: About Memory Management.
Therefore, Objetive-C is bad at handling circular references (if A references B and B references A, their reference count is not zero, even if they cannot be reached), thus, you should avoid them in Xamarin.iOS. In fact, forgetting to decouple references will lead to leaks in Xamarin.iOS... See: Xamarin iOS memory leaks everywhere.
Others
dotGNU also uses a generational mark and sweep garbage collector.
I also had a look at CrossNet (that compiles IL to C++), it appears they attempted to implement it too. I do not know how good it is.
It depends on which version of .NET you are working with. On mobile platforms like Xamarin or mono, you may find that the garbage collector needs this kind of help in order to do its work. Whereas on desktop platforms the garbage collector implementation may be more elaborate. Each implementation of the CLI out there is going to have it's own implementation of the garbage collector and it is likely to behave differently from one implementation to another.
I can remember 10 years ago working on a Windows Mobile application which had memory issues and this sort of code was the solution. This was probably due to the mobile platform requiring a garbage collector that was more frugal with processing power than the desktop.
Decoupling objects helps simplify the analysis the garbage collector needs to do and helps avoid scenarios where the garbage collector fails to recognise a large graph of objects has actually become disconnected from all the threads in your application. Which results in memory leaks.
Anyone who believes you can't have memory leaks in .NET is an inexperienced .NET developer. On desktop platforms just ensuring Dispose is called on objects which implement them may be enough, however with other implementations you may find it is not.
List.Clear() will decouple the objects in the list from the list and each other.
EDIT: So to be clear I'm not claiming that any particular implementation currently out there is susceptible to memory leaks. And again depending on when this answer is read the robustness of the garbage collector on any implementation of the CLI currently out there could have changed since the time writing this.
Essentially I'm suggesting if you know that your code needs to be cross platform and used across many implementations of the .NET framework, especially implementations of the .NET framework for mobile devices, it could be worth investing time into decoupling objects when they are no longer required. In that case I'd start off by adding decoupling to classes that already implement Dispose, and then if needed look at implementing IDisposable on classes that don't implement IDisposable and ensuring Dispose is called on those classes.
How to tell for sure if it's needed? You need to instrument and monitor the memory usage of your application on each platform it is to be deployed on. Rather than writing lots of superfluous code, I think the best approach is to wait until your monitoring tools indicate you have memory leaks.
As mentioned in the docs:
List.Clear Method (): Count is set to 0, and references to other
objects from elements of the collection are also released.
In your 1st snippet:
if(list != null){
list.Clear();
list = null;
}
If you just set the list to null, it means that you release the reference of your list to the actual object in the memory (so the list itself is remain in the memory) and waiting for the Garbage Collector comes and release its allocated memory.
But the problem is that your list may contain elements that hold a reference to another objects, for example:
list → objectA, objectB, objectC
objectB → objectB1, objectB2
So, after setting the list to null, now list has no reference and it should be collected by Garbage Collector later, but objectB1 and objectB2 has a reference from objectB (still be in the memory) and because of that, Garbage Collector need to analyse the object reference chain. To make it less confusing, this snippet use .Clear() function to remove this confusion.
Clearing the list ensures that if the list is not garbage collected for some reason, then at the very least, the elements it contained can still be disposed of.
As stated in the comments, preventing other references to the list from existing requires careful planning, and clearing the list before nulling it doesn't incur a big enough performance hit to justify trying to avoid doing so.
I'm in a situation where I'm having to create thousands of objects at once, and the cost of instantiating the objects and garbage collecting them is impacting the performance of the application, and the impact of the garbage collector running hurts the performance more since this is on older hardware, so I'm mostly trying to prevent the creation of garbage. I believe a memory pool would solve my issue but I'm not sure how the memory pool would know when a resource in the pool was freed up for re-use. The tricky part is that receiver of objects from the pool end up passing that object around throughout the program and it would be very difficult to know when it could be manually freed up. I'd like it to be like a WeakReference, where I could know when nobody was using it anymore. But my understanding is that if I use a WeakReference in the memory pool then it would eventually get garbage collected from the pool itself and I need these objects to remain pretty much forever so they'll continue to get recycled. Sometimes the program can go for awhile without needing the objects so I imagine the garbage collector would collect them before the next time when they were needed and that would then trigger another performance hit as another thousand of these objects were made.
Is there a way I can make sure these objects are never collected, but know when there are no references to them aside from the memory pool itself? Do I need to implement reference counting for these objects somehow?
I've been googling for a couple of hours now and have seen no implementation of a Memory Pool that doesn't require the user to let the memory pool know when they're done with it. I find it hard to believe there is no way to do this in C#.
Is there a way I can make sure these objects are never collected, but know when there are no references to them aside from the memory pool itself?
Usually an object pool only holds references to available objects (you can check ObjectPool implementation in Roslyn). With that in mind, you can use the finalizer to resurrect the object and return it to the pool when it is unreachable.
However, I don't think it will improve performance. The whole pool will very soon reach generation 2, so the unreachable objects will need a full garbage collection to be returned to the pool. Depending on the memory usage pattern in your program, it might not happen very often. You can GC.Collect() and GC.WaitForPendingFinalizers() of course, but it will also hurt performance. You can try it out and check if it helps.
Another problem is the design - your objects are coupled to the pool.
I'd rather try to return the objects to the pool explicitly. Remember that not all objects have to be returned. If there are no more available objects, the pool can create new ones. The ones that were not returned will just be garbage collected. Check if there are some code paths where you are sure the objects are not needed anymore. If you can't find any, try to refactor your code.
I have an application that creates trees of nodes, then tosses them and makes new trees.
The application allocates about 20 MB upon startup. But I tried loading a large file of nodes many times, and the allocated memory went above 700 MB. I thought I would see memory being freed by the garbage collector on occasion.
The machine I'm on has 12 GB of RAM, so maybe it's just that such a "small" amount of memory being allocated doesn't matter to the GC.
I've found a lot of good info on how the GC works, and that it's best not to tell it what to do. But I wanted to verify that it's actually doing anything, and that I'm not somehow doing something wrong in the code that prevents my objects from being cleaned up.
The GC generally runs when either of the scenarios below occur:
You call GC.Collect (which you shouldn't)
Gen0's budget is exhausted
There are some other scenarios as well, but I'll skip those for now.
You didn't tell us how you measured the memory usage, but if you're looking at the memory usage of process itself (e.g. through task manager), then you may not see the numbers you expect. Remember that the .NET runtime essentially has its own memory manager that handles memory usage on behalf of you managed application. The runtime tries to be smart about it so it doesn't allocate and free memory to the OS all the time (those are expensive operations). This question may be relevant as well.
If you're concerned about memory leaks have a look at some of the answers here.
When does the .Net 3.5 garbage collector run?
I thought I would see memory being freed by the garbage collector on occasion.
Since the GC is non-deterministic, you won't be able to necessarily determine when it is going to issue a collection. Short answer: It will run when needed. Trying to analyze your code and predict or assume it should be running at a certain time usually ends up down a rabbit hole.
Answer to: do I leak objects or GC have not need to run yet?
Use memory profiler to see what objects are allocated. As basic step - force garbage collection (GC.Collect) and check out if allocated memory (GC.GetTotalMemory) seems to be reasonable.
If you want to make sure that you're not leaving any unwanted object behind you can use dotTrace memory profiler. It allows you to take two snapshots of objects in memory (taken some time apart) and compare them. You will can clearly see if any old nodes are still hanging around and what is keeping a reference to them and stops them from being collected.
You may have the managed equivalent of a memory leak. Are you maintaining stale references to these objects (i.e., do you have a List<T> or some other object which tracks these nodes)?
Are the subscribing to an event of an object that is not going out of scope? A reference to the subscribee of an event is maintained, so if you don't detach it will keep your objects alive.
You may also be forgetting to Dispose of objects that implement IDisposable. Can't say without seeing your code.
The exact behavior of the GC is implementation defined. You should design your application such that it does not matter. If you need deterministic memory management then you are using the wrong language. Use a tool (RedGate's ANTS profiler will do) to see if you are leaking references somewhere.
This comic is the best way to explain this topic :)
You can monitor the garbage collector activity in .NET 4.0 and beyond with GC.RegisterForFullGCNotification as described in this link: http://www.abhisheksur.com/2010/08/garbage-collection-notifications-in-net.html
i have a cache that uses WeakReferences to the cached objects to make them automatically removed from the cache in case of memory pressure. My problem is that the cached objects are collected very soon after they have been stored in the cache. The cache runs in a 64-Bit application and in spite of the case that more than 4gig of memory are still available, all the cached objects are collected (they usually are stored in the G2-heap at that moment). There are no garbage collection induced manually as the process explorer shows.
What methods can i apply to make the objects live a litte longer?
Using WeakReferences as the primary means of referencing cached objects is not really a great idea, because as Josh said, your at the mercy of any future behavioral changes to WeakReference and the GC.
However, if your cache needs any kind of resurrection capability, use of WeakReferences for items that are pending purge is useful. When an item meets eviction criteria, rather than immediately evicting it, you change its reference to a weak reference. If anything requests it before it is GC'ed, you restore its strong reference, and the object can live again. I have found this useful for some caches that have hard to predict hit rate patterns with frequent enough "resurrections" to be beneficial.
If you have predictable hit rate patterns, then I would forgoe the WeakReference option and perform explicit evictions.
There is one situation where a WeakReference-based cache may be good: when the usefulness of an item in the class is predicated upon the existence of a reference to it. In such a situation, a weak interning cache may be useful. For example, if one had an application which would deserialize many large immutable objects, many of which were expected to be duplicates, and would have to perform many comparisons between them. If X and Y are references to some immutable class type, testing X.Equals(Y) will be very fast if both variables point to the same instance, but may be very slow if they point to distinct instances that happen to be equal. If a deserialized object happens to match another object to which a reference already exists, fetching a from the dictionary a reference to that latter object (requiring one slow comparison) may expedite future comparisons. On the other hand, if it matched an item in the dictionary but the dictionary was the only reference to that item, there would be little advantage to using the dictionary object instead of simply keeping the object that was read in; probably not enough advantage to justify the cost of the comparison. For an interning cache, having WeakReferences get invalidated as soon as possible once no other references exist to an object would be a good thing.
In .net, a WeakReference is not considered a reference from the GC standpoint at all, so any object that only has weak references will be collected in the next GC run (for the appropriate generation).
That makes weak reference completely inappropriate for caching - as your experience shows.
You need a "real" cache component, and the most important thing about caching is to get one where the eviction policy (that is, the rules about when to drop an object from the cache) are a good match for you application's usage pattern.
No, WeakReference is not good for that because the behavior of the garbage collector can and will change over time and your cache should not be relying on today's behavior. Also many factors outside of your control could affect memory pressure.
There are many implementations of a cache for .NET. You could find probably a dozen on CodePlex. I guess what you need to add to it is something that looks at the application's current working set to use that as a trigger for purging.
One more note about why your objects are being collected so frequently. The GC is very aggressive at cleaning up Gen0 objects. If your objects are very short-lived (up until the only reference to it is a weak reference) then the GC is doing what it's designed to do by cleaning up as quickly as it can.
I believe the problem you are having is that the Garbage Collector removes weakly referenced objects in response not only in response to memory pressure - instead it will do collection quite aggressively sometimes just because the runtime system thinks some objects may likely have become unreachable.
You may be better off using e.g. System.Runtime.Caching.MemoryCache which can be configured with a memory limit, or custom eviction policies for the items.
The answer actually depends on usage characteristics of the cache you are trying to build. I have successfully used WeakReference based caching strategy for improving performance in many of my projects where the cached objects are expected to be used in short bursts of multiple reads. As others pointed out, the weak references are pretty much garbage from GC's point of view and will be collected whenever the next GC cycle is run. It's nothing to do with the memory utilization.
If, however, you need a cache that survives such brutality from GC, you need to use or mimic the functionality provided by System.Runtime.Caching namespace. Keep in mind that you'd need an additional thread that cleans up the cache when the memory usage is crossing your thresholds.
A bit late, but here's a relevant use case:
I need to cache two types of objects: large (deserialised) data files that take 10 minutes to load and cost 15G of ram each, and smaller (dynamically compiled) objects that contain internal references to those data files (the smaller objects are also cached because they take ~10s to generate). These caches are hidden within the factories that supply the objects (the former component having no knowledge of the latter), and have different eviction policies.
When my `data file' cache evicts an object, it replaces it by a weak reference, so if that object is still available when next requested, we can resurrect it (and renew its cache timeout). In this way we avoid losing (or accidentally duplicating) any object before it is truly defunct (i.e. not used anywhere else). Notice that neither cache is required to be aware of the other, and that no other client objects need to be aware that there are any caches at all (eg: we avoid needing 'keepalives', callbacks, registration, retrieve-and-return scopes, etc - things get a lot simpler).
So although using WeakReference by itself (instead of a cache) is a terrible idea (because modern GCs are typically tuned to the size of the L2 CPU cache, and regular code will burn through this many times per minute), it's very useful as a way to hide your caches from the rest of your code.