When does dot net's MemoryCache eviction occur? How can I simulate the eviction in console application? Whenever I try to add objects to the memory cache until eviction occurs I get OutofMemoryException instead.
See MemoryCacheElement, that is what controls the default behavior if you don't pass in values to the NameValueCollection config in the constructor or you use the default instance.
Looking through the defaults of the MemoryCacheElement, it checks every two minutes (however it does go faster the closer you are to the high pressure limit). Once inside the timer's callback it will caculate the percentage to trim off of the MemoryCache and then will call MemoryCache.Trim(Int32) using the calculated percentage.
One thing to note in the percentage calculation, if no Gen 2 garbage collections have happened the cache does not attempt to shrink itself.
It is very possible that the way your test console program was running it used up all the memory before a Gen 2 collection could occur or was still in the initial two minute slow mode to check on the memory pressure before it could clear items out.
If you would like to simulate a eviction just call
MemoryCache.Default.Trim(50);
And that will evict half of the cache from the default cache.
Related
I'm trying to use .NET System.Runtime.Caching.MemoryCache (.NET Framework 4.7.2). I'm creating my own instance and set memory limit by using CacheMemoryLimitMegabytes parameter.
I use quite short keys - about 50 characters in average. Cached data is just long values (DB record IDs).
I use CacheItemPolicy with SlidingExpiration set to 15 minutes and RemovedCallback set to my method so I can log items evictions.
In my unit tests everything works fine. I set cache memory limit to 1 MB (just for testing) and I'm able to store thousands of items before eviction starts.
But when I try to use MemoryCache in application on dev server and set memory cache limit to 1 MB cache, I experience eviction after adding approximately 10 items to cache.
I tried to measure memory used by cache by approach described here:
In unit test, it reports reasonable values, but when I used my solution with memory cache on dev server, I get approx. 4.5 MB just after adding single item to cache. I even tried to call GC.Collect(2, GCCollectionMode.Forced) before checking ApproximateSize of _sizedRefMultiple, but still getting this value.
And because eviction relies on values returned from ApproximateSize as well, cache starts evicting items almost immediately. So I suspect that issue is caused by value reported by ApproximateSize.
Has anyone experienced similar behaviour? Do you have any tips what to check?
When capturing a dump file and analyzing it (e.g. in WinDbg), I often get the warning that the data may not be accurate, or commands may not be accessible, because the process was in the middle of GC when the dump file was collected.
When doing memory analysis, we often do it because the memory on the process is high and memory pressure is high which I guess forces .NET to GC frequently.
How do I avoid taking dumps during a GC? Is there a way to know when its safe to capture the dump file?
I am no expert in this area but I've noticed that you can use the performance counters of the .NET runtime for monitoring some interesting things - one of them is the number of bytes that has been allocated by the garbage collector during it's last collection.
The description of Allocated Bytes/second in Performance Counters in the .NET Framework states:
Displays the number of bytes per second allocated on the garbage collection heap. This counter is updated at the end of every garbage collection, not at each allocation. This counter is not an average over time; it displays the difference between the values observed in the last two samples divided by the duration of the sample interval.
According to my tests, if you set the update interval of the performance monitor to 1 second and take a good look at the indicator for Allocated Bytes/second, it seems to show a value of 0 after a collection has completed. So I would assume, that you can derive from this value whether a collection is in progress or not.
I checked it by building a small application in VS 2015 which has the ability to show whether there is a garbage collection in progress. The value of the indicator was different to 0 if it was the case.
Update (Thanks Thomas)
It is possible to use ProcDump for monitoring the performance counter and create the dump in an automated fashion.
The correct way of doing so would be: procdump ProcessName -s 1 -ma -pl "\.NET CLR Memory(ProcessName)\Allocated Bytes/second" 1000 which would trigger the dump if the value drops below one thousand.
This should work as the value is only zero if there is no garbage collection going on.
If you are not operating on an english version of the operating system you will have to find out the correct language specific name of the performance counter (can be done by looking at the MSDN-link provided above and switching to a different language there). E.g. the german name would
be "\.NET CLR-Speicher(ProcessName)\Zugeordnete Bytes/Sek.".
In my application I use a dictionary (supporting adding, removing, updating and lookup) where both keys and values are or can be made serializable (values can possibly be quite large object graphs). I came to a point when the dictionary became so large that holding it completely in memory started to occasionally trigger OutOfMemoryException (sometimes in the dictionary methods, and sometimes in other parts of code).
After an attempt to completely replace the dictionary with a database, performance dropped down to an unacceptable level.
Analysis of the dictionary usage patterns showed that usually a smaller part of values are "hot" (are accessed quite often), and the rest (a larger part) are "cold" (accessed rarely or never). It is difficult to say when a new value is added if it will be hot or cold, moreover, some values may migrate back and forth between hot and cold parts over time.
I think that I need an implementation of a dictionary that is able to flush its cold values to a disk on a low memory event, and then reload some of them on demand and keep them in memory until the next low memory event occurs when their hot/cold status will be re-assessed. Ideally, the implementation should neatly adjust the sizes of its hot and cold parts and the flush interval depending on the memory usage profile in the application to maximize overall performance. Because several instances of a dictionary exist in the application (with different key/value types), I think, they might need to coordinate their workflows.
Could you please suggest how to implement such a dictionary?
Compile for 64 bit, deploy on 64 bit, add memory. Keep it in memory.
Before you grown your own you may alternatively look at WeakReference http://msdn.microsoft.com/en-us/library/ms404247.aspx. It would of course require you to rebuild those objects that were reclaimed but one should hope that those which are reclaimed are not used much. It comes with the caveat that its own guidleines state to avoid using weak references as an automatic solution to memory management problems. Instead, develop an effective caching policy for handling your application's objects.
Of course you can ignore that guideline and effectively work your code to account for it.
You can implement the caching policy and upon expiry save to database, on fetch get and cache. Use a sliding expiry of course since you are concerned with keeping those most used.
Do remember however that most used vs heaviest is a trade off. Losing an object 10 times a day that takes 5 minutes to restore would annoy users much more than losing an object 10000 times which tool just 5ms to restore.
And someone above mentioned the web cache. It does automatic memory management with callbacks as noted, depends if you want to lug that one around in your apps.
And...last but not least, look at a distributed cache. With sharding you can split that big dictionary across a few machines.
Just an idea - never did that and never used System.Runtime.Caching:
Implement a wrapper around MemoryCache which will:
Add items with an eviction callback specified. The callback will place evicted items to the database.
Fetch item from database and put back into MemoryCache if the item is absent in MemoryCache during retrieval.
If you expect a lot of request for items missing both in database and memory, you'll probably need to implement either bloom filter or cache keys for present/missing items also.
I have a similar problem in the past.
The concept you are looking for is a read through cache with a LRU (Least Recently Used or Most Recently Used) queue.
Is it there any LRU implementation of IDictionary?
As you add things to your dictionary keep track of which ones where used least recently, remove them from memory and persist those to disk.
I have a collection (I'm writing a Weak Dictionary) and I need to cull the dead WeakReferences periodically. What I've usually seen is checks in the Add and Remove methods that say, "After X modifications to the collection, it's time to cull." This will be acceptable for me, but it seems like there should be a better way.
I would really like to know when the GC ran and run my cleanup code immediately after. After all, the GC is probably the best mechanism for determining when is a good time to clean up dead references. I found Garbage Collection Notifications, but it doesn't look like this is what I want. I don't want to spawn a separate thread just to monitor the GC. Ideally, my collection would implement IWantToRunCodeDuringGC or subscribe to a System.GC.Collected event. But the .NET framework probably can't trust user code to run during a GC...
Or maybe there's another approach I'm overlooking.
EDIT: I don't think it matters if my code runs after, before, or during the GC.
I guess you want to create a weak dictionary as a way of “lazy” caching of some data.
You need to consider the fact that GC will occur very often and your weak references will be dead most of the time if no other objects will reference them. GC occurs approximately after every 256KB of memory is allocated. It is pretty often.
You probably will be better by implementing your cache as a dictionary with a maximum number of elements. Then you can use least-recently used algorithm or time-based algorithm for pushing elements out of the collection. Usually such approach has better performance and memory consumption than using weak references.
Can't you just use an object with no references, then call some code in the finalizer? The finalizer is called when the GC is collecting the object.
EDIT: Then of course, you wouldn't know when the GC was done.. Hmm.
From the standpoint of avoiding memory leaks, I would suggest that when something is added to a dictionary, you check whether the number of garbage-collections that have been performed. If the number of items that have been added between the last time the dictionary was checked and the time the last collection occurred exceeds a reasonable fraction of the size of the dictionary (say, 10%, or some minimum number of items, whichever is less), that would be a sign that the dictionary should be swept. Note that this approach will limit the number of excessive items in the dictionary to a certain fraction of the dictionary size while offering reasonable performance regardless of dictionary size.
OVERVIEW
I am facing performance slowdown while iterating MANY times through a calculator class.
Iterations take about 3mn each at the beginning and take longer and longer as the iteration count grows (30mn+/per process). I have to Stop the program/Restart the execution where I left it to come back to normal conditions (3mn/per process).
WHAT I DO
I have a scientific application that tests a set of parameters over a process.
For example I have N scenarios (i.e a parameter combination), tested over an experimentation set, that consists in a calculator class that takes the parameters in input, processes them against T possible XP conditions, and stores the output in ORM objects, that are fired to DB after each iteration. In other words, Each of the N Parameters combination is passed T times trough the calculator.
Parameter combination : Params Set 1, Params Set 2, ...., Params Set N
Experimental Set : XP Set 1 , XP Set 2 , ...., XP Set T
So I have NxT combinations, N and T being around 256 each, which give 65000+ iterations.
HOW I DO IT
I have a GUI to fix the parameter sets, and launch Background Workers (one per Parameter combination). Each Backrgound worker loads the first of the T XP sets, executes the current Parameter Set, move to next XP Set, and so on . A report is generated after each single iteration by the calculator (i.e after each Nx/Tx) and an event is fired to populate .NET Linq/SQL ORM objects (AgileFX) and store them into an SQL Server Database.
THE PROBLEM
The process runs fine the first 30mn and then slowly begins to drift, each iteration taking longer and longer (Sound like a memory overflow or so...)
HINT
Oddly enough, an experimenter noticed very pertinently that the processing time grows in a linear fashion : +3mn more of the precedent processing time. Which comes down to an arithmetic progression (Tn+1 = Tn + 3mn)
I have a 12-Core INTEL and 24GB RAM
A quick suggestion, could you solve your problem through Memoization, avoiding re-calculating what should have been known results?
Also, remember that your garbage collector will not be able to do a garbage collection if you have it will find a reference to the object in some way!
I think I have found one part of the problem, but it did not fix the issue completely :
Objects where sent to the ORM via Delegates registered by a Listener, so each Calculation Thread was still "existing" in the memory even after it has ended.
As a colleague stated it : "Even if you move off, If I still have your address in my registers, ror me you still live in the neighborhood."
BTW, performance wizard in VS2010 works a treat. Extremely insightful and useful for monitoring overall memory performance with precision and accuracy.
EDIT : PROBLEM SOLVED
The class responsible for firing background workers was keeping track of some data in a tracker object that kept growing on and on and never flushed, getting bigger and bigger. I've noticed this by closely tracking memory usage per object in VS 2010 Performance Wizard.
I advice having a clear view of objects lifecycle, and memory usage, although it can get tough when the application is big and complex.