How do I clear a System.Runtime.Caching.MemoryCache - c#

I use a System.Runtime.Caching.MemoryCache to hold items which never expire. However, at times I need the ability to clear the entire cache. How do I do that?
I asked a similar question here concerning whether I could enumerate the cache, but that is a bad idea as it needs to be synchronised during enumeration.
I've tried using .Trim(100) but that doesn't work at all.
I've tried getting a list of all the keys via Linq, but then I'm back where I started because evicting items one-by-one can easily lead to race conditions.
I thought to store all the keys, and then issue a .Remove(key) for each one, but there is an implied race condition there too, so I'd need to lock access to the list of keys, and things get messy again.
I then thought that I should be able to call .Dispose() on the entire cache, but I'm not sure if this is the best approach, due to the way it's implemented.
Using ChangeMonitors is not an option for my design, and is unnecassarily complex for such a trivial requirement.
So, how do I completely clear the cache?

I was struggling with this at first. MemoryCache.Default.Trim(100) does not work (as discussed). Trim is a best attempt, so if there are 100 items in the cache, and you call Trim(100) it will remove the ones least used.
Trim returns the count of items removed, and most people expect that to remove all items.
This code removes all items from MemoryCache for me in my xUnit tests with MemoryCache.Default. MemoryCache.Default is the default Region.
foreach (var element in MemoryCache.Default)
{
MemoryCache.Default.Remove(element.Key);
}

You should not call dispose on the Default member of the MemoryCache if you want to be able to use it anymore:
The state of the cache is set to indicate that the cache is disposed.
Any attempt to call public caching methods that change the state of
the cache, such as methods that add, remove, or retrieve cache
entries, might cause unexpected behavior. For example, if you call the
Set method after the cache is disposed, a no-op error occurs. If you
attempt to retrieve items from the cache, the Get method will always
return Nothing.
http://msdn.microsoft.com/en-us/library/system.runtime.caching.memorycache.dispose.aspx
About the Trim, it's supposed to work:
The Trim property first removes entries that have exceeded either an absolute or sliding expiration. Any callbacks that are registered
for items that are removed will be passed a removed reason of Expired.
If removing expired entries is insufficient to reach the specified trim percentage, additional entries will be removed from the cache
based on a least-recently used (LRU) algorithm until the requested
trim percentage is reached.
But two other users reported it doesnt work on same page so I guess you are stuck with Remove() http://msdn.microsoft.com/en-us/library/system.runtime.caching.memorycache.trim.aspx
Update
However I see no mention of it being singleton or otherwise unsafe to have multiple instances so you should be able to overwrite your reference.
But if you need to free the memory from the Default instance you will have to clear it manually or destroy it permanently via dispose (rendering it unusable).
Based on your question you could make your own singleton-imposing class returning a Memorycache you may internally dispose at will.. Being the nature of a cache :-)

Here's is what I had made for something I was working on...
public void Flush()
{
List<string> cacheKeys = MemoryCache.Default.Select(kvp => kvp.Key).ToList();
foreach (string cacheKey in cacheKeys)
{
MemoryCache.Default.Remove(cacheKey);
}
}

I know this is an old question but the best option I've come across is to
Dispose the existing MemoryCache and create a new MemoryCache object.
https://stackoverflow.com/a/4183319/880642
The answer doesn't really provide the code to do this in a thread safe way. But this can be achieved using Interlocked.Exchange
var oldCache = Interlocked.Exchange(ref _existingCache, new MemoryCache("newCacheName"));
oldCache.Dispose();
This will swap the existing cache with a new one and allow you to safely call Dispose on the original cache. This avoids needing to enumerate the items in the cache and race conditions caused by disposing a cache while it is in use.
Edit
Here's how I use it in practice accounting for DI
public class CustomCacheProvider : ICustomCacheProvider
{
private IMemoryCache _internalCache;
private readonly ICacheFactory _cacheFactory;
public CustomCacheProvider (ICacheFactory cacheFactory)
{
_cacheFactory = cacheFactory;
_internalCache = _cacheFactory.CreateInstance();
}
public void Set(string key, object item, MemoryCacheEntryOptions policy)
{
_internalCache.Set(key, item, policy);
}
public object Get(string key)
{
return _internalCache.Get(key);
}
// other methods ignored for breviy
public void Dispose()
{
_internalCache?.Dispose();
}
public void EmptyCache()
{
var oldCache = Interlocked.Exchange(ref _internalCache, _cacheFactory.CreateInstance());
oldCache.Dispose();
}
}
The key is controlling access to the internal cache using another singleton which has the ability to create new cache instances using a factory (or manually if you prefer).

The details in #stefan's answer detail the principle; here's how I'd do it.
One should synchronise access to the cache whilst recreating it, to avoid the race condition of client code accessing the cache after it is disposed, but before it is recreated.
To avoid this synchronisation, do this in your adapter class (which wraps the MemoryCache):
public void clearCache() {
var oldCache = TheCache;
TheCache = new MemoryCache("NewCacheName", ...);
oldCache.Dispose();
GC.Collect();
}
This way, TheCache is always in a non-disposed state, and no synchronisation is needed.

I ran into this problem too. .Dispose() did something quite different than what I expected.
Instead, I added a static field to my controller class. I did not use the default cache, to get around this behavior, but created a private one (if you want to call it that). So my implementation looked a bit like this:
public class MyController : Controller
{
static MemoryCache s_cache = new MemoryCache("myCache");
public ActionResult Index()
{
if (conditionThatInvalidatesCache)
{
s_cache = new MemoryCache("myCache");
}
String s = s_cache["key"] as String;
if (s == null)
{
//do work
//add to s_cache["key"]
}
//do whatever next
}
}

Check out this post, and specifically, the answer that Thomas F. Abraham posted.
It has a solution that enables you to clear the entire cache or a named subset.
The key thing here is:
// Cache objects are obligated to remove entry upon change notification.
base.OnChanged(null);
I've implemented this myself, and everything seems to work just fine.

Related

Is using property getters for initialization (to avoid having to call methods in specific order) bad practice?

Suppose I have a class that provides some data to my application. Data initially comes from database, and I provide it through some methods that handle the whole database thing and present the result as a usable class instead of raw query result. This class has to do some setup (not complex) to make sure any method called can use the database (e.g. connect to database, make sure it contains some critical info, etc). So, were I to put it in a method (say, method Init(), that would handle checking for database, connecting to it, verifying that it does contain the info), I would have to make sure that this method is called before any other method.
So, I usually find that instead of doing this:
public class DataProvider
{
private SqlController controller;
public void Init()
{
controller = new SqlController();
controller.Init();
controller.ConnectToDataBase();
CheckForCriticalInfoInDatabase();
}
public Data GetData()
{
// get data from database (not actually going to use raw queries like that, just an example)
var queryResult = sqlController.RunQuery("SELECT something FROM SOME_TABLE");
// and present it as usable class
Data usefulData = QueryResultToUsefulData(queryResult);
return usefulData;
}
...
}
and then always making sure I call Init() before GetData(), i do something like
private SqlController _controller;
private SqlController controller
{
get
{
if (_controller == null)
{
_controller = new SqlController();
_controller.Init();
_controller.ConnectToDataBase();
CheckForCriticalInfoInDatabase();
}
return controller;
}
}
So, now i can be sure that i won't use an uninitialised SqlController, and I don't have to do that same null check in every method that uses it. However, I never noticed getters being used this way in other peoples' code.
Is there some pitfall I don't see? To me it looks like it's the same as lazy initialization, with the exception being that I use it not because the initialization is heavy or long, but because I don't want to check the order in which I call methods. This question points out that it's not thread-safe (not a concern in my case, plus I imagine it could be made thread-safe with some locks) and that setting the property to null will result in unintuitive behaviour (not a concern, because I don't have a setter at all and the backing field shouldn't be touched either way).
Also, if this kind of code IS bas practice, what is the proper way to ensure that my methods don't rely on order in which they are called?
As #madreflection said in the OP comments, use a method for anything that is possibly going to be slow. Getters and setters should just be quick ways of getting and setting a value.
Connections to dbs can be slow or fail to connect so you may have catches setup to try different connection methods etc.
You could also have the checking occur in the constructor of the object, that way the object cannot be used without init() being run in a different function, saving on time tracing where an error is actually occurring.
For example if you had one function create the object, do a bunch of 'stuff' then try to use the object without running init(), then you get the error after all of the 'stuff' not where you created the object. This could lead you to think there is something wrong in whatever way you are using the object, not that it has not been initialised.

Are Private properties of a class called within a Parallel.Foreach body Thread Safe?

I am tasked with writing a system to process result files created by a different process(which I have no control over) and and trying to modify my code to make use of Parallel.Foreach. The code works fine when just calling a foreach but I have some concerns about thread safety when using the parallel version. The base question I need answered here is "Is the way I am doing this going to guarantee thread safety?" or is this going to cause everything to go sideways on me.
I have tried to make sure all calls are to instances and have removed every static anything except the initial static void Main. It is my current understanding that this will do alot towards assuring thread safety.
I have basically the following, edited for brevity
static void Main(string[] args)
{
MyProcess process = new MyProcess();
process.DoThings();
}
And then in the actual process to do stuff I have
public class MyProcess
{
public void DoThings()
{
//Get some list of things
List<Thing> things = getThings();
Parallel.Foreach(things, item => {
//based on some criteria, take actions from MyActionClass
MyActionClass myAct = new MyActionClass(item);
string tempstring = myAct.DoOneThing();
if(somecondition)
{
MyAct.DoOtherThing();
}
...other similar calls to myAct below here
};
}
}
And over in the MyActionClass I have something like the following:
public class MyActionClass
{
private Thing _thing;
public MyActionClass(Thing item)
{
_thing = item;
}
public string DoOneThing()
{
return _thing.GetSubThings().FirstOrDefault();
}
public void DoOtherThing()
{
_thing.property1 = "Somenewvalue";
}
}
If I can explain this any better I'll try, but I think that's the basics of my needs
EDIT:
Something else I just noticed. If I change the value of a property of the item I'm working with while inside the Parallel.Foreach (in this case, a string value that gets written to a database inside the loop), will that have any affect on the rest of the loop iterations or just the one I'm on? Would it be better to create a new instance of Thing inside the loop to store the item i'm working with in this case?
There is no shared mutable state between actions in the Parallel.ForEach that I can see, so it should be thread-safe, because at most one thread can touch one object at a time.
But as it has been mentioned there is nothing shared that can be seen. It doesn't mean that in the actual code you use everything is as good as it seems here.
Or that nothing will be changed by you or your coworker that will make some state both shared and mutable (in the Thing, for example), and now you start getting difficult to reproduce crashes at best or just plain wrong behaviour at worst that can be left undetected for a long time.
So, perhaps you should try to go fully immutable near threading code?
Perhaps.
Immutability is good, but it is not a silver bullet, and it is not always easy to use and implement, or that every task can be reasonably expressed through immutable objects. And even that accidental "make shared and mutable" change may happen to it as well, though much less likely.
It should at least be considered as a possible option/alternative.
About the EDIT
If I change the value of a property of the item I'm working with while
inside the Parallel.Foreach (in this case, a string value that gets
written to a database inside the loop), will that have any affect on
the rest of the loop iterations or just the one I'm on?
If you change a property and that object is not used anywhere else, and it doesn't rely on some global mutable state (for example, sort of a public static Int32 ChangesCount that increments with each state change), then you should be safe.
a string value that gets written to a database inside the loop - depending on the used data access technology and how you use it, you may be in trouble, because most of them are not designed for multithreaded environment, like EF DbContext, for example. And obviously do not forget that dealing with concurrent access in database is not always easy, though that is a bit away from our original theme.
Would it be better to create a new instance of Thing inside the loop to store the item i'm working with in this case - if there is no risk of external concurrent changes, then it is just an unnecessary work. And if there is a chance of another threads(not Parallel.For) making changes to those objects that are being persisted, then you already have bigger problems than Parallel.For.
Objects should always have observable consistent state (unlike when half of properties set by one thread, and half by another, while you try to persist that who-knows-what), and if they are used by many threads, then they should be already thread-safe - there should be no way to put them into inconsistent state.
And if they want to be persisted by external code, such objects should probably provide:
Either SyncRoot property to synchronize property reading code.
Or some current state snapshot DTO that is created internally by some thread-safe method like ThingSnapshot Thing.GetCurrentData() { lock() {} }.
Or something more exotic.

Managing Pool of Generated Objects

I am working on a project where individual regions of a map are either generated dynamically, or loaded from a file if it has already been generated and saved. Regions are only loaded/generated as needed, and saved and discarded when they aren't anymore.
There are several different tasks that will be using one or more regions of this map for various purposes. For instance, one of these tasks will be to draw all currently visible regions (about 9 at any given time). Another is to get information about, or even modify regions.
The problem is that these tasks may or may not be working with the same regions as other tasks.
Since these regions are rather large, and are costly to generate, it would be problematic (for these and other reasons) to use different copies for each task.
Rather, I think it would be a good idea to create and manage a pool of currently loaded regions. New tasks will first check the pool for their reqired region. They can then use it if it exists, or else create a new one and add it to the pool.
Provided that works, how would I manage this pool? How would I determine if a region is no longer needed by any tasks and can be safely discarded? Am I being silly and overcomplicating this?
I am using c# if that matters to anyone.
Edit:
Now that I'm more awake, would it be as simple as incrementing a counter in each region for each place it's used? then discarding it when the counter reaches 0?
Provided that works, how would I manage this pool? How would I determine if a region is no longer needed by any tasks and can be safely discarded?
A simple way of doing this can be to use weak references:
public class RegionStore
{
// I'm using int as the identifier for a region.
// Obviously this must be some type that can serve as
// an ID according to your application's logic.
private Dictionary<int, WeakReference<Region>> _store = new Dictionary<int, WeakReference<Region>>();
private const int TrimThreshold = 1000; // Profile to find good value here.
private int _addCount = 0;
public bool TryGetRegion(int id, out Region region)
{
WeakReference<Region> wr;
if(!_store.TryGetValue(id, out wr))
return false;
if(wr.TryGetTarget(out region))
return true;
// Clean up space in dictionary.
_store.Remove(id);
return false;
}
public void AddRegion(int id, Region region)
{
if(++_addCount >= TrimThreshold)
Trim();
_store[id] = new WeakReference<Region>(region);
}
public void Remove(int id)
{
_store.Remove(id);
}
private void Trim()
{
// Remove dead keys.
// Profile to test if this is really necessary.
// If you were fully implementing this, rather than delegating to Dictionary,
// you'd likely see if this helped prior to an internal resize.
_addCount = 0;
var keys = _store.Keys.ToList();
Region region;
foreach(int key in keys)
if(!_store[key].TryGetTarget(out wr))
_store.Remove(key);
}
}
Now you've a store of your Region objects, but that store doesn't prevent them being garbage collected if no other references to them exist.
Certain task will be modifying regions. In this case I will likely raise an "update" flag in the region object, and from there update all other tasks using it.
Do note that this will be a definite potential source of bugs in the application as a whole. Mutability complicates any sort of caching. If you can move to a immutable model, it will likely simplify things, but then uses of outdated objects brings its own complications.
ok, i don't know how you have your app designed, but i sugest you to have a look at this
You can also use static to share you variable with other tasks but then you may want to use block variables to prevent you to write or read from that variable while other processes are using it. (here)

How to perform this particular type of locking?

I have the following code:
var sequence = from row in CicApplication.DistributorBackpressure44Cache
where row.Coater == this.Coater && row.IsDistributorInUse
select new GenericValue
{
ReadTime = row.CoaterTime.Value,
Value = row.BackpressureLeft
};
this.EvaluateBackpressure(sequence, "BackpressureLeftTarget");
And DistributorBackpressure44Cache is defined as follows:
internal static List<DistributorBackpressure44> DistributorBackpressure44Cache
{
get
{
return _distributorBackpressure44;
}
}
This is part of a heavily threaded application where DistributorBackpressure44Cache could be being refreshed in one thread, and queried from, as shown above, in another thread. The variable 'sequence' above is an IEnumerable, which is passed to the method shown, and then potentially passed to the other methods, before actually being executed. My concern is this. What will happen with the above query if the DistributorBackpressure44Cache is being refreshed (cleared and repopulated) when the query is actually executed?
It wouldn't do any good to put a lock around this code because this query actually gets executed at some point later (unless I were to convert it to a list immediately).
If your design can tolerate it, you could ensure snapshot level isolation with this code and avoid locking altogether. However, you would need to do the following:
Make DistributorBackpressure44Cache return a ReadOnlyCollection<T> instead, this way it is explicit you shouldn't mutate this data.
Ensure that any mutations to _distributorBackpressure44 occur on a copy and result in an atomic assignment back to _distributorBackpressure44 when complete.
var cache = _distributorBackpressure44.ToList();
this.RefreshCache(cache); // this assumes you *need* to know
// about the structure of the old list
// a design where this is not required
// is preferred
_distributorBackpressure44 = cache; // some readers will have "old"
// views of the cache, but all readers
// from some time T (where T < Twrite)
// will use the same "snapshot"
You can convert it to a list immediately (might be best--)
or
You can put a lock in the get for DistributorBackpressure44 that synchs with the cache refresh lock. You might want to include a locked and unlocked accessor; use the unlocked accessor when the result is going to be used immediately, and the locked one when the accessor is going to be used in a deferred execution situation.
Note that even that won't work if the cache refresh mutates the list _distributorBackpress44, only if it just replaces the referenced list.
Without knowing more about your architecture options, you could do something like this.
lock(CicApplication.DistributorBackpressure44Cache)
{
var sequence = from row in CicApplication.DistributorBackpressure44Cache
where row.Coater == this.Coater && row.IsDistributorInUse
select new GenericValue
{
ReadTime = row.CoaterTime.Value,
Value = row.BackpressureLeft
};
}
this.EvaluateBackpressure(sequence, "BackpressureLeftTarget");
Then in the code where you do the clear/update you would have something like this.
lock(CicApplication.DistributorBackpressure44Cache)
{
var objCache = CicApplication.DistributorBackpressure44Cache
objCache.Clear();
// code to add back items here
// [...]
}
It would be cleaner to have a central class (Singleton pattern maybe?) that controls everything surrounding the cache. But I don't know how feasible this is (i.e. putting the query code into another class and passing the parameters in). In lieu of something fancier, the above solution should work as long as you consistently remember to lock() each and every time you ever read/write to this object.

Partially thread-safe dictionary

I have a class that maintains a private Dictionary instance that caches some data.
The class writes to the dictionary from multiple threads using a ReaderWriterLockSlim.
I want to expose the dictionary's values outside the class.
What is a thread-safe way of doing that?
Right now, I have the following:
public ReadOnlyCollection<MyClass> Values() {
using (sync.ReadLock())
return new ReadOnlyCollection<MyClass>(cache.Values.ToArray());
}
Is there a way to do this without copying the collection many times?
I'm using .Net 3.5 (not 4.0)
I want to expose the dictionary's values outside the class.
What is a thread-safe way of doing that?
You have three choices.
1) Make a copy of the data, hand out the copy. Pros: no worries about thread safe access to the data. Cons: Client gets a copy of out-of-date data, not fresh up-to-date data. Also, copying is expensive.
2) Hand out an object that locks the underlying collection when it is read from. You'll have to write your own read-only collection that has a reference to the lock of the "parent" collection. Design both objects carefully so that deadlocks are impossible. Pros: "just works" from the client's perspective; they get up-to-date data without having to worry about locking. Cons: More work for you.
3) Punt the problem to the client. Expose the lock, and make it a requirement that clients lock all views on the data themselves before using it. Pros: No work for you. Cons: Way more work for the client, work they might not be willing or able to do. Risk of deadlocks, etc, now become the client's problem, not your problem.
If you want a snapshot of the current state of the dictionary, there's really nothing else you can do with this collection type. This is the same technique used by the ConcurrentDictionary<TKey, TValue>.Values property.
If you don't mind throwing an InvalidOperationException if the collection is modified while you are enumerating it, you could just return cache.Values since it's readonly (and thus can't corrupt the dictionary data).
EDIT: I personally believe the below code is technically answering your question correctly (as in, it provides a way to enumerate over the values in a collection without creating a copy). Some developers far more reputable than I strongly advise against this approach, for reasons they have explained in their edits/comments. In short: This is apparently a bad idea. Therefore I'm leaving the answer but suggesting you not use it.
Unless I'm missing something, I believe you could expose your values as an IEnumerable<MyClass> without needing to copy values by using the yield keyword:
public IEnumerable<MyClass> Values {
get {
using (sync.ReadLock()) {
foreach (MyClass value in cache.Values)
yield return value;
}
}
}
Be aware, however (and I'm guessing you already knew this), that this approach provides lazy evaluation, which means that the Values property as implemented above can not be treated as providing a snapshot.
In other words... well, take a look at this code (I am of course guessing as to some of the details of this class of yours):
var d = new ThreadSafeDictionary<string, string>();
// d is empty right now
IEnumerable<string> values = d.Values;
d.Add("someKey", "someValue");
// if values were a snapshot, this would output nothing...
// but in FACT, since it is lazily evaluated, it will now have
// what is CURRENTLY in d.Values ("someValue")
foreach (string s in values) {
Console.WriteLine(s);
}
So if it's a requirement that this Values property be equivalent to a snapshot of what is in cache at the time the property is accessed, then you're going to have to make a copy.
(begin 280Z28): The following is an example of how someone unfamiliar with the "C# way of doing things" could lock the code:
IEnumerator enumerator = obj.Values.GetEnumerator();
MyClass first = null;
if (enumerator.MoveNext())
first = enumerator.Current;
(end 280Z28)
Review next possibility, just exposes ICollection interface, so in Values() you can return your own implementation. This implementation will use only reference on Dictioanry.Values and always use ReadLock for access items.

Categories