Caching Pattern for Fetching Collections - c#

Is there a caching strategy/pattern that can be used for fetching multiple items, where only some may be cached? Does it make sense to use a cache in such scenarios?
More Detail
Where the cache would return a single (or known number of) result(s) caching's simple; if it's in the cache we return it, if not we fetch it, add it to the cache, then return it:
//using System.Runtime.Caching;
ObjectCache cache = MemoryCache.Default;
TimeSpan cacheLifetime = new TimeSpan(0, 20, 0);
SomeObjectFinder source = new SomeObjectFinder();
public SomeObject GetById(long id)
{
return GetFromCacheById(id) ?? GetFromSourceById(id);
}
protected SomeObject GetFromCacheById(long id)
{
return (SomeObject)cache.Get(id.ToString(),null);
}
protected SomeObject GetFromSourceById (long id)
{
SomeObject result = source.GetById(id);
return result == null ? null : (SomeObject)cache.AddOrGetExisting(id.ToString(), result, DateTimeOffset.UtcNow.Add(cacheLifetime), null);
}
However, where we don't know how many results to expect, if we fetched the results from the cache we wouldn't know that we'd fetched everything; only what's been cached. So following the above pattern, if none of or all of the results had been cached we'd be fine; but if half of them had been, we'd get a partial result set.
I was thinking the below may make sense; but I've not played with async/await before, and most of what I've read implies that calling async code from synchronous code is generally considered bad; so this solution's probably not a good idea.
public IEnumerable<SomeObject> GetByPartialName(string startOfName)
{
//kick off call to the DB (or whatever) to get the full result set
var getResultsTask = Task.Run<IList<SomeObject>>(async() => await GetFromSourceByPartialNameAsync(startOfName));
//whilst we leave that to run in the background, start getting and returning the results from the cache
var cacheResults = GetFromCacheByPartialName(startOfName);
foreach (var result in cacheResults)
{
yield return result;
}
//once all cached values are returned, wait for the async task to complete, remove the results we'd already returned from cache, then add the remaining results to cache and return those also
var results = getResultsTask.GetAwaiter().GetResult();
foreach (var result in results.Except(cacheResults))
{
yield return CacheAddOrGetExistingByName(result.Name, result);
}
}
protected async Task<IList<SomeObject>> GetFromSourceByPartialNameAsync(string startOfName)
{
return source.GetByPartialName(startOfName);
}
My assumption is that the answer's going to be "in this scenario either cache everything beforehand, or don't use cache"... but hopefully there's some better option.

You can use the Decorator and strategy pattern, here is the link to Steve Smith post regarding building a cached repository, and for individual items work as well, so you can use a decorator using redis and if not found it goes to the database and then save it to database, after that querying the data will be really fast.
Here is some sample code:
public class CachedAlbumRepository : IAlbumRepository
{
private readonly IAlbumRepository _albumRepository;
public CachedAlbumRepository(IAlbumRepository albumRepository)
{
_albumRepository = albumRepository;
}
private static readonly object CacheLockObject = new object();
public IEnumerable<Album> GetTopSellingAlbums(int count)
{
Debug.Print("CachedAlbumRepository:GetTopSellingAlbums");
string cacheKey = "TopSellingAlbums-" + count;
var result = HttpRuntime.Cache[cacheKey] as List<Album>;
if (result == null)
{
lock (CacheLockObject)
{
result = HttpRuntime.Cache[cacheKey] as List<Album>;
if (result == null)
{
result = _albumRepository.GetTopSellingAlbums(count).ToList();
HttpRuntime.Cache.Insert(cacheKey, result, null,
DateTime.Now.AddSeconds(60), TimeSpan.Zero);
}
}
}
return result;
}
}

Related

Synchronize threads on per-item base

While this question is about the MemoryCache class, I can imagine the same need with a Dictionary or ConcurrentDictionary.GetOrAdd where the valueFactory-lambda is also a lengthy operation.
In essence I want to synchronize/lock threads on a per-item base. I know MemoryCache is thread safe, but still, checking if an item exists and add the item when it doesn't exist, still needs to be synchronized.
Consider this sample code:
public class MyCache
{
private static readonly MemoryCache cache = new MemoryCache(Guid.NewGuid().ToString());
public object Get(string id)
{
var cacheItem = cache.GetCachedItem(id);
if (cacheItem != null) return cacheItem.Value;
var item = this.CreateItem(id);
cache.Add(id, item, new CacheItemPolicy
{
SlidingExpiration = TimeSpan.FromMinutes(20)
});
return item;
}
private object CreateItem(string id)
{
// Lengthy operation, f.e. querying database or even external API
return whateverCreatedObject;
}
}
As you can see, we need to synchronize cache.GetCachedItem and cache.Add. But since CreateItem is a lengthy operation (hence the MemoryCache), I don't want to lock all threads as this code would do:
public object Get(string id)
{
lock (cache)
{
var item = cache.GetCachedItem(id);
if (item != null) return item.Value;
cache.Add(id, this.CreateItem(id), new CacheItemPolicy
{
SlidingExpiration = TimeSpan.FromMinutes(20)
});
}
}
Also, having no lock is not an options, as then we could have multiple threads calling CreateItem for the same id.
What I could do is create a unique named Semaphore per id, so locking happens on per-item basis. But this will be a system-resource killer, as we do not want to register +100K named semaphores on our system.
I'm sure I'm not the first that needs this kind of synchronization, but I didn't find any question/answer that fits this scenario.
My question is if someone can come up with a different, resource friendly approach for this problem?
Update
I've found this NamedReaderWriterLocker class that looks promising at first but is dangerous to use as two threads can potentially get a different ReaderWriterLockSlim instance for the same name when both threads get into the ConcurrentDictionary's valueFactory at the same time. Maybe I can use this implementation with some additional lock inside the GetLock method.
Since your key is a string, you could lock on string.Intern(id).
MSDN documentation: System.String.Intern
i.e.
lock (string.Intern(id))
{
var item = cache.GetCachedItem(id);
if (item != null)
{
return item.Value;
}
cache.Add(id, this.CreateItem(id), new CacheItemPolicy
{
SlidingExpiration = TimeSpan.FromMinutes(20)
});
return /* some value, this line was absent in the original code. */;
}

Which one of these async calls is not like the other?

I suspect I have a deadlock issue, but it's an odd one that I can't rationalize. I have an API that needs to verify a few things in order to process the call. As part of the business logic, I might have to make more of those same calls as well. In this case, if a particular piece of data associated with an entity is not found, we attempt to use a backup (if one is configured), which requires checking other entities. Eventually, the code will hang.
Let's just dive into the code (comments highlight the calls in question).
API Controller:
public async Task<HttpResponseMessage> Get(int entityID, string content, bool? useBackUp = true)
{
//Some look-ups here, no issues at all
//This works, but it's this method that has an issue later in the process.
SystemEntity entityObj =
await BusinessLayer.GetSystemEntityAsync(SystemEntityID);
if (entityObj == null)
{
return new HttpResponseMessage
{
StatusCode = System.Net.HttpStatusCode.BadRequest,
Content = new StringContent("Entity is unavailable.")
};
}
string text = BusinessLayer.GetContentTextAsync(entityID
new List<string> {contentName}, useBackUp).Result.FirstOrDefault().Value;
if (text == null)
{
return new HttpResponseMessage {StatusCode = System.Net.HttpStatusCode.NoContent};
}
return new HttpResponseMessage
{
StatusCode = System.Net.HttpStatusCode.OK,
Content = new StringContent(text)
};
}
Business Layer:
public async Task<Dictionary<string, string>> GetContentTextAsync(int systemEntityID, List<string> contentNames, bool useBackUp)
{
Dictionary<string, string> records = new Dictionary<string, string>();
//We iterate for caching purposes
foreach (string name in contentNames)
{
string nameCopy = name;
string record = Cache.GetData(
string.Format("{0}_{1}_{2}", CONTENT, systemEntityID, name), () =>
DataLayer.GetCotnent(systemEntityID, nameCopy));
if (record == null && useBackUp)
{
List<int> entityIDs = new List<int> {systemEntityID};
int currentEntityID = systemEntityID;
//Here's that method again. This call seems to work.
SystemEntity currentEntity = await GetSystemEntityAsync(systemEntityID);
if (currentEntity != null && currentEntity.BackUpID.HasValue)
{
currentEntityID = (int) currentEntity.BackUpID;
}
while (!entityIDs.Contains(currentEntityID))
{
int id = currentEntityID;
record = Cache.GetData(
string.Format("{0}_{1}_{2}", CONTENT, systemEntityID, name), () =>
DataLayer.GetCotnent(id, nameCopy));
if (record != null) break;
entityIDs.Add(currentEntityID);
//This call seems to cause the deadlock
currentEntity = await GetSystemEntityAsync(currentEntityID);
if (currentEntity != null && currentEntity.BackUpID.HasValue)
{
currentEntityID = (int) currentEntity.UseBackupID;
}
}
}
if (record != null)
{
records.Add(name, record);
}
}
return records;
}
public async Task<SystemEntity> GetSystemEntityAsync(int systemEntityID)
{
SystemEntity systemEntity = await DataLayer.GetSystemEntity(
scc => scc.SystemEntityID == systemEntityID);
return systemEntity;
}
Data Layer:
public async Task<SystemEntity> GetSystemEntity(Expression<Func<SystemEntity, bool>> whereExpression)
{
using (EntityContext dbContext = createDbInstance())
{
//This is the last line that the debugger in VS 2013 brings me to. Stepping into this returns to whatever called the API method, waiting endlessly.
return await
dbContext.SystemEntity.Include(sc => sc.OtherEntity).Where(whereExpression).FirstOrDefaultAsync();
}
}
To recap: I call GetSystemEntityAsync three times. The first two times, it completes successfully. The third time, it hangs. If I comment out the first two calls so they don't run at all, the third one still hangs. If I remove the await and use just a normal FirstOrDefault in the return statement of the data layer method, then everything completes just fine.
Note: I have to keep the GetSystemEntityAsync method asynchronous. I cannot alter it to be synchronous.
What are the possible sources of the deadlock I'm encountering? I'm out of ideas on how to solve it.
Which one of these async calls is not like the other?
This one, I suspect:
string text = BusinessLayer.GetContentTextAsync(entityID
new List<string> {contentName}, useBackUp).Result.FirstOrDefault().Value;
Try changing it to this:
string text = (await BusinessLayer.GetContentTextAsync(entityID
new List<string> {contentName}, useBackUp)).FirstOrDefault().Value;
The possible source of the deadlock is described by Stephen Cleary in his "Don't Block on Async Code" blog post.

Locking pattern for proper use of .NET MemoryCache

I assume this code has concurrency issues:
const string CacheKey = "CacheKey";
static string GetCachedData()
{
string expensiveString =null;
if (MemoryCache.Default.Contains(CacheKey))
{
expensiveString = MemoryCache.Default[CacheKey] as string;
}
else
{
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
expensiveString = SomeHeavyAndExpensiveCalculation();
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
}
return expensiveString;
}
The reason for the concurrency issue is that multiple threads can get a null key and then attempt to insert data into cache.
What would be the shortest and cleanest way to make this code concurrency proof? I like to follow a good pattern across my cache related code. A link to an online article would be a great help.
UPDATE:
I came up with this code based on #Scott Chamberlain's answer. Can anyone find any performance or concurrency issue with this?
If this works, it would save many line of code and errors.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Runtime.Caching;
namespace CachePoc
{
class Program
{
static object everoneUseThisLockObject4CacheXYZ = new object();
const string CacheXYZ = "CacheXYZ";
static object everoneUseThisLockObject4CacheABC = new object();
const string CacheABC = "CacheABC";
static void Main(string[] args)
{
string xyzData = MemoryCacheHelper.GetCachedData<string>(CacheXYZ, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
string abcData = MemoryCacheHelper.GetCachedData<string>(CacheABC, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
}
private static string SomeHeavyAndExpensiveXYZCalculation() {return "Expensive";}
private static string SomeHeavyAndExpensiveABCCalculation() {return "Expensive";}
public static class MemoryCacheHelper
{
public static T GetCachedData<T>(string cacheKey, object cacheLock, int cacheTimePolicyMinutes, Func<T> GetData)
where T : class
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
T cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
{
return cachedData;
}
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
{
return cachedData;
}
//The value still did not exist so we now write it in to the cache.
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(cacheTimePolicyMinutes))
};
cachedData = GetData();
MemoryCache.Default.Set(cacheKey, cachedData, cip);
return cachedData;
}
}
}
}
}
This is my 2nd iteration of the code. Because MemoryCache is thread safe you don't need to lock on the initial read, you can just read and if the cache returns null then do the lock check to see if you need to create the string. It greatly simplifies the code.
const string CacheKey = "CacheKey";
static readonly object cacheLock = new object();
private static string GetCachedData()
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
//The value still did not exist so we now write it in to the cache.
var expensiveString = SomeHeavyAndExpensiveCalculation();
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
}
EDIT: The below code is unnecessary but I wanted to leave it to show the original method. It may be useful to future visitors who are using a different collection that has thread safe reads but non-thread safe writes (almost all of classes under the System.Collections namespace is like that).
Here is how I would do it using ReaderWriterLockSlim to protect access. You need to do a kind of "Double Checked Locking" to see if anyone else created the cached item while we where waiting to to take the lock.
const string CacheKey = "CacheKey";
static readonly ReaderWriterLockSlim cacheLock = new ReaderWriterLockSlim();
static string GetCachedData()
{
//First we do a read lock to see if it already exists, this allows multiple readers at the same time.
cacheLock.EnterReadLock();
try
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
}
finally
{
cacheLock.ExitReadLock();
}
//Only one UpgradeableReadLock can exist at one time, but it can co-exist with many ReadLocks
cacheLock.EnterUpgradeableReadLock();
try
{
//We need to check again to see if the string was created while we where waiting to enter the EnterUpgradeableReadLock
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
//The entry still does not exist so we need to create it and enter the write lock
var expensiveString = SomeHeavyAndExpensiveCalculation();
cacheLock.EnterWriteLock(); //This will block till all the Readers flush.
try
{
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
finally
{
cacheLock.ExitWriteLock();
}
}
finally
{
cacheLock.ExitUpgradeableReadLock();
}
}
There is an open source library [disclaimer: that I wrote]: LazyCache that IMO covers your requirement with two lines of code:
IAppCache cache = new CachingService();
var cachedResults = cache.GetOrAdd("CacheKey",
() => SomeHeavyAndExpensiveCalculation());
It has built in locking by default so the cacheable method will only execute once per cache miss, and it uses a lambda so you can do "get or add" in one go. It defaults to 20 minutes sliding expiration.
There's even a NuGet package ;)
I've solved this issue by making use of the AddOrGetExisting method on the MemoryCache and the use of Lazy initialization.
Essentially, my code looks something like this:
static string GetCachedData(string key, DateTimeOffset offset)
{
Lazy<String> lazyObject = new Lazy<String>(() => SomeHeavyAndExpensiveCalculationThatReturnsAString());
var returnedLazyObject = MemoryCache.Default.AddOrGetExisting(key, lazyObject, offset);
if (returnedLazyObject == null)
return lazyObject.Value;
return ((Lazy<String>) returnedLazyObject).Value;
}
Worst case scenario here is that you create the same Lazy object twice. But that is pretty trivial. The use of AddOrGetExisting guarantees that you'll only ever get one instance of the Lazy object, and so you're also guaranteed to only call the expensive initialization method once.
I assume this code has concurrency issues:
Actually, it's quite possibly fine, though with a possible improvement.
Now, in general the pattern where we have multiple threads setting a shared value on first use, to not lock on the value being obtained and set can be:
Disastrous - other code will assume only one instance exists.
Disastrous - the code that obtains the instance is not can only tolerate one (or perhaps a certain small number) concurrent operations.
Disastrous - the means of storage is not thread-safe (e.g. have two threads adding to a dictionary and you can get all sorts of nasty errors).
Sub-optimal - the overall performance is worse than if locking had ensured only one thread did the work of obtaining the value.
Optimal - the cost of having multiple threads do redundant work is less than the cost of preventing it, especially since that can only happen during a relatively brief period.
However, considering here that MemoryCache may evict entries then:
If it's disastrous to have more than one instance then MemoryCache is the wrong approach.
If you must prevent simultaneous creation, you should do so at the point of creation.
MemoryCache is thread-safe in terms of access to that object, so that is not a concern here.
Both of these possibilities have to be thought about of course, though the only time having two instances of the same string existing can be a problem is if you're doing very particular optimisations that don't apply here*.
So, we're left with the possibilities:
It is cheaper to avoid the cost of duplicate calls to SomeHeavyAndExpensiveCalculation().
It is cheaper not to avoid the cost of duplicate calls to SomeHeavyAndExpensiveCalculation().
And working that out can be difficult (indeed, the sort of thing where it's worth profiling rather than assuming you can work it out). It's worth considering here though that most obvious ways of locking on insert will prevent all additions to the cache, including those that are unrelated.
This means that if we had 50 threads trying to set 50 different values, then we'll have to make all 50 threads wait on each other, even though they weren't even going to do the same calculation.
As such, you're probably better off with the code you have, than with code that avoids the race-condition, and if the race-condition is a problem, you quite likely either need to handle that somewhere else, or need a different caching strategy than one that expels old entries†.
The one thing I would change is I'd replace the call to Set() with one to AddOrGetExisting(). From the above it should be clear that it probably isn't necessary, but it would allow the newly obtained item to be collected, reducing overall memory use and allowing a higher ratio of low generation to high generation collections.
So yeah, you could use double-locking to prevent concurrency, but either the concurrency isn't actually a problem, or your storing the values in the wrong way, or double-locking on the store would not be the best way to solve it.
*If you know only one each of a set of strings exists, you can optimise equality comparisons, which is about the only time having two copies of a string can be incorrect rather than just sub-optimal, but you'd want to be doing very different types of caching for that to make sense. E.g. the sort XmlReader does internally.
†Quite likely either one that stores indefinitely, or one that makes use of weak references so it will only expel entries if there are no existing uses.
Somewhat dated question, but maybe still useful: you may take a look at FusionCache ⚡🦥, which I recently released.
The feature you are looking for is described here, and you can use it like this:
const string CacheKey = "CacheKey";
static string GetCachedData()
{
return fusionCache.GetOrSet(
CacheKey,
_ => SomeHeavyAndExpensiveCalculation(),
TimeSpan.FromMinutes(20)
);
}
You may also find some of the other features interesting like fail-safe, advanced timeouts with background factory completion and support for an optional, distributed 2nd level cache.
If you will give it a chance please let me know what you think.
/shameless-plug
It is difficult to choose which one is better; lock or ReaderWriterLockSlim. You need real world statistics of read and write numbers and ratios etc.
But if you believe using "lock" is the correct way. Then here is a different solution for different needs. I also include the Allan Xu's solution in the code. Because both can be needed for different needs.
Here are the requirements, driving me to this solution:
You don't want to or cannot supply the 'GetData' function for some reason. Perhaps the 'GetData' function is located in some other class with a heavy constructor and you do not want to even create an instance till ensuring it is unescapable.
You need to access the same cached data from different locations/tiers of the application. And those different locations don't have access to same locker object.
You don't have a constant cache key. For example; need of caching some data with the sessionId cache key.
Code:
using System;
using System.Runtime.Caching;
using System.Collections.Concurrent;
using System.Collections.Generic;
namespace CachePoc
{
class Program
{
static object everoneUseThisLockObject4CacheXYZ = new object();
const string CacheXYZ = "CacheXYZ";
static object everoneUseThisLockObject4CacheABC = new object();
const string CacheABC = "CacheABC";
static void Main(string[] args)
{
//Allan Xu's usage
string xyzData = MemoryCacheHelper.GetCachedDataOrAdd<string>(CacheXYZ, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
string abcData = MemoryCacheHelper.GetCachedDataOrAdd<string>(CacheABC, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
//My usage
string sessionId = System.Web.HttpContext.Current.Session["CurrentUser.SessionId"].ToString();
string yvz = MemoryCacheHelper.GetCachedData<string>(sessionId);
if (string.IsNullOrWhiteSpace(yvz))
{
object locker = MemoryCacheHelper.GetLocker(sessionId);
lock (locker)
{
yvz = MemoryCacheHelper.GetCachedData<string>(sessionId);
if (string.IsNullOrWhiteSpace(yvz))
{
DatabaseRepositoryWithHeavyConstructorOverHead dbRepo = new DatabaseRepositoryWithHeavyConstructorOverHead();
yvz = dbRepo.GetDataExpensiveDataForSession(sessionId);
MemoryCacheHelper.AddDataToCache(sessionId, yvz, 5);
}
}
}
}
private static string SomeHeavyAndExpensiveXYZCalculation() { return "Expensive"; }
private static string SomeHeavyAndExpensiveABCCalculation() { return "Expensive"; }
public static class MemoryCacheHelper
{
//Allan Xu's solution
public static T GetCachedDataOrAdd<T>(string cacheKey, object cacheLock, int minutesToExpire, Func<T> GetData) where T : class
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
T cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
return cachedData;
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
return cachedData;
cachedData = GetData();
MemoryCache.Default.Set(cacheKey, cachedData, DateTime.Now.AddMinutes(minutesToExpire));
return cachedData;
}
}
#region "My Solution"
readonly static ConcurrentDictionary<string, object> Lockers = new ConcurrentDictionary<string, object>();
public static object GetLocker(string cacheKey)
{
CleanupLockers();
return Lockers.GetOrAdd(cacheKey, item => (cacheKey, new object()));
}
public static T GetCachedData<T>(string cacheKey) where T : class
{
CleanupLockers();
T cachedData = MemoryCache.Default.Get(cacheKey) as T;
return cachedData;
}
public static void AddDataToCache(string cacheKey, object value, int cacheTimePolicyMinutes)
{
CleanupLockers();
MemoryCache.Default.Add(cacheKey, value, DateTimeOffset.Now.AddMinutes(cacheTimePolicyMinutes));
}
static DateTimeOffset lastCleanUpTime = DateTimeOffset.MinValue;
static void CleanupLockers()
{
if (DateTimeOffset.Now.Subtract(lastCleanUpTime).TotalMinutes > 1)
{
lock (Lockers)//maybe a better locker is needed?
{
try//bypass exceptions
{
List<string> lockersToRemove = new List<string>();
foreach (var locker in Lockers)
{
if (!MemoryCache.Default.Contains(locker.Key))
lockersToRemove.Add(locker.Key);
}
object dummy;
foreach (string lockerKey in lockersToRemove)
Lockers.TryRemove(lockerKey, out dummy);
lastCleanUpTime = DateTimeOffset.Now;
}
catch (Exception)
{ }
}
}
}
#endregion
}
}
class DatabaseRepositoryWithHeavyConstructorOverHead
{
internal string GetDataExpensiveDataForSession(string sessionId)
{
return "Expensive data from database";
}
}
}
To avoid the global lock, you can use SingletonCache to implement one lock per key, without exploding memory usage (the lock objects are removed when no longer referenced, and acquire/release is thread safe guaranteeing that only 1 instance is ever in use via compare and swap).
Using it looks like this:
SingletonCache<string, object> keyLocks = new SingletonCache<string, object>();
const string CacheKey = "CacheKey";
static string GetCachedData()
{
string expensiveString =null;
if (MemoryCache.Default.Contains(CacheKey))
{
return MemoryCache.Default[CacheKey] as string;
}
// double checked lock
using (var lifetime = keyLocks.Acquire(url))
{
lock (lifetime.Value)
{
if (MemoryCache.Default.Contains(CacheKey))
{
return MemoryCache.Default[CacheKey] as string;
}
cacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
expensiveString = SomeHeavyAndExpensiveCalculation();
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
}
}
Code is here on GitHub: https://github.com/bitfaster/BitFaster.Caching
Install-Package BitFaster.Caching
There is also an LRU implementation that is lighter weight than MemoryCache, and has several advantages - faster concurrent reads and writes, bounded size, no background thread, internal perf counters etc. (disclaimer, I wrote it).
Console example of MemoryCache, "How to save/get simple class objects"
Output after launching and pressing Any key except Esc :
Saving to cache!
Getting from cache!
Some1
Some2
class Some
{
public String text { get; set; }
public Some(String text)
{
this.text = text;
}
public override string ToString()
{
return text;
}
}
public static MemoryCache cache = new MemoryCache("cache");
public static string cache_name = "mycache";
static void Main(string[] args)
{
Some some1 = new Some("some1");
Some some2 = new Some("some2");
List<Some> list = new List<Some>();
list.Add(some1);
list.Add(some2);
do {
if (cache.Contains(cache_name))
{
Console.WriteLine("Getting from cache!");
List<Some> list_c = cache.Get(cache_name) as List<Some>;
foreach (Some s in list_c) Console.WriteLine(s);
}
else
{
Console.WriteLine("Saving to cache!");
cache.Set(cache_name, list, DateTime.Now.AddMinutes(10));
}
} while (Console.ReadKey(true).Key != ConsoleKey.Escape);
}
public interface ILazyCacheProvider : IAppCache
{
/// <summary>
/// Get data loaded - after allways throw cached result (even when data is older then needed) but very fast!
/// </summary>
/// <param name="key"></param>
/// <param name="getData"></param>
/// <param name="slidingExpiration"></param>
/// <typeparam name="T"></typeparam>
/// <returns></returns>
T GetOrAddPermanent<T>(string key, Func<T> getData, TimeSpan slidingExpiration);
}
/// <summary>
/// Initialize LazyCache in runtime
/// </summary>
public class LazzyCacheProvider: CachingService, ILazyCacheProvider
{
private readonly Logger _logger = LogManager.GetLogger("MemCashe");
private readonly Hashtable _hash = new Hashtable();
private readonly List<string> _reloader = new List<string>();
private readonly ConcurrentDictionary<string, DateTime> _lastLoad = new ConcurrentDictionary<string, DateTime>();
T ILazyCacheProvider.GetOrAddPermanent<T>(string dataKey, Func<T> getData, TimeSpan slidingExpiration)
{
var currentPrincipal = Thread.CurrentPrincipal;
if (!ObjectCache.Contains(dataKey) && !_hash.Contains(dataKey))
{
_hash[dataKey] = null;
_logger.Debug($"{dataKey} - first start");
_lastLoad[dataKey] = DateTime.Now;
_hash[dataKey] = ((object)GetOrAdd(dataKey, getData, slidingExpiration)).CloneObject();
_lastLoad[dataKey] = DateTime.Now;
_logger.Debug($"{dataKey} - first");
}
else
{
if ((!ObjectCache.Contains(dataKey) || _lastLoad[dataKey].AddMinutes(slidingExpiration.Minutes) < DateTime.Now) && _hash[dataKey] != null)
Task.Run(() =>
{
if (_reloader.Contains(dataKey)) return;
lock (_reloader)
{
if (ObjectCache.Contains(dataKey))
{
if(_lastLoad[dataKey].AddMinutes(slidingExpiration.Minutes) > DateTime.Now)
return;
_lastLoad[dataKey] = DateTime.Now;
Remove(dataKey);
}
_reloader.Add(dataKey);
Thread.CurrentPrincipal = currentPrincipal;
_logger.Debug($"{dataKey} - reload start");
_hash[dataKey] = ((object)GetOrAdd(dataKey, getData, slidingExpiration)).CloneObject();
_logger.Debug($"{dataKey} - reload");
_reloader.Remove(dataKey);
}
});
}
if (_hash[dataKey] != null) return (T) (_hash[dataKey]);
_logger.Debug($"{dataKey} - dummy start");
var data = GetOrAdd(dataKey, getData, slidingExpiration);
_logger.Debug($"{dataKey} - dummy");
return (T)((object)data).CloneObject();
}
}
Its a bit late, however...
Full implementation:
[HttpGet]
public async Task<HttpResponseMessage> GetPageFromUriOrBody(RequestQuery requestQuery)
{
log(nameof(GetPageFromUriOrBody), nameof(requestQuery));
var responseResult = await _requestQueryCache.GetOrCreate(
nameof(GetPageFromUriOrBody)
, requestQuery
, (x) => getPageContent(x).Result);
return Request.CreateResponse(System.Net.HttpStatusCode.Accepted, responseResult);
}
static MemoryCacheWithPolicy<RequestQuery, string> _requestQueryCache = new MemoryCacheWithPolicy<RequestQuery, string>();
Here is getPageContent signature:
async Task<string> getPageContent(RequestQuery requestQuery);
And here is the MemoryCacheWithPolicy implementation:
public class MemoryCacheWithPolicy<TParameter, TResult>
{
static ILogger _nlogger = new AppLogger().Logger;
private MemoryCache _cache = new MemoryCache(new MemoryCacheOptions()
{
//Size limit amount: this is actually a memory size limit value!
SizeLimit = 1024
});
/// <summary>
/// Gets or creates a new memory cache record for a main data
/// along with parameter data that is assocciated with main main.
/// </summary>
/// <param name="key">Main data cache memory key.</param>
/// <param name="param">Parameter model that assocciated to main model (request result).</param>
/// <param name="createCacheData">A delegate to create a new main data to cache.</param>
/// <returns></returns>
public async Task<TResult> GetOrCreate(object key, TParameter param, Func<TParameter, TResult> createCacheData)
{
// this key is used for param cache memory.
var paramKey = key + nameof(param);
if (!_cache.TryGetValue(key, out TResult cacheEntry))
{
// key is not in the cache, create data through the delegate.
cacheEntry = createCacheData(param);
createMemoryCache(key, cacheEntry, paramKey, param);
_nlogger.Warn(" cache is created.");
}
else
{
// data is chached so far..., check if param model is same (or changed)?
if(!_cache.TryGetValue(paramKey, out TParameter cacheParam))
{
//exception: this case should not happened!
}
if (!cacheParam.Equals(param))
{
// request param is changed, create data through the delegate.
cacheEntry = createCacheData(param);
createMemoryCache(key, cacheEntry, paramKey, param);
_nlogger.Warn(" cache is re-created (param model has been changed).");
}
else
{
_nlogger.Trace(" cache is used.");
}
}
return await Task.FromResult<TResult>(cacheEntry);
}
MemoryCacheEntryOptions createMemoryCacheEntryOptions(TimeSpan slidingOffset, TimeSpan relativeOffset)
{
// Cache data within [slidingOffset] seconds,
// request new result after [relativeOffset] seconds.
return new MemoryCacheEntryOptions()
// Size amount: this is actually an entry count per
// key limit value! not an actual memory size value!
.SetSize(1)
// Priority on removing when reaching size limit (memory pressure)
.SetPriority(CacheItemPriority.High)
// Keep in cache for this amount of time, reset it if accessed.
.SetSlidingExpiration(slidingOffset)
// Remove from cache after this time, regardless of sliding expiration
.SetAbsoluteExpiration(relativeOffset);
//
}
void createMemoryCache(object key, TResult cacheEntry, object paramKey, TParameter param)
{
// Cache data within 2 seconds,
// request new result after 5 seconds.
var cacheEntryOptions = createMemoryCacheEntryOptions(
TimeSpan.FromSeconds(2)
, TimeSpan.FromSeconds(5));
// Save data in cache.
_cache.Set(key, cacheEntry, cacheEntryOptions);
// Save param in cache.
_cache.Set(paramKey, param, cacheEntryOptions);
}
void checkCacheEntry<T>(object key, string name)
{
_cache.TryGetValue(key, out T value);
_nlogger.Fatal("Key: {0}, Name: {1}, Value: {2}", key, name, value);
}
}
nlogger is just nLog object to trace MemoryCacheWithPolicy behavior.
I re-create the memory cache if request object (RequestQuery requestQuery) is changed through the delegate (Func<TParameter, TResult> createCacheData) or re-create when sliding or absolute time reached their limit. Note that everything is async too ;)

Better Caching For .NET 3.5 site

I'm trying to optimize old application and right now I'm trying to decrease SQL requests to minimum.
I've created simple caching mechanism, but it requires 3 methods:
public static List<Models.CarModel> GetCarModels()
{
return GetCarModels(false);
}
public static List<Models.CarModel> GetCarModels(bool reload)
{
const string my_key = "GetCarModels";
object list = HttpContext.Current.Cache[my_key] as List<Models.CarModel>;
if ((reload) || (list == null))
{
list = GetCarModels_P();
HttpContext.Current.Cache.Insert(my_key, list, null, DateTime.Now.AddHours(1), TimeSpan.Zero);
}
return (List<Models.CarModel>)list;
}
private static List<Models.CarModel> GetCarModels_P()
{
var tmp_list = new List<Models.CarModel>();
using (var conn = new SqlConnection())
{
conn.ConnectionString = ConfigurationManager.ConnectionStrings["HelpDesk"].ToString();
using (var cmd = new SqlCommand(#"SELECT_CAR_MODELS", conn))
{
cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandTimeout = 360;
conn.Open();
using (SqlDataReader sdr = cmd.ExecuteReader())
{
while (sdr.Read())
{
var carModel = new Models.CarModel()
{
Id = Convert.ToInt32(sdr["Id"]),
Name= Convert.ToString(sdr["CarName"])
};
tmp_list.Add(carModel );
}
}
conn.Close();
}
}
return tmp_list;
}
This works fine, sql query results are cached for 1 hour, but for every old method I must create 3 new, so for 10 old methods I must write 30 new.
I would like to reduce number of code that I must write, there probably is a way to create generic method to read/write to cache.
GetCarModels_P will probably not change, but my 2 public methods can be optimised (I quess).
You can use generics and a lambda to simplify your change:
public static List<T> GetViaCache<T>(bool reload, string key, Func<List<T>> loader)
{
object list = HttpContext.Current.Cache[key] as List<T>;
if ((reload) || (list == null))
{
list = loader();
HttpContext.Current.Cache.Insert(key, list, null, DateTime.Now.AddHours(1), TimeSpan.Zero);
}
return list;
}
This simplifies the loading code as
public static List<Models.CarModel> GetCarModels(bool reload)
{
return GetViaCache<Models.CarModel>(reload, "GetCarModels", GetCarModels_P);
}
of course you will still need to generate a unique key for the cache.
You could implement the pattern in a helper class, something like this:
internal class Cached<T> where T: class
{
private readonly Func<T> _loadFunc;
private readonly string _key;
private readonly TimeSpan _expiration;
public Cached(Func<T> loadFunc, string key, TimeSpan expiration)
{
_loadFunc = loadFunc;
_key = key;
_expiration = expiration;
}
public T GetValue(bool reload)
{
T value = (T)HttpContext.Current.Cache[_key];
if (reload || value == null)
{
value = _loadFunc();
HttpContext.Current.Cache.Insert(_key, value, null,
DateTime.Now + _expiration, TimeSpan.Zero);
}
return value;
}
}
You'd use it like so:
private static Cached<List<Models.CarModel>> _carModels =
new Cached<List<Models.CarModel>>(
GetCarModels_P, "GetCarModels", TimeSpan.FromHours(1));
public static List<Models.CarModel> GetCarModels()
{
return _carModels.GetValue(false);
}
You can also do it with only a generic method of course (like one of the other answers posted while I was writing this). It depends on the situation which I would use. If you have a similar call that forces a reload, my method is a bit easier because you don't have to repeat yourself. However, on the downside, my implementation somewhat suggests that the Cached instance itself is retaining the data personally, while it's actually the HttpContext cache doing so.
You just wrote two very good reasons why you should use an ORM layer like NHibernate or Entity Framework. They take away the pain to write incredible amounts of boilerplate SQL-calling functions. They support caching, I know for a fact that it's easy as a pie with NH, not sure about EF.
No need to throw out any of your existing code, ORM's can coexist with your existing database code, or even use the exact same connections. You can slowly migrate your existing functions one by one, from barehanded SQL to the ORM of your choice. If you want to spend time optimizing an old app, I would say this is the best way to spend that time.

How to deal with costly building operations using MemoryCache?

On an ASP.NET MVC project we have several instances of data that requires good amount of resources and time to build. We want to cache them.
MemoryCache provides certain level of thread-safety but not enough to avoid running multiple instances of building code in parallel. Here is an example:
var data = cache["key"];
if(data == null)
{
data = buildDataUsingGoodAmountOfResources();
cache["key"] = data;
}
As you can see on a busy website hundreds of threads could go inside the if statement simultaneously until the data is built and make the building operation even slower, unnecessarily consuming the server resources.
There is an atomic AddOrGetExisting implementation in MemoryCache but it incorrectly requires "value to set" instead of "code to retrieve the value to set" which I think renders the given method almost completely useless.
We have been using our own ad-hoc scaffolding around MemoryCache to get it right however it requires explicit locks. It's cumbersome to use per-entry lock objects and we usually get away by sharing lock objects which is far from ideal. That made me think that reasons to avoid such convention could be intentional.
So I have two questions:
Is it a better practice not to lock building code? (That could have been proven more responsive for one, I wonder)
What's the right way to achieve per-entry locking for MemoryCache for such a lock? The strong urge to use key string as the lock object is dismissed at ".NET locking 101".
We solved this issue by combining Lazy<T> with AddOrGetExisting to avoid a need for a lock object completely. Here is a sample code (which uses infinite expiration):
public T GetFromCache<T>(string key, Func<T> valueFactory)
{
var newValue = new Lazy<T>(valueFactory);
// the line belows returns existing item or adds the new value if it doesn't exist
var value = (Lazy<T>)cache.AddOrGetExisting(key, newValue, MemoryCache.InfiniteExpiration);
return (value ?? newValue).Value; // Lazy<T> handles the locking itself
}
That's not complete. There are gotchas like "exception caching" so you have to decide about what you want to do in case your valueFactory throws exception. One of the advantages, though, is the ability to cache null values too.
For the conditional add requirement, I always use ConcurrentDictionary, which has an overloaded GetOrAdd method which accepts a delegate to fire if the object needs to be built.
ConcurrentDictionary<string, object> _cache = new
ConcurrenctDictionary<string, object>();
public void GetOrAdd(string key)
{
return _cache.GetOrAdd(key, (k) => {
//here 'k' is actually the same as 'key'
return buildDataUsingGoodAmountOfResources();
});
}
In reality I almost always use static concurrent dictionaries. I used to have 'normal' dictionaries protected by a ReaderWriterLockSlim instance, but as soon as I switched to .Net 4 (it's only available from that onwards) I started converting any of those that I came across.
ConcurrentDictionary's performance is admirable to say the least :)
Update Naive implementation with expiration semantics based on age only. Also should ensure that individual items are only created once - as per #usr's suggestion. Update again - as #usr has suggested - simply using a Lazy<T> would be a lot simpler - you can just forward the creation delegate to that when adding it to the concurrent dictionary. I'be changed the code, as actually my dictionary of locks wouldn't have worked anyway. But I really should have thought of that myself (past midnight here in the UK though and I'm beat. Any sympathy? No of course not. Being a developer, I have enough caffeine coursing through my veins to wake the dead).
I do recommend implementing the IRegisteredObject interface with this, though, and then registering it with the HostingEnvironment.RegisterObject method - doing that would provide a cleaner way to shut down the poller thread when the application pool shuts-down/recycles.
public class ConcurrentCache : IDisposable
{
private readonly ConcurrentDictionary<string, Tuple<DateTime?, Lazy<object>>> _cache =
new ConcurrentDictionary<string, Tuple<DateTime?, Lazy<object>>>();
private readonly Thread ExpireThread = new Thread(ExpireMonitor);
public ConcurrentCache(){
ExpireThread.Start();
}
public void Dispose()
{
//yeah, nasty, but this is a 'naive' implementation :)
ExpireThread.Abort();
}
public void ExpireMonitor()
{
while(true)
{
Thread.Sleep(1000);
DateTime expireTime = DateTime.Now;
var toExpire = _cache.Where(kvp => kvp.First != null &&
kvp.Item1.Value < expireTime).Select(kvp => kvp.Key).ToArray();
Tuple<string, Lazy<object>> removed;
object removedLock;
foreach(var key in toExpire)
{
_cache.TryRemove(key, out removed);
}
}
}
public object CacheOrAdd(string key, Func<string, object> factory,
TimeSpan? expiry)
{
return _cache.GetOrAdd(key, (k) => {
//get or create a new object instance to use
//as the lock for the user code
//here 'k' is actually the same as 'key'
return Tuple.Create(
expiry.HasValue ? DateTime.Now + expiry.Value : (DateTime?)null,
new Lazy<object>(() => factory(k)));
}).Item2.Value;
}
}
Taking the top answer into C# 7, here's my implementation that allows storage from any source type T to any return type TResult.
/// <summary>
/// Creates a GetOrRefreshCache function with encapsulated MemoryCache.
/// </summary>
/// <typeparam name="T">The type of inbound objects to cache.</typeparam>
/// <typeparam name="TResult">How the objects will be serialized to cache and returned.</typeparam>
/// <param name="cacheName">The name of the cache.</param>
/// <param name="valueFactory">The factory for storing values.</param>
/// <param name="keyFactory">An optional factory to choose cache keys.</param>
/// <returns>A function to get or refresh from cache.</returns>
public static Func<T, TResult> GetOrRefreshCacheFactory<T, TResult>(string cacheName, Func<T, TResult> valueFactory, Func<T, string> keyFactory = null) {
var getKey = keyFactory ?? (obj => obj.GetHashCode().ToString());
var cache = new MemoryCache(cacheName);
// Thread-safe lazy cache
TResult getOrRefreshCache(T obj) {
var key = getKey(obj);
var newValue = new Lazy<TResult>(() => valueFactory(obj));
var value = (Lazy<TResult>) cache.AddOrGetExisting(key, newValue, ObjectCache.InfiniteAbsoluteExpiration);
return (value ?? newValue).Value;
}
return getOrRefreshCache;
}
Usage
/// <summary>
/// Get a JSON object from cache or serialize it if it doesn't exist yet.
/// </summary>
private static readonly Func<object, string> GetJson =
GetOrRefreshCacheFactory<object, string>("json-cache", JsonConvert.SerializeObject);
var json = GetJson(new { foo = "bar", yes = true });
Here is simple solution as MemoryCache extension method.
public static class MemoryCacheExtensions
{
public static T LazyAddOrGetExitingItem<T>(this MemoryCache memoryCache, string key, Func<T> getItemFunc, DateTimeOffset absoluteExpiration)
{
var item = new Lazy<T>(
() => getItemFunc(),
LazyThreadSafetyMode.PublicationOnly // Do not cache lazy exceptions
);
var cachedValue = memoryCache.AddOrGetExisting(key, item, absoluteExpiration) as Lazy<T>;
return (cachedValue != null) ? cachedValue.Value : item.Value;
}
}
And test for it as usage description.
[TestMethod]
[TestCategory("MemoryCacheExtensionsTests"), TestCategory("UnitTests")]
public void MemoryCacheExtensions_LazyAddOrGetExitingItem_Test()
{
const int expectedValue = 42;
const int cacheRecordLifetimeInSeconds = 42;
var key = "lazyMemoryCacheKey";
var absoluteExpiration = DateTimeOffset.Now.AddSeconds(cacheRecordLifetimeInSeconds);
var lazyMemoryCache = MemoryCache.Default;
#region Cache warm up
var actualValue = lazyMemoryCache.LazyAddOrGetExitingItem(key, () => expectedValue, absoluteExpiration);
Assert.AreEqual(expectedValue, actualValue);
#endregion
#region Get value from cache
actualValue = lazyMemoryCache.LazyAddOrGetExitingItem(key, () => expectedValue, absoluteExpiration);
Assert.AreEqual(expectedValue, actualValue);
#endregion
}
Sedat's solution of combining Lazy with AddOrGetExisting is inspiring. I must point out that this solution has a performance issue, which seems very important for a solution for caching.
If you look at the code of AddOrGetExisting(), you will find that AddOrGetExisting() is not a lock-free method. Comparing to the lock-free Get() method, it wastes the one of the advantage of MemoryCache.
I would like to recommend to follow solution, using Get() first and then use AddOrGetExisting() to avoid creating object multiple times.
public T GetFromCache<T>(string key, Func<T> valueFactory)
{
T value = (T)cache.Get(key);
if (value != null)
{
return value;
}
var newValue = new Lazy<T>(valueFactory);
// the line belows returns existing item or adds the new value if it doesn't exist
var oldValue = (Lazy<T>)cache.AddOrGetExisting(key, newValue, MemoryCache.InfiniteExpiration);
return (oldValue ?? newValue).Value; // Lazy<T> handles the locking itself
}
Here is a design that follows what you seem to have in mind. The first lock only happens for a short time. The final call to data.Value also locks (underneath), but clients will only block if two of them are requesting the same item at the same time.
public DataType GetData()
{
lock(_privateLockingField)
{
Lazy<DataType> data = cache["key"] as Lazy<DataType>;
if(data == null)
{
data = new Lazy<DataType>(() => buildDataUsingGoodAmountOfResources();
cache["key"] = data;
}
}
return data.Value;
}

Categories