How to implement Caching with data size limit? - c#

I have multiple threads asking for data that have to be loaded over network.
In order to have less network traffic and faster responses, I'd like to cache data, which are often requested. I also want to limit the Cache's data size.
My class looks something like this:
public class DataProvider
{
private ConcurrentDictionary<string, byte[]> dataCache;
private int dataCacheSize;
private int maxDataCacheSize;
private object dataCacheSizeLockObj = new object();
public DataProvider(int maxCacheSize)
{
maxDataCacheSize = maxCacheSize;
dataCache = new ConcurrentDictionary<string,byte[]>();
}
public byte[] GetData(string key)
{
byte[] retVal;
if (dataCache.ContainsKey(key))
{
retVal = dataCache[key];
}
else
{
retVal = ... // get data from somewhere else
if (dataCacheSize + retVal.Length <= maxDataCacheSize)
{
lock (dataCacheSizeLockObj)
{
dataCacheSize += retVal.Length;
}
dataCache[key] = retVal;
}
}
return retVal;
}
}
My problem is: how do I make sure, that dataCacheSize always has the correct value? If two threads request the same uncached data at the same time, they will both write their data to the cache, which is no problem, because the data is the same and the second thread will just overwrite the cached data with the same data. But how do I know, if it was overwritten or not to avoid counting its size twice?
It could also happen, that two threads are adding data at the same time resulting in a dataCache size larger than allowed...
Is there an elegant way to accomplish this task without adding complex locking mechanisms?

Instead of trying to "roll you own" caching, take a look at System.Runtime.Caching.MemoryCache. See comment above.

Since you update dataCacheSize inside lock, you can just check here if it would remain correct:
if (dataCacheSize + retVal.Length <= maxDataCacheSize)
{
lock (dataCacheSizeLockObj)
{
if (dataCacheSize + retVal.Length > maxDataCacheSize)
{
return retVal;
}
dataCacheSize += retVal.Length;
}
byte[] oldVal = dataCache.GetOrAdd(key, retVal);
if (oldVal != retVal)
{
// retVal wasn't actually added
lock (dataCacheSizeLockObj)
{
dataCacheSize -= retVal.Length;
}
}
}

Related

Locking pattern for proper use of .NET MemoryCache

I assume this code has concurrency issues:
const string CacheKey = "CacheKey";
static string GetCachedData()
{
string expensiveString =null;
if (MemoryCache.Default.Contains(CacheKey))
{
expensiveString = MemoryCache.Default[CacheKey] as string;
}
else
{
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
expensiveString = SomeHeavyAndExpensiveCalculation();
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
}
return expensiveString;
}
The reason for the concurrency issue is that multiple threads can get a null key and then attempt to insert data into cache.
What would be the shortest and cleanest way to make this code concurrency proof? I like to follow a good pattern across my cache related code. A link to an online article would be a great help.
UPDATE:
I came up with this code based on #Scott Chamberlain's answer. Can anyone find any performance or concurrency issue with this?
If this works, it would save many line of code and errors.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Runtime.Caching;
namespace CachePoc
{
class Program
{
static object everoneUseThisLockObject4CacheXYZ = new object();
const string CacheXYZ = "CacheXYZ";
static object everoneUseThisLockObject4CacheABC = new object();
const string CacheABC = "CacheABC";
static void Main(string[] args)
{
string xyzData = MemoryCacheHelper.GetCachedData<string>(CacheXYZ, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
string abcData = MemoryCacheHelper.GetCachedData<string>(CacheABC, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
}
private static string SomeHeavyAndExpensiveXYZCalculation() {return "Expensive";}
private static string SomeHeavyAndExpensiveABCCalculation() {return "Expensive";}
public static class MemoryCacheHelper
{
public static T GetCachedData<T>(string cacheKey, object cacheLock, int cacheTimePolicyMinutes, Func<T> GetData)
where T : class
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
T cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
{
return cachedData;
}
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
{
return cachedData;
}
//The value still did not exist so we now write it in to the cache.
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(cacheTimePolicyMinutes))
};
cachedData = GetData();
MemoryCache.Default.Set(cacheKey, cachedData, cip);
return cachedData;
}
}
}
}
}
This is my 2nd iteration of the code. Because MemoryCache is thread safe you don't need to lock on the initial read, you can just read and if the cache returns null then do the lock check to see if you need to create the string. It greatly simplifies the code.
const string CacheKey = "CacheKey";
static readonly object cacheLock = new object();
private static string GetCachedData()
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
//The value still did not exist so we now write it in to the cache.
var expensiveString = SomeHeavyAndExpensiveCalculation();
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
}
EDIT: The below code is unnecessary but I wanted to leave it to show the original method. It may be useful to future visitors who are using a different collection that has thread safe reads but non-thread safe writes (almost all of classes under the System.Collections namespace is like that).
Here is how I would do it using ReaderWriterLockSlim to protect access. You need to do a kind of "Double Checked Locking" to see if anyone else created the cached item while we where waiting to to take the lock.
const string CacheKey = "CacheKey";
static readonly ReaderWriterLockSlim cacheLock = new ReaderWriterLockSlim();
static string GetCachedData()
{
//First we do a read lock to see if it already exists, this allows multiple readers at the same time.
cacheLock.EnterReadLock();
try
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
}
finally
{
cacheLock.ExitReadLock();
}
//Only one UpgradeableReadLock can exist at one time, but it can co-exist with many ReadLocks
cacheLock.EnterUpgradeableReadLock();
try
{
//We need to check again to see if the string was created while we where waiting to enter the EnterUpgradeableReadLock
var cachedString = MemoryCache.Default.Get(CacheKey, null) as string;
if (cachedString != null)
{
return cachedString;
}
//The entry still does not exist so we need to create it and enter the write lock
var expensiveString = SomeHeavyAndExpensiveCalculation();
cacheLock.EnterWriteLock(); //This will block till all the Readers flush.
try
{
CacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
finally
{
cacheLock.ExitWriteLock();
}
}
finally
{
cacheLock.ExitUpgradeableReadLock();
}
}
There is an open source library [disclaimer: that I wrote]: LazyCache that IMO covers your requirement with two lines of code:
IAppCache cache = new CachingService();
var cachedResults = cache.GetOrAdd("CacheKey",
() => SomeHeavyAndExpensiveCalculation());
It has built in locking by default so the cacheable method will only execute once per cache miss, and it uses a lambda so you can do "get or add" in one go. It defaults to 20 minutes sliding expiration.
There's even a NuGet package ;)
I've solved this issue by making use of the AddOrGetExisting method on the MemoryCache and the use of Lazy initialization.
Essentially, my code looks something like this:
static string GetCachedData(string key, DateTimeOffset offset)
{
Lazy<String> lazyObject = new Lazy<String>(() => SomeHeavyAndExpensiveCalculationThatReturnsAString());
var returnedLazyObject = MemoryCache.Default.AddOrGetExisting(key, lazyObject, offset);
if (returnedLazyObject == null)
return lazyObject.Value;
return ((Lazy<String>) returnedLazyObject).Value;
}
Worst case scenario here is that you create the same Lazy object twice. But that is pretty trivial. The use of AddOrGetExisting guarantees that you'll only ever get one instance of the Lazy object, and so you're also guaranteed to only call the expensive initialization method once.
I assume this code has concurrency issues:
Actually, it's quite possibly fine, though with a possible improvement.
Now, in general the pattern where we have multiple threads setting a shared value on first use, to not lock on the value being obtained and set can be:
Disastrous - other code will assume only one instance exists.
Disastrous - the code that obtains the instance is not can only tolerate one (or perhaps a certain small number) concurrent operations.
Disastrous - the means of storage is not thread-safe (e.g. have two threads adding to a dictionary and you can get all sorts of nasty errors).
Sub-optimal - the overall performance is worse than if locking had ensured only one thread did the work of obtaining the value.
Optimal - the cost of having multiple threads do redundant work is less than the cost of preventing it, especially since that can only happen during a relatively brief period.
However, considering here that MemoryCache may evict entries then:
If it's disastrous to have more than one instance then MemoryCache is the wrong approach.
If you must prevent simultaneous creation, you should do so at the point of creation.
MemoryCache is thread-safe in terms of access to that object, so that is not a concern here.
Both of these possibilities have to be thought about of course, though the only time having two instances of the same string existing can be a problem is if you're doing very particular optimisations that don't apply here*.
So, we're left with the possibilities:
It is cheaper to avoid the cost of duplicate calls to SomeHeavyAndExpensiveCalculation().
It is cheaper not to avoid the cost of duplicate calls to SomeHeavyAndExpensiveCalculation().
And working that out can be difficult (indeed, the sort of thing where it's worth profiling rather than assuming you can work it out). It's worth considering here though that most obvious ways of locking on insert will prevent all additions to the cache, including those that are unrelated.
This means that if we had 50 threads trying to set 50 different values, then we'll have to make all 50 threads wait on each other, even though they weren't even going to do the same calculation.
As such, you're probably better off with the code you have, than with code that avoids the race-condition, and if the race-condition is a problem, you quite likely either need to handle that somewhere else, or need a different caching strategy than one that expels old entries†.
The one thing I would change is I'd replace the call to Set() with one to AddOrGetExisting(). From the above it should be clear that it probably isn't necessary, but it would allow the newly obtained item to be collected, reducing overall memory use and allowing a higher ratio of low generation to high generation collections.
So yeah, you could use double-locking to prevent concurrency, but either the concurrency isn't actually a problem, or your storing the values in the wrong way, or double-locking on the store would not be the best way to solve it.
*If you know only one each of a set of strings exists, you can optimise equality comparisons, which is about the only time having two copies of a string can be incorrect rather than just sub-optimal, but you'd want to be doing very different types of caching for that to make sense. E.g. the sort XmlReader does internally.
†Quite likely either one that stores indefinitely, or one that makes use of weak references so it will only expel entries if there are no existing uses.
Somewhat dated question, but maybe still useful: you may take a look at FusionCache ⚡🦥, which I recently released.
The feature you are looking for is described here, and you can use it like this:
const string CacheKey = "CacheKey";
static string GetCachedData()
{
return fusionCache.GetOrSet(
CacheKey,
_ => SomeHeavyAndExpensiveCalculation(),
TimeSpan.FromMinutes(20)
);
}
You may also find some of the other features interesting like fail-safe, advanced timeouts with background factory completion and support for an optional, distributed 2nd level cache.
If you will give it a chance please let me know what you think.
/shameless-plug
It is difficult to choose which one is better; lock or ReaderWriterLockSlim. You need real world statistics of read and write numbers and ratios etc.
But if you believe using "lock" is the correct way. Then here is a different solution for different needs. I also include the Allan Xu's solution in the code. Because both can be needed for different needs.
Here are the requirements, driving me to this solution:
You don't want to or cannot supply the 'GetData' function for some reason. Perhaps the 'GetData' function is located in some other class with a heavy constructor and you do not want to even create an instance till ensuring it is unescapable.
You need to access the same cached data from different locations/tiers of the application. And those different locations don't have access to same locker object.
You don't have a constant cache key. For example; need of caching some data with the sessionId cache key.
Code:
using System;
using System.Runtime.Caching;
using System.Collections.Concurrent;
using System.Collections.Generic;
namespace CachePoc
{
class Program
{
static object everoneUseThisLockObject4CacheXYZ = new object();
const string CacheXYZ = "CacheXYZ";
static object everoneUseThisLockObject4CacheABC = new object();
const string CacheABC = "CacheABC";
static void Main(string[] args)
{
//Allan Xu's usage
string xyzData = MemoryCacheHelper.GetCachedDataOrAdd<string>(CacheXYZ, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
string abcData = MemoryCacheHelper.GetCachedDataOrAdd<string>(CacheABC, everoneUseThisLockObject4CacheXYZ, 20, SomeHeavyAndExpensiveXYZCalculation);
//My usage
string sessionId = System.Web.HttpContext.Current.Session["CurrentUser.SessionId"].ToString();
string yvz = MemoryCacheHelper.GetCachedData<string>(sessionId);
if (string.IsNullOrWhiteSpace(yvz))
{
object locker = MemoryCacheHelper.GetLocker(sessionId);
lock (locker)
{
yvz = MemoryCacheHelper.GetCachedData<string>(sessionId);
if (string.IsNullOrWhiteSpace(yvz))
{
DatabaseRepositoryWithHeavyConstructorOverHead dbRepo = new DatabaseRepositoryWithHeavyConstructorOverHead();
yvz = dbRepo.GetDataExpensiveDataForSession(sessionId);
MemoryCacheHelper.AddDataToCache(sessionId, yvz, 5);
}
}
}
}
private static string SomeHeavyAndExpensiveXYZCalculation() { return "Expensive"; }
private static string SomeHeavyAndExpensiveABCCalculation() { return "Expensive"; }
public static class MemoryCacheHelper
{
//Allan Xu's solution
public static T GetCachedDataOrAdd<T>(string cacheKey, object cacheLock, int minutesToExpire, Func<T> GetData) where T : class
{
//Returns null if the string does not exist, prevents a race condition where the cache invalidates between the contains check and the retreival.
T cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
return cachedData;
lock (cacheLock)
{
//Check to see if anyone wrote to the cache while we where waiting our turn to write the new value.
cachedData = MemoryCache.Default.Get(cacheKey, null) as T;
if (cachedData != null)
return cachedData;
cachedData = GetData();
MemoryCache.Default.Set(cacheKey, cachedData, DateTime.Now.AddMinutes(minutesToExpire));
return cachedData;
}
}
#region "My Solution"
readonly static ConcurrentDictionary<string, object> Lockers = new ConcurrentDictionary<string, object>();
public static object GetLocker(string cacheKey)
{
CleanupLockers();
return Lockers.GetOrAdd(cacheKey, item => (cacheKey, new object()));
}
public static T GetCachedData<T>(string cacheKey) where T : class
{
CleanupLockers();
T cachedData = MemoryCache.Default.Get(cacheKey) as T;
return cachedData;
}
public static void AddDataToCache(string cacheKey, object value, int cacheTimePolicyMinutes)
{
CleanupLockers();
MemoryCache.Default.Add(cacheKey, value, DateTimeOffset.Now.AddMinutes(cacheTimePolicyMinutes));
}
static DateTimeOffset lastCleanUpTime = DateTimeOffset.MinValue;
static void CleanupLockers()
{
if (DateTimeOffset.Now.Subtract(lastCleanUpTime).TotalMinutes > 1)
{
lock (Lockers)//maybe a better locker is needed?
{
try//bypass exceptions
{
List<string> lockersToRemove = new List<string>();
foreach (var locker in Lockers)
{
if (!MemoryCache.Default.Contains(locker.Key))
lockersToRemove.Add(locker.Key);
}
object dummy;
foreach (string lockerKey in lockersToRemove)
Lockers.TryRemove(lockerKey, out dummy);
lastCleanUpTime = DateTimeOffset.Now;
}
catch (Exception)
{ }
}
}
}
#endregion
}
}
class DatabaseRepositoryWithHeavyConstructorOverHead
{
internal string GetDataExpensiveDataForSession(string sessionId)
{
return "Expensive data from database";
}
}
}
To avoid the global lock, you can use SingletonCache to implement one lock per key, without exploding memory usage (the lock objects are removed when no longer referenced, and acquire/release is thread safe guaranteeing that only 1 instance is ever in use via compare and swap).
Using it looks like this:
SingletonCache<string, object> keyLocks = new SingletonCache<string, object>();
const string CacheKey = "CacheKey";
static string GetCachedData()
{
string expensiveString =null;
if (MemoryCache.Default.Contains(CacheKey))
{
return MemoryCache.Default[CacheKey] as string;
}
// double checked lock
using (var lifetime = keyLocks.Acquire(url))
{
lock (lifetime.Value)
{
if (MemoryCache.Default.Contains(CacheKey))
{
return MemoryCache.Default[CacheKey] as string;
}
cacheItemPolicy cip = new CacheItemPolicy()
{
AbsoluteExpiration = new DateTimeOffset(DateTime.Now.AddMinutes(20))
};
expensiveString = SomeHeavyAndExpensiveCalculation();
MemoryCache.Default.Set(CacheKey, expensiveString, cip);
return expensiveString;
}
}
}
Code is here on GitHub: https://github.com/bitfaster/BitFaster.Caching
Install-Package BitFaster.Caching
There is also an LRU implementation that is lighter weight than MemoryCache, and has several advantages - faster concurrent reads and writes, bounded size, no background thread, internal perf counters etc. (disclaimer, I wrote it).
Console example of MemoryCache, "How to save/get simple class objects"
Output after launching and pressing Any key except Esc :
Saving to cache!
Getting from cache!
Some1
Some2
class Some
{
public String text { get; set; }
public Some(String text)
{
this.text = text;
}
public override string ToString()
{
return text;
}
}
public static MemoryCache cache = new MemoryCache("cache");
public static string cache_name = "mycache";
static void Main(string[] args)
{
Some some1 = new Some("some1");
Some some2 = new Some("some2");
List<Some> list = new List<Some>();
list.Add(some1);
list.Add(some2);
do {
if (cache.Contains(cache_name))
{
Console.WriteLine("Getting from cache!");
List<Some> list_c = cache.Get(cache_name) as List<Some>;
foreach (Some s in list_c) Console.WriteLine(s);
}
else
{
Console.WriteLine("Saving to cache!");
cache.Set(cache_name, list, DateTime.Now.AddMinutes(10));
}
} while (Console.ReadKey(true).Key != ConsoleKey.Escape);
}
public interface ILazyCacheProvider : IAppCache
{
/// <summary>
/// Get data loaded - after allways throw cached result (even when data is older then needed) but very fast!
/// </summary>
/// <param name="key"></param>
/// <param name="getData"></param>
/// <param name="slidingExpiration"></param>
/// <typeparam name="T"></typeparam>
/// <returns></returns>
T GetOrAddPermanent<T>(string key, Func<T> getData, TimeSpan slidingExpiration);
}
/// <summary>
/// Initialize LazyCache in runtime
/// </summary>
public class LazzyCacheProvider: CachingService, ILazyCacheProvider
{
private readonly Logger _logger = LogManager.GetLogger("MemCashe");
private readonly Hashtable _hash = new Hashtable();
private readonly List<string> _reloader = new List<string>();
private readonly ConcurrentDictionary<string, DateTime> _lastLoad = new ConcurrentDictionary<string, DateTime>();
T ILazyCacheProvider.GetOrAddPermanent<T>(string dataKey, Func<T> getData, TimeSpan slidingExpiration)
{
var currentPrincipal = Thread.CurrentPrincipal;
if (!ObjectCache.Contains(dataKey) && !_hash.Contains(dataKey))
{
_hash[dataKey] = null;
_logger.Debug($"{dataKey} - first start");
_lastLoad[dataKey] = DateTime.Now;
_hash[dataKey] = ((object)GetOrAdd(dataKey, getData, slidingExpiration)).CloneObject();
_lastLoad[dataKey] = DateTime.Now;
_logger.Debug($"{dataKey} - first");
}
else
{
if ((!ObjectCache.Contains(dataKey) || _lastLoad[dataKey].AddMinutes(slidingExpiration.Minutes) < DateTime.Now) && _hash[dataKey] != null)
Task.Run(() =>
{
if (_reloader.Contains(dataKey)) return;
lock (_reloader)
{
if (ObjectCache.Contains(dataKey))
{
if(_lastLoad[dataKey].AddMinutes(slidingExpiration.Minutes) > DateTime.Now)
return;
_lastLoad[dataKey] = DateTime.Now;
Remove(dataKey);
}
_reloader.Add(dataKey);
Thread.CurrentPrincipal = currentPrincipal;
_logger.Debug($"{dataKey} - reload start");
_hash[dataKey] = ((object)GetOrAdd(dataKey, getData, slidingExpiration)).CloneObject();
_logger.Debug($"{dataKey} - reload");
_reloader.Remove(dataKey);
}
});
}
if (_hash[dataKey] != null) return (T) (_hash[dataKey]);
_logger.Debug($"{dataKey} - dummy start");
var data = GetOrAdd(dataKey, getData, slidingExpiration);
_logger.Debug($"{dataKey} - dummy");
return (T)((object)data).CloneObject();
}
}
Its a bit late, however...
Full implementation:
[HttpGet]
public async Task<HttpResponseMessage> GetPageFromUriOrBody(RequestQuery requestQuery)
{
log(nameof(GetPageFromUriOrBody), nameof(requestQuery));
var responseResult = await _requestQueryCache.GetOrCreate(
nameof(GetPageFromUriOrBody)
, requestQuery
, (x) => getPageContent(x).Result);
return Request.CreateResponse(System.Net.HttpStatusCode.Accepted, responseResult);
}
static MemoryCacheWithPolicy<RequestQuery, string> _requestQueryCache = new MemoryCacheWithPolicy<RequestQuery, string>();
Here is getPageContent signature:
async Task<string> getPageContent(RequestQuery requestQuery);
And here is the MemoryCacheWithPolicy implementation:
public class MemoryCacheWithPolicy<TParameter, TResult>
{
static ILogger _nlogger = new AppLogger().Logger;
private MemoryCache _cache = new MemoryCache(new MemoryCacheOptions()
{
//Size limit amount: this is actually a memory size limit value!
SizeLimit = 1024
});
/// <summary>
/// Gets or creates a new memory cache record for a main data
/// along with parameter data that is assocciated with main main.
/// </summary>
/// <param name="key">Main data cache memory key.</param>
/// <param name="param">Parameter model that assocciated to main model (request result).</param>
/// <param name="createCacheData">A delegate to create a new main data to cache.</param>
/// <returns></returns>
public async Task<TResult> GetOrCreate(object key, TParameter param, Func<TParameter, TResult> createCacheData)
{
// this key is used for param cache memory.
var paramKey = key + nameof(param);
if (!_cache.TryGetValue(key, out TResult cacheEntry))
{
// key is not in the cache, create data through the delegate.
cacheEntry = createCacheData(param);
createMemoryCache(key, cacheEntry, paramKey, param);
_nlogger.Warn(" cache is created.");
}
else
{
// data is chached so far..., check if param model is same (or changed)?
if(!_cache.TryGetValue(paramKey, out TParameter cacheParam))
{
//exception: this case should not happened!
}
if (!cacheParam.Equals(param))
{
// request param is changed, create data through the delegate.
cacheEntry = createCacheData(param);
createMemoryCache(key, cacheEntry, paramKey, param);
_nlogger.Warn(" cache is re-created (param model has been changed).");
}
else
{
_nlogger.Trace(" cache is used.");
}
}
return await Task.FromResult<TResult>(cacheEntry);
}
MemoryCacheEntryOptions createMemoryCacheEntryOptions(TimeSpan slidingOffset, TimeSpan relativeOffset)
{
// Cache data within [slidingOffset] seconds,
// request new result after [relativeOffset] seconds.
return new MemoryCacheEntryOptions()
// Size amount: this is actually an entry count per
// key limit value! not an actual memory size value!
.SetSize(1)
// Priority on removing when reaching size limit (memory pressure)
.SetPriority(CacheItemPriority.High)
// Keep in cache for this amount of time, reset it if accessed.
.SetSlidingExpiration(slidingOffset)
// Remove from cache after this time, regardless of sliding expiration
.SetAbsoluteExpiration(relativeOffset);
//
}
void createMemoryCache(object key, TResult cacheEntry, object paramKey, TParameter param)
{
// Cache data within 2 seconds,
// request new result after 5 seconds.
var cacheEntryOptions = createMemoryCacheEntryOptions(
TimeSpan.FromSeconds(2)
, TimeSpan.FromSeconds(5));
// Save data in cache.
_cache.Set(key, cacheEntry, cacheEntryOptions);
// Save param in cache.
_cache.Set(paramKey, param, cacheEntryOptions);
}
void checkCacheEntry<T>(object key, string name)
{
_cache.TryGetValue(key, out T value);
_nlogger.Fatal("Key: {0}, Name: {1}, Value: {2}", key, name, value);
}
}
nlogger is just nLog object to trace MemoryCacheWithPolicy behavior.
I re-create the memory cache if request object (RequestQuery requestQuery) is changed through the delegate (Func<TParameter, TResult> createCacheData) or re-create when sliding or absolute time reached their limit. Note that everything is async too ;)

Design pattern for dynamic C# object

I have a queue that processes objects in a while loop. They are added asynchronously somewhere.. like this:
myqueue.pushback(String value);
And they are processed like this:
while(true)
{
String path = queue.pop();
if(process(path))
{
Console.WriteLine("Good!");
}
else
{
queue.pushback(path);
}
}
Now, the thing is that I'd like to modify this to support a TTL-like (time to live) flag, so the file path would be added o more than n times.
How could I do this, while keeping the bool process(String path) function signature? I don't want to modify that.
I thought about holding a map, or a list that counts how many times the process function returned false for a path and drop the path from the list at the n-th return of false. I wonder how can this be done more dynamically, and preferably I'd like the TTL to automatically decrement itself at each new addition to the process. I hope I am not talking trash.
Maybe using something like this
class JobData
{
public string path;
public short ttl;
public static implicit operator String(JobData jobData) {jobData.ttl--; return jobData.path;}
}
I like the idea of a JobData class, but there's already an answer demonstrating that, and the fact that you're working with file paths give you another possible advantage. Certain characters are not valid in file paths, and so you could choose one to use as a delimiter. The advantage here is that the queue type remains a string, and so you would not have to modify any of your existing asynchronous code. You can see a list of reserved path characters here:
http://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words
For our purposes, I'll use the percent (%) character. Then you can modify your code as follows, and nothing else needs to change:
const int startingTTL = 100;
const string delimiter = "%";
while(true)
{
String[] path = queue.pop().Split(delimiter.ToCharArray());
int ttl = path.Length > 1?--int.Parse(path[1]):startingTTL;
if(process(path[0]))
{
Console.WriteLine("Good!");
}
else if (ttl > 0)
{
queue.pushback(string.Format("{0}{1}{2}", path[0], delimiter,ttl));
}
else
{
Console.WriteLine("TTL expired for path: {0}" path[0]);
}
}
Again, from a pure architecture standpoint, a class with two properties is a better design... but from a practical standpoint, YAGNI: this option means you can avoid going back and changing other asynchronous code that pushes into the queue. That code still only needs to know about the strings, and will work with this unmodified.
One more thing. I want to point out that this is a fairly tight loop, prone to running away with a cpu core. Additionally, if this is the .Net queue type and your tight loop gets ahead of your asynchronous produces to empty the queue, you'll throw an exception, which would break out of the while(true) block. You can solve both issues with code like this:
while(true)
{
try
{
String[] path = queue.pop().Split(delimiter.ToCharArray());
int ttl = path.Length > 1?--int.Parse(path[1]):startingTTL;
if(process(path[0]))
{
Console.WriteLine("Good!");
}
else if (ttl > 0)
{
queue.pushback(string.Format("{0}{1}{2}", path[0], delimiter,ttl));
}
else
{
Console.WriteLine("TTL expired for path: {0}" path[0]);
}
}
catch(InvalidOperationException ex)
{
//Queue.Dequeue throws InvalidOperation if the queue is empty... sleep for a bit before trying again
Thread.Sleep(100);
}
}
If the constraint is that bool process(String path) cannot be touched/changed then put the functionality into myqueue. You can keep its public signatures of void pushback(string path) and string pop(), but internally you can track your TTL. You can either wrap the string paths in a JobData-like class that gets added to the internal queue, or you can have a secondary Dictionary keyed by path. Perhaps even something as simple as saving the last poped path and if the subsequent push is the same path you can assume it was a rejected/failed item. Also, in your pop method you can even discard a path that has been rejected too many time and internally fetch the next path so the calling code is blissfully unaware of the issue.
You could abstract/encapsulate the functionality of the "job manager". Hide the queue and implementation from the caller so you can do whatever you want without the callers caring. Something like this:
public static class JobManager
{
private static Queue<JobData> _queue;
static JobManager() { Task.Factory.StartNew(() => { StartProcessing(); }); }
public static void AddJob(string value)
{
//TODO: validate
_queue.Enqueue(new JobData(value));
}
private static StartProcessing()
{
while (true)
{
if (_queue.Count > 0)
{
JobData data = _queue.Dequeue();
if (!process(data.Path))
{
data.TTL--;
if (data.TTL > 0)
_queue.Enqueue(data);
}
}
else
{
Thread.Sleep(1000);
}
}
}
private class JobData
{
public string Path { get; set; }
public short TTL { get; set; }
public JobData(string value)
{
this.Path = value;
this.TTL = DEFAULT_TTL;
}
}
}
Then your processing loop can handle the TTL value.
Edit - Added a simple processing loop. This code isn't thread safe, but should hopefully give you an idea.

C# OutOfMemory, Mapped Memory File or Temp Database

Seeking some advice, best practice etc...
Technology: C# .NET4.0, Winforms, 32 bit
I am seeking some advice on how I can best tackle large data processing in my C# Winforms application which experiences high memory usage (working set) and the occasional OutOfMemory exception.
The problem is that we perform a large amount of data processing "in-memory" when a "shopping-basket" is opened. In simplistic terms when a "shopping-basket" is loaded we perform the following calculations;
For each item in the "shopping-basket" retrieve it's historical price going all the way back to the date the item first appeared in-stock (could be two months, two years or two decades of data). Historical price data is retrieved from text files, over the internet, any format which is supported by a price plugin.
For each item, for each day since it first appeared in-stock calculate various metrics which builds a historical profile for each item in the shopping-basket.
The result is that we can potentially perform hundreds, thousand and/or millions of calculations depending upon the number of items in the "shopping-basket". If the basket contains too many items we run the risk of hitting a "OutOfMemory" exception.
A couple of caveats;
This data needs to be calculated for each item in the "shopping-basket" and the data is kept until the "shopping-basket" is closed.
Even though we perform steps 1 and 2 in a background thread, speed is important as the number of items in the "shopping-basket" can greatly effect overall calculation speed.
Memory is salvaged by the .NET garbage collector when a "shopping-basket" is closed. We have profiled our application and ensure that all references are correctly disposed and closed when a basket is closed.
After all the calculations are completed the resultant data is stored in a IDictionary. "CalculatedData is a class object whose properties are individual metrics calculated by the above process.
Some ideas I've thought about;
Obviously my main concern is to reduce the amount of memory being used by the calculations however the volume of memory used can only be reduced if I
1) reduce the number of metrics being calculated for each day or
2) reduce the number of days used for the calculation.
Both of these options are not viable if we wish to fulfill our business requirements.
Memory Mapped Files
One idea has been to use memory mapped files which will store the data dictionary. Would this be possible/feasible and how can we put this into place?
Use a temporary database
The idea is to use a separate (not in-memory) database which can be created for the life-cycle of the application. As "shopping-baskets" are opened we can persist the calculated data to the database for repeated use, alleviating the requirement to recalculate for the same "shopping-basket".
Are there any other alternatives that we should consider? What is best practice when it comes to calculations on large data and performing them outside of RAM?
Any advice is appreciated....
The easiest solution is a database, perhaps SQLite. Memory mapped files don't automatically become dictionaries, you would have to code all the memory management yourself, and thereby fight with the .net GC system itself for ownership of he data.
If you're interested in trying the memory mapped file approach, you can try it now. I wrote a small native .NET package called MemMapCache that in essence creates a key/val database backed by MemMappedFiles. It's a bit of a hacky concept, but the program MemMapCache.exe keeps all references to the memory mapped files so that if your application crashes, you don't have to worry about losing the state of your cache.
It's very simple to use and you should be able to drop it in your code without too many modifications. Here is an example using it: https://github.com/jprichardson/MemMapCache/blob/master/TestMemMapCache/MemMapCacheTest.cs
Maybe it'd be of some use to you to at least further figure out what you need to do for an actual solution.
Please let me know if you do end up using it. I'd be interested in your results.
However, long-term, I'd recommend Redis.
As an update for those stumbling upon this thread...
We ended up using SQLite as our caching solution. The SQLite database we employ exists separate to the main data store used by the application. We persist calculated data to the SQLite (diskCache) as it's required and have code controlling cache invalidation etc. This was a suitable solution for us as we were able to achieve write speeds up and around 100,000 records per second.
For those interested, this is the code that controls inserts into the diskCache. Full credit for this code goes to JP Richardson (shown answering a question here) for his excellent blog post.
internal class SQLiteBulkInsert
{
#region Class Declarations
private SQLiteCommand m_cmd;
private SQLiteTransaction m_trans;
private readonly SQLiteConnection m_dbCon;
private readonly Dictionary<string, SQLiteParameter> m_parameters = new Dictionary<string, SQLiteParameter>();
private uint m_counter;
private readonly string m_beginInsertText;
#endregion
#region Constructor
public SQLiteBulkInsert(SQLiteConnection dbConnection, string tableName)
{
m_dbCon = dbConnection;
m_tableName = tableName;
var query = new StringBuilder(255);
query.Append("INSERT INTO ["); query.Append(tableName); query.Append("] (");
m_beginInsertText = query.ToString();
}
#endregion
#region Allow Bulk Insert
private bool m_allowBulkInsert = true;
public bool AllowBulkInsert { get { return m_allowBulkInsert; } set { m_allowBulkInsert = value; } }
#endregion
#region CommandText
public string CommandText
{
get
{
if(m_parameters.Count < 1) throw new SQLiteException("You must add at least one parameter.");
var sb = new StringBuilder(255);
sb.Append(m_beginInsertText);
foreach(var param in m_parameters.Keys)
{
sb.Append('[');
sb.Append(param);
sb.Append(']');
sb.Append(", ");
}
sb.Remove(sb.Length - 2, 2);
sb.Append(") VALUES (");
foreach(var param in m_parameters.Keys)
{
sb.Append(m_paramDelim);
sb.Append(param);
sb.Append(", ");
}
sb.Remove(sb.Length - 2, 2);
sb.Append(")");
return sb.ToString();
}
}
#endregion
#region Commit Max
private uint m_commitMax = 25000;
public uint CommitMax { get { return m_commitMax; } set { m_commitMax = value; } }
#endregion
#region Table Name
private readonly string m_tableName;
public string TableName { get { return m_tableName; } }
#endregion
#region Parameter Delimiter
private const string m_paramDelim = ":";
public string ParamDelimiter { get { return m_paramDelim; } }
#endregion
#region AddParameter
public void AddParameter(string name, DbType dbType)
{
var param = new SQLiteParameter(m_paramDelim + name, dbType);
m_parameters.Add(name, param);
}
#endregion
#region Flush
public void Flush()
{
try
{
if (m_trans != null) m_trans.Commit();
}
catch (Exception ex)
{
throw new Exception("Could not commit transaction. See InnerException for more details", ex);
}
finally
{
if (m_trans != null) m_trans.Dispose();
m_trans = null;
m_counter = 0;
}
}
#endregion
#region Insert
public void Insert(object[] paramValues)
{
if (paramValues.Length != m_parameters.Count)
throw new Exception("The values array count must be equal to the count of the number of parameters.");
m_counter++;
if (m_counter == 1)
{
if (m_allowBulkInsert) m_trans = m_dbCon.BeginTransaction();
m_cmd = m_dbCon.CreateCommand();
foreach (var par in m_parameters.Values)
m_cmd.Parameters.Add(par);
m_cmd.CommandText = CommandText;
}
var i = 0;
foreach (var par in m_parameters.Values)
{
par.Value = paramValues[i];
i++;
}
m_cmd.ExecuteNonQuery();
if(m_counter != m_commitMax)
{
// Do nothing
}
else
{
try
{
if(m_trans != null) m_trans.Commit();
}
catch(Exception)
{ }
finally
{
if(m_trans != null)
{
m_trans.Dispose();
m_trans = null;
}
m_counter = 0;
}
}
}
#endregion
}

Pattern for concurrent cache sharing

Ok I was a little unsure on how best name this problem :) But assume this scenarion, you're
going out and fetching some webpage (with various urls) and caching it locally. The cache part is pretty easy to solve even with multiple threads.
However, imagine that one thread starts fetching an url, and a couple of milliseconds later another want to get the same url. Is there any good pattern for making the seconds thread's method wait on the first one to fetch the page , insert it into the cache and return it so you don't have to do multiple requests. With little enough overhead that it's worth doing even for requests that take about 300-700 ms? And without locking requests for other urls
Basically when requests for identical urls comes in tightly after each other I want the second request to "piggyback" the first request
I had some loose idea of having a dictionary where you insert an object with the key as url when you start fetching a page and lock on it. If there's any matching the key already it get's the object, locks on it and then tries to fetch the url for the actual cache.
I'm a little unsure of the particulars however to make it really thread-safe, using ConcurrentDictionary might be one part of it...
Is there any common pattern and solutions for scenarios like this?
Breakdown wrong behavior:
Thread 1: Checks the cache, it doesnt exists so starts fetching the url
Thread 2: Starts fetching the same url since it still doesn't exist in Cache
Thread 1: finished and inserts into the cache, returns the page
Thread 2: Finishes and also inserts into cache (or discards it), returns the page
Breakdown correct behavior:
Thread 1: Checks the cache, it doesnt exists so starts fetching the url
Thread 2: Wants the same url, but sees it's currently being fetched so waits on thread 1
Thread 1: finished and inserts into the cache, returns the page
Thread 2: Notices that thread 1 is finished and returns the page thread 1 it fetched
EDIT
Most solutions sofar seem to misunderstand the problem and only addressing the caching, as I said that isnt the problem, the problem is when doing an external web fetch to make the second fetch that is done before the first one has cached it to use the result from the first rather then doing a second
You could use a ConcurrentDictionary<K,V> and a variant of double-checked locking:
public static string GetUrlContent(string url)
{
object value1 = _cache.GetOrAdd(url, new object());
if (value1 == null) // null check only required if content
return null; // could legitimately be a null string
var urlContent = value1 as string;
if (urlContent != null)
return urlContent; // got the content
// value1 isn't a string which means that it's an object to lock against
lock (value1)
{
object value2 = _cache[url];
// at this point value2 will *either* be the url content
// *or* the object that we already hold a lock against
if (value2 != value1)
return (string)value2; // got the content
urlContent = FetchContentFromTheWeb(url); // todo
_cache[url] = urlContent;
return urlContent;
}
}
private static readonly ConcurrentDictionary<string, object> _cache =
new ConcurrentDictionary<string, object>();
EDIT: My code is quite a bit uglier now, but uses a separate lock per URL. This allows different URLs to be fetched asynchronously, however each URL will only be fetched once.
public class UrlFetcher
{
static Hashtable cache = Hashtable.Synchronized(new Hashtable());
public static String GetCachedUrl(String url)
{
// exactly 1 fetcher is created per URL
InternalFetcher fetcher = (InternalFetcher)cache[url];
if( fetcher == null )
{
lock( cache.SyncRoot )
{
fetcher = (InternalFetcher)cache[url];
if( fetcher == null )
{
fetcher = new InternalFetcher(url);
cache[url] = fetcher;
}
}
}
// blocks all threads requesting the same URL
return fetcher.Contents;
}
/// <summary>Each fetcher locks on itself and is initilized with null contents.
/// The first thread to call fetcher.Contents will cause the fetch to occur, and
/// block until completion.</summary>
private class InternalFetcher
{
private String url;
private String contents;
public InternalFetcher(String url)
{
this.url = url;
this.contents = null;
}
public String Contents
{
get
{
if( contents == null )
{
lock( this ) // "this" is an instance of InternalFetcher...
{
if( contents == null )
{
contents = FetchFromWeb(url);
}
}
}
return contents;
}
}
}
}
Will the Semaphore please stand up! stand up! stand up!
use Semaphore you can easily synchronize your threads with it.
on both cases where
you are trying to load a page that is currently being cached
you are saving cache to a file where a page is loading from it.
in both scenarios you will face troubles.
it is just like writers and readers problem that is a common problem in Operating System Racing Issues. just when a thread wants to rebuild a cache or start caching a page no thread should read from it. if a thread is reading it it should wait until reading finished and replace the cache, no 2 threads should cache same page in to a same file. hence it is possible for all readers to read from a cache at anytime since no writer is writing on it.
you should read some semaphore using samples on msdn, it is very easy to use. just the thread that wants to do something is call the semaphore and if the resource can granted it do the works otherwise sleeps and wait to be woken up when the resource is ready.
Disclaimer: This might be a n00bish answer. Please pardon me, if it is.
I'd recommend using some shared dictionary object with locks to keep a track of the url being currently fetched or have already been fetched.
At every request, check the url against this object.
If an entry for the url is present, check the cache. (this means one of the threads has either fetched it or is currently fetching it)
If its available in the cache, use it, else put the current thread to sleep for a while and check back again. (if not in cache, some thread is still fetching it, so wait while its done)
If the entry is not found in the dictionary object, add the url to it and send the request. Once it obtains a response, add it to cache.
This logic should work, however, you would need to take care of cache expiration and removal of the entry from the dictionary object.
my solution is use atomicBoolean to control access database when cache is timeout or unexist;
at the same moment, only one thread(i call it read-th) can access database, the other threads spin until the read-th return data and write it into cache;
here codes; implement by java;
public class CacheBreakDownDefender<K, R> {
/**
* false = do not write null to cache when get null value from database;
*/
private final boolean writeNullToCache;
/**
* cache different query key
*/
private final ConcurrentHashMap<K, AtomicBoolean> selectingDBTagMap = new ConcurrentHashMap<>();
public static <K, R> CacheBreakDownDefender<K, R> getInstance(Class<K> keyType, Class<R> resultType) {
return Singleton.get(keyType.getName() + resultType.getName(), () -> new CacheBreakDownDefender<>(false));
}
public static <K, R> CacheBreakDownDefender<K, R> getInstance(Class<K> keyType, Class<R> resultType, boolean writeNullToCache) {
return Singleton.get(keyType.getName() + resultType.getName(), () -> new CacheBreakDownDefender<>(writeNullToCache));
}
private CacheBreakDownDefender(boolean writeNullToCache) {
this.writeNullToCache = writeNullToCache;
}
public R readFromCache(K key, Function<K, ? extends R> getFromCache, Function<K, ? extends R> getFromDB, BiConsumer<K, R> writeCache) throws InterruptedException {
R result = getFromCache.apply(key);
if (result == null) {
final AtomicBoolean selectingDB = selectingDBTagMap.computeIfAbsent(key, x -> new AtomicBoolean(false));
if (selectingDB.compareAndSet(false, true)) {
try {
result = getFromDB.apply(key);
if (result != null || writeNullToCache) {
writeCache.accept(key, result);
}
} finally {
selectingDB.getAndSet(false);
selectingDBTagMap.remove(key);
}
} else {
while (selectingDB.get()) {
TimeUnit.MILLISECONDS.sleep(0L);
//do nothing...
}
return getFromCache.apply(key);
}
}
return result;
}
public static void main(String[] args) throws InterruptedException {
Map<String, String> map = new ConcurrentHashMap<>();
CacheBreakDownDefender<String, String> instance = CacheBreakDownDefender.getInstance(String.class, String.class, true);
for (int i = 0; i < 9; i++) {
int finalI = i;
new Thread(() -> {
String kele = null;
try {
if (finalI == 6) {
kele = instance.readFromCache("kele2", map::get, key -> "helloword2", map::put);
} else
kele = instance.readFromCache("kele", map::get, key -> "helloword", map::put);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
log.info("resut= {}", kele);
}).start();
}
TimeUnit.SECONDS.sleep(2L);
}
}
This is not exactly for concurrent caches but for all caches:
"A cache with a bad policy is another name for a memory leak" (Raymond Chen)

Caching in C#/.Net

I wanted to ask you what is the best approach to implement a cache in C#? Is there a possibility by using given .NET classes or something like that? Perhaps something like a dictionary that will remove some entries, if it gets too large, but where whose entries won't be removed by the garbage collector?
If you are using .NET 4 or superior, you can use MemoryCache class.
If you're using ASP.NET, you could use the Cache class (System.Web.Caching).
Here is a good helper class: c-cache-helper-class
If you mean caching in a windows form app, it depends on what you're trying to do, and where you're trying to cache the data.
We've implemented a cache behind a Webservice for certain methods
(using the System.Web.Caching object.).
However, you might also want to look at the Caching Application Block. (See here) that is part of the Enterprise Library for .NET Framework 2.0.
MemoryCache in the framework is a good place to start, but you might also like to consider the open source library LazyCache because it has a simpler API than memory cache and has built in locking as well as some other developer friendly features. It is also available on nuget.
To give you an example:
// Create our cache service using the defaults (Dependency injection ready).
// Uses MemoryCache.Default as default so cache is shared between instances
IAppCache cache = new CachingService();
// Declare (but don't execute) a func/delegate whose result we want to cache
Func<ComplexObjects> complexObjectFactory = () => methodThatTakesTimeOrResources();
// Get our ComplexObjects from the cache, or build them in the factory func
// and cache the results for next time under the given key
ComplexObject cachedResults = cache.GetOrAdd("uniqueKey", complexObjectFactory);
I recently wrote this article about getting started with caching in dot net that you may find useful.
(Disclaimer: I am the author of LazyCache)
The cache classes supplied with .NET are handy, but have a major problem - they can not store much data (tens of millions+) of objects for a long time without killing your GC. They work great if you cache a few thousand objects, but the moment you move into millions and keep them around until they propagate into GEN2 - the GC pauses would eventually start to be noticeable when you system comes to low memory threshold and GC needs to sweep all gens.
The practicality is this - if you need to store a few hundred thousand instances - use MS cache. Does not matter if your objects are 2-field or 25 field - its about the number of references.
On the other hand there are cases when large RAMs, which are common these days, need to be utilized, i.e. 64 GB.
For that we have created a 100% managed memory manager and cache that sits on top of it.
Our solution can easily store 300,000,000 object in-memory in-process without taxing GC at all - this is because we store data in large (250 mb) byte[] segments.
Here is the code: NFX Pile (Apache 2.0)
And video:
NFX Pile Cache - Youtube
You can use the ObjectCache.
See http://msdn.microsoft.com/en-us/library/system.runtime.caching.objectcache.aspx
For Local Stores
.NET MemoryCache
NCache Express
AppFabric Caching
...
As mentioned in other answers, the default choice using the .NET Framework is MemoryCache and the various related implementations in Microsoft NuGet packages (e.g. Microsoft.Extensions.Caching.MemoryCache). All of these caches bound size in terms of memory used, and attempt to estimate memory used by tracking how total physical memory is increasing relative to the number of cached objects. A background thread then periodically 'trims' entries.
MemoryCache etc. share some limitations:
Keys are strings, so if the key type is not natively string, you will be forced to constantly allocate strings on the heap. This can really add up in a server application when items are 'hot'.
Has poor 'scan resistance' - e.g. if some automated process is rapidly looping through all the items in that exist, the cache size can grow too fast for the background thread to keep up. This can result in memory pressure, page faults, induced GC or when running under IIS, recycling the process due to exceeding the private bytes limit.
Does not scale well with concurrent writes.
Contains perf counters that cannot be disabled (which incur overhead).
Your workload will determine the degree to which these things are problematic. An alternative approach to caching is to bound the number of objects in the cache (rather than estimating memory used). A cache replacement policy then determines which object to discard when the cache is full.
Below is the source code for a simple cache with least recently used eviction policy:
public sealed class ClassicLru<K, V>
{
private readonly int capacity;
private readonly ConcurrentDictionary<K, LinkedListNode<LruItem>> dictionary;
private readonly LinkedList<LruItem> linkedList = new LinkedList<LruItem>();
private long requestHitCount;
private long requestTotalCount;
public ClassicLru(int capacity)
: this(Defaults.ConcurrencyLevel, capacity, EqualityComparer<K>.Default)
{
}
public ClassicLru(int concurrencyLevel, int capacity, IEqualityComparer<K> comparer)
{
if (capacity < 3)
{
throw new ArgumentOutOfRangeException("Capacity must be greater than or equal to 3.");
}
if (comparer == null)
{
throw new ArgumentNullException(nameof(comparer));
}
this.capacity = capacity;
this.dictionary = new ConcurrentDictionary<K, LinkedListNode<LruItem>>(concurrencyLevel, this.capacity + 1, comparer);
}
public int Count => this.linkedList.Count;
public double HitRatio => (double)requestHitCount / (double)requestTotalCount;
///<inheritdoc/>
public bool TryGet(K key, out V value)
{
Interlocked.Increment(ref requestTotalCount);
LinkedListNode<LruItem> node;
if (dictionary.TryGetValue(key, out node))
{
LockAndMoveToEnd(node);
Interlocked.Increment(ref requestHitCount);
value = node.Value.Value;
return true;
}
value = default(V);
return false;
}
public V GetOrAdd(K key, Func<K, V> valueFactory)
{
if (this.TryGet(key, out var value))
{
return value;
}
var node = new LinkedListNode<LruItem>(new LruItem(key, valueFactory(key)));
if (this.dictionary.TryAdd(key, node))
{
LinkedListNode<LruItem> first = null;
lock (this.linkedList)
{
if (linkedList.Count >= capacity)
{
first = linkedList.First;
linkedList.RemoveFirst();
}
linkedList.AddLast(node);
}
// Remove from the dictionary outside the lock. This means that the dictionary at this moment
// contains an item that is not in the linked list. If another thread fetches this item,
// LockAndMoveToEnd will ignore it, since it is detached. This means we potentially 'lose' an
// item just as it was about to move to the back of the LRU list and be preserved. The next request
// for the same key will be a miss. Dictionary and list are eventually consistent.
// However, all operations inside the lock are extremely fast, so contention is minimized.
if (first != null)
{
dictionary.TryRemove(first.Value.Key, out var removed);
if (removed.Value.Value is IDisposable d)
{
d.Dispose();
}
}
return node.Value.Value;
}
return this.GetOrAdd(key, valueFactory);
}
public bool TryRemove(K key)
{
if (dictionary.TryRemove(key, out var node))
{
// If the node has already been removed from the list, ignore.
// E.g. thread A reads x from the dictionary. Thread B adds a new item, removes x from
// the List & Dictionary. Now thread A will try to move x to the end of the list.
if (node.List != null)
{
lock (this.linkedList)
{
if (node.List != null)
{
linkedList.Remove(node);
}
}
}
if (node.Value.Value is IDisposable d)
{
d.Dispose();
}
return true;
}
return false;
}
// Thead A reads x from the dictionary. Thread B adds a new item. Thread A moves x to the end. Thread B now removes the new first Node (removal is atomic on both data structures).
private void LockAndMoveToEnd(LinkedListNode<LruItem> node)
{
// If the node has already been removed from the list, ignore.
// E.g. thread A reads x from the dictionary. Thread B adds a new item, removes x from
// the List & Dictionary. Now thread A will try to move x to the end of the list.
if (node.List == null)
{
return;
}
lock (this.linkedList)
{
if (node.List == null)
{
return;
}
linkedList.Remove(node);
linkedList.AddLast(node);
}
}
private class LruItem
{
public LruItem(K k, V v)
{
Key = k;
Value = v;
}
public K Key { get; }
public V Value { get; }
}
}
This is just to illustrate a thread safe cache - it probably has bugs and can be a bottleneck under heavy concurrent workloads (e.g. in a web server).
A thoroughly tested, production ready, scalable concurrent implementation is a bit beyond a stack overflow post. To solve this in my projects, I implemented a thread safe pseudo LRU (think concurrent dictionary, but with constrained size). Performance is very close to a raw ConcurrentDictionary, ~10x faster than MemoryCache, ~10x better concurrent throughput than ClassicLru above, and better hit rate. A detailed performance analysis provided in the github link below.
Usage looks like this:
int capacity = 666;
var lru = new ConcurrentLru<int, SomeItem>(capacity);
var value = lru.GetOrAdd(1, (k) => new SomeItem(k));
GitHub: https://github.com/bitfaster/BitFaster.Caching
Install-Package BitFaster.Caching
Your question needs more clarification. C# is a language not a framework. You have to specify which framework you want to implement the caching. If we consider that you want to implement it in ASP.NET it is still depends completely on what you want from Cache. You can decide between in-process cache (which will keep the data inside the heap of your application) and out-of-process cache (in this case you can store the data in other memory than the heap like Amazon Elastic cache server). And there is also another decision to make which is between client caching or serve side caching. Usually in solution you have to develop different solution for caching different data. Because base on four factors (accessibility, persistency, size, cost) you have to make decision which solution you need.
I wrote this some time ago and it seems to work well. It allows you to differentiate different cache stores by using different Types: ApplicationCaching<MyCacheType1>, ApplicationCaching<MyCacheType2>....
You can decide to allow some stores to persist after execution and others to expire.
You will need a reference to the Newtonsoft.Json serializer (or use an alternative one) and of course all objects or values types to be cached must be serializable.
Use MaxItemCount to set a limit to the number of items in any one store.
A separate Zipper class (see code below) uses System.IO.Compression. This minimises the size of the store and helps speed up loading times.
public static class ApplicationCaching<K>
{
//====================================================================================================================
public static event EventHandler InitialAccess = (s, e) => { };
//=============================================================================================
static Dictionary<string, byte[]> _StoredValues;
static Dictionary<string, DateTime> _ExpirationTimes = new Dictionary<string, DateTime>();
//=============================================================================================
public static int MaxItemCount { get; set; } = 0;
private static void OnInitialAccess()
{
//-----------------------------------------------------------------------------------------
_StoredValues = new Dictionary<string, byte[]>();
//-----------------------------------------------------------------------------------------
InitialAccess?.Invoke(null, EventArgs.Empty);
//-----------------------------------------------------------------------------------------
}
public static void AddToCache<T>(string key, T value, DateTime expirationTime)
{
try
{
//-----------------------------------------------------------------------------------------
if (_StoredValues is null) OnInitialAccess();
//-----------------------------------------------------------------------------------------
string strValue = JsonConvert.SerializeObject(value);
byte[] zippedValue = Zipper.Zip(strValue);
//-----------------------------------------------------------------------------------------
_StoredValues.Remove(key);
_StoredValues.Add(key, zippedValue);
//-----------------------------------------------------------------------------------------
_ExpirationTimes.Remove(key);
_ExpirationTimes.Add(key, expirationTime);
//-----------------------------------------------------------------------------------------
}
catch (Exception ex)
{
throw ex;
}
}
//=============================================================================================
public static T GetFromCache<T>(string key, T defaultValue = default)
{
try
{
//-----------------------------------------------------------------------------------------
if (_StoredValues is null) OnInitialAccess();
//-----------------------------------------------------------------------------------------
if (_StoredValues.ContainsKey(key))
{
//------------------------------------------------------------------------------------------
if (_ExpirationTimes[key] <= DateTime.Now)
{
//------------------------------------------------------------------------------------------
_StoredValues.Remove(key);
_ExpirationTimes.Remove(key);
//------------------------------------------------------------------------------------------
return defaultValue;
//------------------------------------------------------------------------------------------
}
//------------------------------------------------------------------------------------------
byte[] zippedValue = _StoredValues[key];
//------------------------------------------------------------------------------------------
string strValue = Zipper.Unzip(zippedValue);
T value = JsonConvert.DeserializeObject<T>(strValue);
//------------------------------------------------------------------------------------------
return value;
//------------------------------------------------------------------------------------------
}
else
{
return defaultValue;
}
//---------------------------------------------------------------------------------------------
}
catch (Exception ex)
{
throw ex;
}
}
//=============================================================================================
public static string ConvertCacheToString()
{
//-----------------------------------------------------------------------------------------
if (_StoredValues is null || _ExpirationTimes is null) return "";
//-----------------------------------------------------------------------------------------
List<string> storage = new List<string>();
//-----------------------------------------------------------------------------------------
string strStoredObject = JsonConvert.SerializeObject(_StoredValues);
string strExpirationTimes = JsonConvert.SerializeObject(_ExpirationTimes);
//-----------------------------------------------------------------------------------------
storage.AddRange(new string[] { strStoredObject, strExpirationTimes});
//-----------------------------------------------------------------------------------------
string strStorage = JsonConvert.SerializeObject(storage);
//-----------------------------------------------------------------------------------------
return strStorage;
//-----------------------------------------------------------------------------------------
}
//=============================================================================================
public static void InializeCacheFromString(string strCache)
{
try
{
//-----------------------------------------------------------------------------------------
List<string> storage = JsonConvert.DeserializeObject<List<string>>(strCache);
//-----------------------------------------------------------------------------------------
if (storage != null && storage.Count == 2)
{
//-----------------------------------------------------------------------------------------
_StoredValues = JsonConvert.DeserializeObject<Dictionary<string, byte[]>>(storage.First());
_ExpirationTimes = JsonConvert.DeserializeObject<Dictionary<string, DateTime>>(storage.Last());
//-----------------------------------------------------------------------------------------
if (_ExpirationTimes != null && _StoredValues != null)
{
//-----------------------------------------------------------------------------------------
for (int i = 0; i < _ExpirationTimes.Count; i++)
{
string key = _ExpirationTimes.ElementAt(i).Key;
//-----------------------------------------------------------------------------------------
if (_ExpirationTimes[key] < DateTime.Now)
{
ClearItem(key);
}
//-----------------------------------------------------------------------------------------
}
//-----------------------------------------------------------------------------------------
if (MaxItemCount > 0 && _StoredValues.Count > MaxItemCount)
{
IEnumerable<KeyValuePair<string, DateTime>> countedOutItems = _ExpirationTimes.OrderByDescending(o => o.Value).Skip(MaxItemCount);
for (int i = 0; i < countedOutItems.Count(); i++)
{
ClearItem(countedOutItems.ElementAt(i).Key);
}
}
//-----------------------------------------------------------------------------------------
return;
//-----------------------------------------------------------------------------------------
}
//-----------------------------------------------------------------------------------------
}
//-----------------------------------------------------------------------------------------
_StoredValues = new Dictionary<string, byte[]>();
_ExpirationTimes = new Dictionary<string, DateTime>();
//-----------------------------------------------------------------------------------------
}
catch (Exception)
{
throw;
}
}
//=============================================================================================
public static void ClearItem(string key)
{
//-----------------------------------------------------------------------------------------
if (_StoredValues.ContainsKey(key))
{
_StoredValues.Remove(key);
}
//-----------------------------------------------------------------------------------------
if (_ExpirationTimes.ContainsKey(key))
_ExpirationTimes.Remove(key);
//-----------------------------------------------------------------------------------------
}
//=============================================================================================
}
You can easily start using the cache on the fly with something like...
//------------------------------------------------------------------------------------------------------------------------------
string key = "MyUniqueKeyForThisItem";
//------------------------------------------------------------------------------------------------------------------------------
MyType obj = ApplicationCaching<MyCacheType>.GetFromCache<MyType>(key);
//------------------------------------------------------------------------------------------------------------------------------
if (obj == default)
{
obj = new MyType(...);
ApplicationCaching<MyCacheType>.AddToCache(key, obj, DateTime.Now.AddHours(1));
}
Note the actual types stored in the cache can be the same or different from the cache type. The cache type is ONLY used to differentiate different cache stores.
You can then decide to allow the cache to persist after execution terminates using Default Settings
string bulkCache = ApplicationCaching<MyType>.ConvertCacheToString();
//--------------------------------------------------------------------------------------------------------
if (bulkCache != "")
{
Properties.Settings.Default.*MyType*DataCachingStore = bulkCache;
}
//--------------------------------------------------------------------------------------------------------
try
{
Properties.Settings.Default.Save();
}
catch (IsolatedStorageException)
{
//handle Isolated Storage exceptions here
}
Handle the InitialAccess Event to reinitialize the cache when you restart the app
private static void ApplicationCaching_InitialAccess(object sender, EventArgs e)
{
//-----------------------------------------------------------------------------------------
string storedCache = Properties.Settings.Default.*MyType*DataCachingStore;
ApplicationCaching<MyCacheType>.InializeCacheFromString(storedCache);
//-----------------------------------------------------------------------------------------
}
Finally here is the Zipper class...
public class Zipper
{
public static void CopyTo(Stream src, Stream dest)
{
byte[] bytes = new byte[4096];
int cnt;
while ((cnt = src.Read(bytes, 0, bytes.Length)) != 0)
{
dest.Write(bytes, 0, cnt);
}
}
public static byte[] Zip(string str)
{
var bytes = Encoding.UTF8.GetBytes(str);
using (var msi = new MemoryStream(bytes))
using (var mso = new MemoryStream())
{
using (var gs = new GZipStream(mso, CompressionMode.Compress))
{
CopyTo(msi, gs);
}
return mso.ToArray();
}
}
public static string Unzip(byte[] bytes)
{
using (var msi = new MemoryStream(bytes))
using (var mso = new MemoryStream())
{
using (var gs = new GZipStream(msi, CompressionMode.Decompress))
{
CopyTo(gs, mso);
}
return Encoding.UTF8.GetString(mso.ToArray());
}
}
}
If you are looking to Cache something in ASP.Net then I would look at the Cache class. For example
Hashtable menuTable = new Hashtable();
menuTable.add("Home","default.aspx");
Cache["menu"] = menuTable;
Then to retrieve it again
Hashtable menuTable = (Hashtable)Cache["menu"];
- Memory cache implementation for .Net core
public class CachePocRepository : ICachedEmployeeRepository
{
private readonly IEmployeeRepository _employeeRepository;
private readonly IMemoryCache _memoryCache;
public CachePocRepository(
IEmployeeRepository employeeRepository,
IMemoryCache memoryCache)
{
_employeeRepository = employeeRepository;
_memoryCache = memoryCache;
}
public async Task<Employee> GetEmployeeDetailsId(string employeeId)
{
_memoryCache.TryGetValue(employeeId, out Employee employee);
if (employee != null)
{
return employee;
}
employee = await _employeeRepository.GetEmployeeDetailsId(employeeId);
_memoryCache.Set(employeeId,
employee,
new MemoryCacheEntryOptions()
{
AbsoluteExpiration = DateTimeOffset.UtcNow.AddDays(7),
});
return employee;
}
You could use a Hashtable
it has very fast lookups, no key collisions and your data will not garbage collected

Categories