Thread-safe fixed-size circular buffer with sequence ids - c#

I need a queue with these capabilities:
fixed-size (i.e. circular buffer)
queue items have ids (like a primary key), which are sequential
thread-safe (used from multiple ASP.NET Core requests)
To avoid locking, I tried a ConcurrentQueue but found race conditions. So I'm trying a custom approach.
public interface IQueueItem
{
long Id { get; set; }
}
public class CircularBuffer<T> : LinkedList<T> where T : class, IQueueItem
{
public CircularBuffer(int capacity) => _capacity = capacity;
private readonly int _capacity;
private long _counter = 0;
private readonly object _lock = new();
public void Enqueue(T item)
{
lock (_lock) { // works but feels "heavy"
_counter++;
item.Id = _counter;
if (Count == _capacity) RemoveFirst();
AddLast(item);
}
}
}
And to test:
public class Item : IQueueItem
{
public long Id { get; set; }
//...
}
public class Program
{
public static void Main()
{
var q = new CircularBuffer<Item>(10);
Parallel.For(0, 15, i => q.Enqueue(new Item()));
Console.WriteLine(string.Join(", ", q.Select(x => x.Id)));
}
}
Which gives correct output (is ordered even though enqueued by competing threads, and has fixed size with oldest items dequeued):
6, 7, 8, 9, 10, 11, 12, 13, 14, 15
In reality, I have web requests that read (i.e. enumerate) that queue.
The problem: if one thread is enumerating the queue while another thread is adding to it, I will have errors. (I could use a ToList() before the read, but for a large queue that will suck up all the server's memory as this could be done many times a second by multiple requests). How can I deal with that scenario? I used a linked list, but I'm flexible to use any structure.
(Also, that seems to be a really heavy lock section; is there a more performant way?)
UPDATE
As asked in comments below: I expect the queue to have from a few hundred to a few tens of thousand items, but the items themselves are small (just a few primitive data types). I expect an enqueue every second. Reads from web requests are less often, let's say a few times per minute (but can occur concurrently to the server writing to the queue).

Based on the metrics that you provided in the question, you have plenty of options. The anticipated usage of the CircularBuffer<T> is not really that heavy. Wrapping a lock-protected Queue<T> should work pretty well. The cost of copying the contents of the queue into an array on each enumeration (copying 10,000 elements a few times per second) is unlikely to be noticeable. Modern machines can do such things in the blink of an eye. You'd have to enumerate the collection thousands of times per second for this to start (slightly) becoming an issue.
For the sake of variety I'll propose a different structure as internal storage: the ImmutableQueue<T> class. Its big plus is that it can be enumerated freely by multiple threads concurrently. You don't have to worry about concurrent mutations, because this collection is immutable. Nobody can change it after it has been created, ever.
The way that you update this collection is by creating a new collection and discarding the previous one. This collection has methods Enqueue and Dequeue that don't mutate the existing collection, but instead they return a new collection with the desirable mutation. This sounds extremely inefficient, but actually it's not. The new collection reuses most of the internal parts of the existing collection. Of course it's much more expensive compared to mutating a Queue<T>, probably around 10 times more expensive, but you hope that you'll get even more back in return by how cheap and non-contentious is to enumerate it.
public class ConcurrentCircularBuffer<T> : IEnumerable<T> where T : IQueueItem
{
private readonly object _locker = new();
private readonly int _capacity;
private ImmutableQueue<T> _queue = ImmutableQueue<T>.Empty;
private int _count = 0;
private long _lastId = 0;
public ConcurrentCircularBuffer(int capacity) => _capacity = capacity;
public void Enqueue(T item)
{
lock (_locker)
{
item.Id = ++_lastId;
_queue = _queue.Enqueue(item);
if (_count < _capacity)
_count++;
else
_queue = _queue.Dequeue();
}
}
public IEnumerator<T> GetEnumerator()
{
var enumerator = Volatile.Read(ref _queue).GetEnumerator();
while (enumerator.MoveNext())
yield return enumerator.Current;
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
The class that implements the IQueueItem interface should be implemented like this:
public class QueueItem : IQueueItem
{
private long _id;
public long Id
{
get => Volatile.Read(ref _id);
set => Volatile.Write(ref _id, value);
}
}
Otherwise it might be possible for a thread to see an IQueueItem instance with uninitialized Id. For an explanation you can read this article by Igor Ostrovsky. I am not 100% sure that it's possible, but neither I can guarantee that it's impossible. Even with the Volatile in place, it still looks fragile to me to delegate the responsibility of initializing the Id to an external component.

Since ConcurrentQueue is out in this question, you can try fixed array.
IQueueItem[] items = new IQueueItem[SIZE];
long id = 0;
Enqueue is simple.
void Enqueue(IQueueItem item)
{
long id2 = Interlocked.Increment(ref id);
item.Id = id2 - 1;
items[id2 % SIZE] = item;
}
To output the data, you just need copy the array to a new one, then sort it. (of course, it can be optimized here)
var arr = new IQueueItem[SIZE];
Array.Copy(items, arr, SIZE);
return arr.Where(a => a != null).OrderBy(a => a.Id);
There may be some gaps in the array because of the concurrent insertions, you can take a sequence till a gap is found.
var e = arr.Where(a => a != null).OrderBy(a => a.Id);
var firstId = e.First().Id;
return e.TakeWhile((a, index) => a.Id - index == firstId);

Here is another implementation, using a Queue<T> with locking.
public interface IQueueItem
{
long Id { get; set; }
}
public class CircularBuffer<T> : IEnumerable<T> where T : class, IQueueItem
{
private readonly int _capacity;
private readonly Queue<T> _queue;
private long _lastId = 0;
private readonly object _lock = new();
public CircularBuffer(int capacity) {
_capacity = capacity;
_queue = new Queue<T>(capacity);
}
public void Enqueue(T item)
{
lock (_lock) {
if (_capacity < _queue.Count)
_queue.Dequeue();
item.Id = ++_lastId;
_queue.Enqueue(item);
}
}
public IEnumerator<T> GetEnumerator()
{
lock (_lock) {
var copy = _queue.ToArray();
return ((IEnumerable<T>)copy).GetEnumerator();
}
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}
And to test:
public class Item : IQueueItem
{
private long _id;
public long Id
{
get => Volatile.Read(ref _id);
set => Volatile.Write(ref _id, value);
}
}
public class Program
{
public static void Main()
{
var q = new CircularBuffer<Item>(10);
Parallel.For(0, 15, i => q.Enqueue(new Item()));
Console.WriteLine(string.Join(", ", q.Select(x => x.Id)));
}
}
Result:
6, 7, 8, 9, 10, 11, 12, 13, 14, 15

Related

BlockingCollection where the consumers are also producers

I have a bunch of requests to process, and during the processing of those requests, more "sub-requests" can be generated and added to the same blocking collection. The consumers add sub-requests to the queue.
It's hard to know when to exit the consuming loop: clearly no thread can call BlockingCollection.CompleteAdding as the other threads may add something to the collection. You also cannot exit the consuming loop just because the BlockingCollection is empty as another thread may have just read the final remaining request from the BlockingCollection and will be about to start generating more requests - the Count of the BlockingCollection will then increase from zero again.
My only idea on this so far is to use a Barrier - when all threads reach the Barrier, there can't be anything left in the BlockingCollection and no thread can be generating new requests. Here is my code - is this an acceptable approach? (and please note: this is highly contrived block of code modelling a much more complex situation: no programmer really writes code that processes random strings 😊 )
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using System.Collections.Concurrent;
using System.Threading;
namespace Barrier1
{
class Program
{
private static readonly Random random = new Random();
private static void Main()
{
var bc = new BlockingCollection<string>();
AddRandomStringsToBc(bc, 1000, true);
int nTasks = 4;
var barrier = new Barrier(nTasks);
Action a = () => DoSomething(bc, barrier);
var actions = Enumerable.Range(0, nTasks).Select(x => a).ToArray();
Parallel.Invoke(actions);
}
private static IEnumerable<char> GetC(bool includeA)
{
var startChar = includeA ? 'A' : 'B';
var add = includeA ? 24 : 25;
while (true)
{
yield return (char)(startChar + random.Next(add));
}
}
private static void DoSomething(BlockingCollection<string> bc, Barrier barrier)
{
while (true)
{
if (bc.TryTake(out var str))
{
Console.WriteLine(str);
if (str[0] == 'A')
{
Console.WriteLine("Adding more strings...");
AddRandomStringsToBc(bc, 100);
}
}
else
{
// Can't exit the loop here just because there is nothing in the collection.
// A different thread may be just about to call AddRandomStringsToBc:
if (barrier.SignalAndWait(100))
{
break;
}
}
}
}
private static void AddRandomStringsToBc(BlockingCollection<string> bc, int n, bool startWithA = false, bool sleep = false)
{
var collection = Enumerable.Range(0, n).Select(x => string.Join("", GetC(startWithA).Take(5)));
foreach (var c in collection)
{
bc.Add(c);
}
}
}
}
Here is a collection similar to the BlockingCollection<T>, with the difference that it completes automatically instead of relying on manually calling the CompleteAdding method. The condition for the automatic completion is that the collection is empty, and all the consumers are in a waiting state.
The implementation is based on your clever idea of using a Barrier as a mechanism for checking the auto-complete condition. It's not perfect because it relies on pooling, which is taking place when the collection becomes empty and has some consumers that are still active. On the other hand it allows to exploit all the existing functionality of the BlockingCollection<T> class, instead of rewriting it from scratch:
/// <summary>
/// A blocking collection that completes automatically when it's empty, and all
/// consuming enumerables are in a waiting state.
/// </summary>
public class AutoCompleteBlockingCollection<T> : IEnumerable<T>, IDisposable
{
private readonly BlockingCollection<T> _queue;
private readonly Barrier _barrier;
private volatile bool _autoCompleteStarted;
private volatile int _intervalMilliseconds = 500;
public AutoCompleteBlockingCollection(int boundedCapacity = -1)
{
_queue = boundedCapacity == -1 ? new() : new(boundedCapacity);
_barrier = new(0, _ => _queue.CompleteAdding());
}
public int Count => _queue.Count;
public int BoundedCapacity => _queue.BoundedCapacity;
public bool IsAddingCompleted => _queue.IsAddingCompleted;
public bool IsCompleted => _queue.IsCompleted;
/// <summary>
/// Begin observing the condition for automatic completion.
/// </summary>
public void BeginObservingAutoComplete() => _autoCompleteStarted = true;
/// <summary>
/// Gets or sets how frequently to check for the auto-complete condition.
/// </summary>
public TimeSpan CheckAutoCompleteInterval
{
get { return TimeSpan.FromMilliseconds(_intervalMilliseconds); }
set
{
int milliseconds = checked((int)value.TotalMilliseconds);
if (milliseconds < 0) throw new ArgumentOutOfRangeException();
_intervalMilliseconds = milliseconds;
}
}
public void Add(T item, CancellationToken cancellationToken = default)
=> _queue.Add(item, cancellationToken);
public bool TryAdd(T item) => _queue.TryAdd(item);
public IEnumerable<T> GetConsumingEnumerable(
CancellationToken cancellationToken = default)
{
_barrier.AddParticipant();
try
{
while (true)
{
if (!_autoCompleteStarted)
{
if (_queue.TryTake(out var item, _intervalMilliseconds,
cancellationToken))
yield return item;
}
else
{
if (_queue.TryTake(out var item, 0, cancellationToken))
yield return item;
else if (_barrier.SignalAndWait(_intervalMilliseconds,
cancellationToken))
break;
}
}
}
finally { _barrier.RemoveParticipant(); }
}
IEnumerator<T> IEnumerable<T>.GetEnumerator()
=> ((IEnumerable<T>)_queue).GetEnumerator();
IEnumerator IEnumerable.GetEnumerator()
=> ((IEnumerable<T>)_queue).GetEnumerator();
public void Dispose() { _barrier.Dispose(); _queue.Dispose(); }
}
The BeginObservingAutoComplete method should be called after adding the initial items in the collection. Before calling this method, the auto-complete condition is not checked.
The CheckAutoCompleteInterval is 500 milliseconds by default, and it can be configured at any time.
The Take and TryTake methods are missing on purpose. The collection is intended to be consumed via the GetConsumingEnumerable method. This way the collection keeps track of the currently subscribed consumers, in order to know when to auto-complete. Consumers can be added and removed at any time. A consumer can be removed by exiting the foreach loop, either by break/return etc, or by an exception.
Usage example:
private static void Main()
{
var bc = new AutoCompleteBlockingCollection<string>();
AddRandomStringsToBc(bc, 1000, true);
bc.BeginObservingAutoComplete();
Action action = () => DoSomething(bc);
var actions = Enumerable.Repeat(action, 4).ToArray();
Parallel.Invoke(actions);
}
private static void DoSomething(AutoCompleteBlockingCollection<string> bc)
{
foreach (var str in bc.GetConsumingEnumerable())
{
Console.WriteLine(str);
if (str[0] == 'A')
{
Console.WriteLine("Adding more strings...");
AddRandomStringsToBc(bc, 100);
}
}
}
The collection is thread-safe, with the exception of the Dispose method.

Performance of ConcurrentBag, many reads, rare modifications

I'm trying to build a model where there will me multiple reads of an entire collection and rare additions and modifications to it.
I thought I might use the ConcurrentBag in .NET as I've read the documentation and it's supposed to be good for concurrent reads and writes.
The code would look like this:
public class Cache
{
ConcurrentBag<string> cache = new ConcurrentBag<string>();
// this method gets called frequently
public IEnumerable<string> GetAllEntries()
{
return cache.ToList();
}
// this method gets rarely called
public void Add(string newEntry)
{
// add to concurrentBag
}
public void Remove(string entryToRemove)
{
// remove from concurrent bag
}
}
However, I've decompiled the ConcurrentBag class and on theGetEnumerator there's always a lock taken, which means any call to GetAllEntries will lock the entire collection and it will not perform.
I'm thinking to get around this and code it in this manner instead, using a list.
public class Cache
{
private object guard = new object();
IList<string> cache = new List<string>();
// this method gets called frequently
public IEnumerable<string> GetAllEntries()
{
var currentCache = cache;
return currentCache;
}
// this method gets rarely called
public void Add(string newEntry)
{
lock (guard)
{
cache.Add(newEntry);
}
}
public void Remove(string entryToRemove)
{
lock (guard)
{
cache.Remove(entryToRemove);
}
}
}
Since the Add and Remove are rarely called I don't care too much about locking the access to the list there. On Get I might get a stale version of the list, but again I don't care, it will be fine for the next request.
Is the second implementation a good way to go?
EDIT
I've run a quick performance test and the results are the following:
Setup: populated the in memory collection with 10000 strings.
Action: GetAllEntries concurrently 50000 times.
Result:
00:00:35.2393871 to finish operation using ConcurrentBag (first implementation)
00:00:00.0036959 to finish operation using normal list (second implementation)
Code below:
class Program
{
static void Main(string[] args)
{
// warmup caches and stopwatch
var cacheWitBag = new CacheWithBag();
var cacheWitList = new CacheWithList();
cacheWitBag.Add("abc");
cacheWitBag.GetAllEntries();
cacheWitList.Add("abc");
cacheWitList.GetAllEntries();
var sw = new Stopwatch();
// warmup stowtach as well
sw.Start();
// initialize caches (rare writes so no real reason to measure here
for (int i =0; i < 50000; i++)
{
cacheWitBag.Add(new Guid().ToString());
cacheWitList.Add(new Guid().ToString());
}
sw.Stop();
// measure
var program = new Program();
sw.Start();
program.Run(cacheWitBag).Wait();
sw.Stop();
Console.WriteLine(sw.Elapsed);
sw.Restart();
program.Run2(cacheWitList).Wait();
sw.Stop();
Console.WriteLine(sw.Elapsed);
}
public async Task Run(CacheWithBag cache1)
{
List<Task> tasks = new List<Task>();
for (int i = 0; i < 10000; i++)
{
tasks.Add(Task.Run(() => cache1.GetAllEntries()));
}
await Task.WhenAll(tasks);
}
public async Task Run2(CacheWithList cache)
{
List<Task> tasks = new List<Task>();
for (int i = 0; i < 10000; i++)
{
tasks.Add(Task.Run(() => cache.GetAllEntries()));
}
await Task.WhenAll(tasks);
}
public class CacheWithBag
{
ConcurrentBag<string> cache = new ConcurrentBag<string>();
// this method gets called frequently
public IEnumerable<string> GetAllEntries()
{
return cache.ToList();
}
// this method gets rarely called
public void Add(string newEntry)
{
cache.Add(newEntry);
}
}
public class CacheWithList
{
private object guard = new object();
IList<string> cache = new List<string>();
// this method gets called frequently
public IEnumerable<string> GetAllEntries()
{
var currentCache = cache;
return currentCache;
}
// this method gets rarely called
public void Add(string newEntry)
{
lock (guard)
{
cache.Add(newEntry);
}
}
public void Remove(string entryToRemove)
{
lock (guard)
{
cache.Remove(entryToRemove);
}
}
}
}
}
To improve on InBetween's solution:
class Cache
{
ImmutableHashSet<string> cache = ImmutableHashSet.Create<string>();
public IEnumerable<string> GetAllEntries()
{
return cache;
}
public void Add(string newEntry)
{
ImmutableInterlocked.Update(ref cache, (set,item) => set.Add(item), newEntry);
}
public void Remove(string entryToRemove)
{
ImmutableInterlocked.Update(ref cache, (set,item) => set.Remove(item), newEntry);
}
}
This performs only atomic operations (no locking) and uses the .NET Immutable types.
In your current scenario, where Add and Remove are rarely called, I'd consider the following approach:
public class Cache
{
private object guard = new object();
var cache = new SomeImmutableCollection<string>();
// this method gets called frequently
public IEnumerable<string> GetAllEntries()
{
return cache;
}
// this method gets rarely called
public void Add(string newEntry)
{
lock (guard)
{
cache = cache.Add(newEntry);
}
}
public void Remove(string entryToRemove)
{
lock (guard)
{
cache = cache.Remove(entryToRemove);
}
}
}
The fundamental change here is that cache now is an immutable collection, which means it can't change....ever. So concurrency problems with the collection itself simply disappear, something that can't change is inherently thread safe.
Also, depending on how rare calls to Add and Remove are you can even consider removing the lock in both of them because all its doing now is avoiding a race between Add and Remove and a potential loss of a cache update. If that scenario is very very improbable you could get away with it. That said, I very much doubt the few nanoseconds an uncontended lock takes is a relevant factor here to actually consider this ;)
SomeImmutableCollection can be any of the collections found in System.Collections.Immutable that better suit your needs.
Instead of a 'lock' on a guard object to protect a simple container you should consider the 'ReaderWriterLockSlim' which is optimized and very performant for the read/write scenario : multiple readers are allowed at same time but only one writer is allowed and blocks other readers/writers. It is very useful in your scenario where you read a lot but write only few.
Please note you can be a reader and then, for some reason, decide to become a writer (upgrade the slim lock) in your "reading" code.

Consumer with timeout and under specific condition

The BlockingCollection<T> class provides an easy way to implement the producer/consumer pattern, but unfortunately doesn't have a feature I need. It allows me to set a timeout while waiting to consume an element, but does not provide a way to restrict which item is removed from the collection.
How can I implement a class similar to BlockingCollection<T>, but which allows me to specify the condition under which items should be taken?
For example: I need to take Bar item only with Amount equal to specific value:
public class Bar
{
public Int32 Amount { get; set; }
}
public class Program
{
public static void Main()
{
ToDoCollection<Bar> ToDoCollection = new ToDoCollection<Bar>();
int timeout = 10000;
// this doesn't work, that's why I'm asking for your help
Bar value = ToDoCollection.TryTake().Where(p => p.Amount != 5);
// Here, I need to wait for 10s trying to take item from blockingCollection
// item, that will follow specific condition: Bar.Amount has to be greater then zero
}
}
If I understand correctly, you want a collection that has the following behavior:
Allows a thread to attempt to retrieve an item matching a specific condition, and will block the thread until such time as that item is present.
Allows a thread to specify a timeout for the operation described in point #1.
The existing BlockingCollection class apparently has nothing at all to do with the question.
You can implement your own collection type, adding whatever specific features you need. For example:
class BlockingPredicateCollection<T>
{
private readonly object _lock = new object();
private readonly List<T> _list = new List<T>();
public void Add(T t)
{
lock (_lock)
{
_list.Add(t);
// Wake any waiting threads, so they can check if the element they
// are waiting for is now present.
Monitor.PulseAll(_lock);
}
}
public bool TryTake(out T t, Predicate<T> predicate, TimeSpan timeout)
{
Stopwatch sw = Stopwatch.StartNew();
lock (_lock)
{
int index;
while ((index = _list.FindIndex(predicate)) < 0)
{
TimeSpan elapsed = sw.Elapsed;
if (elapsed > timeout ||
!Monitor.Wait(_lock, timeout - elapsed))
{
t = default(T);
return false;
}
}
t = _list[index];
_list.RemoveAt(index);
return true;
}
}
}
Then, for example:
BlockingPredicateCollection<Bar> toDoCollection = new BlockingPredicateCollection<Bar>();
int timeout = 10000;
Bar value;
if (toDoCollection.TryTake(out value,
p => p.Amount != 5, TimeSpan.FromMilliseconds(timeout)))
{
// do something with "value"
}

How to dynamically lock strings but remove the lock objects from memory

I have the following situation:
I have a lot of threads in my project, and each thread process one "key" by time.
Two threads cannot process the same "key" at the same time, but my project process A LOOOOOT OF KEYS, so I can't store the "keys" on memory, I need to store on memory that a thread is processing a "key" and if another thread tries to process the same "key" this thread will be waiting in a lock clause.
Now I have the following structure:
public class Lock
{
private static object _lockObj = new object();
private static List<object> _lockListValues = new List<object>();
public static void Execute(object value, Action action)
{
lock (_lockObj)
{
if (!_lockListValues.Contains(value))
_lockListValues.Add(value);
}
lock (_lockListValues.First(x => x.Equals(value)))
{
action.Invoke();
}
}
}
It is working fine, the problem is that the keys aren't being removed from the memory. the biggest trouble is the multi thread feature because at any time a "key" can be processed.
How could I solve this without a global lock independent of the keys?
Sorry, but no, this is not the way it should be done.
First, you speak about keys, but you store keys as type object in List and then you are searching with LINQ to get that from list.
For that kind of stuff is here dictionary.
Second, object model, usually it is best to implement locking of some object around some class, make it nice and clean:
like:
using System.Collections.Concurrent;
public LockedObject<T>
{
public readonly T data;
public readonly int id;
private readonly object obj = new object();
LockedObject(int id, T data)
{
this.id = id;
this.data = data;
}
//Usually, if you have Action related to some data,
//it is better to receive
//that data as parameter
public void InvokeAction(Action<T> action)
{
lock(obj)
{
action(data);
}
}
}
//Now it is a concurrently safe object applying some action
//concurrently on given data, no matter how it is stored.
//But still, this is the best idea:
ConcurrentDictionary<int, LockedObject<T>> dict =
new ConcurrentDictionary<int, LockedObject<T>>();
//You can insert, read, remove all object's concurrently.
But, the best thing is yet to come! :) You can make it lock free and very easily!
EDIT1:
ConcurrentInvoke, dictionary like collection for concurrently safe invoking action over data. There can be only one action at the time on given key.
using System;
using System.Threading;
using System.Collections.Concurrent;
public class ConcurrentInvoke<TKey, TValue>
{
//we hate lock() :)
private class Data<TData>
{
public readonly TData data;
private int flag;
private Data(TData data)
{
this.data = data;
}
public static bool Contains<TTKey>(ConcurrentDictionary<TTKey, Data<TData>> dict, TTKey key)
{
return dict.ContainsKey(key);
}
public static bool TryAdd<TTKey>(ConcurrentDictionary<TTKey, Data<TData>> dict, TTKey key, TData data)
{
return dict.TryAdd(key, new Data<TData>(data));
}
// can not remove if,
// not exist,
// remove of the key already in progress,
// invoke action of the key inprogress
public static bool TryRemove<TTKey>(ConcurrentDictionary<TTKey, Data<TData>> dict, TTKey key, Action<TTKey, TData> action_removed = null)
{
Data<TData> data = null;
if (!dict.TryGetValue(key, out data)) return false;
var access = Interlocked.CompareExchange(ref data.flag, 1, 0) == 0;
if (!access) return false;
Data<TData> data2 = null;
var removed = dict.TryRemove(key, out data2);
Interlocked.Exchange(ref data.flag, 0);
if (removed && action_removed != null) action_removed(key, data2.data);
return removed;
}
// can not invoke if,
// not exist,
// remove of the key already in progress,
// invoke action of the key inprogress
public static bool TryInvokeAction<TTKey>(ConcurrentDictionary<TTKey, Data<TData>> dict, TTKey key, Action<TTKey, TData> invoke_action = null)
{
Data<TData> data = null;
if (invoke_action == null || !dict.TryGetValue(key, out data)) return false;
var access = Interlocked.CompareExchange(ref data.flag, 1, 0) == 0;
if (!access) return false;
invoke_action(key, data.data);
Interlocked.Exchange(ref data.flag, 0);
return true;
}
}
private
readonly
ConcurrentDictionary<TKey, Data<TValue>> dict =
new ConcurrentDictionary<TKey, Data<TValue>>()
;
public bool Contains(TKey key)
{
return Data<TValue>.Contains(dict, key);
}
public bool TryAdd(TKey key, TValue value)
{
return Data<TValue>.TryAdd(dict, key, value);
}
public bool TryRemove(TKey key, Action<TKey, TValue> removed = null)
{
return Data<TValue>.TryRemove(dict, key, removed);
}
public bool TryInvokeAction(TKey key, Action<TKey, TValue> invoke)
{
return Data<TValue>.TryInvokeAction(dict, key, invoke);
}
}
ConcurrentInvoke<int, string> concurrent_invoke = new ConcurrentInvoke<int, string>();
concurrent_invoke.TryAdd(1, "string 1");
concurrent_invoke.TryAdd(2, "string 2");
concurrent_invoke.TryAdd(3, "string 3");
concurrent_invoke.TryRemove(1);
concurrent_invoke.TryInvokeAction(3, (key, value) =>
{
Console.WriteLine("InvokingAction[key: {0}, vale: {1}", key, value);
});
I modified a KeyedLock class that I posted in another question, to use internally the Monitor class instead of SemaphoreSlims. I expected that using a specialized mechanism for synchronous locking would offer better performance, but I can't actually see any difference. I am posting it anyway because it has the added convenience feature of releasing the lock automatically with the using statement. This feature adds no significant overhead in the case of synchronous locking, so there is no reason to omit it.
Another reason that justifies this separate implementation is that the Monitor has different semantics than the SemaphoreSlim. The Monitor is reentrant while the SemaphoreSlim is not. A single thread is allowed to enter the Monitor multiple times, before finally Exiting an equal number of times. This is not possible with a SemaphoreSlim. If a thread make an attempt to Wait a second time a SemaphoreSlim(1, 1), most likely it will deadlock.
The KeyedMonitor class stores internally only the locking objects that are currently in use, plus a small pool of locking objects that have been released and can be reused. This pool reduces significantly the memory allocations under heavy usage, at the cost of some added synchronization overhead.
public class KeyedMonitor<TKey>
{
private readonly Dictionary<TKey, (object, int)> _perKey;
private readonly Stack<object> _pool;
private readonly int _poolCapacity;
public KeyedMonitor(IEqualityComparer<TKey> keyComparer = null,
int poolCapacity = 10)
{
_perKey = new Dictionary<TKey, (object, int)>(keyComparer);
_pool = new Stack<object>(poolCapacity);
_poolCapacity = poolCapacity;
}
public ExitToken Enter(TKey key)
{
var locker = GetLocker(key);
Monitor.Enter(locker);
return new ExitToken(this, key);
}
// Abort-safe API
public void Enter(TKey key, ref bool lockTaken)
{
try { }
finally // Abort-safe block
{
var locker = GetLocker(key);
try { Monitor.Enter(locker, ref lockTaken); }
finally { if (!lockTaken) ReleaseLocker(key, withMonitorExit: false); }
}
}
public bool TryEnter(TKey key, int millisecondsTimeout)
{
var locker = GetLocker(key);
bool acquired = false;
try { acquired = Monitor.TryEnter(locker, millisecondsTimeout); }
finally { if (!acquired) ReleaseLocker(key, withMonitorExit: false); }
return acquired;
}
public void Exit(TKey key) => ReleaseLocker(key, withMonitorExit: true);
private object GetLocker(TKey key)
{
object locker;
lock (_perKey)
{
if (_perKey.TryGetValue(key, out var entry))
{
int counter;
(locker, counter) = entry;
counter++;
_perKey[key] = (locker, counter);
}
else
{
lock (_pool) locker = _pool.Count > 0 ? _pool.Pop() : null;
if (locker == null) locker = new object();
_perKey[key] = (locker, 1);
}
}
return locker;
}
private void ReleaseLocker(TKey key, bool withMonitorExit)
{
object locker; int counter;
lock (_perKey)
{
if (_perKey.TryGetValue(key, out var entry))
{
(locker, counter) = entry;
// It is important to allow a possible SynchronizationLockException
// to be surfaced before modifying the internal state of the class.
// That's why the Monitor.Exit should be called here.
// Exiting the Monitor while holding the inner lock should be safe.
if (withMonitorExit) Monitor.Exit(locker);
counter--;
if (counter == 0)
_perKey.Remove(key);
else
_perKey[key] = (locker, counter);
}
else
{
throw new InvalidOperationException("Key not found.");
}
}
if (counter == 0)
lock (_pool) if (_pool.Count < _poolCapacity) _pool.Push(locker);
}
public readonly struct ExitToken : IDisposable
{
private readonly KeyedMonitor<TKey> _parent;
private readonly TKey _key;
public ExitToken(KeyedMonitor<TKey> parent, TKey key)
{
_parent = parent; _key = key;
}
public void Dispose() => _parent?.Exit(_key);
}
}
Usage example:
var locker = new KeyedMonitor<string>();
using (locker.Enter("Hello"))
{
DoSomething(); // with the "Hello" resource
}
Although the KeyedMonitor class is thread-safe, it is not as robust as using the lock statement directly, because it offers no resilience in case of a ThreadAbortException. An aborted thread could leave the class in a corrupted internal state. I don't consider this to be a big issue, since the Thread.Abort method has become obsolete in the current version of the .NET platform (.NET 5).
For an explanation about why the IDisposable ExitToken struct is not boxed by the using statement, you can look here: If my struct implements IDisposable will it be boxed when used in a using statement? If this was not the case, the ExitToken feature would add significant overhead.
Caution: please don't store anywhere the ExitToken value returned by the KeyedMonitor.Enter method. There is no protection against misuse of this struct (like disposing it multiple times). The intended usage of this method is shown in the example above.
Update: I added an Enter overload that allows to take the lock with thread-abort resilience, albeit with an inconvenient syntax:
bool lockTaken = false;
try
{
locker.Enter("Hello", ref lockTaken);
DoSomething();
}
finally
{
if (lockTaken) locker.Exit("Hello");
}
As with the underlying Monitor class, the lockTaken is always true after a successful invocation of the Enter method. The lockTaken can be false only if the Enter throws an exception.

Cache class and modified collections

I've written a generic Cache class designed to return an in-memory object and only evaluate src (an IQueryable, or a function returning IQueryable) occasionally. Its used in a couple of places in my app where fetching a lot of data via entity framework is expensive.
It is called
//class level
private static CachedData<Foo> fooCache= new CachedData<Foo>();
//method level
var results = fooCache.GetData("foos", fooRepo.Include("Bars"));
Although it appeared to work OK in testing, running on a busy web server I'm seeing some issues with "Collection was modified; enumeration operation may not execute." errors in the code that consumes the results.
This must be because one thread is overwriting the results object inside the lock, while another is using them, outside the lock.
I'm guessing my only solution is to return a copy of the results to each consumer rather than the original, and that I cannot allow the copy to occur while inside the Fetch lock, but that multiple copies could occur simultaneously.
Can anyone suggest a better way, or help with the locking strategy please?
public class CachedData<T> where T:class
{
private static Dictionary<string, IEnumerable<T>> DataCache { get; set; }
public static Dictionary<string, DateTime> Expire { get; set; }
public int TTL { get; set; }
private object lo = new object();
public CachedData()
{
TTL = 600;
Expire = new Dictionary<string, DateTime>();
DataCache = new Dictionary<string, IEnumerable<T>>();
}
public IEnumerable<T> GetData(string key, Func<IQueryable<T>> src)
{
var bc = brandKey(key);
if (!DataCache.ContainsKey(bc)) Fetch(bc, src);
if (DateTime.Now > Expire[bc]) Fetch(bc, src);
return DataCache[bc];
}
public IEnumerable<T> GetData(string key, IQueryable<T> src)
{
var bc = brandKey(key);
if ((!DataCache.ContainsKey(bc)) || (DateTime.Now > Expire[bc])) Fetch(bc, src);
return DataCache[bc];
}
private void Fetch(string key, IQueryable<T> src )
{
lock (lo)
{
if ((!DataCache.ContainsKey(key)) || (DateTime.Now > Expire[key])) ExecuteFetch(key, src);
}
}
private void Fetch(string key, Func<IQueryable<T>> src)
{
lock (lo)
{
if ((!DataCache.ContainsKey(key)) || (DateTime.Now > Expire[key])) ExecuteFetch(key, src());
}
}
private void ExecuteFetch(string key, IQueryable<T> src)
{
if (!DataCache.ContainsKey(key)) DataCache.Add(key, src.ToList());
else DataCache[key] = src.ToList();
if (!Expire.ContainsKey(key)) Expire.Add(key, DateTime.Now.AddSeconds(TTL));
else Expire[key] = DateTime.Now.AddSeconds(TTL);
}
private string brandKey(string key, int? brandid = null)
{
return string.Format("{0}/{1}", brandid ?? Config.BrandID, key);
}
}
I usually use a ConcurrentDictionary<TKey, Lazy<TValue>>. That gives you a lock per key. It makes the strategy to hold the lock while fetching viable. This also avoids cache-stampeding. It guarantees that only one evaluation per key will ever happen. Lazy<T> automates the locking entirely.
Regarding your expiration logic: You could set up a timer that cleans the dictionary (or rewrites it entirely) every X seconds.

Categories