Lock entire Dictionary or part of before copying (SyncRoot) - c#

I have 2 dictionaries in a scoped service within a Blazor server application I use to manage state for multi tenancy. It has come to my attention that there may be concurrency issues with users modifying the dictionaries on different threads.
public Dictionary<string, GlobalDataModel> Global { get; } = new();
public Dictionary<string, Dictionary<long, LocalDataModel>> Local { get; } = new();
I think I'm leaning towards leaving these as normal dictionaries and locking sync root when modifying or iterating over.
If I were to add an item to a collection within the containing class, would it be:
if (Local.ContainsKey(tenantId) == false)
{
lock (syncRoot)
{
Local.Add(tenantId, new Dictionary<long, LocalDataModel>());
}
}
or:
lock (syncRoot)
{
if (Local.ContainsKey(tenantId) == false)
{
{
Local.Add(tenantId, new Dictionary<long, LocalDataModel>());
}
}
}
Then if I were to copy different parts of the service collection to lists from an external class so it can be safely iterated would I just lock at the Service level, the Dictionary level, or the DataModel level, or is it dependent?
Say I wanted a resources collection within a specific project. Would it be:
[Inject] private IDataAdaptor DataAdaptor { get; set; };
var resources;
lock (DataAdaptor)
{
resources = DataAdaptor.Local[TenantId][ProjectId].Resources;
}
or:
lock (DataAdaptor.Local[TenantId][ProjectId].Resources)
{
resources = DataAdaptor.Local[TenantId][ProjectId].Resources;
}
And vice versa for different parts of the collection, et Tenants, Projects etc...
I assume I have to lock on the object because syncRoot isn't accessible outside the containing class, or do I create a SyncRoot object in the class where I'm copying and lock on that?
Obviously multi threading is a new concept.

The Dictionary<TKey,TValue> is thread-safe only when used by multiple readers and zero writers. As soon as you add a writer in the mix, in other words if the dictionary is not frozen and can be mutated, then it is thread-safe for nothing. Every access to the dictionary, even the slightest touch like reading its Count, should be synchronized, otherwise its behavior is undefined.
Synchronizing a dictionary can be quite inefficient, especially if you have to enumerate it and do something with each of its entries. In this case you have to either create a copy of the dictionary (.ToArray()) and enumerate the copy, or lock for the whole duration of the iteration, blocking all other threads that might want to do trivial reads or writes. Alternative collections that are better suited for multithreaded environments are the ConcurrentDictionary<TKey,TValue> and the ImmutableDictionary<TKey,TValue>.
In case you want to educate yourself systematically about thread-safety, here is an online resource: Threading in C# - Thread Safety by Joseph Albahari.

Related

How lock work for static collection

The lock keyword is used where you want to achieve that the area should be executed by at most on thread, in a multithreading environment, rest of the thread will wait for the area.
I have a collection IList<Student> student=new List<Student>() that is being used in multiple classes.
In some places objects are getting added to the list, and in some places objects are getting deleted. This causes some inconsistent behavior.
Is it true that when I lock the collection in class x in a multithreading environment, the collection will be locked for all classes and all threads in different classes will wait for the lock?
Class StaticClass
{
Public static IList<Student> student=new List<Student>();
}
Class ClassA
{
//add an item in the collection
}
Class ClassB
{
//delete an item in the collection
}
Class ClassC
{
//lock the collection here
lock (StaticClass.student)
{
foreach (ConnectionManager con in ConnectionManager.GetAllStudents())
{
con.Send(offlinePresence);
}
}
}
When I have locked the collection in ClassC, will other threads for classA and ClassB wait? Until the for loop execute nobody is allowed to add or delete items in the collection, because the collection has been locked?
Take a look at System.Collections.Generic.SynchronizedCollection<T> from the System.ServiceModel.dll assembly. All of the locking stuff is built in.
As Fabio said, ConcurrentBag also works, but a separate list context is created for each thread accessing it. When you try and remove items, it works like a queue. Once you've run out of items in your own thread's list, it will then "steal" items from other threads' lists (in a thread-safe way).
For your task, I'm guessing the SynchronizedCollection would be better.
When your code acquires a lock on an object, it does not block access to that object from other threads. The only thing that happens to this object is that other threads cannot acquire a lock to the same object as long as the lock is not released.
It is common practice to reserve a separate plain object as the target of a lock.
In your example other threads can still access the List<Student> object as long as code does not explicitly lock it first.
Locking can cause serious performance issues (threads waiting for eachother) and in many cases does not need to be implemented explicitly. Just take a look at the Concurrent collection classes of the .NET Framework.

Adding a new item in dictionary from multiple threads

I have a problem adding a new item to a Static dictionary while using it from multiple threads. Any ideas where I'm doing it wrong?
Initializing the dictionary:
public static class Server
{
public static volatile Dictionary<int, List<SomeClass>> Values;
}
Trying to add an item:
Server.Values.Add(someInt, new List<SomeClass> { elements});
As explained by Jon Skeet you are using an object which is not guaranteed to be thread safe
try using ConcurrentDictionary which is designed for Concurrency Scenario With Many threads
public static class Server
{
public static ConcurrentDictionary<int, List<SomeClass>> Values =
new ConcurrentDictionary<int, List<SomeClass>>();
}
Here how to use it
bool added = Server.Values.TryAdd(someInt, new List<SomeClass> { elements});
In general, when working with resources that are shared between multiple threads, you need to use a synchronization mechanism, like lock() to make your code thread safe. Create a common object to use as the lock:
private object _lock = new object();
Then you surround any code which accesses your shared resource, like this:
lock(_lock)
{
// perform operations on shared resource here.
}
It's important to note that you should have a different lock for every shared resource rather than one lock used for all resources. If you use your lock object with multiple resources, your code could be very inefficient. If one thread grabs the lock so it can use resource A, then other threads will have to wait for the lock to be released even if they want to access resource B which has nothing to do with the resource A. Therefore, it's better to have one lock object per resource and to name your lock objects so you know which resources they should be used with.
An alternative to this (as BRAHIM Kamel's answer shows) is to use a replacement, if available, for your shared resource which already has thread synchronization baked in, like ConcurrentDictionary. Though this may not be feasible in your case.

Returning a dictionary in c# in a multi-threaded environment

I have declared a dictionary of dicionary:
Dictionary<String, Dictionary<String, String>> values;
I have a getter to get a dictionary at a specific index:
public Dictionary<String,String> get(String idx)
{
lock (_lock)
{
return values[moduleName];
}
}
As you can see I am working in a multi-threaded environment.
My question is do I need to return a copy of my dictionary in order to be thread safe like this:
public Dictionary<String,String> get(String idx)
{
lock (_lock)
{
return new Dictionary<string, string>(values[moduleName]);
}
}
If I don't will the class that calls the getter receive a copy (so if I remove this dictionary from my Dictionary<String, Dictionary<String, String>> will it still work)?
Cheers,
Thierry.
Dictionary<> is not Thread-safe, but ConncurrentDictionary<> is.
The class calling the getter receives a reference, which means it will still be there if you remove it from the values-Dictionary as the GC does not clean it as long as you have a reference somewhere, you just can't get it with the getter anymore.
Basicaly that means you have two possibilities when using Dictionary<>:
return a copy: Bad idea, because if you change the config you have two different configurations in your app "alive"
lock the instance: this would make it thread-safe, but then use ConcurrentDictionary<> as it does exactly that for you
If you really need to return the dictionary itself, then you're going to need to either have rules for how the threads lock on it (brittle, what if there's a case that doesn't?), use it in a read-only way (dictionary is thread-safe for read-only, but note that this assumes that the code private to the class isn't writing to it any more either), use a thread-safe dictionary (ConcurrentDictionary uses striped locking, my ThreadSafeDictionary uses lock-free approaches, and there are different scenarios where one beats the other), or make a copy as you suggest.
Alternatively though, if you expose a method to retrieve or set the ultimate string that is found by the two keys, to enumerate the second-level keys found by a key, and so on, then not only can you control the locking done in one place, but it's got other advantages in cleanness of interface and in freeing implementation from interface (e.g. you could move from lock-based to use of a thread-safe dictionary, or the other way around, without affecting the calling code).
If you don't return a copy, the caller will be able to change the dictionary but this is not a thread safety issue.
There is also a thread safety issue because you don't expose any lock to synchronize writes and reads. For instance, your writer thread can adds/removes a value while the reader thread is working on the same instance.
Yes, you have to return a copy.

Thread Synchronization in .NET

In my app I have a List of objects. I'm going to have a process (thread) running every few minutes that will update the values in this list. I'll have other processes (other threads) that will just read this data, and they may attempt to do so at the same time.
When the list is being updated, I don't want any other process to be able to read the data. However, I don't want the read-only processes to block each other when no updating is occurring. Finally, if a process is reading the data, the process that updates the data must wait until the process reading the data is finished.
What sort of locking should I implement to achieve this?
This is what you are looking for.
ReaderWriterLockSlim is a class that will handle scenario that you have asked for.
You have 2 pair of functions at your disposal:
EnterWriteLock and ExitWriteLock
EnterReadLock and ExitReadLock
The first one will wait, till all other locks are off, both read and write, so it will give you access like lock() would do.
The second one is compatible with each other, you can have multiple read locks at any given time.
Because there's no syntactic sugar like with lock() statement, make sure you will never forget to Exit lock, because of Exception or anything else. So use it in form like this:
try
{
lock.EnterWriteLock(); //ReadLock
//Your code here, which can possibly throw an exception.
}
finally
{
lock.ExitWriteLock(); //ReadLock
}
You don't make it clear whether the updates to the list will involve modification of existing objects, or adding/removing new ones - the answers in each case are different.
To handling modification of existing items in the list, each object should handle it's own locking.
To allow modification of the list while others are iterating it, don't allow people direct access to the list - force them to work with a read/only copy of the list, like this:
public class Example()
{
public IEnumerable<X> GetReadOnlySnapshot()
{
lock (padLock)
{
return new ReadOnlyCollection<X>( MasterList );
}
}
private object padLock = new object();
}
Using a ReadOnlyCollection<X> to wrap the master list ensures that readers can iterate through a list of fixed content, without blocking modifications made by writers.
You could use ReaderWriterLockSlim. It would satisfy your requirements precisely. However, it is likely to be slower than just using a plain old lock. The reason is because RWLS is ~2x slower than lock and accessing a List would be so fast that it would not be enough to overcome the additional overhead of the RWLS. Test both ways, but it is likely ReaderWriterLockSlim will be slower in your case. Reader writer locks do better in scenarios were the number readers significantly outnumbers the writers and when the guarded operations are long and drawn out.
However, let me present another options for you. One common pattern for dealing with this type of problem is to use two separate lists. One will serve as the official copy which can accept updates and the other will serve as the read-only copy. After you update the official copy you must clone it and swap out the reference for the read-only copy. This is elegant in that the readers require no blocking whatsoever. The reason why readers do not require any blocking type of synchronization is because we are treating the read-only copy as if it were immutable. Here is how it can be done.
public class Example
{
private readonly List<object> m_Official;
private volatile List<object> m_Readonly;
public Example()
{
m_Official = new List<object>();
m_Readonly = m_Official;
}
public void Update()
{
lock (m_Official)
{
// Modify the official copy here.
m_Official.Add(...);
m_Official.Remove(...);
// Now clone the official copy.
var clone = new List<object>(m_Official);
// And finally swap out the read-only copy reference.
m_Readonly = clone;
}
}
public object Read(int index)
{
// It is safe to access the read-only copy here because it is immutable.
// m_Readonly must be marked as volatile for this to work correctly.
return m_Readonly[index];
}
}
The code above would not satisfy your requirements precisely because readers never block...ever. Which means they will still be taking place while writers are updating the official list. But, in a lot of scenarios this winds up being acceptable.

How to implement a thread-safe cache mechanism when working with collections?

Scenario:
I have a bunch of Child objects, all related to a given Parent.
I'm working on an ASP.NET MVC 3 Web Application (e.g multi-threaded)
One of the pages is a "search" page where i need to grab a given sub-set of children and "do stuff" in memory to them (calculations, ordering, enumeration)
Instead of getting each child in a seperate call, i do one call to the database to get all children for a given parent, cache the result and "do stuff" on the result.
The problem is that the "do stuff" involves LINQ operations (enumeration, adding/removing items from the collection), which when implemented using List<T> is not thread-safe.
I have read about ConcurrentBag<T> and ConcurrentDictionary<T> but not sure if i should use one of these, or implement the synchronization/locking myself.
I'm on .NET 4, so i'm utilizing ObjectCache as a singleton instance of MemoryCache.Default. I have a service layer which works with the cache, and the services accept an instance of ObjectCache, which is done via constructor DI. This way all the services share the same ObjectCache instance.
The main thread safety issue is I need to loop through the current "cached" collection, and if the child I'm working with is already there, I need to remove it and add the one I'm working with back in, this enumeration is what causes the issues.
Any recommendations?
Yes, to implement a cache efficiently, it needs a fast lookup mechanism and so List<T> is the wrong data structure out-of-the-box. A Dictionary<TKey, TValue> is an ideal data structure for a cache because it provides a way to replace:
var value = instance.GetValueExpensive(key);
with:
var value = instance.GetValueCached(key);
by using cached values in a dictionary and using the dictionary to do the heavy lifting for lookup. The caller is none the wiser.
But, if the callers could be calling from multiple threads then .NET4 provides ConcurrentDictionary<TKey, TValue> that works perfectly in this situation. But what does the dictionary cache? It seems like in your situation the dictionary key is the child and the dictionary values are the database results for that child.
OK, so now we have a thread-safe and efficient cache of database results keyed by child. What data structure should we use for the database results?
You haven't said what those results look like but since you use LINQ we know that they are at least IEnumerable<T> and maybe even List<T>. So we're back to the same problem, right? Because List<T> is not thread-safe, we can't use it for the dictionary value. Or can we?
A cache must be read-only from the point-of-view of the caller. You say with LINQ you "do stuff" like add/remove, but that makes no sense for a cached value. It only makes sense to do stuff in the implementation of the cache itself such as replacing a stale entry with new results.
The dictionary value, because it is read-only, can be a List<T> with no ill effects, even though it will be accessed from multiple threads. You can use List<T>.AsReadOnly to improve your confidence and add some compile-time safety checking.
But the important point is that List<T> is only not thread-safe if it is mutable. Since by definition a method implemented using a cache must return the same value if called multiple times (until the value is invalidated by the cache itself), the client cannot modify the returned value and so the List<T> must be frozen, effectively immutable.
If the client desperately needs to modify the cached database results and the value is a List<T>, then the only safe way to do so is to:
Make a copy
Make changes to the copy
Ask the cache to update the value
In summary, use a thread-safe dictionary for the top-level cache and an ordinary list for the cached value, being careful never to modify the contents of the last after inserting it into the cache.
Try modifying RefreshFooCache like so:
public ReadOnlyCollection<Foo> RefreshFooCache(Foo parentFoo)
{
ReadOnlyCollection<Foo> results;
try {
// Prevent other writers but allow read operations to proceed
Lock.EnterUpgradableReaderLock();
// Recheck your cache: it may have already been updated by a previous thread before we gained exclusive lock
if (_cache.Get(parentFoo.Id.ToString()) != null) {
return parentFoo.FindFoosUnderneath(uniqueUri).AsReadOnly();
}
// Get the parent and everything below.
var parentFooIncludingBelow = _repo.FindFoosIncludingBelow(parentFoo.UniqueUri).ToList();
// Now prevent both other writers and other readers
Lock.EnterWriteLock();
// Remove the cache
_cache.Remove(parentFoo.Id.ToString());
// Add the cache.
_cache.Add(parentFoo.Id.ToString(), parentFooIncludingBelow );
} finally {
if (Lock.IsWriteLockHeld) {
Lock.ExitWriteLock();
}
Lock.ExitUpgradableReaderLock();
}
results = parentFooIncludingBelow.AsReadOnly();
return results;
}
EDIT - updated to use ReadOnlyCollection instead of ConcurrentDictionary
Here's what i currently have implemented. I ran some load tests and didn't see any errors happen - what do you guys think of this implementation:
public class FooService
{
private static ReaderWriterLockSlim _lock;
private static readonly object SyncLock = new object();
private static ReaderWriterLockSlim Lock
{
get
{
if (_lock == null)
{
lock(SyncLock)
{
if (_lock == null)
_lock = new ReaderWriterLockSlim();
}
}
return _lock;
}
}
public ReadOnlyCollection<Foo> RefreshFooCache(Foo parentFoo)
{
// Get the parent and everything below.
var parentFooIncludingBelow = repo.FindFoosIncludingBelow(parentFoo.UniqueUri).ToList().AsReadOnly();
try
{
Lock.EnterWriteLock();
// Remove the cache
_cache.Remove(parentFoo.Id.ToString());
// Add the cache.
_cache.Add(parentFoo.Id.ToString(), parentFooIncludingBelow);
}
finally
{
Lock.ExitWriteLock();
}
return parentFooIncludingBelow;
}
public ReadOnlyCollection<Foo> FindFoo(string uniqueUri)
{
var parentIdForFoo = uniqueUri.GetParentId();
ReadOnlyCollection<Foo> results;
try
{
Lock.EnterReadLock();
var cachedFoo = _cache.Get(parentIdForFoo);
if (cachedFoo != null)
results = cachedFoo.FindFoosUnderneath(uniqueUri).ToList().AsReadOnly();
}
finally
{
Lock.ExitReadLock();
}
if (results == null)
results = RefreshFooCache(parentFoo).FindFoosUnderneath(uniqueUri).ToList().AsReadOnly();
}
return results;
}
}
Summary:
Each Controller (e.g each HTTP request) gets given a new instance of FooService.
FooService has a static lock, so all the HTTP requests share the same instance.
I've used ReaderWriterLockSlim to ensure multiple threads can read, but only one thread can write. I'm hoping this should avoid "read" deadlocks. If two threads happen to need to "write" at the same time, then of course one will have to wait. But the goal here is to serve the reads quickly.
The goal is to be able to find a "Foo" from the cache as long as something above it has been retrieved.
Any tips/suggestions/problems with this approach? I'm not sure if im locking the write areas. But because i need to run LINQ on the dictionary, i figured i need a read lock.

Categories