I'm wondering if there are any downsides to locking over a collection such as a List<T>, HashSet<T>, or a Dictionary<TKey, TValue> rather than a simple object.
Note: in the following examples, that is the only place where the locks occur, it's not being locked from multiple places, but the static method may be called from multiple threads. Also, the _dict is never accessed outside of the GetSomething method.
My current code looks like this:
private static readonly Dictionary<string, string> _dict = new Dictionary<string, string>();
public static string GetSomething(string key)
{
string result;
if (!_dict.TryGetValue(key, out result))
{
lock (_dict)
{
if (!_dict.TryGetValue(key, out result))
{
_dict[key] = result = CalculateSomethingExpensive(key);
}
}
}
return result;
}
Another developer is telling me that locking on a collection will cause issues, but I'm skeptical. Would my code be more efficient if I do it this way?
private static readonly Dictionary<string, string> _dict = new Dictionary<string, string>();
private static readonly object _syncRoot = new object();
public static string GetSomething(string key)
{
string result;
if (!_dict.TryGetValue(key, out result))
{
lock (_syncRoot)
{
if (!_dict.TryGetValue(key, out result))
{
_dict[key] = result = CalculateSomethingExpensive(key);
}
}
}
return result;
}
If you expose your collections to the outside world, then, yes this can be a problem. The usual recommendation is to lock on something that you exclusively own and that can never be locked unexpectedly by code that is outside your influence. That's why generally it's probably better to lock on something that you'd never even consider exposing (i.e. a specific lock object created for that purpose). That way, when your memory fails you, you'll never probably not get unexpected results.
To answer your question more directly: Adding another object into the mix is never going to be more efficient, but placing what is generally regarded as good coding practice before some perceived, but unmeasured efficiency might be an optmisation occurring prematurely. I favour best practice until it's demonstrably causing a bottleneck.
In this case, I would lock on the collection; the purpose for the lock relates directly to the collection and not to any other object, so there is a degree of self-annotation in using it as the lock object.
There are changes I would make though.
There's nothing I can find in the documentation to say that TryGetValue is threadsafe and won't throw an exception (or worse) if you call it while the dictionary is in an invalid state because it is half-way through adding a new value. Because it's not atomic, the double-read pattern you use here (to avoid the time spent obtaining a lock) is not safe. That will have to be changed to:
private static readonly Dictionary<string, string> _dict = new Dictionary<string, string>();
public static string GetSomething(string key)
{
string result;
lock (_dict)
{
if (!_dict.TryGetValue(key, out result))
{
_dict[key] = result = CalculateSomethingExpensive(key);
}
}
return result;
}
If it is likely to involve more successful reads than unsuccessful (that hence require writes), use of a ReaderWriterLockSlim would give better concurrency on those reads.
Edit: I just noticed that your question was not about preference generally, but about efficiency. Really, the efficiency difference of using 4 bytes more memory in the entire system (since it's static) is absolutely zero. The decision isn't about efficiency at all, but since both are of equal technical merit (in this case) is about whether you find locking on the collection or on a separate object is better at expressing your intent to another developer (including you in the future).
No. As long as the variable is not accessible from anywhere else, and you can guarantee that the lock is only used here, there is no downside. In fact, the documentation for Monitor.Enter (which is what a lock in C# uses) does exactly this.
However, as a general rule, I still recommend using a private object for locking. This is safer in general, and will protect you if you ever expose this object to any other code, as you will not open up the possibility of your object being locked on from other code.
To directly answer your question: no
It makes no difference whatever object you're locking to. .NET cares only about it's reference, that works exactly like a pointer. Think about locking in .NET as a big synchronized hash table where the key is the object reference and the value is a boolean saying you can enter the monitor or not. If two threads lock onto different objects (a != b) they can enter the lock's monitor concurrently, even if a.Equals(b) (it's very important!!!). But if they lock on a and b, and (a==b) only one of them at a time will be in the monitor.
As soon as dict is not accessed outside your scope you have no performance impact. If dict is visible elsewhere, other user code may get a lock on it even if not necessarily required (think your deskmate is a dumb and locks to the first random object he finds in the code).
Hope to have been of help.
I would recommend using the ICollection.SyncRoot object for locking rather than your own object:
private static readonly Dictionary<String, String> _dict = new Dictionary<String, String>();
private static readonly Object _syncRoot = ((ICollection)_dict).SyncRoot;
Related
I have a configuration repository class that roughly looks like this:
public class ConfigurationRepository // pseudo c#
{
private IDictionary<string, string> _cache = new Dictionary<string, string>();
private ConfigurationStore _configStore;
private CancellationToken cancellationToken;
public ConfigurationRepository(ConfigurationStore configStore, CancellationToken cancellationToken)
{
_configStore = configStore;
_cancellationToken = cancellationToken;
LiveCacheReload();
}
private void LiveCacheReload()
{
Task.Run(() =>
while(!_cancellationToken.IsCancellationRequested)
{
try {
_cache = new Dictionary<string, string>(_store.GetAllItems(), StringComparer.OrdinalIgnoreCase);
} catch {} // ignore
// some exponential back-off code here
}
);
}
... get methods ...
}
... where _cache is only ever accessed in a read-only manner through _cache.ContainsKey(key), _cache.Keys, and _cache[key].
This class is accessed from multiple threads. Is it ok to hot swap this Dictionary without synchronization when it is only ever read-accessed? ConfigurationProvider from Microsoft.Extensions.Configuration looks to be implemented in the same way.
It depends. If you have code which does something like:
if (_cache.ContainsKey(key))
{
var x = _cache[key];
}
that's obviously unsafe, because _cache could be re-assigned between the first and second reads.
If the consumer code only ever accesses _cache once (and creates a local copy if it needs to do multiple accesses), it's safe in the sense that you shouldn't get a crash. However you need to carefully audit every place where _cache is accessed to make sure that the code doesn't make any assumptions about _cache.
However, there's no memory barrier around reading or writing _cache, which means that a thread reading _cache may read a value which is old: the compiler, JIT and even CPU are allowed to return a value which was read some time ago. For example, in a tight loop which reads _cache on every iteration, the JIT may re-arrange instructions so that _cache is read once just before the loop, and then never re-read inside the loop. Likewise a CPU cache local to one processor core may contain an out-of-date value for _value, and the CPU is under no obligation to update this if another core writes a different value through a different cache.
To avoid this, you need a memory barrier, and the safest way to introduce one is through a lock.
So, don't be clever and try and avoid locks. It's fraught: lock-free code is really hard to write correctly, but it's very very easy to write something which appears to work, and then causes a subtle error in very particular circumstances which is impossible to track down. It's just not worth the risk.
For an eye-opening read, try Eric Lippert's post Can I skip the lock when reading an integer? (and the follow-up article linked at the bottom).
I am wondering which following code is best:
private static volatile OrderedDictionary _instance;
private static readonly Object SyncLock = new Object();
private static OrderedDictionary Instance
{
get { return _instance ?? (_instance = new OrderedDictionary()); }
}
public static Mea Add(Double pre, Double rec)
{
lock (SyncLock)
{
...
}
}
Or is it OK and better IMO just use the following?
private static volatile OrderedDictionary _instance;
private static OrderedDictionary Instance
{
get { return _instance ?? (_instance = new OrderedDictionary()); }
}
public static Mea Add(Double pre, Double rec)
{
lock (Instance)
{
...
}
}
Based on Mike Strobel's answer I have done to following changes:
public static class Meas
{
private static readonly OrderedDictionary Instance = new OrderedDictionary();
private static readonly Object SyncLock = new Object();
public static Mea Add(Double pre, Double rec)
{
lock (SyncLock)
{
Instance.Add(pre, rec);
...
}
}
}
Mike Strobel's advice is good advice. To sum up:
Lock only objects that are specifically intended to be locks.
Those lock objects should be private readonly fields that are initialized in their declarations.
Do not try to roll your own threadsafe lazy initialization. Use the Lazy<T> type; it was designed by experts who know what they are doing.
Lock all accesses to the protected variable.
Violate these sensible guidelines when both of the following two conditions are true: (1) you have a empirically demonstrated customer-impacting performance problem and solid proof that going with a more complex low-lock thread safety system is the only reasonable solution to the problem, and (2) you are a leading expert on the implications of processor optimizations on low-lock code. For example, if you are Grant Morrison or Joe Duffy.
The two pieces of code are not equivalent. The former ensures that all threads will always use the same lock object. The latter locks a lazily-initialized object, and there is absolutely nothing preventing multiple instantiations of your _instance dictionary, resulting in contents being lost.
What is the purpose of the lock? Does the serve a purpose other than to guarantee single-initialization of the dictionary? Ignoring that it fails to accomplish this in the second example, if that is its sole intended purpose, then you may consider simply using the Lazy<T> class or a double-check locking pattern.
But since this is a static member (and does not appear to capture outer generic parameters), it will presumably only be instantiated once per AppDomain. In that case, just mark it as readonly and initialize it in the declaration. You're probably not saving much this way.
Since you are concerned with best practices: you should never use the lock construct on a mutable value; this goes for both static and instance fields, as well as locals. It is especially bad practice to lock on a volatile field, as the presence of that keyword indicates that you expect the underlying value to change. If you're going to lock on a field, it should almost always be a readonly field. It's also considered bad practice to lock on a method result; that applies to properties too, as properties are effectively a pair of specially-named accessor methods.
If you do not expose the Instance to other classes, the second approach is okay (but not equivalent). It's best practice to keep the lock object private to the class that is using it as a lock object. Only if other classes can also take this object as lock object you may run into issues.
(For completeness and regarding #Scott Chamberlain comment:)
This assumes that the class of Instance is not using lock (this) which contrary represents bad practice.
Nevertheless, the property could make problems. The null coalescing operator is compiled to a null check + assignment... Therefore you could run into race conditions. You might want to read more about this. But also consider to remove the lazy initialization at all, if possible.
I have declared a dictionary of dicionary:
Dictionary<String, Dictionary<String, String>> values;
I have a getter to get a dictionary at a specific index:
public Dictionary<String,String> get(String idx)
{
lock (_lock)
{
return values[moduleName];
}
}
As you can see I am working in a multi-threaded environment.
My question is do I need to return a copy of my dictionary in order to be thread safe like this:
public Dictionary<String,String> get(String idx)
{
lock (_lock)
{
return new Dictionary<string, string>(values[moduleName]);
}
}
If I don't will the class that calls the getter receive a copy (so if I remove this dictionary from my Dictionary<String, Dictionary<String, String>> will it still work)?
Cheers,
Thierry.
Dictionary<> is not Thread-safe, but ConncurrentDictionary<> is.
The class calling the getter receives a reference, which means it will still be there if you remove it from the values-Dictionary as the GC does not clean it as long as you have a reference somewhere, you just can't get it with the getter anymore.
Basicaly that means you have two possibilities when using Dictionary<>:
return a copy: Bad idea, because if you change the config you have two different configurations in your app "alive"
lock the instance: this would make it thread-safe, but then use ConcurrentDictionary<> as it does exactly that for you
If you really need to return the dictionary itself, then you're going to need to either have rules for how the threads lock on it (brittle, what if there's a case that doesn't?), use it in a read-only way (dictionary is thread-safe for read-only, but note that this assumes that the code private to the class isn't writing to it any more either), use a thread-safe dictionary (ConcurrentDictionary uses striped locking, my ThreadSafeDictionary uses lock-free approaches, and there are different scenarios where one beats the other), or make a copy as you suggest.
Alternatively though, if you expose a method to retrieve or set the ultimate string that is found by the two keys, to enumerate the second-level keys found by a key, and so on, then not only can you control the locking done in one place, but it's got other advantages in cleanness of interface and in freeing implementation from interface (e.g. you could move from lock-based to use of a thread-safe dictionary, or the other way around, without affecting the calling code).
If you don't return a copy, the caller will be able to change the dictionary but this is not a thread safety issue.
There is also a thread safety issue because you don't expose any lock to synchronize writes and reads. For instance, your writer thread can adds/removes a value while the reader thread is working on the same instance.
Yes, you have to return a copy.
I have been learning about locking on threads and I have not found an explanation for why creating a typical System.Object, locking it and carrying out whatever actions are required during the lock provides the thread safety?
Example
object obj = new object()
lock (obj) {
//code here
}
At first I thought that it was just being used as a place holder in examples and meant to be swapped out with the Type you are dealing with. But I find examples such as Dennis Phillips points out, doesn't appear to be anything different than actually using an instance of Object.
So taking an example of needing to update a private dictionary, what does locking an instance of System.Object do to provide thread safety as opposed to actually locking the dictionary (I know locking the dictionary in this case could case synchronization issues)?
What if the dictionary was public?
//what if this was public?
private Dictionary<string, string> someDict = new Dictionary<string, string>();
var obj = new Object();
lock (obj) {
//do something with the dictionary
}
The lock itself provides no safety whatsoever for the Dictionary<TKey, TValue> type. What a lock does is essentially
For every use of lock(objInstance) only one thread will ever be in the body of the lock statement for a given object (objInstance)
If every use of a given Dictionary<TKey, TValue> instance occurs inside a lock. And every one of those lock uses the same object then you know that only one thread at a time is ever accessing / modifying the dictionary. This is critical to preventing multiple threads from reading and writing to it at the same time and corrupting its internal state.
There is one giant problem with this approach though: You have to make sure every use of the dictionary occurs inside a lock and it uses the same object. If you forget even one then you've created a potential race condition, there will be no compiler warnings and likely the bug will remain undiscovered for some time.
In the second sample you showed you're using a local object instance (var indicates a method local) as a lock parameter for an object field. This is almost certainly the wrong thing to do. The local will live only for the lifetime of the method. Hence 2 calls to the method will use lock on different locals and hence all methods will be able to simultaneously enter the lock.
It used to be common practice to lock on the shared data itself:
private Dictionary<string, string> someDict = new Dictionary<string, string>();
lock (someDict )
{
//do something with the dictionary
}
But the (somewhat theoretical) objection is that other code, outside of your control, could also lock on someDict and then you might have a deadlock.
So it is recommended to use a (very) private object, declared in 1-to-1 correspondence with the data, to use as a stand-in for the lock. As long as all code that accesses the dictionary locks on on obj the tread-safety is guaranteed.
// the following 2 lines belong together!!
private Dictionary<string, string> someDict = new Dictionary<string, string>();
private object obj = new Object();
// multiple code segments like this
lock (obj)
{
//do something with the dictionary
}
So the purpose of obj is to act as a proxy for the dictionary, and since its Type doesn't matter we use the simplest type, System.Object.
What if the dictionary was public?
Then all bets are off, any code could access the Dictionary and code outside the containing class is not even able to lock on the guard object. And before you start looking for fixes, that simply is not an sustainable pattern. Use a ConcurrentDictionary or keep a normal one private.
The object which is used for locking does not stand in relation to the objects that are modified during the lock. It could be anything, but should be private and no string, as public objects could be modified externally and strings could be used by two locks by mistake.
So far as I understand it, the use of a generic object is simply to have something to lock (as an internally lockable object). To better explain this; say you have two methods within a class, both access the Dictionary, but may be running on different threads. To prevent both methods from modifying the Dictionary at the same time (and potentially causing deadlock), you can lock some object to control the flow. This is better illustrated by the following example:
private readonly object mLock = new object();
public void FirstMethod()
{
while (/* Running some operations */)
{
// Get the lock
lock (mLock)
{
// Add to the dictionary
mSomeDictionary.Add("Key", "Value");
}
}
}
public void SecondMethod()
{
while (/* Running some operation */)
{
// Get the lock
lock (mLock)
{
// Remove from dictionary
mSomeDictionary.Remove("Key");
}
}
}
The use of the lock(...) statement in both methods on the same object prevents the two methods from accessing the resource at the same time.
The important rules for the object you lock on are:
It must be an object visible only to the code that needs to lock on it. This avoids other code also locking on it.
This rules out strings that could be interned, and Type objects.
This rules out this in most cases, and the exceptions are too few and offer little in exploiting, so just don't use this.
Note also that some cases internal to the framework lock on Types and this, so while "it's okay as long as nobody else does it" is true, but it's already too late.
It must be static to protect static static operations, it may be instance to protect instance operations (including those internal to a instance that is held in a static).
You don't want to lock on a value-type. If you really wanted too you could lock on a particular boxing of it, but I can't think of anything that this would gain beyond proving that it's technically possible - it's still going to lead to the code being less clear as to just what locks on what.
You don't want to lock on a field that you may change during the lock being held, as you'll no longer have the lock on what you appear to have the lock on (it's just about plausible that there's a practical use for the effect of this, but there's going to be an impedance between what the code appears to do at first read and what it really does, which is never good).
The same object must be used to lock on all operations that may conflict with each other.
While you can have correctness with overly-broad locks, you can get better performance with finer. E.g. if you had a lock that was protecting 6 operations, and realised that 2 of those operations couldn't interfere with the other 4, so you changed to having 2 lock objects, then you can gain by having better coherency (or crash-and-burn if you were wrong in that analysis!)
The first point rules out locking on anything that is either visible or which could be made visible (e.g. a private instance that is returned by a protected or public member should be considered public as far as this analysis goes, anything captured by a delegate could end up elsewhere, and so on).
The last two points can mean that there's no obvious "type you are dealing with" as you put it, because locks don't protect objects, the protect operations done on objects and you may either have more than one object affected, or the same object affected by more than one group of operations that must be locked.
Hence it can be good practice to have an object that exists purely to lock on. Since it's doing nothing else, it can't get mixed up with other semantics or written over when you don't expect. And since it does nothing else it may as well be the lightest reference type that exists in .NET; System.Object.
Personally, I do prefer to lock on an object related to an operation when it does clearly fit the bill of the "type you are dealing with", and none of the other concerns apply, as it seems to me to be quite self-documenting, but to others the risk of doing it wrong out-weighs that benefit.
I have a generic dictionary in a multithreaded application; to implement a lock i created a property.
static object myLock=new object();
Dictionary<key,SomeRef> dict=new Dictionary<key,SomeRef>();
public Dictionary<key,SomeRef> MyDict{
get{
lock(myLock){
return dict;
}
}
}
Now if i write CODE#1
MyDict.TryGetValue
or CODE#2
var result=MyDict.Values;
foreach(var item in result){
//read value into some other variable
}
so while i m runnig code 1 or 2 and at the same time if some other thread tries to do some write operation on the dictionary like ..clear dict or add new item. then, will this solution be thread safe (using a property).
if not ..then is there any other ways to do this.
When i say write operation it can be take a reference of the dict through property chek key exoist or not if not create key and assign value. (thus me not using the setter in the property)
No, this will not be threadsafe.
The lock will only lock around getting the reference to your internal (dict) instance of the dictionary. It will not lock when the user tries to add to the dictionary, or read from the dictionary.
If you need to provide threadsafe access, I would recommend keeping the dictionary private, and make your own methods for getting/setting/adding values to/from the dictionary. This way, you can put the locks in place to protect at the granularity you need.
This will look something like this:
public bool TryGetValue(key thekey, out SomeRef result)
{
lock(myLock) { return this.dict.TryGetValue(thekey, out result); }
}
public void Add(key thekey, SomeRef value)
{
lock(myLock) { this.dict.Add(thekey, value) }
}
// etc for each method you need to implement...
The idea here is that your clients use your class directly, and your class handles the synchronization. If you expect them to iterate over the values (such as your foreach statement), you can decide whether to copy the values into a List and return that, or provide an enumerator directly (IEnumerator<SomeRef> GetValues()), etc.
No, this will not be safe, as the only code that's locked is the retrieval code. What you need to do is
lock(MyDict)
{
if(MyDict.TryGetValue()...
}
and
lock(MyDict)
{
foreach(var item in MyDict.Values) ...
}
The basic idea is to enclose your working code within the lock() block.
The implementation is not guaranteed to be thread safe as it is. In order to be thread safe concurrent reads/writes must all be protected by the lock. By handing out a reference to your internal dictionary, you're making it very hard to control who accesses the resource and thus you have no guarantee that the caller will use the same lock.
A good approach is to make sure whatever resources you're trying to synchronize access to is completely encapsulated in your type. That will make it much easier to understand and reason about the thread safety of the type.
Thread Safe Dictionary in .NET with ReaderWriterLockSlim
This is a method that uses ReaderWriterLockSlim and deterministic finalization to hold and release locks.