Thread-Safe lazy instantiating using MEF

Thread-Safe lazy instantiating using MEF - c#

// Member Variable
private static readonly object _syncLock = new object();
// Now inside a static method
foreach (var lazyObject in plugins)
{
if ((string)lazyObject.Metadata["key"] = "something")
{
lock (_syncLock)
{
// It seems the `IsValueCreated` is not up-to-date
if (!lazyObject.IsValueCreated)
lazyObject.value.DoSomething();
}
return lazyObject.value;
}
}
Here I need synchronized access per loop. There are many threads iterating this loop and based on the key they are looking for, a lazy instance is created and returned.
lazyObject should not be created more that one time. Although Lazy class is for doing so and despite of the used lock, under high threading I have more than one instance created (I track this with a Interlocked.Increment on a volatile static int and log it somewhere). The problem is I don't have access to definition of Lazy and MEF defines how the Lazy class create objects. I should notice the CompositionContainer has a thread-safe option in constructor which is already used.
My questions:
1) Why the lock doesn't work ?
2) Should I use an array of locks instead of one lock for performance improvement ?

Is the default constructor of T in your Lazy complex? MEF uses LazyThreadSafetyMode.PublicationOnly which means each thread accessing the unitialised Lazy will generate a new() on T until the first to complete the initialisation. That value is then returned for all threads currently accessing .Value and their own new() instances are discarded. If your constructor is complex (perhaps doing too much?) you should redefine it as doing minimal construction work and moving configuration to another method.
You need to think about the method as a whole. Should you consider:
public IPlugin GetPlugin(string key)
{
mutex.WaitOne();
try
{
var plugin = plugins
.Where(l => l.Metadata["key"] == key)
.Select(l => l.Value);
.FirstOrDefault();
return plugin;
}
finally
{
mutex.ReleaseMutex();
}
}
You also need to consider that if plugins is not read-only then you need to synchronise access to that instance too, otherwise it may be modified on another thread, causing your code to fall over.

There is a specific constructor of Lazy<T, TMetadata> for such scenarios, where you define a LazyThreadSafetyMode when constructing a Lazy instance... Otherwise, the lock might not work for many different reasons, e.g. if this is not the only place where the Value property of this Lazy<T> instance is ever accessed.
Btw you got I typo in the if statement...

Related

Why magic does an locking an instance of System.Object allow differently than locking a specific instance type?

I have been learning about locking on threads and I have not found an explanation for why creating a typical System.Object, locking it and carrying out whatever actions are required during the lock provides the thread safety?
Example
object obj = new object()
lock (obj) {
//code here
}
At first I thought that it was just being used as a place holder in examples and meant to be swapped out with the Type you are dealing with. But I find examples such as Dennis Phillips points out, doesn't appear to be anything different than actually using an instance of Object.
So taking an example of needing to update a private dictionary, what does locking an instance of System.Object do to provide thread safety as opposed to actually locking the dictionary (I know locking the dictionary in this case could case synchronization issues)?
What if the dictionary was public?
//what if this was public?
private Dictionary<string, string> someDict = new Dictionary<string, string>();
var obj = new Object();
lock (obj) {
//do something with the dictionary
}

The lock itself provides no safety whatsoever for the Dictionary<TKey, TValue> type. What a lock does is essentially
For every use of lock(objInstance) only one thread will ever be in the body of the lock statement for a given object (objInstance)
If every use of a given Dictionary<TKey, TValue> instance occurs inside a lock. And every one of those lock uses the same object then you know that only one thread at a time is ever accessing / modifying the dictionary. This is critical to preventing multiple threads from reading and writing to it at the same time and corrupting its internal state.
There is one giant problem with this approach though: You have to make sure every use of the dictionary occurs inside a lock and it uses the same object. If you forget even one then you've created a potential race condition, there will be no compiler warnings and likely the bug will remain undiscovered for some time.
In the second sample you showed you're using a local object instance (var indicates a method local) as a lock parameter for an object field. This is almost certainly the wrong thing to do. The local will live only for the lifetime of the method. Hence 2 calls to the method will use lock on different locals and hence all methods will be able to simultaneously enter the lock.

It used to be common practice to lock on the shared data itself:
private Dictionary<string, string> someDict = new Dictionary<string, string>();
lock (someDict )
{
//do something with the dictionary
}
But the (somewhat theoretical) objection is that other code, outside of your control, could also lock on someDict and then you might have a deadlock.
So it is recommended to use a (very) private object, declared in 1-to-1 correspondence with the data, to use as a stand-in for the lock. As long as all code that accesses the dictionary locks on on obj the tread-safety is guaranteed.
// the following 2 lines belong together!!
private Dictionary<string, string> someDict = new Dictionary<string, string>();
private object obj = new Object();
// multiple code segments like this
lock (obj)
{
//do something with the dictionary
}
So the purpose of obj is to act as a proxy for the dictionary, and since its Type doesn't matter we use the simplest type, System.Object.
What if the dictionary was public?
Then all bets are off, any code could access the Dictionary and code outside the containing class is not even able to lock on the guard object. And before you start looking for fixes, that simply is not an sustainable pattern. Use a ConcurrentDictionary or keep a normal one private.

The object which is used for locking does not stand in relation to the objects that are modified during the lock. It could be anything, but should be private and no string, as public objects could be modified externally and strings could be used by two locks by mistake.

So far as I understand it, the use of a generic object is simply to have something to lock (as an internally lockable object). To better explain this; say you have two methods within a class, both access the Dictionary, but may be running on different threads. To prevent both methods from modifying the Dictionary at the same time (and potentially causing deadlock), you can lock some object to control the flow. This is better illustrated by the following example:
private readonly object mLock = new object();
public void FirstMethod()
{
while (/* Running some operations */)
{
// Get the lock
lock (mLock)
{
// Add to the dictionary
mSomeDictionary.Add("Key", "Value");
}
}
}
public void SecondMethod()
{
while (/* Running some operation */)
{
// Get the lock
lock (mLock)
{
// Remove from dictionary
mSomeDictionary.Remove("Key");
}
}
}
The use of the lock(...) statement in both methods on the same object prevents the two methods from accessing the resource at the same time.

The important rules for the object you lock on are:
It must be an object visible only to the code that needs to lock on it. This avoids other code also locking on it.
This rules out strings that could be interned, and Type objects.
This rules out this in most cases, and the exceptions are too few and offer little in exploiting, so just don't use this.
Note also that some cases internal to the framework lock on Types and this, so while "it's okay as long as nobody else does it" is true, but it's already too late.
It must be static to protect static static operations, it may be instance to protect instance operations (including those internal to a instance that is held in a static).
You don't want to lock on a value-type. If you really wanted too you could lock on a particular boxing of it, but I can't think of anything that this would gain beyond proving that it's technically possible - it's still going to lead to the code being less clear as to just what locks on what.
You don't want to lock on a field that you may change during the lock being held, as you'll no longer have the lock on what you appear to have the lock on (it's just about plausible that there's a practical use for the effect of this, but there's going to be an impedance between what the code appears to do at first read and what it really does, which is never good).
The same object must be used to lock on all operations that may conflict with each other.
While you can have correctness with overly-broad locks, you can get better performance with finer. E.g. if you had a lock that was protecting 6 operations, and realised that 2 of those operations couldn't interfere with the other 4, so you changed to having 2 lock objects, then you can gain by having better coherency (or crash-and-burn if you were wrong in that analysis!)
The first point rules out locking on anything that is either visible or which could be made visible (e.g. a private instance that is returned by a protected or public member should be considered public as far as this analysis goes, anything captured by a delegate could end up elsewhere, and so on).
The last two points can mean that there's no obvious "type you are dealing with" as you put it, because locks don't protect objects, the protect operations done on objects and you may either have more than one object affected, or the same object affected by more than one group of operations that must be locked.
Hence it can be good practice to have an object that exists purely to lock on. Since it's doing nothing else, it can't get mixed up with other semantics or written over when you don't expect. And since it does nothing else it may as well be the lightest reference type that exists in .NET; System.Object.
Personally, I do prefer to lock on an object related to an operation when it does clearly fit the bill of the "type you are dealing with", and none of the other concerns apply, as it seems to me to be quite self-documenting, but to others the risk of doing it wrong out-weighs that benefit.

How to implement a thread-safe cache mechanism when working with collections?

Scenario:
I have a bunch of Child objects, all related to a given Parent.
I'm working on an ASP.NET MVC 3 Web Application (e.g multi-threaded)
One of the pages is a "search" page where i need to grab a given sub-set of children and "do stuff" in memory to them (calculations, ordering, enumeration)
Instead of getting each child in a seperate call, i do one call to the database to get all children for a given parent, cache the result and "do stuff" on the result.
The problem is that the "do stuff" involves LINQ operations (enumeration, adding/removing items from the collection), which when implemented using List<T> is not thread-safe.
I have read about ConcurrentBag<T> and ConcurrentDictionary<T> but not sure if i should use one of these, or implement the synchronization/locking myself.
I'm on .NET 4, so i'm utilizing ObjectCache as a singleton instance of MemoryCache.Default. I have a service layer which works with the cache, and the services accept an instance of ObjectCache, which is done via constructor DI. This way all the services share the same ObjectCache instance.
The main thread safety issue is I need to loop through the current "cached" collection, and if the child I'm working with is already there, I need to remove it and add the one I'm working with back in, this enumeration is what causes the issues.
Any recommendations?

Yes, to implement a cache efficiently, it needs a fast lookup mechanism and so List<T> is the wrong data structure out-of-the-box. A Dictionary<TKey, TValue> is an ideal data structure for a cache because it provides a way to replace:
var value = instance.GetValueExpensive(key);
with:
var value = instance.GetValueCached(key);
by using cached values in a dictionary and using the dictionary to do the heavy lifting for lookup. The caller is none the wiser.
But, if the callers could be calling from multiple threads then .NET4 provides ConcurrentDictionary<TKey, TValue> that works perfectly in this situation. But what does the dictionary cache? It seems like in your situation the dictionary key is the child and the dictionary values are the database results for that child.
OK, so now we have a thread-safe and efficient cache of database results keyed by child. What data structure should we use for the database results?
You haven't said what those results look like but since you use LINQ we know that they are at least IEnumerable<T> and maybe even List<T>. So we're back to the same problem, right? Because List<T> is not thread-safe, we can't use it for the dictionary value. Or can we?
A cache must be read-only from the point-of-view of the caller. You say with LINQ you "do stuff" like add/remove, but that makes no sense for a cached value. It only makes sense to do stuff in the implementation of the cache itself such as replacing a stale entry with new results.
The dictionary value, because it is read-only, can be a List<T> with no ill effects, even though it will be accessed from multiple threads. You can use List<T>.AsReadOnly to improve your confidence and add some compile-time safety checking.
But the important point is that List<T> is only not thread-safe if it is mutable. Since by definition a method implemented using a cache must return the same value if called multiple times (until the value is invalidated by the cache itself), the client cannot modify the returned value and so the List<T> must be frozen, effectively immutable.
If the client desperately needs to modify the cached database results and the value is a List<T>, then the only safe way to do so is to:
Make a copy
Make changes to the copy
Ask the cache to update the value
In summary, use a thread-safe dictionary for the top-level cache and an ordinary list for the cached value, being careful never to modify the contents of the last after inserting it into the cache.

Try modifying RefreshFooCache like so:
public ReadOnlyCollection<Foo> RefreshFooCache(Foo parentFoo)
{
ReadOnlyCollection<Foo> results;
try {
// Prevent other writers but allow read operations to proceed
Lock.EnterUpgradableReaderLock();
// Recheck your cache: it may have already been updated by a previous thread before we gained exclusive lock
if (_cache.Get(parentFoo.Id.ToString()) != null) {
return parentFoo.FindFoosUnderneath(uniqueUri).AsReadOnly();
}
// Get the parent and everything below.
var parentFooIncludingBelow = _repo.FindFoosIncludingBelow(parentFoo.UniqueUri).ToList();
// Now prevent both other writers and other readers
Lock.EnterWriteLock();
// Remove the cache
_cache.Remove(parentFoo.Id.ToString());
// Add the cache.
_cache.Add(parentFoo.Id.ToString(), parentFooIncludingBelow );
} finally {
if (Lock.IsWriteLockHeld) {
Lock.ExitWriteLock();
}
Lock.ExitUpgradableReaderLock();
}
results = parentFooIncludingBelow.AsReadOnly();
return results;
}

EDIT - updated to use ReadOnlyCollection instead of ConcurrentDictionary
Here's what i currently have implemented. I ran some load tests and didn't see any errors happen - what do you guys think of this implementation:
public class FooService
{
private static ReaderWriterLockSlim _lock;
private static readonly object SyncLock = new object();
private static ReaderWriterLockSlim Lock
{
get
{
if (_lock == null)
{
lock(SyncLock)
{
if (_lock == null)
_lock = new ReaderWriterLockSlim();
}
}
return _lock;
}
}
public ReadOnlyCollection<Foo> RefreshFooCache(Foo parentFoo)
{
// Get the parent and everything below.
var parentFooIncludingBelow = repo.FindFoosIncludingBelow(parentFoo.UniqueUri).ToList().AsReadOnly();
try
{
Lock.EnterWriteLock();
// Remove the cache
_cache.Remove(parentFoo.Id.ToString());
// Add the cache.
_cache.Add(parentFoo.Id.ToString(), parentFooIncludingBelow);
}
finally
{
Lock.ExitWriteLock();
}
return parentFooIncludingBelow;
}
public ReadOnlyCollection<Foo> FindFoo(string uniqueUri)
{
var parentIdForFoo = uniqueUri.GetParentId();
ReadOnlyCollection<Foo> results;
try
{
Lock.EnterReadLock();
var cachedFoo = _cache.Get(parentIdForFoo);
if (cachedFoo != null)
results = cachedFoo.FindFoosUnderneath(uniqueUri).ToList().AsReadOnly();
}
finally
{
Lock.ExitReadLock();
}
if (results == null)
results = RefreshFooCache(parentFoo).FindFoosUnderneath(uniqueUri).ToList().AsReadOnly();
}
return results;
}
}
Summary:
Each Controller (e.g each HTTP request) gets given a new instance of FooService.
FooService has a static lock, so all the HTTP requests share the same instance.
I've used ReaderWriterLockSlim to ensure multiple threads can read, but only one thread can write. I'm hoping this should avoid "read" deadlocks. If two threads happen to need to "write" at the same time, then of course one will have to wait. But the goal here is to serve the reads quickly.
The goal is to be able to find a "Foo" from the cache as long as something above it has been retrieved.
Any tips/suggestions/problems with this approach? I'm not sure if im locking the write areas. But because i need to run LINQ on the dictionary, i figured i need a read lock.

locking the object inside a property, c#

public ArrayList InputBuffer
{
get { lock (this.in_buffer) { return this.in_buffer; } }
}
is this.in_buffer locked during a call to InputBuffer.Clear?
or does the property simply lock the in_buffer object while it's getting the reference to it; the lock exits, and then that reference is used to Clear?

No, the property locks the reference while it's getting that reference. Pretty pointless, to be honest... this is more common:
private readonly object mutex = new object();
private Foo foo = ...;
public Foo Foo
{
get
{
lock(mutex)
{
return foo;
}
}
}
That lock would only cover the property access itself, and wouldn't provide any protection for operations performed with the Foo. However, it's not the same as not having the lock at all, because so long as the variable is only written while holding the same lock, it ensures that any time you read the Foo property, you're accessing the most recent value of the property... without the lock, there's no memory barrier and you could get a "stale" result.
This is pretty weak, but worth knowing about.
Personally I try to make very few types thread-safe, and those tend to have more appropriate operations... but if you wanted to write code which did modify and read properties from multiple threads, this is one way of doing so. Using volatile can help too, but the semantics of it are hideously subtle.

The object is locked inside the braces of the lock call, and then it is unlocked.
In this case the only code in the lock call is return this.in_buffer;.
So in this case, the in_buffer is not locked during a call to InputBuffer.Clear.
One solution to your problem, using extension methods, is as follows.
private readonly object _bufLock;
class EMClass{
public static void LockedClear(this ArrayList a){
lock(_bufLock){
a.Clear();
}
}
}
Now when you do:
a.LockedClear();
The Clear call will be done in a lock.
You must ensure that the buffer is only accessed inside _bufLocks.

In addition to what others have said about the scope of the lock, remember that you aren't locking the object, you are only locking based on the object instance named.
Common practice is to have a separate lock mutex as Jon Skeet exemplifies.
If you must guarantee synchronized execution while the collection is being cleared, expose a method that clears the collection, have clients call that, and don't expose your underlying implementation details. (Which is good practice anyway - look up encapsulation.)

Static methods updating a Dictionary<T,U> in ASP.NET - is it safe to lock() on the dictionary itself?

I have a class that maintains a static dictionary of cached lookup results from my domain controller - users' given names and e-mails.
My code looks something like:
private static Dictionary<string, string> emailCache = new Dictionary<string, string>();
protected string GetUserEmail(string accountName)
{
if (emailCache.ContainsKey(accountName))
{
return(emailCache[accountName]);
}
lock(/* something */)
{
if (emailCache.ContainsKey(accountName))
{
return(emailCache[accountName]);
}
var email = GetEmailFromActiveDirectory(accountName);
emailCache.Add(accountName, email);
return(email);
}
}
Is the lock required? I assume so since multiple requests could be performing lookups simultaneously and end up trying to insert the same key into the same static dictionary.
If the lock is required, do I need to create a dedicated static object instance to use as the lock token, or is it safe to use the actual dictionary instance as the lock token?

Collections in .NET are not thread safe so the lock is indeed required. An alternative to using the dictionary one could use Concurrent dictionaries introduced in .NET 4.0
http://msdn.microsoft.com/en-us/library/dd287191.aspx

Yes, the lock is required as long as code on other threads can/will access the static object.
Yes, its safe to lock on the dictionary itself, as long as its not accessible via a public getter. Then the caller might use the object for locking itself and that might result in deadlocks. So i would recommend to use a separate object to lock in if your dictionary is somewhat public.

The lock is indeed required.
By using lock, you ensure that only one thread can access the critical section at one time, so an additional static object is not needed.
You can lock on the dictionary object itself, but I would simply use a object lock =new object(); as my lock.

MSDN documentation specify that you should never use the lock() statement over a public object that can be read or modified outside your own code.
I would rather use an object instance rather than the object you attempt to modify, specifically if this dictionnary has accessors that allows external code to access it.
I might be wrong here, I didn't write a line of C# since one year ago.

Since the dictionary is private, you should be safe to lock on it. The danger with locking (that I'm aware of) is that other code that you're not considering now could also lock on the object and potentially lead to a deadlock. With a private dictionary, this isn't an issue.
Frankly, I think you could eliminate the lock by just changing your code to not call the dictionary Add method, instead using the property set statement. Then I don't believe the lock at all.
UPDATE: The following is a block of code from the private Insert method on Dictionary, which is called by both the Item setter and the Add method. Note that when called from the item setter, the "add" variable is set to false and when called from the Add method, the "add" variable is set to true:
if (add)
{
ThrowHelper.ThrowArgumentException(ExceptionResource.Argument_AddingDuplicate);
}
So it seems to me that if you're not concerned about overwriting values in your dictionary (which you wouldn't be in this case) then using the property setter without locking should be sufficient.

As far as I could see, additional object as a mutex was used:
private static object mutex = new object();
protected string GetUserEmail(string accountName)
{
lock (mutex)
{
// access the dictionary
}
}

Difference between lock(locker) and lock(variable_which_I_am_using)

I'm using C# & .NEt 3.5. What is the difference between the OptionA and OptionB ?
class MyClass
{
private object m_Locker = new object();
private Dicionary<string, object> m_Hash = new Dictionary<string, object>();
public void OptionA()
{
lock(m_Locker){
// Do something with the dictionary
}
}
public void OptionB()
{
lock(m_Hash){
// Do something with the dictionary
}
}
}
I'm starting to dabble in threading (primarly for creating a cache for a multi-threaded app, NOT using the HttpCache class, since it's not attached to a web site), and I see the OptionA syntax in a lot of the examples I see online, but I don't understand what, if any, reason that is done over OptionB.

Option B uses the object to be protected to create a critical section. In some cases, this more clearly communicates the intent. If used consistently, it guarantees only one critical section for the protected object will be active at a time:
lock (m_Hash)
{
// Across all threads, I can be in one and only one of these two blocks
// Do something with the dictionary
}
lock (m_Hash)
{
// Across all threads, I can be in one and only one of these two blocks
// Do something with the dictionary
}
Option A is less restrictive. It uses a secondary object to create a critical section for the object to be protected. If multiple secondary objects are used, it's possible to have more than one critical section for the protected object active at a time.
private object m_LockerA = new object();
private object m_LockerB = new object();
lock (m_LockerA)
{
// It's possible this block is active in one thread
// while the block below is active in another
// Do something with the dictionary
}
lock (m_LockerB)
{
// It's possible this block is active in one thread
// while the block above is active in another
// Do something with the dictionary
}
Option A is equivalent to Option B if you use only one secondary object. As far as reading code, Option B's intent is clearer. If you're protecting more than one object, Option B isn't really an option.

It's important to understand that lock(m_Hash) does NOT prevent other code from using the hash. It only prevents other code from running that is also using m_Hash as its locking object.
One reason to use option A is because classes are likely to have private variables that you will use inside the lock statement. It is much easier to just use one object which you use to lock access to all of them instead of trying to use finer grain locks to lock access to just the members you will need. If you try to go with the finer grained method you will probably have to take multiple locks in some situations and then you need to make sure you are always taking them in the same order to avoid deadlocks.
Another reason to use option A is because it is possible that the reference to m_Hash will be accessible outside your class. Perhaps you have a public property which supplies access to it, or maybe you declare it as protected and derived classes can use it. In either case once external code has a reference to it, it is possible that the external code will use it for a lock. This also opens up the possibility of deadlocks since you have no way to control or know what order the lock will be taken in.

Actually, it is not good idea to lock on object if you are using its members.
Jeffrey Richter wrote in his book "CLR via C#" that there is no guarantee that a class of object that you are using for synchronization will not use lock(this) in its implementation (It's interesting, but it was a recommended way for synchronization by Microsoft for some time... Then, they found that it was a mistake), so it is always a good idea to use a special separate object for synchronization. So, as you can see OptionB will not give you a guarantee of deadlock - safety.
So, OptionA is much safer that OptionB.

It's not what you're "Locking", its the code that's contained between the lock { ... } thats important and that you're preventing from being executed.
If one thread takes out a lock() on any object, it prevents other threads from obtaining a lock on the same object, and hence prevents the second thread from executing the code between the braces.
So that's why most people just create a junk object to lock on, it prevents other threads from obtaining a lock on that same junk object.

I think the scope of the variable you "pass" in will determine the scope of the lock.
i.e. An instance variable will be in respect of the instance of the class whereas a static variable will be for the whole AppDomain.
Looking at the implementation of the collections (using Reflector), the pattern seems to follow that an instance variable called SyncRoot is declared and used for all locking operations in respect of the instance of the collection.

Well, it depends on what you wanted to lock(be made threadsafe).
Normally I would choose OptionB to provide threadsafe access to m_Hash ONLY. Where as OptionA, I would used for locking value type, which can't be used with the lock, or I had a group of objects that need locking concurrently, but I don't what to lock the whole instance by using lock(this)

Locking the object that you're using is simply a matter of convenience. An external lock object can make things simpler, and is also needed if the shared resource is private, like with a collection (in which case you use the ICollection.SyncRoot object).

OptionA is the way to go here as long as in all your code, when accessing the m_hash you use the m_Locker to lock on it.
Now Imagine this case. You lock on the object. And that object in one of the functions you call has a lock(this) code segment. In this case that is a sure unrecoverable deadlock

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Thread-Safe lazy instantiating using MEF - c#

Related

Why magic does an locking an instance of System.Object allow differently than locking a specific instance type?

How to implement a thread-safe cache mechanism when working with collections?

locking the object inside a property, c#

Static methods updating a Dictionary<T,U> in ASP.NET - is it safe to lock() on the dictionary itself?

Difference between lock(locker) and lock(variable_which_I_am_using)

Categories

Resources