Why Locking On a Public Object is a Bad Idea - c#

Ok, I've used locks quite a bit, but I've never had this scenario before. I have two different classes that contain code used to modify the same MSAccess database:
public class DatabaseNinja
{
public void UseSQLKatana
{
//Code to execute queries against db.TableAwesome
}
}
public class DatabasePirate
{
public void UseSQLCutlass
{
//Code to execute queries against db.TableAwesome
}
}
This is a problem, because transactions to the database cannot be executed in parallel, and these methods (UseSQLKatana and UseSQLCutlass) are called by different threads.
In my research, I see that it is bad practice to use a public object as a lock object so how do I lock these methods so that they don't run in tandem? Is the answer simply to have these methods in the same class? (That is actually not so simple in my real code)

Well, first off, you could create a third class:
internal class ImplementationDetail
{
private static readonly object lockme = new object();
public static void DoDatabaseQuery(whatever)
{
lock(lockme)
ReallyDoQuery(whatever);
}
}
and now UseSQLKatana and UseSQLCutlass call ImplementationDetail.DoDatabaseQuery.
Second, you could decide to not worry about it, and lock an object that is visible to both types. The primary reason to avoid that is because it becomes difficult to reason about who is locking the object, and difficult to protect against hostile partially trusted code locking the object maliciously. If you don't care about either downside then you don't have to blindly follow the guideline.

The reason it's bad practice to lock on a public object is that you can never be sure who ELSE is locking on that object. Although unlikely, someone else someday can decide that they want to grab your lock object, and do some process that ends up calling your code, where you lock onto that same lock object, and now you have an impossible deadlock to figure out. (It's the same issue for using 'this').
A better way to do this would be to use a public Mutex object. These are much more heavyweight, but it's much easier to debug the issue.

Use a Mutex.
You can create mutex in main class and call Wait method at the beginning of each class (method); then set mutex so when the other method is called it gonna wait for first class to finish.
Ah, remember to release mutex exiting from those methods...

I see two differing questions here:
Why is it a bad idea to lock on a public object?
The idea is that locking on an object restricts access while the lock is maintained - this means none of its members can be accessed, and other sources may not be aware of the lock and attempt to utilise the instance, even trying to acquire a lock themselves, hence causing problems.
For this reason, use a dedicated object instance to lock onto.
How do I lock these methods so that they don't run in tandem?
You could consider the Mutex class; creating a 'global' mutex will allow your classes to operate on the basis of knowing the state of the lock throughout the application. Or, you could use a shared ReaderWriterLockSlim instance, but I wouldn't really recommend the cross-class sharing of it.

You can use a public LOCK object as a lock object. You'll just have to specify that the object you're creating is a Lock object solely used for locking the Ninja and Pirate class.

Related

Is this a valid way to make a custom type thread safe? And general threading questions

I have a few general questions when dealing with threads. I have been looking around but haven't really seen any answers to my questions
When dealing with multiple variables in a class you want to be thread safe, are you supposed to have one "lock object" for every variable you want to lock in the class? Like this?
static readonly object lockForVarA = new object();
private float varA;
static readonly object lockForVarB = new object();
private float varB;
Also is this a valid way to handle thread safing a custom type?
public class SomeClass
{
public SomeClass()
{
//Do some kind of work IE load an assembly
}
}
public class SomeOtherClass : BaseClassFiringFromRandomThread
{
static readonly object someClassLock = new object();
SomeClass someClass;
public override void Init()//this is fired from any available thread, can be fired multiple times and even at the same time
{
lock(someClassLock)
{
if(someClass == null)
someClass = new SomeClass();
}
}
}
This code is in the constructor of a class that can be called from any thread at any time
When dealing with multiple variables in a class you want to be thread safe, are you supposed to have one "lock object" for every variable you want to lock in the class?
There are two rules:
Be "fine grained". Have as many locks as possible, one for each variable. Access the variable under its lock every time you use it. Lock as little code as possible to ensure scalability. If you forget to lock a variable, you'll cause a race condition, and if you get the lock ordering wrong, you'll cause a deadlock, so make sure you get it perfect.
Be "coarse-grained". Have just one lock, and put all the critical sections under that lock. Having many locks decreases contention but increases the chance of deadlocks and other errors, so have as few locks as possible, with as much code as possible in each. Of course, this also increases the risk of deadlocks since now there is lots more code inside the locks that can have inversions, and it decreases scalability.
As you have no doubt noticed, the standard advice is completely contradictory. That's because locks are terrible.
My advice: if you don't share variables across threads then you don't need to have any locks at all.
Also is this a valid way to handle thread safing a custom type?
The code looks reasonable so far, but if your intention is to lazy-load some logic then do not write your own threading logic. Just use Lazy<T> and make it do the work. It was written by experts.
Always use the highest-level tool designed by experts that is available to you. Rolling your own threading primitives is a recipe for disaster.
Whatever you do do not take the advice in the other answer that says you must use double checked locking. There are no circumstances in which you must use double-checked locking. Single checked locking is safer, easier, and more likely to be correct. Only use double-checked locking when (1) you have overwhelming empirical evidence that contention is the cause of a measurable, user-impacting performance problem that will be fixed by going low-lock, and (2) you can explain what rules in the C# memory model make double checked locking safe.
If you can't do (1) then you have no reason to do double checked locking, and if you can't do (2), you can't do it with any confidence of safety.
You need to use a double checked lock pattern. There isn't need to acquire your someClassLock lock once someClass has been initialised, and locking it there will just cause unnecessary contention.
if (someClass == null)
{
lock(someClassLock)
{
if (someClass == null)
someClass = new SomeClass();
}
}
You need the inner if block because it is possible a concurrent thread may have created someClass after the first null check but before your lock was acquired.
Of course, you need to also ensure that SomeClass is written in a way that is itself threadsafe, but this will safely ensure that only one instance of someClass is created.
An alternative method is to use Lazy<T> with a suitable LazyThreadSafetyMode.

C# mutex through reference

I have a reasonably simple case of two threads interacting with the same data structure. The threads are hosted in their own responsible classes. Let's say these are class Alfons and class Belzebub:
class Alfons {
public Mutex listMutex = new Mutex();
private void ProcessListInfo()
{
listMutex.WaitOne();
//
// ... Process multi-access list stuff ...
//
listMutex.ReleaseMutex();
}
}
class Belzebub {
private Alfons mCachedAlfonsReference;
private void ProcessListInfoDifferently()
{
mCachedAlfonsReference.listMutex.WaitOne();
//
// ... Process multi-access list stuff in a different fashion ...
//
mCachedAlfonsReference.listMutex.ReleaseMutex();
}
}
My question is whether referencing a Mutex like this can create a concurrency issue OR whether it is recommended practice to do so. Is there a better way of doing this and should I, for example, cache the mutex reference rather than accessing it through a reference.
There would be no concurrency issue - the mutex is supposed to be shared. As per the Mutex MSDN docs
This type is thread safe.
However, I'd say that the data structure itself should synchronize access coming from different threads. If the data structure doesn't support this (e.g., using SyncRoot), encapsulate it and add that feature.
Out of curiosity: which data structure are you using? You might consider using one of the System.Collections.Concurrent collections for lock-free/fine-grained locking solutions. Also, wouldn't using the lock keyword be simpler and less error-prone for your scenario?
Generally, since locking can be tricky and deadlocks will stop all fun, I try to reduce the code that is concerned with the mutex rather than passing it around. Otherwise it can be a headache to figure out which paths lead to a lock.
It may be better to encapsulate the resource and thread critical operations in a class and then:
Lock( this )
{
}
Or see if there is a thread-safe version as suggested by dcastro.
Besides this, be very careful that there is no return (throw, etc) between WaitOne() and ReleaseMutex() otherwise other threads will be locked out indefinitely - lock or a finally with the ReleaseMutex is safer in this respect. As castro pointed out in the comments, it could be another library that raises the exception.
Finally, I am assuming that it is the same resource that is being protected in ProcessListInfo() and ProcessListInfoDifferently(). If these are two different resources that are being protected, then you have extended the likelihood of unnecessary thread contention.
I don't see how caching the mutex reference would make any difference, either way you are still accessing the same object through references, and if you don't do that then it defeats the point of a mutex.

locking the object inside a property, c#

public ArrayList InputBuffer
{
get { lock (this.in_buffer) { return this.in_buffer; } }
}
is this.in_buffer locked during a call to InputBuffer.Clear?
or does the property simply lock the in_buffer object while it's getting the reference to it; the lock exits, and then that reference is used to Clear?
No, the property locks the reference while it's getting that reference. Pretty pointless, to be honest... this is more common:
private readonly object mutex = new object();
private Foo foo = ...;
public Foo Foo
{
get
{
lock(mutex)
{
return foo;
}
}
}
That lock would only cover the property access itself, and wouldn't provide any protection for operations performed with the Foo. However, it's not the same as not having the lock at all, because so long as the variable is only written while holding the same lock, it ensures that any time you read the Foo property, you're accessing the most recent value of the property... without the lock, there's no memory barrier and you could get a "stale" result.
This is pretty weak, but worth knowing about.
Personally I try to make very few types thread-safe, and those tend to have more appropriate operations... but if you wanted to write code which did modify and read properties from multiple threads, this is one way of doing so. Using volatile can help too, but the semantics of it are hideously subtle.
The object is locked inside the braces of the lock call, and then it is unlocked.
In this case the only code in the lock call is return this.in_buffer;.
So in this case, the in_buffer is not locked during a call to InputBuffer.Clear.
One solution to your problem, using extension methods, is as follows.
private readonly object _bufLock;
class EMClass{
public static void LockedClear(this ArrayList a){
lock(_bufLock){
a.Clear();
}
}
}
Now when you do:
a.LockedClear();
The Clear call will be done in a lock.
You must ensure that the buffer is only accessed inside _bufLocks.
In addition to what others have said about the scope of the lock, remember that you aren't locking the object, you are only locking based on the object instance named.
Common practice is to have a separate lock mutex as Jon Skeet exemplifies.
If you must guarantee synchronized execution while the collection is being cleared, expose a method that clears the collection, have clients call that, and don't expose your underlying implementation details. (Which is good practice anyway - look up encapsulation.)

Using the same lock for multiple methods

I haven't had any issues using the same lock for multiple methods so far, but I'm wondering if the following code might actually have issues (performance?) that I'm not aware of:
private static readonly object lockObj = new object();
public int GetValue1(int index)
{
lock(lockObj)
{
// Collection 1 read and/or write
}
}
public int GetValue2(int index)
{
lock(lockObj)
{
// Collection 2 read and/or write
}
}
public int GetValue3(int index)
{
lock(lockObj)
{
// Collection 3 read and/or write
}
}
The 3 methods and the collections are not related in anyway.
In addition, will it be a problem if this lockObj is also used by a singleton (in Instance property) ?
Edit: To clarify my question on using the same lock object in a Singleton class:
private static readonly object SyncObject = new object();
public static MySingleton Instance
{
get
{
lock (SyncObject)
{
if (_instance == null)
{
_instance = new MySingleton();
}
}
return _instance;
}
}
public int MyMethod()
{
lock (SyncObject)
{
// Read or write
}
}
Will this cause issues?
If the methods are unrelated as you state, then use a different lock for each one; otherwise it's inefficient (since there's no reason for different methods to lock on the same object, as they could safely execute concurrently).
Also, it seems that these are instance methods locking on a static object -- was that intended? I have a feeling that's a bug; instance methods should (usually) only lock on instance fields.
Regarding the Singleton design pattern:
While locking can be safe for those, better practice is doing a delayed initialization of a field like this:
private static object sharedInstance;
public static object SharedInstance
{
get
{
if (sharedInstance == null)
Interlocked.CompareExchange(ref sharedInstance, new object(), null);
return sharedInstance;
}
}
This way it's a little bit faster (both because interlocked methods are faster, and because the initialization is delayed), but still thread-safe.
By using the same object to lock on in all of those methods, you are serializing all access to code in all of the threads.
That is... code running GetValue1() will block other code in a different thread from running GetValue2() until it's done. If you add even more code that locks on the same object instance, you'll end up with effectively a single-threaded application at some point.
Shared lock locks other non-related calls
If you use the same lock then locking in one method unnecessarily locks others as well. If they're not related at all than this is a problem since they have to wait for each other. Which they shouldn't.
Bottleneck
This may pose a bottleneck when these methods are frequently called. With separate locks they would run independently, but sharing the same lock it means they must wait for the lock to be released more often as required (actually three times more often).
To create a thread-safe singleton, use this technique.
You don't need a lock.
In general, each lock should be used as little as possible.
The more methods lock on the same thing, the mroe likely you are to end up waiting for it when you don't really need to.
Good question. There are pros and cons of making locks more fine grained vs more coarse grained, with one extreme being a separate lock for each piece of data and the other extreme being one lock for the entire program. As other posts point out, the disadvantage of reusing the same locks is in general you may get less concurrency (though it depends on the case, you may not get less concurrency).
However, the disadvantage of using more locks is in general you make deadlock more likely. There are more ways to get deadlocks the more locks you have involved. For example, acquiring two locks at the same time in separate threads but in the opposite order is a potential deadlock which wouldn't happen if only one lock were involved. Of course sometimes you may fix a deadlock by breaking one lock into two, but usually fewer locks means fewer deadlocks. There's also added code complexity of having more locks.
In general these two factors need to be balanced. It's common to use one lock per class for convenience if it doesn't cause any concurrency issues. In fact, doing so is a design pattern called a monitor.
I would say the best practice is to favor fewer locks for code simplicity's sake and make additional locks if there's a good reason (such as concurrency, or a case where it's more simple or fixes a deadlock).

Difference between lock(locker) and lock(variable_which_I_am_using)

I'm using C# & .NEt 3.5. What is the difference between the OptionA and OptionB ?
class MyClass
{
private object m_Locker = new object();
private Dicionary<string, object> m_Hash = new Dictionary<string, object>();
public void OptionA()
{
lock(m_Locker){
// Do something with the dictionary
}
}
public void OptionB()
{
lock(m_Hash){
// Do something with the dictionary
}
}
}
I'm starting to dabble in threading (primarly for creating a cache for a multi-threaded app, NOT using the HttpCache class, since it's not attached to a web site), and I see the OptionA syntax in a lot of the examples I see online, but I don't understand what, if any, reason that is done over OptionB.
Option B uses the object to be protected to create a critical section. In some cases, this more clearly communicates the intent. If used consistently, it guarantees only one critical section for the protected object will be active at a time:
lock (m_Hash)
{
// Across all threads, I can be in one and only one of these two blocks
// Do something with the dictionary
}
lock (m_Hash)
{
// Across all threads, I can be in one and only one of these two blocks
// Do something with the dictionary
}
Option A is less restrictive. It uses a secondary object to create a critical section for the object to be protected. If multiple secondary objects are used, it's possible to have more than one critical section for the protected object active at a time.
private object m_LockerA = new object();
private object m_LockerB = new object();
lock (m_LockerA)
{
// It's possible this block is active in one thread
// while the block below is active in another
// Do something with the dictionary
}
lock (m_LockerB)
{
// It's possible this block is active in one thread
// while the block above is active in another
// Do something with the dictionary
}
Option A is equivalent to Option B if you use only one secondary object. As far as reading code, Option B's intent is clearer. If you're protecting more than one object, Option B isn't really an option.
It's important to understand that lock(m_Hash) does NOT prevent other code from using the hash. It only prevents other code from running that is also using m_Hash as its locking object.
One reason to use option A is because classes are likely to have private variables that you will use inside the lock statement. It is much easier to just use one object which you use to lock access to all of them instead of trying to use finer grain locks to lock access to just the members you will need. If you try to go with the finer grained method you will probably have to take multiple locks in some situations and then you need to make sure you are always taking them in the same order to avoid deadlocks.
Another reason to use option A is because it is possible that the reference to m_Hash will be accessible outside your class. Perhaps you have a public property which supplies access to it, or maybe you declare it as protected and derived classes can use it. In either case once external code has a reference to it, it is possible that the external code will use it for a lock. This also opens up the possibility of deadlocks since you have no way to control or know what order the lock will be taken in.
Actually, it is not good idea to lock on object if you are using its members.
Jeffrey Richter wrote in his book "CLR via C#" that there is no guarantee that a class of object that you are using for synchronization will not use lock(this) in its implementation (It's interesting, but it was a recommended way for synchronization by Microsoft for some time... Then, they found that it was a mistake), so it is always a good idea to use a special separate object for synchronization. So, as you can see OptionB will not give you a guarantee of deadlock - safety.
So, OptionA is much safer that OptionB.
It's not what you're "Locking", its the code that's contained between the lock { ... } thats important and that you're preventing from being executed.
If one thread takes out a lock() on any object, it prevents other threads from obtaining a lock on the same object, and hence prevents the second thread from executing the code between the braces.
So that's why most people just create a junk object to lock on, it prevents other threads from obtaining a lock on that same junk object.
I think the scope of the variable you "pass" in will determine the scope of the lock.
i.e. An instance variable will be in respect of the instance of the class whereas a static variable will be for the whole AppDomain.
Looking at the implementation of the collections (using Reflector), the pattern seems to follow that an instance variable called SyncRoot is declared and used for all locking operations in respect of the instance of the collection.
Well, it depends on what you wanted to lock(be made threadsafe).
Normally I would choose OptionB to provide threadsafe access to m_Hash ONLY. Where as OptionA, I would used for locking value type, which can't be used with the lock, or I had a group of objects that need locking concurrently, but I don't what to lock the whole instance by using lock(this)
Locking the object that you're using is simply a matter of convenience. An external lock object can make things simpler, and is also needed if the shared resource is private, like with a collection (in which case you use the ICollection.SyncRoot object).
OptionA is the way to go here as long as in all your code, when accessing the m_hash you use the m_Locker to lock on it.
Now Imagine this case. You lock on the object. And that object in one of the functions you call has a lock(this) code segment. In this case that is a sure unrecoverable deadlock

Categories