I'm curious about the difference below two example.
case 1) locking by readonly object
private readonly object key = new object();
private List<int> list = new List<int>;
private void foo()
{
lock(key){
list.add(1);
}
}
case2) locking by target object itself
private List<int> list = new List<int>;
private void foo()
{
lock(list){
list.add(1);
}
}
are both cases thread-safe enough? i'm wondering if garbage collector changes the address of list variable(like 0x382743 => 0x576382) at sometime so that it could fail thread-safe.
So long as List<T> does not have within its code any lock(this) statements the two functions will behave the same.
However, because you don't always know if a object locks on itself or not without looking through it's source code it is safer to just lock on a separate object.
One thing of note, classes that inherit from ICollection have a SyncRoot property which is explicitly the object you are supposed to lock on if you want to put a lock on the collection without using a seperate object.
private List<int> list = new List<int>;
private void foo()
{
lock(((ICollection)list).SyncRoot){
list.add(1);
}
}
This internally is just doing the same thing as you did and created a separate new Object() to lock on.
In both cases, foo() is thread-safe. But locking on separate readonly object (case 1) is preferred, because
you don't have to bother about whether list is not null (lock would then fail)
you don't have to bother about whether list is re-assigned (assigning new instance of List<T> to list could cause some troubles, e.g. loss of atomicity of locked block of code)
you don't have to bother about whether you pass list to third-party code (e.g. as result of some function), as it is good practice to lock only on objects you have under your exclusive control (it could lead to deadlock if another piece of could would lock on this object too).
when lock protects more objects at once (e.g. list1, list2, etc.), locking just on one of them would still work (if you would lock consistently on the same object everywhere), but would lead to less readable and harder to understand code.
Lock is guaranteed to work properly even if garbage collector moves object you are locking on.
Related
I have a question about locking in c#. Does c# lock an instance of an object, or the member.
If i have the following code:
lock(testVar)
{
testVar = testVar.Where(Item => Item.Value == 1).ToList();
//... do some more stuff
}
Does c# keep the lock, even i set testVar to a new value?
All C# objects inherit from System.Object, which itself always contains 4 bytes dedicated to be used when you use the syntactic sugar for lock. That's called a SyncBlock object.
When you create a new object using new, in your case, ToList which generated a new reference to a List<T>, you're actually overriding the old reference, which invalidates your lock. That means that now multiple threads could possibly be inside your lock. The compiler will transform your code into a try-finally block with an extra local variable, to avoid you from shooting your leg.
That is why the best practice is to define a dedicated private readonly variable which will act as a sync root object, instead of using a class member. That way, your intentions are clear to anyone reading your code.
Edit:
There is a nice article on MSDN which describes the objects structure in memory:
SyncTableEntry also stores a pointer to SyncBlock that contains useful
information, but is rarely needed by all instances of an object. This
information includes the object's lock, its hash code, any thunking
data, and its AppDomain index. For most object instances, there will
be no storage allocated for the actual SyncBlock and the syncblk
number will be zero. This will change when the execution thread hits
statements like lock(obj) or obj.GetHashCode.
It locks on the object that the expression (testVar) resolves to. This means that your code does have a thread race, because once the list is reassigned, other concurrent threads could be locking on the new instance.
A good rule of thumb: only ever lock on a readonly field. testVar clearly isn't... but it could be, especially if you use RemoveAll to change the existing list instead of creating a new one. This of course depends on all access to the list happening inside the lock.
Frankly, though, most code doesn't need to be thread-safe. If code does need to be thread safe, the supported use scenarios must be clearly understood by the implementer.
The lock expression translates to a try/finally expression using Monitor.Enter/Monitor.Exit.
Doing a simple test with some code similar to yours (with VS2015 Preview) you can see what the compiler translates the code to.
The code
var testVar = new List<int>();
lock (testVar)
{
testVar = new List<int>();
testVar.Add(1);
}
Is actually translated to this:
List<int> list2;
List<int> list = new List<int>();
bool lockTaken = false;
try
{
list2 = list;
Monitor.Enter(list2, ref lockTaken);
list = new List<int> { 1 };
}
finally
{
if (lockTaken)
{
Monitor.Exit(list2);
}
}
So you can see that the compiler has completely removed your variable testVar and replaced it with 2 variables, namely list and list2. Then the following happens:
list2 is initialized to list and now both references point to the same instance of List<int>.
The call Monitor.Enter(list2, ref lockTaken) associates the synchronization block in the List<int> object with the current thread.
The list variable is assigned to a new instance of List<int>, but list2 still points to the original instance that we locked against.
The lock is release using list2
So even though you think that you are changing the lock variable, you are actually not. Doing that however makes your code hard to read and confusing so you should use a dedicated lock variable as suggested by the other posts.
I have been learning about locking on threads and I have not found an explanation for why creating a typical System.Object, locking it and carrying out whatever actions are required during the lock provides the thread safety?
Example
object obj = new object()
lock (obj) {
//code here
}
At first I thought that it was just being used as a place holder in examples and meant to be swapped out with the Type you are dealing with. But I find examples such as Dennis Phillips points out, doesn't appear to be anything different than actually using an instance of Object.
So taking an example of needing to update a private dictionary, what does locking an instance of System.Object do to provide thread safety as opposed to actually locking the dictionary (I know locking the dictionary in this case could case synchronization issues)?
What if the dictionary was public?
//what if this was public?
private Dictionary<string, string> someDict = new Dictionary<string, string>();
var obj = new Object();
lock (obj) {
//do something with the dictionary
}
The lock itself provides no safety whatsoever for the Dictionary<TKey, TValue> type. What a lock does is essentially
For every use of lock(objInstance) only one thread will ever be in the body of the lock statement for a given object (objInstance)
If every use of a given Dictionary<TKey, TValue> instance occurs inside a lock. And every one of those lock uses the same object then you know that only one thread at a time is ever accessing / modifying the dictionary. This is critical to preventing multiple threads from reading and writing to it at the same time and corrupting its internal state.
There is one giant problem with this approach though: You have to make sure every use of the dictionary occurs inside a lock and it uses the same object. If you forget even one then you've created a potential race condition, there will be no compiler warnings and likely the bug will remain undiscovered for some time.
In the second sample you showed you're using a local object instance (var indicates a method local) as a lock parameter for an object field. This is almost certainly the wrong thing to do. The local will live only for the lifetime of the method. Hence 2 calls to the method will use lock on different locals and hence all methods will be able to simultaneously enter the lock.
It used to be common practice to lock on the shared data itself:
private Dictionary<string, string> someDict = new Dictionary<string, string>();
lock (someDict )
{
//do something with the dictionary
}
But the (somewhat theoretical) objection is that other code, outside of your control, could also lock on someDict and then you might have a deadlock.
So it is recommended to use a (very) private object, declared in 1-to-1 correspondence with the data, to use as a stand-in for the lock. As long as all code that accesses the dictionary locks on on obj the tread-safety is guaranteed.
// the following 2 lines belong together!!
private Dictionary<string, string> someDict = new Dictionary<string, string>();
private object obj = new Object();
// multiple code segments like this
lock (obj)
{
//do something with the dictionary
}
So the purpose of obj is to act as a proxy for the dictionary, and since its Type doesn't matter we use the simplest type, System.Object.
What if the dictionary was public?
Then all bets are off, any code could access the Dictionary and code outside the containing class is not even able to lock on the guard object. And before you start looking for fixes, that simply is not an sustainable pattern. Use a ConcurrentDictionary or keep a normal one private.
The object which is used for locking does not stand in relation to the objects that are modified during the lock. It could be anything, but should be private and no string, as public objects could be modified externally and strings could be used by two locks by mistake.
So far as I understand it, the use of a generic object is simply to have something to lock (as an internally lockable object). To better explain this; say you have two methods within a class, both access the Dictionary, but may be running on different threads. To prevent both methods from modifying the Dictionary at the same time (and potentially causing deadlock), you can lock some object to control the flow. This is better illustrated by the following example:
private readonly object mLock = new object();
public void FirstMethod()
{
while (/* Running some operations */)
{
// Get the lock
lock (mLock)
{
// Add to the dictionary
mSomeDictionary.Add("Key", "Value");
}
}
}
public void SecondMethod()
{
while (/* Running some operation */)
{
// Get the lock
lock (mLock)
{
// Remove from dictionary
mSomeDictionary.Remove("Key");
}
}
}
The use of the lock(...) statement in both methods on the same object prevents the two methods from accessing the resource at the same time.
The important rules for the object you lock on are:
It must be an object visible only to the code that needs to lock on it. This avoids other code also locking on it.
This rules out strings that could be interned, and Type objects.
This rules out this in most cases, and the exceptions are too few and offer little in exploiting, so just don't use this.
Note also that some cases internal to the framework lock on Types and this, so while "it's okay as long as nobody else does it" is true, but it's already too late.
It must be static to protect static static operations, it may be instance to protect instance operations (including those internal to a instance that is held in a static).
You don't want to lock on a value-type. If you really wanted too you could lock on a particular boxing of it, but I can't think of anything that this would gain beyond proving that it's technically possible - it's still going to lead to the code being less clear as to just what locks on what.
You don't want to lock on a field that you may change during the lock being held, as you'll no longer have the lock on what you appear to have the lock on (it's just about plausible that there's a practical use for the effect of this, but there's going to be an impedance between what the code appears to do at first read and what it really does, which is never good).
The same object must be used to lock on all operations that may conflict with each other.
While you can have correctness with overly-broad locks, you can get better performance with finer. E.g. if you had a lock that was protecting 6 operations, and realised that 2 of those operations couldn't interfere with the other 4, so you changed to having 2 lock objects, then you can gain by having better coherency (or crash-and-burn if you were wrong in that analysis!)
The first point rules out locking on anything that is either visible or which could be made visible (e.g. a private instance that is returned by a protected or public member should be considered public as far as this analysis goes, anything captured by a delegate could end up elsewhere, and so on).
The last two points can mean that there's no obvious "type you are dealing with" as you put it, because locks don't protect objects, the protect operations done on objects and you may either have more than one object affected, or the same object affected by more than one group of operations that must be locked.
Hence it can be good practice to have an object that exists purely to lock on. Since it's doing nothing else, it can't get mixed up with other semantics or written over when you don't expect. And since it does nothing else it may as well be the lightest reference type that exists in .NET; System.Object.
Personally, I do prefer to lock on an object related to an operation when it does clearly fit the bill of the "type you are dealing with", and none of the other concerns apply, as it seems to me to be quite self-documenting, but to others the risk of doing it wrong out-weighs that benefit.
public ArrayList InputBuffer
{
get { lock (this.in_buffer) { return this.in_buffer; } }
}
is this.in_buffer locked during a call to InputBuffer.Clear?
or does the property simply lock the in_buffer object while it's getting the reference to it; the lock exits, and then that reference is used to Clear?
No, the property locks the reference while it's getting that reference. Pretty pointless, to be honest... this is more common:
private readonly object mutex = new object();
private Foo foo = ...;
public Foo Foo
{
get
{
lock(mutex)
{
return foo;
}
}
}
That lock would only cover the property access itself, and wouldn't provide any protection for operations performed with the Foo. However, it's not the same as not having the lock at all, because so long as the variable is only written while holding the same lock, it ensures that any time you read the Foo property, you're accessing the most recent value of the property... without the lock, there's no memory barrier and you could get a "stale" result.
This is pretty weak, but worth knowing about.
Personally I try to make very few types thread-safe, and those tend to have more appropriate operations... but if you wanted to write code which did modify and read properties from multiple threads, this is one way of doing so. Using volatile can help too, but the semantics of it are hideously subtle.
The object is locked inside the braces of the lock call, and then it is unlocked.
In this case the only code in the lock call is return this.in_buffer;.
So in this case, the in_buffer is not locked during a call to InputBuffer.Clear.
One solution to your problem, using extension methods, is as follows.
private readonly object _bufLock;
class EMClass{
public static void LockedClear(this ArrayList a){
lock(_bufLock){
a.Clear();
}
}
}
Now when you do:
a.LockedClear();
The Clear call will be done in a lock.
You must ensure that the buffer is only accessed inside _bufLocks.
In addition to what others have said about the scope of the lock, remember that you aren't locking the object, you are only locking based on the object instance named.
Common practice is to have a separate lock mutex as Jon Skeet exemplifies.
If you must guarantee synchronized execution while the collection is being cleared, expose a method that clears the collection, have clients call that, and don't expose your underlying implementation details. (Which is good practice anyway - look up encapsulation.)
If I have multiple threads calling the Add method of a List object, and no readers, do I only need to lock on the List object before calling Add to be thread safe?
Usually it's best to lock on a separate (immutable) object... locking on the same object you're modifying is bad practice should be done with caution.
private readonly object sync = new object();
private List<object> list = new List<object>();
void MultiThreadedMethod(object val)
{
lock(sync)
{
list.Add(val);
}
}
In a basic case like this you will not have a problem, but if there is a possibility that your list can be changed (not the contents of the list, but the list itself), then you might have a situation where you lock on two objects when you only intend to lock on one.
Yes. But you might also consider subclassing the List and "new" over the Add method. That will allow you to encapsulate the lock. It will work great as long as nothing accesses the base List. This technique is used for simple tree structures in XNA video games.
Yes, you need to lock. Instance methods are not guaranteed to be thread safe on List<T>.
I think these have been linked before on here, but I found them to be very useful and interesting:
Thread safe collections are hard
Thread safe collection
I have a static collections which implements IList interface. This collection is used throughout the application, including adding/removing items.
Due to multithread issue, I wonder what I can do to ensure that the list is modifying one at a time, such as when 1 thread try to add an item, another thread should not do delete item at that time.
I wonder what is difference between lock(this) and lock(privateObject) ? Which one is better in my case?
Thank you.
The lock(this) will lock on the entire instance while lock(privateObject) will only lock that specific instance variable. The second one is the better choice since locking on the entire instance will prevent other threads from being able to do anything with the object.
From MSDN:
In general, avoid locking on a public
type, or instances beyond your code's
control. The common constructs lock
(this), lock (typeof (MyType)), and
lock ("myLock") violate this
guideline:
lock (this) is a problem if the
instance can be accessed publicly.
lock (typeof (MyType)) is a problem if
MyType is publicly accessible.
lock(“myLock”) is a problem since any
other code in the process using the
same string, will share the same lock.
Best practice is to define a private
object to lock on, or a private static
object variable to protect data common
to all instances.
In this particular case, the collection is static which effectively means there is a single instance but that still doesn't change how the lock(this) and lock(privateObject) would behave.
By using lock(this) even in a static collection you are still locking the entire instance. In this case, as soon as thread A acquires a lock for method Foo() all other threads will have to wait to perform any operation on the collection.
Using lock(privateObject) means that as soon as thread A acquires a lock for method Foo() all other threads can perform any other operation except Foo() without waiting. Only when another thread tries to perform method Foo() will it have to wait until thread A has completed its Foo() operation and released the lock.
The lock keyword is a little confusing. The object expression in the lock statement is really just an identification mechanism for creating critical sections. It is not the subject of the lock nor is it in any way guarenteed to be safe for multithreaded operations just because it is referenced by the statement.
So lock(this) is creating a critical section identified by the class containing the currently executing method whereas lock(privateObject) is identified by an object that is (presumably anyway) private to the class. The former is more risky because a caller of your class could inadvertantly define their own critical sections using a lock statement that uses that class instance as the lock object. That could lead to unintended threading problems including, but not limited to, deadlocks and bottlenecks.
You mentioned that you were concerned with multiple threads modifying the collection at the same time. I should point out that you should be equally concerned with threads reading the collection as well even if they are not modifying it. It is likely that you will need some of the same safe guards in place to protect the collection during reads as you would during writes.
Add a private member to the class that methods lock on.
eg.
public class MyClass : IList
{
private object syncRoot = new object();
public void Add(object value)
{
lock(this.syncRoot)
{
// Add code here
}
}
public void Remove(object value)
{
lock(this.syncRoot)
{
// Remove code here
}
}
}
This will ensure that access to the list is syncronized between threads for the adding and removing cases, while maintaining access to the list. This will still let enumerators access the list while another thread can modify it, but that then opens another issue where an enumerator will throw an exception if the collection is modified during the enumeration.