I have a static collections which implements IList interface. This collection is used throughout the application, including adding/removing items.
Due to multithread issue, I wonder what I can do to ensure that the list is modifying one at a time, such as when 1 thread try to add an item, another thread should not do delete item at that time.
I wonder what is difference between lock(this) and lock(privateObject) ? Which one is better in my case?
Thank you.
The lock(this) will lock on the entire instance while lock(privateObject) will only lock that specific instance variable. The second one is the better choice since locking on the entire instance will prevent other threads from being able to do anything with the object.
From MSDN:
In general, avoid locking on a public
type, or instances beyond your code's
control. The common constructs lock
(this), lock (typeof (MyType)), and
lock ("myLock") violate this
guideline:
lock (this) is a problem if the
instance can be accessed publicly.
lock (typeof (MyType)) is a problem if
MyType is publicly accessible.
lock(“myLock”) is a problem since any
other code in the process using the
same string, will share the same lock.
Best practice is to define a private
object to lock on, or a private static
object variable to protect data common
to all instances.
In this particular case, the collection is static which effectively means there is a single instance but that still doesn't change how the lock(this) and lock(privateObject) would behave.
By using lock(this) even in a static collection you are still locking the entire instance. In this case, as soon as thread A acquires a lock for method Foo() all other threads will have to wait to perform any operation on the collection.
Using lock(privateObject) means that as soon as thread A acquires a lock for method Foo() all other threads can perform any other operation except Foo() without waiting. Only when another thread tries to perform method Foo() will it have to wait until thread A has completed its Foo() operation and released the lock.
The lock keyword is a little confusing. The object expression in the lock statement is really just an identification mechanism for creating critical sections. It is not the subject of the lock nor is it in any way guarenteed to be safe for multithreaded operations just because it is referenced by the statement.
So lock(this) is creating a critical section identified by the class containing the currently executing method whereas lock(privateObject) is identified by an object that is (presumably anyway) private to the class. The former is more risky because a caller of your class could inadvertantly define their own critical sections using a lock statement that uses that class instance as the lock object. That could lead to unintended threading problems including, but not limited to, deadlocks and bottlenecks.
You mentioned that you were concerned with multiple threads modifying the collection at the same time. I should point out that you should be equally concerned with threads reading the collection as well even if they are not modifying it. It is likely that you will need some of the same safe guards in place to protect the collection during reads as you would during writes.
Add a private member to the class that methods lock on.
eg.
public class MyClass : IList
{
private object syncRoot = new object();
public void Add(object value)
{
lock(this.syncRoot)
{
// Add code here
}
}
public void Remove(object value)
{
lock(this.syncRoot)
{
// Remove code here
}
}
}
This will ensure that access to the list is syncronized between threads for the adding and removing cases, while maintaining access to the list. This will still let enumerators access the list while another thread can modify it, but that then opens another issue where an enumerator will throw an exception if the collection is modified during the enumeration.
Related
Assuming I have an object A containing
// ...
private List<double> someList = new List<double>();
// ...
public List<double> SomeList
{
get { lock (this) { return someList; } }
}
// ...
would it be thread safe to perform operation on the list as in the code below. Knowing that several operations could be executed simultaneously by different threads.
A.SomeList.Add(2.0);
or
A.SomeList.RemoveAt(0);
In other words, when is the lock released?
There is no thread safety here.
The lock is released as soon as the block it protects is finished, just before the property returns, so the calls to Add ad RemoveAt are not protected by the lock.
The lock you shown in the question isn't of much use.
To make list operations thread safe you need to implement your own Add/Remove/etc methods wrapping those of the list.
public void Add(double item)
{
lock(_list)
{
_list.Add(item);
}
}
Also, it's a good idea to hide the list itself from the consumers of your class, i.e. make the field private.
The lock is released when you exit the body of the lock statement. That means that your code is not thread-safe.
In other words, you can be sure that two threads won't be executing return someList on the same object at the same time. But it's certainly possible that one thread will execute Add() at the same time as another thread will execute RemoveAt(), which is what makes it non thread-safe.
The lock is released when the code inside the lock is finished executing.
Also, locking on this will only affect the current instance of the object
Ok, just for the hell of it.
There IS a way to make your object threadsafe, by using architectures that are already threadsafe.
For example, you could make your object a single threaded COM object. The COM object will be thread safe, but you'll pay with performance (the price of being lazy and not managing your own locks).
Create a COM Object in C#
...others said already, but just to formalize a problem a bit...
First, lock (this) {...} suggests a 'scope' - e.g. like using (){} - it only locks (this in this case) for variables inside. And that's a 'good thing' :) actually, as if you couldn't rely on that the whole locks/synchronization concept would be very much useless,
lock (this) { return something; } is an oxymoron of a sort - it's returning something that unlocks the very same moment it returns,
And the problems I think is the understanding of how it works. 'lock()' is not 'persisted' in a state of the object, so that you could return it etc. Take a look here how it's implemented How does lock work exactly? - answer explains it. It's more of a 'critical section' - i.e. you protect certain parts of 'code', which uses the variable - not the variable itself. Any type of synchronization requires 'synchronization objects' to hold the locks - and to be disposed of once lock is no longer needed. Take a look at this post https://stackoverflow.com/a/251668/417747, Esteban formulated that very well,
"Finally, there is the common misconception that lock(this) actually modifies the object passed as a parameter, and in some way makes it read-only or inaccessible. This is false. The object passed as a parameter to lock merely serves as a key" (this is a quote)
you either (usually) lock 'private code' of a class method, property... - to synchronize access to something you're doing inside - and access being to a private member (again usually) so that nobody else can access it w/o going through your synchronized code.
Or you make a thread-safe 'structure' - like a list - which is already 'synchronized' inside so that you can access it in a thread-safe manner. But there are no such things (or not used, almost never) as passing locks around or having one place lock the variable, while the other part of the code unlocks it etc. (in that case it's some type of EventWaitHandle that's rather used to synchronize things in between 'distant' code where one fires off on another etc.)
In your case, the choice is I think to go with the 'synchronized structure', i.e. the list that's internally handled,
Ok, I've used locks quite a bit, but I've never had this scenario before. I have two different classes that contain code used to modify the same MSAccess database:
public class DatabaseNinja
{
public void UseSQLKatana
{
//Code to execute queries against db.TableAwesome
}
}
public class DatabasePirate
{
public void UseSQLCutlass
{
//Code to execute queries against db.TableAwesome
}
}
This is a problem, because transactions to the database cannot be executed in parallel, and these methods (UseSQLKatana and UseSQLCutlass) are called by different threads.
In my research, I see that it is bad practice to use a public object as a lock object so how do I lock these methods so that they don't run in tandem? Is the answer simply to have these methods in the same class? (That is actually not so simple in my real code)
Well, first off, you could create a third class:
internal class ImplementationDetail
{
private static readonly object lockme = new object();
public static void DoDatabaseQuery(whatever)
{
lock(lockme)
ReallyDoQuery(whatever);
}
}
and now UseSQLKatana and UseSQLCutlass call ImplementationDetail.DoDatabaseQuery.
Second, you could decide to not worry about it, and lock an object that is visible to both types. The primary reason to avoid that is because it becomes difficult to reason about who is locking the object, and difficult to protect against hostile partially trusted code locking the object maliciously. If you don't care about either downside then you don't have to blindly follow the guideline.
The reason it's bad practice to lock on a public object is that you can never be sure who ELSE is locking on that object. Although unlikely, someone else someday can decide that they want to grab your lock object, and do some process that ends up calling your code, where you lock onto that same lock object, and now you have an impossible deadlock to figure out. (It's the same issue for using 'this').
A better way to do this would be to use a public Mutex object. These are much more heavyweight, but it's much easier to debug the issue.
Use a Mutex.
You can create mutex in main class and call Wait method at the beginning of each class (method); then set mutex so when the other method is called it gonna wait for first class to finish.
Ah, remember to release mutex exiting from those methods...
I see two differing questions here:
Why is it a bad idea to lock on a public object?
The idea is that locking on an object restricts access while the lock is maintained - this means none of its members can be accessed, and other sources may not be aware of the lock and attempt to utilise the instance, even trying to acquire a lock themselves, hence causing problems.
For this reason, use a dedicated object instance to lock onto.
How do I lock these methods so that they don't run in tandem?
You could consider the Mutex class; creating a 'global' mutex will allow your classes to operate on the basis of knowing the state of the lock throughout the application. Or, you could use a shared ReaderWriterLockSlim instance, but I wouldn't really recommend the cross-class sharing of it.
You can use a public LOCK object as a lock object. You'll just have to specify that the object you're creating is a Lock object solely used for locking the Ninja and Pirate class.
public ArrayList InputBuffer
{
get { lock (this.in_buffer) { return this.in_buffer; } }
}
is this.in_buffer locked during a call to InputBuffer.Clear?
or does the property simply lock the in_buffer object while it's getting the reference to it; the lock exits, and then that reference is used to Clear?
No, the property locks the reference while it's getting that reference. Pretty pointless, to be honest... this is more common:
private readonly object mutex = new object();
private Foo foo = ...;
public Foo Foo
{
get
{
lock(mutex)
{
return foo;
}
}
}
That lock would only cover the property access itself, and wouldn't provide any protection for operations performed with the Foo. However, it's not the same as not having the lock at all, because so long as the variable is only written while holding the same lock, it ensures that any time you read the Foo property, you're accessing the most recent value of the property... without the lock, there's no memory barrier and you could get a "stale" result.
This is pretty weak, but worth knowing about.
Personally I try to make very few types thread-safe, and those tend to have more appropriate operations... but if you wanted to write code which did modify and read properties from multiple threads, this is one way of doing so. Using volatile can help too, but the semantics of it are hideously subtle.
The object is locked inside the braces of the lock call, and then it is unlocked.
In this case the only code in the lock call is return this.in_buffer;.
So in this case, the in_buffer is not locked during a call to InputBuffer.Clear.
One solution to your problem, using extension methods, is as follows.
private readonly object _bufLock;
class EMClass{
public static void LockedClear(this ArrayList a){
lock(_bufLock){
a.Clear();
}
}
}
Now when you do:
a.LockedClear();
The Clear call will be done in a lock.
You must ensure that the buffer is only accessed inside _bufLocks.
In addition to what others have said about the scope of the lock, remember that you aren't locking the object, you are only locking based on the object instance named.
Common practice is to have a separate lock mutex as Jon Skeet exemplifies.
If you must guarantee synchronized execution while the collection is being cleared, expose a method that clears the collection, have clients call that, and don't expose your underlying implementation details. (Which is good practice anyway - look up encapsulation.)
I read all documentation about thread-safe types and the "lock" statement, but I am still not getting it 100%.
When exactly do I need to use the "lock" statement? How it relates to (non) thread-safe types? Thank you.
Imagine an instance of a class with a global variable in it. Imagine two threads call a method on that object at exactly the same time, and that method updates the global variable inside.
The likelihood is that value in the variable will get corrupted. Different languages and compilers/interpreters will deal with this in different ways (or not at all...) but the point is that you get "undesired" and "unpredictable" results.
Now imagine that the method obtains a "lock" on the variable before attempting to read from or write to it. The first thread to call the method will get a "lock" on the variable, the second thread to call the method will have to wait until the lock is released by the first thread. While you still have a race condition (i.e. the second thread might overwrite the value from the first) at least you have predictable results because no two threads (that are unaware of each other) can modify the value at the same time.
You use the lock statement to obtain that lock on the variable. Typically you'd define a separate object variable and use that for the lock object:
public class MyThreadSafeClass
{
private readonly object lockObject = new object();
private string mySharedString;
public void ThreadSafeMethod(string newValue)
{
lock (lockObject)
{
// Once one thread has got inside this lock statement, any others will have to wait outside for their turn...
mySharedString = newValue;
}
}
}
A type is deemed "thread-safe" if it applies the principle that no corruption will occur if shared data is accessed by multiple threads at the same time.
Beware the difference between "immutable" and "thread-safe". Thread-safe says that you have coded for the scenario and won't get corruption if two threads access shared state at the same time, whereas immutability is simply saying you return a new object rather than modifying it. Immutable objects are thread-safe, but not all thread-safe objects are immutable.
Thread safe code means code that can be accessed with many threads and still operate correctly.
In C#, this normally requires some sort of synchronization mechanism. A simple one is the lock statement (which is behind the scenes a call to Monitor.Enter). A code block that is surrounded by a lock block can only be accessed by one thread at a time.
Any use of a type that is not thread safe requires you to manage synchronization yourself.
A good resource to learn about threading in C# is the free eBook by Joe Albahari, found here.
http://en.wikipedia.org/wiki/Thread_safety
I'm using C# & .NEt 3.5. What is the difference between the OptionA and OptionB ?
class MyClass
{
private object m_Locker = new object();
private Dicionary<string, object> m_Hash = new Dictionary<string, object>();
public void OptionA()
{
lock(m_Locker){
// Do something with the dictionary
}
}
public void OptionB()
{
lock(m_Hash){
// Do something with the dictionary
}
}
}
I'm starting to dabble in threading (primarly for creating a cache for a multi-threaded app, NOT using the HttpCache class, since it's not attached to a web site), and I see the OptionA syntax in a lot of the examples I see online, but I don't understand what, if any, reason that is done over OptionB.
Option B uses the object to be protected to create a critical section. In some cases, this more clearly communicates the intent. If used consistently, it guarantees only one critical section for the protected object will be active at a time:
lock (m_Hash)
{
// Across all threads, I can be in one and only one of these two blocks
// Do something with the dictionary
}
lock (m_Hash)
{
// Across all threads, I can be in one and only one of these two blocks
// Do something with the dictionary
}
Option A is less restrictive. It uses a secondary object to create a critical section for the object to be protected. If multiple secondary objects are used, it's possible to have more than one critical section for the protected object active at a time.
private object m_LockerA = new object();
private object m_LockerB = new object();
lock (m_LockerA)
{
// It's possible this block is active in one thread
// while the block below is active in another
// Do something with the dictionary
}
lock (m_LockerB)
{
// It's possible this block is active in one thread
// while the block above is active in another
// Do something with the dictionary
}
Option A is equivalent to Option B if you use only one secondary object. As far as reading code, Option B's intent is clearer. If you're protecting more than one object, Option B isn't really an option.
It's important to understand that lock(m_Hash) does NOT prevent other code from using the hash. It only prevents other code from running that is also using m_Hash as its locking object.
One reason to use option A is because classes are likely to have private variables that you will use inside the lock statement. It is much easier to just use one object which you use to lock access to all of them instead of trying to use finer grain locks to lock access to just the members you will need. If you try to go with the finer grained method you will probably have to take multiple locks in some situations and then you need to make sure you are always taking them in the same order to avoid deadlocks.
Another reason to use option A is because it is possible that the reference to m_Hash will be accessible outside your class. Perhaps you have a public property which supplies access to it, or maybe you declare it as protected and derived classes can use it. In either case once external code has a reference to it, it is possible that the external code will use it for a lock. This also opens up the possibility of deadlocks since you have no way to control or know what order the lock will be taken in.
Actually, it is not good idea to lock on object if you are using its members.
Jeffrey Richter wrote in his book "CLR via C#" that there is no guarantee that a class of object that you are using for synchronization will not use lock(this) in its implementation (It's interesting, but it was a recommended way for synchronization by Microsoft for some time... Then, they found that it was a mistake), so it is always a good idea to use a special separate object for synchronization. So, as you can see OptionB will not give you a guarantee of deadlock - safety.
So, OptionA is much safer that OptionB.
It's not what you're "Locking", its the code that's contained between the lock { ... } thats important and that you're preventing from being executed.
If one thread takes out a lock() on any object, it prevents other threads from obtaining a lock on the same object, and hence prevents the second thread from executing the code between the braces.
So that's why most people just create a junk object to lock on, it prevents other threads from obtaining a lock on that same junk object.
I think the scope of the variable you "pass" in will determine the scope of the lock.
i.e. An instance variable will be in respect of the instance of the class whereas a static variable will be for the whole AppDomain.
Looking at the implementation of the collections (using Reflector), the pattern seems to follow that an instance variable called SyncRoot is declared and used for all locking operations in respect of the instance of the collection.
Well, it depends on what you wanted to lock(be made threadsafe).
Normally I would choose OptionB to provide threadsafe access to m_Hash ONLY. Where as OptionA, I would used for locking value type, which can't be used with the lock, or I had a group of objects that need locking concurrently, but I don't what to lock the whole instance by using lock(this)
Locking the object that you're using is simply a matter of convenience. An external lock object can make things simpler, and is also needed if the shared resource is private, like with a collection (in which case you use the ICollection.SyncRoot object).
OptionA is the way to go here as long as in all your code, when accessing the m_hash you use the m_Locker to lock on it.
Now Imagine this case. You lock on the object. And that object in one of the functions you call has a lock(this) code segment. In this case that is a sure unrecoverable deadlock