Use of volatile (Thread.VolatileRead/ Thread.VolatileWrite) in C# - c#

In a multi-threaded program running on a multi-cpu machine do I need to access shared state ( _data in the example code below) using volatile read/writes to ensure correctness.
In other words, can heap objects be cached on the cpu?
Using the example below and assuming multi-threads will access the GetValue and Add methods, I need ThreadA to be able to add data (using the Add Method) and ThreadB to be able to see/get that added data immediately (using the GetValue method). So do I need to add volatile reads/writes to _data to ensure this? Basically I don’t want to added data to be cached on ThreadA’s cpu.
/ I am not Locking (enforcing exclusive thread access) as the code needs to be ultra-fast and I am not removing any data from _data so I don’t need to lock _data.
Thanks.
**** Update ****************************
Obviously you guys think going lock-free using this example is bad idea. But what side effects or exceptions could I face here?
Could the Dictionary type throw an exception if 1 thread is iterating the values for read and another thread is iterating the values for update? Or would I just experience “dirty reads” (which would be fine in my case)?
**** End Update ****************************
public sealed class Data
{
private volatile readonly Dictionary<string, double> _data = new Dictionary<string, double>();
public double GetVaule(string key)
{
double value;
if (!_data.TryGetValue(key, out value))
{
throw new ArgumentException(string.Format("Key {0} does not exist.", key));
}
return value;
}
public void Add(string key, double value)
{
_data.Add(key, value);
}
public void Clear()
{
_data.Clear();
}
}
Thanks for the replies. Regarding the locks, the methods are pretty much constantly called by mulitple threads so my problem is with contested locks not the actual lock operation.
So my question is about cpu caching, can heap objects (the _data instance field) be cached on a cpu? Do i need the access the _data field using volatile reads/writes?
/Also, I am stuck with .Net 2.0.
Thanks for your help.

The MSDN docs for Dictionary<TKey, TValue> say that it's safe for multiple readers but they don't give the "one writer, multiple readers" guarantee that some other classes do. In short, I wouldn't do this.
You say you're avoiding locking because you need the code to be "ultra-fast" - have you tried locking to see what the overhead is? Uncontested locks are very cheap, and when the lock is contested that's when you're benefiting from the added safety. I'd certainly profile this extensively before deciding to worry about the concurrency issues of a lock-free solution. ReaderWriterLockSlim may be useful if you've actually got multiple readers, but it sounds like you've got a single reader and a single writer, at least at the moment - simple locking will be easier in this case.

I think you may be misunderstanding the use of the volatile keyword (either that or I am, and someone please feel free to correct me). The volatile keyword guarantees that get and set operations on the value of the variable itself from multiple threads will always deal with the same copy. For instance, if I have a bool that indicates a state then setting it in one thread will make the new value immediately available to the other.
However, you never change the value of your variable (in this case, a reference). All that you do is manipulate the area of memory that the reference points to. Declaring it as volatile readonly (which, if my understanding is sound, defeats the purpose of volatile by never allowing it to be set) won't have any effect on the actual data that's being manipulated (the back-end store for the Dictionary<>).
All that being said, you really need to use a lock in this case. Your danger extends beyond the prospect of "dirty reads" (meaning that what you read would have been, at some point, valid) into truly unknown territory. As Jon said, you really need proof that locking produces unacceptable performance before you try to go down the road of lockless coding. Otherwise that's the epitome of premature optimization.

The problem is that your add method:
public void Add(string key, double value)
{
_data.Add(key, value);
}
Could cause _data to decide to completely re-organise the data it's holding - at that point a GetVaule request could fail in any possible way.
You need a lock or a different data structure / data structure implementation.

I don't think volatile can be a replacement of locking if you start calling methods on it. You are guaranteeing that the thread A and thread B sees the same copy of the dictionary, but you can still access the dictionary simultaneously. You can use multi-moded locks to increase concurrency. See ReaderWriterLockSlim for example.
Represents a lock that is used to
manage access to a resource, allowing
multiple threads for reading or
exclusive access for writing.

The volatile keyword is not about locking, it is used to indicate that the value of the specified field might be changed or read by different thread or other thing that can run concurrently with your code. This is crucial for the compiler to know, because many optimization processes involve caching the variable value and rearranging the instructions. The volatile keyword will tell the compiler to be "cautious" when optimizing those instructions that reference to volatile variable.
For multi-thread usage of dictionary, there are many ways to do. The simplest way is using lock keyword, which has adequate performance. If you need higher performance, you might need to implement your own dictionary for your specific task.

Volatile is not locking, it has nothing to do with synchronization. It's generally safe to do lock-free reads on read-only data. Note that just because you don't remove anything from _data, you seem to call _data.Add(). That is NOT read-only. So yes, this code will blow up in your face in a variety of exciting and difficult to predict ways.
Use locks, it's simple, it's safer. If you're a lock-free guru (you're not!), AND profiling shows a bottleneck related to contention for the lock, AND you cannot solve the contention issues via partitioning or switching to spin-locks THEN AND ONLY THEN can you investigate a solution to get lock-free reads, which WILL involve writing your own Dictionary from scratch and MAY be faster than the locking solution.
Are you starting to see how far off base you are in your thinking here? Just use a damn lock!

Related

Generic dictionary - possible locking issue?

Specific Answers Only Please! I'm decently familiar with the better(best) practices around collection locking, thread safety etc. Just want some answers / ideas around this specific scenario.
We have some legacy code of the type:
public class GodObject
{
private readonly Dictionary<string, string> _signals;
//bunch of methods accessing the dictionary
private void SampleMethod1()
{
lock(_signals)
{
//critical code section 1
}
}
public void SampleMethod2()
{
lock(_signals)
{
//critical code section 2
}
}
}
All access to the dictionary is inside such lock statements. We're getting some bugs which could be explained if the locking was not explicitly working - meaning 2 or more threads getting simultaneous access to the dictionary.
So my question is this - is there any scenario where the critical sections could be simultaneously accessed by multiple threads?? To me, it should not be possible, since the reference is readonly, it's not as though the object could be changing, and most of the issues around the lock() are around deadlocks rather than syncronization not happening. But maybe i'm missing some nuance or something glaring?
This is running in a long running windows service .NET Framework 3.5.
There are three problems I can imagine occurring outside the code you posted:
Somebody might access the dictionary without locking on it. Using lock on an object will prevent anyone else from using lock on the same object at the same time, but it won't do anything to prevent other threads from using the object without locking on it. Note that because it would not have been overly difficult to have written Dictionary [and for that matter List] in such a way as to allow safe simultaneous use by multiple readers and one writer that only adds information, some people may assume that read methods don't need locking. Unfortunately, that assumption is false: Microsoft could have added such thread safety fairly cheaply, but didn't.
As Servy suggested, someone might be assuming that the the collection won't change between calls to two independent methods.
If some code which acquires a lock assumes a collection isn't going to change while the lock is held, but then calls some outside method while holding the lock, it's possible that the outside method could change the object despite the lock being held.
Unless the object which owns the dictionary keeps all references to itself, so that no outside code ever gets a reference to the dictionary, I think the first of these problems is perhaps the most likely. The other two problems can also occur sometimes, however.

Should a lock variable be declared volatile?

I have the following Lock statement:
private readonly object ownerLock_ = new object();
lock (ownerLock_)
{
}
Should I use volatile keyword for my lock variable?
private readonly volatile object ownerLock_ = new object();
On MSDN I saw that it usually used for a field that is accessed without locking, so if I use Lock I don't need to use volatile?
From MSDN:
The volatile modifier is usually used for a field that is accessed by
multiple threads without using the lock statement to serialize access.
If you're only ever accessing the data that the lock "guards" while you own the lock, then yes - making those fields volatile is superfluous. You don't need to make the ownerLock_ variable volatile either. (You haven't currently shown any actual code within the lock statement, which makes it hard to talk about in concrete terms - but I'm assuming you'd actually be reading/modifying some data within the lock statement.)
volatile should be very rarely used in application code. If you want lock-free access to a single variable, Interlocked is almost always simpler to reason about. If you want lock-free access beyond that, I would almost always start locking. (Or try to use immutable data structures to start with.)
I'd only expect to see volatile within code which is trying to build higher level abstractions for threading - so within the TPL codebase, for example. It's really a tool for experts who really understand the .NET memory model thoroughly... of whom there are very few, IMO.
If something is readonly it's thread-safe, period. (Well, almost. An expert might be able to figure out how to get a NullReferenceException on your lock statement, but it wouldn't be easy.) With readonly you don't need volatile, Interlocked, or locking. It's the ideal keyword for multi-threading, and you should use it where ever you can. It works great for a lock object where its big disadvantage (you can't change the value) doesn't matter.
Also, while the reference is immutable, the object referenced may not be. "new object()" is here, but if it was a List or something else mutable--and not thread-safe--you would want to lock the reference (and all other references to it, if any) to keep the object from changing in two threads at once.

How to ensure thread safe in the static method

Do I need to declare an static Object and use lock on it like
private static readonly Object padlock = new Object()
public static Test()
{
lock(padlock) {
// Blah Blah Blah
}
}
(Your code wouldn't currently compile, by the way - Readonly should be readonly, and you need to give padlock a type.)
It depends on what you're doing in the method. If the method doesn't use any shared data, or uses it in a way which is already safe, then you're fine.
You generally only need to lock if you're accessing shared data in an otherwise non-thread-safe way. (And all access to that shared data needs to be done in a thread-safe way.)
Having said that, I should point out that "thread safe" is a pretty vague term. Eric Lippert has a great blog post about it... rather than trying to come up with a "one size fits all" approach, you should think about what you're trying to protect against, what scenarios you're anticipating etc.
Jon is right; it is really not clear what you are asking here. The way I would interpret your question is:
If I have some shared state that needs to be made thread-safe by locking it, am I required to declare a private static object as the lock object?
The answer to that question is no, you are not required to do so. However, doing so is a really good idea, so you should do so even if you are not required.
You might think, well, there are lots of objects I could use. If the object I am locking is a reference type, I could use it. Or I could use the Type object associated with the containing class.
The problem with those things as locks is it becomes difficult to track down every possible bit of code that could be using the thing as a lock. Therefore it becomes difficult to analyze the code to ensure that there are no deadlocks due to lock ordering issues. And therefore, you are much more likely to get deadlocks. Having a dedicated lock object makes it much easier; you know that every single usage of that object is for the purposes of locking, and you can then understand what is going on inside those locks.
This is particularly true if you ever have untrusted, hostile code running in your appdomain. Locking a type object requires no particular permissions; what stops hostile code from locking all the types and never unlocking them? Nothing, that's what.

locking only when modifying vs entire method

When should locks be used? Only when modifying data or when accessing it as well?
public class Test {
static Dictionary<string, object> someList = new Dictionary<string, object>();
static object syncLock = new object();
public static object GetValue(string name) {
if (someList.ContainsKey(name)) {
return someList[name];
} else {
lock(syncLock) {
object someValue = GetValueFromSomeWhere(name);
someList.Add(name, someValue);
}
}
}
}
Should there be a lock around the the entire block or is it ok to just add it to the actual modification? My understanding is that there still could be some race condition where one call might not have found it and started to add it while another call right after might have also run into the same situation - but I'm not sure. Locking is still so confusing. I haven't run into any issues with the above similar code but I could just be lucky so far. Any help above would be appriciated as well as any good resources for how/when to lock objects.
You have to lock when reading too, or you can get unreliable data, or even an exception if a concurrent modification physically changes the target data structure.
In the case above, you need to make sure that multiple threads don't try to add the value at the same time, so you need at least a read lock while checking whether it is already present. Otherwise multiple threads could decide to add, find the value is not present (since this check is not locked), and then all try to add in turn (after getting the lock)
You could use a ReaderWriterLockSlim if you have many reads and only a few writes. In the code above you would acquire the read lock to do the check and upgrade to a write lock once you decide you need to add it. In most cases, only a read lock (which allows your reader threads to still run in parallel) would be needed.
There is a summary of the available .Net 4 locking primitives here. Definitely you should understand this before you get too deep into multithreaded code. Picking the correct locking mechanism can make a huge performance difference.
You are correct that you have been lucky so far - that's a frequent feature of concurrency bugs. They are often hard to reproduce without targeted load testing, meaning correct design (and exhaustive testing, of course) is vital to avoid embarrassing and confusing production bugs.
Lock the whole block before you check for the existence of name. Otherwise, in theory, another thread could add it between the check, and your code that adds it.
Actually locking just when you perform the Add really doesn't do anything at all. All that would do is prevent another thread from adding something simultaneously. But since that other thread would have already decided it was going to do the add, it would just try to do it anyway as soon as the lock was released.
If a resource can only be accessed by multiple threads, you do not need any locks.
If a resource can be accessed by multiple threads and can be modified, then all accesses/modifications need to be synchronized. In your example, if GetValueFromSomeWhere takes a long time to return, it is possible for a second call to be made with the same value in name, but the value has not been stored in the Dictionary.
ReaderWriterLock or the slim version if you under 4.0.
You will aquire the reader lock for the reads (will allow for concurrent reads) and upgrade the lock to the writer lock when something is to write (will allow only one write at the time and will block all the reads until is done, as well as the concurrent write-threads).
Make sure to release your locks with the pattern to avoid deadlocking:
void Write(object[] args)
{
this.ReaderWriterLock.AquireWriteLock(TimeOut.Infinite);
try
{
this.myData.Write(args);
}
catch(Exception ex)
{
}
finally
{
this.ReaderWriterLock.RelaseWriterLock();
}
}

Is a lock necessary in this situation?

Is it necessary to protect access to a single variable of a reference type in a multi-threaded application? I currently lock that variable like this:
private readonly object _lock = new object();
private MyType _value;
public MyType Value
{
get { lock (_lock) return _value; }
set { lock (_lock) _value = value; }
}
But I'm wondering if this is really necessary? Isn't assignment of a value to a field atomic? Can anything go wrong if I don't lock in this case?
P.S.: MyType is an immutable class: all the fields are set in the constructor and don't change. To change something, a new instance is created and assigned to the variable above.
Being atomic is rarely enough.
I generally want to get the latest value for a variable, rather than potentially see a stale one - so some sort of memory barrier is required, both for reading and writing. A lock is a simple way to get this right, at the cost of potentially losing some performance due to contention.
I used to believe that making the variable volatile would be enough in this situation. I'm no longer convinced this is the case. Basically I now try to avoid writing lock-free code when shared data is involved, unless I'm able to use building blocks written by people who really understand these things (e.g. Joe Duffy).
There is the volatile keyword for this. Whether it's safe without it depends on the scenario. But the compiler can do funny stuff, such as reorganize order of operation. So even read/write to one field may be unsafe.
It can be an issue. It's not just the assignment itself you have to be concerned with. Due to caching, concurrent threads might see an old version of the object if you don't lock. So whether a lock is necessary will depend on precisely how you use it, and you don't show that.
Here's a free, sample chapter of "Concurrent Programming in Windows" which explains this issue in detail.
it all depends on whether the property will be accessed by multiple threads. and some variable is said to be atomic operation, in this atomic operation case, no need to use lock. sorry for poor english.
in you case, immutable, i think lock is not necessary.

Categories