I have multiple threads accessing a single Int32 variable with "++" or "--" operation.
Do we need to lock before accessing it as below?
lock (idleAgentsLock)
{
idleAgents--; //we should place it here.
}
Or what consequences will there be if I don't do the locking?
It is not "atomic", not in the multi-threaded sense. However your lock protects the operation, theoretically at least. When in doubt you can of course use Interlocked.Decrement instead.
No ++/-- is not an atomic operation, however reading and writing to integers and other primitive times is considered an atomic operation.
See these links:
Is accessing a variable in C# an atomic operation?
Is the ++ operator thread safe?
What operations are atomic in C#?
Also this blog post way be of some interest to you.
I would suggest the Interlocked.Increment in the Interlocked class for an atomic implementation of ++/-- operators.
It depends on the machine architecture. In general, no, the compiler may generate a load instruction, increment/decrement the value and then store it. So, other threads may indeed read the value between those operations.
Most CPU instructions sets have a special atomic test & set instruction for this purpose. Assuming you don't want to embed assembly instructions into your C# code, the next best approach is to use mutual exclusion, similar to what you've shown. The implementation of that mechanism ultimately uses an instruction that is atomic to implement the mutex (or whatever it uses).
In short: yes, you should ensure mutual exclusion.
Beyond the scope of this answer, there are other techniques for managing shared data that may be appropriate or not depending on the domain logic of your situation.
A unprotected increment/decrement is not thread-safe - and thus not atomic between threads. (Although it might be "atomic" wrt the actual IL/ML transform1.)
This LINQPad example code shows unpredictable results:
void Main()
{
int nWorkers = 10;
int nLoad = 200000;
int counter = nWorkers * nLoad;
List<Thread> threads = new List<Thread>();
for (var i = 0; i < nWorkers; i++) {
var th = new Thread((_) => {
for (var j = 0; j < nLoad; j++) {
counter--; // bad
}
});
th.Start();
threads.Add(th);
}
foreach (var w in threads) {
w.Join();
}
counter.Dump();
}
Note that the visibility between threads is of importance. Synchronization guarantees this visibility in addition to atomicity.
This code is easily fixed, at least in the limited context presented. Switch out the decrement and observe the results:
counter--; // bad
Interlocked.Decrement(ref counter); // good
lock (threads) { counter--; } // good
1 Even when using a volatile variable, the results are still unpredictable. This seems to indicate that (at least here, when I just ran it) it is also not an atomic operator as read/op/write of competing threads were interleaved. To see that the behavior is still incorrect when visibility issues are removed (are they?), add
class x {
public static volatile int counter;
}
and modify the above code to use x.counter instead of the local counter variable.
Related
Joe Albahari has a great series on multithreading that's a must read and should be known by heart for anyone doing C# multithreading.
In part 4 however he mentions the problems with volatile:
Notice that applying volatile doesn’t prevent a write followed by a
read from being swapped, and this can create brainteasers. Joe Duffy
illustrates the problem well with the following example: if Test1 and
Test2 run simultaneously on different threads, it’s possible for a and
b to both end up with a value of 0 (despite the use of volatile on
both x and y)
Followed by a note that the MSDN documentation is incorrect:
The MSDN documentation states that use of the volatile keyword ensures
that the most up-to-date value is present in the field at all times.
This is incorrect, since as we’ve seen, a write followed by a read can
be reordered.
I've checked the MSDN documentation, which was last changed in 2015 but still lists:
The volatile keyword indicates that a field might be modified by
multiple threads that are executing at the same time. Fields that are
declared volatile are not subject to compiler optimizations that
assume access by a single thread. This ensures that the most
up-to-date value is present in the field at all times.
Right now I still avoid volatile in favor of the more verbose to prevent threads using stale data:
private int foo;
private object fooLock = new object();
public int Foo {
get { lock(fooLock) return foo; }
set { lock(fooLock) foo = value; }
}
As the parts about multithreading were written in 2011, is the argument still valid today? Should volatile still be avoided at all costs in favor of locks or full memory fences to prevent introducing very hard to produce bugs that as mentioned are even dependent on the CPU vendor it's running on?
Volatile in its current implementation is not broken despite popular blog posts claiming such a thing. It is however badly specified and the idea of using a modifier on a field to specify memory ordering is not that great (compare volatile in Java/C# to C++'s atomic specification that had enough time to learn from the earlier mistakes). The MSDN article on the other hand was clearly written by someone who has no business talking about concurrency and is completely bogus.. the only sane option is to completely ignore it.
Volatile guarantees acquire/release semantics when accessing the field and can only be applied to types that allow atomic reads and writes. Not more, not less. This is enough to be useful to implement many lock-free algorithms efficiently such as non-blocking hashmaps.
One very simple sample is using a volatile variable to publish data. Thanks to the volatile on x, the assertion in the following snippet cannot fire:
private int a;
private volatile bool x;
public void Publish()
{
a = 1;
x = true;
}
public void Read()
{
if (x)
{
// if we observe x == true, we will always see the preceding write to a
Debug.Assert(a == 1);
}
}
Volatile is not easy to use and in most situations you are much better off to go with some higher level concept, but when performance is important or you're implementing some low level data structures, volatile can be exceedingly useful.
As I read the MSDN documentation, I believe it is saying that if you see volatile on a variable, you do not have to worry about compiler optimizations screwing up the value because they reorder the operations. It doesn't say that you are protected from errors caused by your own code executing operations on separate threads in the wrong order. (although admittedly, the comment is not clear as to this.)
volatile is a very limited guarantee. It means that the variable isn't subject to compiler optimizations that assume access from a single thread. This means that if you write into a variable from one thread, then read it from another thread, the other thread will definitely have the latest value. Without volatile, one a multiprocessor machine without volatile, the compiler may make assumptions about single-threaded access, for example by keeping the value in a register, which prevents other processors from having access to the latest value.
As the code example you've mentioned shows, it doesn't protect you from having methods in different blocks reordered. In effect volatile makes each individual access to a volatile variable atomic. It doesn't make any guarantees as to the atomicity of groups of such accesses.
If you just want to ensure that your property has an up-to-date single value, you should be able to just use volatile.
The problem comes in if you try to perform multiple parallel operations as if they were atomic. If you have to force several operations to be atomic together, you need to lock the whole operation. Consider the example again, but using locks:
class DoLocksReallySaveYouHere
{
int x, y;
object xlock = new object(), ylock = new object();
void Test1() // Executed on one thread
{
lock(xlock) {x = 1;}
lock(ylock) {int a = y;}
...
}
void Test2() // Executed on another thread
{
lock(ylock) {y = 1;}
lock(xlock) {int b = x;}
...
}
}
The locks cause may cause some synchronization, which may prevent both a and b from having value 0 (I have not tested this). However, since both x and y are locked independently, either a or b can still non-deterministically end up with a value of 0.
So in the case of wrapping the modification of a single variable, you should be safe using volatile, and would not really be any safer using lock. If you need to atomically perform multiple operations, you need to use a lock around the entire atomic block, otherwise scheduling will still cause non-deterministic behavior.
Here are some useful disassemblies for volatile in C#: https://sharplab.io/#gist:625b1181356b543157780baf860c9173
On x86 it is just about:
using memory instead of registers
preventing compiler optimizations like in the case with the endless loop
I use volatile when I just want to tell compiler that a field might be updated from many different threads and I do not need additional features provided by interlocked operations.
I have a property with a backing field which I want to make thread safe (get and set).
The get and set method has no logic except the setting and returning.
I think there are two ways to capsule the logic in the property self (volatile and lock).
Is my understanding of the two's correct or have I make any mistakes?
Below are my examples:
public class ThreadSafeClass
{
// 1. Volatile Example:
private volatile int _processState_1;
public int ProcessState_1
{
get { return _processState_1; }
set { _processState_1 = value; }
}
// 2. Locking Example:
private readonly object _processState_2Lock = new object();
private int _processState_2;
public int ProcessState_2
{
get
{
lock (_processState_2Lock)
{
return _processState_2;
}
}
set
{
lock (_processState_2Lock)
{
_processState_2 = value;
}
}
}
}
For mor information see the great site by J. Albahari:
Synchronization constructs can be divided into four categories:
Simple blocking methods:
These wait for another thread to finish or for a period of time to elapse. Sleep, Join, and Task.Wait are simple blocking methods.
Locking constructs:
These limit the number of threads that can perform some activity or execute a section of code at a time. Exclusive locking constructs are most common — these allow just one thread in at a time, and allow competing threads to access common data without interfering with each other. The standard exclusive locking constructs are lock (Monitor.Enter/Monitor.Exit), Mutex, and SpinLock. The nonexclusive locking constructs are Semaphore, SemaphoreSlim, and the reader/writer locks.
Signaling constructs:
These allow a thread to pause until receiving a notification from another, avoiding the need for inefficient polling. There are two commonly used signaling devices: event wait handles and Monitor’s Wait/Pulse methods. Framework 4.0 introduces the CountdownEvent and Barrier classes.
Non-blocking synchronization constructs:
These protect access to a common field by calling upon processor primitives. The CLR and C# provide the following nonblocking constructs: Thread.MemoryBarrier, Thread.VolatileRead, Thread.VolatileWrite, the volatile keyword, and the Interlocked class.
The volatile keyword:
The volatile keyword instructs the compiler to generate an acquire-fence on every read from that field, and a release-fence on every write to that field. An acquire-fence prevents other reads/writes from being moved before the fence; a release-fence prevents other reads/writes from being moved after the fence. These “half-fences” are faster than full fences because they give the run-time and hardware more scope for optimization.
As it happens, Intel’s X86 and X64 processors always apply acquire-fences to reads and release-fences to writes — whether or not you use the volatile keyword — so this keyword has no effect on the hardware if you’re using these processors. However, volatile does have an effect on optimizations performed by the compiler and the CLR — as well as on 64-bit AMD and (to a greater extent) Itanium processors. This means that you cannot be more relaxed by virtue of your clients running a particular type of CPU.
The effect of applying volatile to fields can be summarized as follows:
First instruction Second instruction Can they be swapped?
Read Read No
Read Write No
Write Write No (The CLR ensures that write-write operations are never swapped, even without the volatile keyword)
Write Read Yes!
Notice that applying volatile doesn’t prevent a write followed by a read from being swapped, and this can create brainteasers. Joe Duffy illustrates the problem well with the following example: if Test1 and Test2 run simultaneously on different threads, it’s possible for a and b to both end up with a value of 0 (despite the use of volatile on both x and y):
class IfYouThinkYouUnderstandVolatile
{
volatile int x, y;
void Test1() // Executed on one thread
{
x = 1; // Volatile write (release-fence)
int a = y; // Volatile read (acquire-fence)
...
}
void Test2() // Executed on another thread
{
y = 1; // Volatile write (release-fence)
int b = x; // Volatile read (acquire-fence)
...
}
}
The MSDN documentation states that use of the volatile keyword ensures that the most up-to-date value is present in the field at all times. This is incorrect, since as we’ve seen, a write followed by a read can be reordered.
This presents a strong case for avoiding volatile: even if you understand the subtlety in this example, will other developers working on your code also understand it? A full fence between each of the two assignments in Test1 and Test2 (or a lock) solves the problem.
The volatile keyword is not supported with pass-by-reference arguments or captured local variables: in these cases you must use the VolatileRead and VolatileWrite methods.
How come if I have a statement like this:
private int sharedValue = 0;
public void SomeMethodOne()
{
lock(this){sharedValue++;}
}
public void SomeMethodTwo()
{
lock(this){sharedValue--;}
}
So for a thread to get into a lock it must first check if another thread is operating on it. If it isn't, it can enter and has to write something to memory, this surely cannot be atomic as it needs to read and write.
So how come it's impossible for one thread to be reading the lock, while the other is writing its ownership to it?
To simplify Why cannot two threads both get into a lock at the same time?
It looks like you are basically asking how the lock works. How can the lock maintain internal state in an atomic manner without already having the lock built? It seems like a chicken and egg problem at first does it not?
The magic all happens because of a compare-and-swap (CAS) operation. The CAS operation is a hardware level instruction that does 2 important things.
It generates a memory barrier so that instruction reordering is constrained.
It compares the contents of a memory address with another value and if they are equal then the original value is replaced with a new value. It does all of this in an atomic manner.
At the most fundamental level this is how the trick is accomplished. It is not that all other threads are blocked from reading while another is writing. That is totally the wrong way to think about it. What actually happens is that all threads are acting as writers simultaneously. The strategy is more optimistic than it is pessimistic. Every thread is trying to acquire the lock by performing this special kind of write called a CAS. You actually have access to a CAS operation in .NET via the Interlocked.CompareExchange (ICX) method. Every synchronization primitive can be built from this single operation.
If I were going to write a Monitor-like class (that is what the lock keyword uses behind the scenes) from scratch entirely in C# I could do it using the Interlocked.CompareExchange method. Here is an overly simplified implementation. Please keep in mind that this is most certainly not how the .NET Framework does it.1 The reason I present the code below is to show you how it could be done in pure C# code without the need for CLR magic behind the scenes and because it might get you thinking about how Microsoft could have implemented it.
public class SimpleMonitor
{
private int m_LockState = 0;
public void Enter()
{
int iterations = 0;
while (!TryEnter())
{
if (iterations < 10) Thread.SpinWait(4 << iterations);
else if (iterations % 20 == 0) Thread.Sleep(1);
else if (iterations % 5 == 0) Thread.Sleep(0);
else Thread.Yield();
iterations++;
}
}
public void Exit()
{
if (!TryExit())
{
throw new SynchronizationLockException();
}
}
public bool TryEnter()
{
return Interlocked.CompareExchange(ref m_LockState, 1, 0) == 0;
}
public bool TryExit()
{
return Interlocked.CompareExchange(ref m_LockState, 0, 1) == 1;
}
}
This implementation demonstrates a couple of important things.
It shows how the ICX operation is used to atomically read and write the lock state.
It shows how the waiting might occur.
Notice how I used Thread.SpinWait, Thread.Sleep(0), Thread.Sleep(1) and Thread.Yield while the lock is waiting to be acquired. The waiting strategy is overly simplified, but it does approximate a real life algorithm implemented in the BCL already. I intentionally kept the code simple in the Enter method above to make it easier to spot the crucial bits. This is not how I would have normally implemented this, but I am hoping it does drive home the salient points.
Also note that my SimpleMonitor above has a lot of problems. Here are but only a few.
It does not handle nested locking.
It does not provide Wait or Pulse methods like the real Monitor class. They are really hard to do right.
1The CLR will actually use a special block of memory that exists on each reference type. This block of memory is referred to as the "sync block". The Monitor will manipulate bits in this block of memory to acquire and release the lock. This action may require a kernel event object. You can read more about it on Joe Duffy's blog.
lock in C# is used to create a Monitor object that is actually used for locking.
You can read more about Monitor in here: http://msdn.microsoft.com/en-us/library/system.threading.monitor.aspx. The Enter method of the Monitor ensures that only one thread can enter the critical section at the time:
Acquires a lock for an object. This action also marks the beginning of a critical section. No other thread can enter the critical section unless it is executing the instructions in the critical section using a different locked object.
BTW, you should avoid locking on this (lock(this)). You should use a private variable on a class (static or non-static) to protect the critical section. You can read more in the same link provided above but the reason is:
When selecting an object on which to synchronize, you should lock only on private or internal objects. Locking on external objects might result in deadlocks, because unrelated code could choose the same objects to lock on for different purposes.
I have a thread that spins until an int changed by another thread is a certain value.
int cur = this.m_cur;
while (cur > this.Max)
{
// spin until cur is <= max
cur = this.m_cur;
}
Does this.m_cur need to be declared volatile for this to work? Is it possible that this will spin forever due to compiler optimization?
Yes, that's a hard requirement. The just-in-time compiler is allowed to store the value of m_cur in a processor register without refreshing it from memory. The x86 jitter in fact does, the x64 jitter doesn't (at least the last time I looked at it).
The volatile keyword is required to suppress this optimization.
Volatile means something entirely different on Itanium cores, a processor with a weak memory model. Unfortunately that's what made it into the MSDN library and C# Language Specification. What it is going to to mean on an ARM core remains to be seen.
The blog below has some fascinating detail on the memory model in c#. In short, it seems safer to use the volatile keyword.
http://igoro.com/archive/volatile-keyword-in-c-memory-model-explained/
From the blog below
class Test
{
private bool _loop = true;
public static void Main()
{
Test test1 = new Test();
// Set _loop to false on another thread
new Thread(() => { test1._loop = false;}).Start();
// Poll the _loop field until it is set to false
while (test1._loop == true) ;
// The loop above will never terminate!
}
}
There are two possible ways to get the while loop to terminate: Use a
lock to protect all accesses (reads and writes) to the _loop field
Mark the _loop field as volatile There are two reasons why a read of a
non-volatile field may observe a stale value: compiler optimizations
and processor optimizations.
It depends on how m_cur is being modified. If it's using a normal assignment statement such as m_cur--;, then it does need to be volatile. However, if it's being modified using one of the Interlocked operations, then it doesn't because Interlocked's methods automatically insert a memory barrier to ensure that all threads get the memo.
In general, using Interlocked to modify atomic valued that are shared across threads is the preferable option. Not only does it take care of the memory barrier for you, but it also tends to be a bit faster than other synchronization options.
That said, like others have said polling loops are enormously wasteful. It would be better to pause the thread that needs to wait, and let whoever is modifying m_cur take charge of waking it up when the time comes. Both Monitor.Wait() and Monitor.Pulse() and AutoResetEvent might be well-suited to the task, depending on your specific needs.
Suppose I have a variable "counter", and there are several threads accessing and setting the value of "counter" by using Interlocked, i.e.:
int value = Interlocked.Increment(ref counter);
and
int value = Interlocked.Decrement(ref counter);
Can I assume that, the change made by Interlocked will be visible in all threads?
If not, what should I do to make all threads synchronize the variable?
EDIT: someone suggested me to use volatile. But when I set the "counter" as volatile, there is compiler warning "reference to volatile field will not be treated as volatile".
When I read online help, it said, "A volatile field should not normally be passed using a ref or out parameter".
InterlockedIncrement/Decrement on x86 CPUs (x86's lock add/dec) are automatically creating memory barrier which gives visibility to all threads (i.e., all threads can see its update as in-order, like sequential memory consistency). Memory barrier makes all pending memory loads/stores to be completed. volatile is not related to this question although C# and Java (and some C/C++ compilers) enforce volatile to make memory barrier. But, interlocked operation already has memory barrier by CPU.
Please also take a look my another answer in stackoverflow.
Note that I have assume that C#'s InterlockedIncrement/Decrement are intrinsic mapping to x86's lock add/dec.
Can I assume that, the change made by Interlocked will be visible in all threads?
This depends on how you read the value. If you "just" read it, then no, this won't always be visible in other threads unless you mark it as volatile. That causes an annoying warning though.
As an alternative (and much preferred IMO), read it using another Interlocked instruction. This will always see the updated value on all threads:
int readvalue = Interlocked.CompareExchange(ref counter, 0, 0);
which returns the value read, and if it was 0 swaps it with 0.
Motivation: the warning hints that something isn't right; combining the two techniques (volatile & interlocked) wasn't the intended way to do this.
Update: it seems that another approach to reliable 32-bit reads without using "volatile" is by using Thread.VolatileRead as suggested in this answer. There is also some evidence that I am completely wrong about using Interlocked for 32-bit reads, for example this Connect issue, though I wonder if the distinction is a bit pedantic in nature.
What I really mean is: don't use this answer as your only source; I'm having my doubts about this.
Actually, they aren't. If you want to safely modify counter, then you are doing the correct thing. But if you want to read counter directly you need to declare it as volatile. Otherwise, the compiler has no reason to believe that counter will change because the Interlocked operations are in code that it might not see.
Interlocked ensures that only 1 thread at a time can update the value. To ensure that other threads can read the correct value (and not a cached value) mark it as volatile.
public volatile int Counter;
No; an Interlocked-at-Write-Only alone does not ensure that variable reads in code are actually fresh; a program that does not correctly read from a field as well might not be Thread-Safe, even under a "strong memory model". This applies to any form of assigning to a field shared between threads.
Here is an example of code that will never terminate due to the JIT. (It was modified from Memory Barriers in .NET to be a runnable LINQPad program updated for the question).
// Run this as a LINQPad program in "Release Mode".
// ~ It will never terminate on .NET 4.5.2 / x64. ~
// The program will terminate in "Debug Mode" and may terminate
// in other CLR runtimes and architecture targets.
class X {
// Adding {volatile} would 'fix the problem', as it prevents the JIT
// optimization that results in the non-terminating code.
public int terminate = 0;
public int y;
public void Run() {
var r = new ManualResetEvent(false);
var t = new Thread(() => {
int x = 0;
r.Set();
// Using Volatile.Read or otherwise establishing
// an Acquire Barrier would disable the 'bad' optimization.
while(terminate == 0){x = x * 2;}
y = x;
});
t.Start();
r.WaitOne();
Interlocked.Increment(ref terminate);
t.Join();
Console.WriteLine("Done: " + y);
}
}
void Main()
{
new X().Run();
}
The explanation from Memory Barriers in .NET:
This time it is JIT, not the hardware. It’s clear that JIT has cached the value of the variable terminate [in the EAX register and the] program is now stuck in the loop highlighted above ..
Either using a lock or adding a Thread.MemoryBarrier inside the while loop will fix the problem. Or you can even use Volatile.Read [or a volatile field]. The purpose of the memory barrier here is only to suppress JIT optimizations. Now that we have seen how software and hardware can reorder memory operations, it’s time to discuss memory barriers ..
That is, an additional barrier construct is required on the read side to prevent issues with Compilation and JIT re-ordering / optimizations: this is a different issue than memory coherency!
Adding volatile here would prevent the JIT optimization, and thus 'fix the problem', even if such results in a warning. This program can also be corrected through the use of Volatile.Read or one of the various other operations that cause a barrier: these barriers are as much a part of the CLR/JIT program correctness as the underlying hardware memory fences.