As I read deeper and deeper into the meaning of the volatile keyword, I keep saying to myself "this is way into implementation, this should not be a part of a high level programming language".
I mean, the fact that CPUs cache the data should be interesting for the JIT compiler, not to the C# programmer.
A considerable alternative might be an attribute (say, VolatileAttribute).
What do you think?
I think you got side-tracked. All the tech stuff about caching etc is part of an attempt to explain it in low level terms. The functional description for volatile would be "I might be shared". Given that by default nothing can be shared between threads, this is not altogether strange. And I think fundamental enough to warrant a keyword over an attribute, but I suppose it was largely influenced by historic decisions (C++)
One way to replace/optimize it is with VolatileRead() and VolatileWrite() calls. But that's even more 'implementation'.
Well, I certainly agree, it is pretty horrible that such an implementation detail is exposed. It is however the exact same kind of detail that's exposed by the lock keyword. We are still very far removed from that bug generator to be completely removed from our code.
The hardware guys have a lot of work to do. The volatile keyword matters a lot of CPU cores with a weak memory model. The marketplace isn't been kind to them, the Alpha and the Itanium haven't done well. Not exactly sure why, but I suspect that the difficulty of writing solid threaded code for these cores has a lot to do with it. Getting it wrong is quite a nightmare to debug. The verbiage in the MSDN Library documentation for volatile applies to these kind of processors, it otherwise is quite inappropriate for x86/x64 cores and makes it sound that the keyword is doing far more than it really does. Volatile merely prevents variable values from being stored in CPU registers on those cores.
Unfortunately, volatile still matters on x86 cores in very select circumstances. I haven't yet found any evidence that it matters on x64 cores. As far as I can tell, and backed up by the source code in SSCLI20, the Opcodes.Volatile instruction is a no-op for the x64 jitter, changing neither the compiler state nor emitting any machine code. That's heading the right way.
Generic advice is that wherever you're contemplating volatile, using lock or one of the synchronization classes should be your first consideration. Avoiding them to try to optimize your code is a micro-optimization, defeated by the amount of sleep you'll lose when your program is exhibiting thread race problems.
Using an attribute would be acceptable, if it were the other way around, that is, the compiler would assume that all varaibles are volatile, unless explicitly marked with an attribute saying it was safe. That would be incredibly determental to performance.
Hence it's assumed that, since having a variable's value changed outside of the view of the compiler is an abberation, the compiler would assume that it is not happeing.
However, That could happen in a program so the language itself must have a way of showing that.
Also, you seems confused about "implementation details". The term refers to things the compiler does behind your back. This is not the case here. Your code is modifying a varaible outside of the view of the compiler. SInce it's in your code, it will always be true. Hence the langauge must be able to indicate that.
volatile in c# emits the correct barriers, or fences, which would matter to the programmer doing multi-thread work. Bear in mind, that the compiler, runtime, and the processor all can reorder reads/writes to some degree (each has its own rules). Though CLR 2.0 has a stronger memory model that what CLI ECMA specifies, the CLR memory model still is not the strictest memory model so you have a need for volatile in C#.
As an attribute, I don't think you can use attributes inside a method body, so the keyword is necessary.
IIIRC, in C++ volatile was mainly about memory mapped I/O rather than cache. If you read the same port twice you get different answers. Still, I would agree with your assessment that this is more cleanly expressed in C# as an attribute.
On the other hand, most actual uses of volatile in C# can better be understood as thread lock anyway, so the choice of volatile is may be a little unfortunate.
Edit: Just to add: two links to show that in C/C++ `volatile is explicitly not for multithreading.
Related
I recently had an interview with a software company who asked me the following question:
Can you describe to me what adding volatile in front of variables does? Can you explain to me why it's important?
Most of my programming knowledge comes from C, yet the job position is for C# (I thought I might add this bit of info if necessary for the question specifically)
I answered by saying it just lets the compiler know that the variable can be used across processes or threads, and that it should not use optimization on that variable; as optimizing it can deteriorate behavior. In a nutshell, it's a warning to the compiler.
According to the interviewer, however, it's the other way around, and the volatile keyword warns the OS, not the compiler.
I was a bit befuddled by this, so I did some research and actually found conflicting answers! Some sources say it's for the compiler, and others for the OS.
Which is it? Does it differ by language?
I answered by saying it just lets the compiler know that the variable can be used across processes or threads, and that it should not use optimization on that variable; as optimizing it can deteriorate behavior. In a nutshell, it's a warning to the compiler.
This is going in the right direction for C# but misses some important aspects.
First off, delete "processes" entirely. Variables are not shared across processes in C#.
Second, don't concentrate on optimizations. Instead concentrate on permissible semantics. A compiler is not required to generate optimal code; a compiler is required to generate specification-compliant code. A re-ordering need not be for performance reasons and need not be faster / smaller / whatever. A volatile declaration adds an additional restriction on the permissible semantics of a multithreaded program.
Third, don't think of it as a warning to the compiler. It's a directive to the compiler: to generate code that is guaranteed to be compliant with the specification for volatile variables. How the compiler does so is up to it.
The actual answer to the question
Can you describe to me what adding volatile in front of variables does?
is: A C# compiler and runtime environment have great latitude to re-order variable reads and writes for any reason they see fit. They are restricted to only those re-orderings which preserve the meaning of programs on a single thread. So "x = y; a = b;" could move the read of b to before the read to y; that's legal because the outcome is unchanged. (This is not the only restriction on re-ordering, but it is in some sense the most fundamental one.) However, re-orderings are permitted to be noticeable on multiple threads; it is possible that another thread observes that b is read before y. This can cause problems.
The C# compiler and runtime have additional restrictions on how volatile reads and writes may be re-ordered with respect to each other, and furthermore how they may be ordered with respect to other events such as threads starting and stopping, locks, exceptions being thrown, and so on.
Consult the C# specification for a detailed list of the restrictions on observed orderings of reads, writes and other effects.
Note in particular that even with volatile variables, there is not required to be a consistent total ordering of all variable accesses as seen from all threads. And specifically, the notion that volatile "reads the latest value of the variable" is simply false; that phrasing suggests that there is such a thing as "the latest value", which implies a total consistent ordering.
If that sounds confusing, it is. Don't write multithreaded programs that share data across threads. If you must, use the highest level abstractions at your disposal. Hardly anyone should be writing code that uses volatile; use the TPL and let it manage your threads.
Now let's consider your answer in the context of C.
The question is ill-posed with respect to C. In C# volatile is a modifier on member variable declarations; in C, it's part of a type. So saying "before a variable" is ambiguous; where before the variable? There's a difference between a volatile int * x and an int * volatile x. (Can you see the difference?)
But more importantly: the C specification does not guarantee that volatile will have any particular behaviour with respect to threads. If your C compiler does, that's an extension of the language by your compiler vendor. Volatile in C is guaranteed to have certain behaviour with respect to memory mapped IO, long jumps, and signals, and that's all; if you rely on it to have certain behaviour with respect to threads then you are writing non-portable code.
According to the interviewer: it's the other way around, and the volatile keyword warns the OS, not the compiler.
That's nonsense from start to finish. Interviewers should not ask questions that they don't understand the answers to.
To be fairly honest, the question posed by the interviewer is kinda foggy as it is.
It really depends on what he/she meant by "OS". Are they talking about the "upfront OS", the pure software side of things, or what they might be misconstruing the "OS" as the hardware-software relationship, ie the RTE and MMM (I've seen either assumptions and comparisons in some of my own personal interviews). I think it should be noted that these two are quite distinctly different! If he/she is talking about the former, then NO volatile does not "inform" the OS. If they are talking about the latter, then yes (this is a loose yes). At this point you are in the realm of the differences between the languages. As Cody Gray mentioned, C# is a managed language, and so then the latter definition of the OS does indeed "get notified" of the variable and the precautions to take.
But also, in any case or definition of OS, the compiler does specially manage and deal with the volatile field, regardless of language. Otherwise, why have the keyword in the first place?
In my personal opinion, what ever that's worth, I think you answered correctly, albeit, judging by the comments, can get complicated and hectic by nature.
Is there any difference regarding performance of private, protected, public and internal methods in C# class? I'm interested if one consumes more processor time or RAM.
I'm not aware of any performance difference for normal invocation; it's possible that more restricted access will take a little more work when accessing via dynamic invocation or reflection as the caller may need to be validated more carefully. In the normal JIT-compiled case the access can be validated by the CLR just once and then taken for granted. I guess it's possible that the JIT compilation (and IL verification) itself could be slightly slower for more restrictive access - but I find it hard to believe it would be significant.
This should absolutely not be a factor in determining which accessibility to use, even if somehow there is some tiny performance difference I'm unaware of. If you believe you may be able to achieve a performance benefit by making the accessibility something other than the "natural" one for your design, you should definitely benchmark the before/after case - I suspect you'll be hard-pressed to find a real-world situation where the difference is reliably measurable.
The same sort of advice goes for all kinds of micro-optimization: it's almost never a good idea anyway, and should definitely only be undertaken within careful measuring.
There will be no measurable difference in performance between private, protected or public methods. If you focus on optimization, possibly you should try making your bottleneck piece of code more "procedural" than object-oriented. It would do small improvement.
There are a lot of articles and discussions explaining why it is good to build thread-safe classes. It is said that if multiple threads access e.g. a field at the same time, there can only be some bad consequences. So, what is the point of keeping non thread-safe code? I'm focusing mostly on .NET, but I believe the main reasons are not language-dependent.
E.g. .NET static fields are not thread-safe. What would be the result if they were thread-safe by default? (without a need to perform "manual" locking). What are the benefits of using (actually defaulting to) non-thread-safety?
One thing that comes to my mind is performance (more of a guess, though). It's rather intuitive that, when a function or field doesn't need to be thread-safe, it shouldn't be. However, the question is: what for? Is thread-safety just an additional amount of code you always need to implement? In what scenarios can I be 100% sure that e.g. a field won't be used by two threads at once?
Writing thread-safe code:
Requires more skilled developers
Is harder and consumes more coding efforts
Is harder to test and debug
Usually has bigger performance cost
But! Thread-safe code is not always needed. If you can be sure that some piece of code will be accessed by only one thread the list above becomes huge and unnecessary overhead. It is like renting a van when going to neighbor city when there are two of you and not much luggage.
Thread safety comes with costs - you need to lock fields that might cause problems if accessed simultaneously.
In applications that have no use of threads, but need high performance when every cpu cycle counts, there is no reason to have safe-thread classes.
So, what is the point of keeping non thread-safe code?
Cost. Like you assumed, there usually is a penalty in performance.
Also, writing thread-safe code is more difficult and time consuming.
Thread safety is not a "yes" or "no" proposition. The meaning of "thread safety" depends upon context; does it mean "concurrent-read safe, concurrent write unsafe"? Does it mean that the application just might return stale data instead of crashing? There are many things that it can mean.
The main reason not to make a class "thread safe" is the cost. If the type won't be accessed by multiple threads, there's no advantage to putting in the work and increase the maintenance cost.
Writing threadsafe code is painfully difficult at times. For example, simple lazy loading requires two checks for '== null' and a lock. It's really easy to screw up.
[EDIT]
I didn't mean to suggest that threaded lazy loading was particularly difficult, it's the "Oh and I didn't remember to lock that first!" moments that come fast and hard once you think you're done with the locking that are really the challenge.
There are situations where "thread-safe" doesn't make sense. This consideration is in addition to the higher developer skill and increased time (development, testing, and runtime all take hits).
For example, List<T> is a commonly-used non-thread-safe class. If we were to create a thread-safe equivalent, how would we implement GetEnumerator? Hint: there is no good solution.
Turn this question on its head.
In the early days of programming there was no Thread-Safe code because there was no concept of threads. A program started, then proceeded step by step to the end. Events? What's that? Threads? Huh?
As hardware became more powerful, concepts of what types of problems could be solved with software became more imaginative and developers more ambitious, the software infrastructure became more sophisticated. It also became much more top-heavy. And here we are today, with a sophisticated, powerful, and in some cases unnecessarily top-heavy software ecosystem which includes threads and "thread-safety".
I realize the question is aimed more at application developers than, say, firmware developers, but looking at the whole forest does offer insights into how that one tree evolved.
So, what is the point of keeping non thread-safe code?
By allowing for code that isn't thread safe you're leaving it up to the programmer to decide what the correct level of isolation is.
As others have mentioned this allows for complexity reduction and improved performance.
Rico Mariani wrote two articles entitled "Putting your synchronization at the correct level" and
Putting your synchronization at the correct level -- solution that have a nice example of this in action.
In the article he has a method called DoWork(). In it he calls other classes Read twice Write twice and then LogToSteam.
Read, Write, and LogToSteam all shared a lock and were thread safe. This is good except for the fact that because DoWork was also thread safe all the synchronizing work in each Read, Write and LogToSteam was a complete waste of time.
This is all related to the nature Imperative Programming. Its side effects cause the need for this.
However if you had an development platform where applications could be expressed as pure functions where there were no dependencies or side effects then it would be possible to create applications where the threading was managed without developer intervention.
So, what is the point of keeping non thread-safe code?
The rule of thumb is to avoid locking as much as possible. The Ideal code is re-entrant and thread safe with out any locking. But that would be utopia.
Coming back to reality, a good programmer tries his level best to have a sectional locking as opposed to locking the entire context. An example would be to lock few lines of code at a time in various routines than locking everything in a function.
So Also, one has to refactor the code to come up with a design that would minimize the locking if not get rid of it in entirity.
e.g. consider a foobar() function that gets new data on each call and uses switch() case on a type of data to changes a node in a tree. The locking can be mostly avoided (if not completely) As each case statement would touch a different node in a tree. This may be a more specific example but i think it elaborates my point.
A question like mine has been asked, but mine is a bit different. The question is, "Why is the volatile keyword not allowed in C# on types System.Double and System.Int64, etc.?"
On first blush, I answered my colleague, "Well, on a 32-bit machine, those types take at least two ticks to even enter the processor, and the .Net framework has the intention of abstracting away processor-specific details like that." To which he responds, "It's not abstracting anything if it's preventing you from using a feature because of a processor-specific problem!"
He's implying that a processor-specific detail should not show up to a person using a framework that "abstracts" details like that away from the programmer. So, the framework (or C#) should abstract away those and do what it needs to do to offer the same guarantees for System.Double, etc. (whether that's a Semaphore, memory barrier, or whatever). I argued that the framework shouldn't add the overhead of a Semaphore on volatile, because the programmer isn't expecting such overhead with such a keyword, because a Semaphore isn't necessary for the 32-bit types. The greater overhead for the 64-bit types might come as a surprise, so, better for the .Net framework to just not allow it, and make you do your own Semaphore on larger types if the overhead is acceptable.
That led to our investigating what the volatile keyword is all about. (see this page). That page states, in the notes:
In C#, using the volatile modifier on a field guarantees that all access to that field uses VolatileRead or VolatileWrite.
Hmmm.....VolatileRead and VolatileWrite both support our 64-bit types!! My question, then, is,
"Why is the volatile keyword not allowed in C# on types System.Double and System.Int64, etc.?"
He's implying that a processor-specific detail should not show up to a person using a framework that "abstracts" details like that away from the programmer.
If you are using low-lock techniques like volatile fields, explicit memory barriers, and the like, then you are entirely in the world of processor-specific details. You need to understand at a deep level precisely what the processor is and is not allowed to do as far as reordering, consistency, and so on, in order to write correct, portable, robust programs that use low-lock techniques.
The point of this feature is to say "I am abandoning the convenient abstractions guaranteed by single-threaded programming and embracing the performance gains possible by having a deep implementation-specific knowledge of my processor." You should expect less abstractions at your disposal when you start using low-lock techniques, not more abstractions.
You're going "down to the metal" for a reason, presumably; the price you pay is having to deal with the quirks of said metal.
Yes. Reason is that you even can't read double or long in one operation. I agree that it is poor abstraction. I have a feeling that reason was that reading them atomically requires effort and it would be too smart for compiler. So they let you choose the best solution: locking, Interlocked, etc.
Interesting thing is that they can actually be read atomically on 32 bit using MMX registers. This is what java JIT compiler does. And they can be read atomically on 64 bit machine. So I think it is serious flaw in design.
Not really an answer to your question, but...
I'm pretty sure that the MSDN documentation you've referenced is incorrect when it states that "using the volatile modifier on a field guarantees that all access to that field uses VolatileRead or VolatileWrite".
Directly reading or writing to a volatile field only generates a half-fence (an acquire-fence when reading and a release-fence when writing).
The VolatileRead and VolatileWrite methods use MemoryBarrier internally, which generates a full-fence.
Joe Duffy knows a thing or two about concurrent programming; this is what he has to say about volatile:
(As an aside, many people wonder about
the difference between loads and
stores of variables marked as volatile
and calls to Thread.VolatileRead and
Thread.VolatileWrite. The difference
is that the former APIs are
implemented stronger than the jitted
code: they achieve acquire/release
semantics by emitting full fences on
the right side. The APIs are more
expensive to call too, but at least
allow you to decide on a
callsite-by-callsite basis which
individual loads and stores need the
MM guarantees.)
It's a simple explanation of legacy. If you read this article - http://msdn.microsoft.com/en-au/magazine/cc163715.aspx, you'll find that the only implementation of the .NET Framework 1.x runtime was on x86 machines, so it makes sense for Microsoft to implement it against the x86 memory model. x64 and IA64 were added later. So the base memory model was always one of x86.
Could it have been implemented for x86? I'm actually not sure it can be fully implemented - a ref of a double returned from native code could be aligned to 4 bytes instead of 8. In which case, all your guarantees of atomic reads/writes no longer hold true.
Starting from .NET Framework 4.5, it is now possible to perform a volatile read or write on long or double variables by using the Volatile.Read and Volatile.Write methods. Although it's not documented, these methods perform atomic reads and writes on the long/double variables, as it's evident from their implementation:
private struct VolatileIntPtr { public volatile IntPtr Value; }
[Intrinsic]
[NonVersionable]
public static long Read(ref long location) =>
#if TARGET_64BIT
(long)Unsafe.As<long, VolatileIntPtr>(ref location).Value;
#else
// On 32-bit machines, we use Interlocked, since an ordinary volatile read would not be atomic.
Interlocked.CompareExchange(ref location, 0, 0);
#endif
Using these two methods is not as convenient as the volatile keyword though. Attention is required to not forget wrapping every read/write access of the volatile field in Volatile.Read or Volatile.Write respectively.
I'm confused. Answers to my previous question seems to confirm my assumptions. But as stated here volatile is not enough to assure atomicity in .Net. Either operations like incrementation and assignment in MSIL are not translated directly to single, native OPCODE or many CPUs can simultaneously read and write to the same RAM location.
To clarify:
I want to know if writes and reads are atomic on multiple CPUs?
I understand what volatile is about. But is it enough? Do I need to use interlocked operations if I want to get latest value writen by other CPU?
Herb Sutter recently wrote an article on volatile and what it really means (how it affects ordering of memory access and atomicity) in the native C++. .NET, and Java environments. It's a pretty good read:
volatile vs. volatile
volatile in .NET does make access to the variable atomic.
The problem is, that's often not enough. What if you need to read the variable, and if it is 0 (indicating that the resource is free), you set it to 1 (indicating that it's locked, and other threads should stay away from it).
Reading the 0 is atomic. Writing the 1 is atomic. But between those two operations, anything might happen. You might read a 0, and then before you can write the 1, another thread jumps in, reads the 0, and writes an 1.
However, volatile in .NET does guarantee atomicity of accesses to the variable. It just doesn't guarantee thread safety for operations relying on multiple accesses to it. (Disclaimer: volatile in C/C++ does not even guarantee this. Just so you know. It is much weaker, and occasinoally a source of bugs because people assume it guarantees atomicity :))
So you need to use locks as well, to group together multiple operations as one thread-safe chunk. (Or, for simple operations, the Interlocked operations in .NET may do the trick)
I might be jumping the gun here but it sounds to me as though you're confusing two issues here.
One is atomicity, which in my mind means that a single operation (that may require multiple steps) should not come in conflict with another such single operation.
The other is volatility, when is this value expected to change, and why.
Take the first. If your two-step operation requires you to read the current value, modify it, and write it back, you're most certainly going to want a lock, unless this whole operation can be translated into a single CPU instruction that can work on a single cache-line of data.
However, the second issue is, even when you're doing the locking thing, what will other threads see.
A volatile field in .NET is a field that the compiler knows can change at arbitrary times. In a single-threaded world, the change of a variable is something that happens at some point in a sequential stream of instructions so the compiler knows when it has added code that changes it, or at least when it has called out to outside world that may or may not have changed it so that once the code returns, it might not be the same value it was before the call.
This knowledge allows the compiler to lift the value from the field into a register once, before a loop or similar block of code, and never re-read the value from the field for that particular code.
With multi-threading however, that might give you some problems. One thread might have adjusted the value, and another thread, due to optimization, won't be reading this value for some time, because it knows it hasn't changed.
So when you flag a field as volatile you're basically telling the compiler that it shouldn't assume that it has the current value of this at any point, except for grabbing snapshots every time it needs the value.
Locks solve multiple-step operations, volatility handles how the compiler caches the field value in a register, and together they will solve more problems.
Also note that if a field contains something that cannot be read in a single cpu-instruction, you're most likely going to want to lock read-access to it as well.
For instance, if you're on a 32-bit cpu and writing a 64-bit value, that write-operation will require two steps to complete, and if another thread on another cpu manages to read the 64-bit value before step 2 has completed, it will get half of the previous value and half of the new, nicely mixed together, which can be even worse than getting an outdated one.
Edit: To answer the comment, that volatile guarantees the atomicity of the read/write operation, that's well, true, in a way, because the volatile keyword cannot be applied to fields that are larger than 32-bit, in effect making the field single-cpu-instruction read/writeable on both 32 and 64-bit cpu's. And yes, it will prevent the value from being kept in a register as much as possible.
So part of the comment is wrong, volatile cannot be applied to 64-bit values.
Note also that volatile has some semantics regarding reordering of reads/writes.
For relevant information, see the MSDN documentation or the C# specification, found here, section 10.5.3.
On a hardware level, multiple CPUs can never write simultanously to the same atomic RAM location. The size of an atomic read/write operation dependeds on CPU architecture, but is typically 1, 2 or 4 bytes on a 32-bit architecture. However, if you try reading the result back there is always a chance that another CPU has made a write to the same RAM location inbetween. On a low level, spin-locks are typically used to synchronize access to shared memory. In a high level language, such mechanisms may be called e.g. critical regions.
The volatile type just makes sure the variable is written immediately back to memory when it is changed (even if the value is to be used in the same function). A compiler will usually keep a value in an internal register for as long as possible if the value is to be reused later in the same function, and it is stored back to RAM when all modifications are finished or when a function returns. Volatile types are mostly useful when writing to hardware registers, or when you want to be sure a value is stored back to RAM in e.g. a multithread system.
Your question doesn't entirely make sense, because volatile specifies the how the read happens, not atomicity of multi-step processes. My car doesn't mow my lawn, either, but I try not to hold that against it. :)
The problem comes in with register based cashed copies of your variable's values.
When reading a value, the cpu will first see if it's in a register (fast) before checking main memory (slower).
Volatile tells the compiler to push the value out to main memory asap, and not to trust the cached register value. It's only useful in certain cases.
If you're looking for single op code writes, you'll need to use the Interlocked.Increment related methods.. But they're fairly limited in what they can do in a single safe instruction.
Safest and most reliable bet is to lock() (if you can't do an Interlocked.*)
Edit: Writes and reads are atomic if they're in a lock or an interlocked.* statement. Volatile alone is not enough under the terms of your question
Volatile is a compiler keyword that tells the compiler what to do. It does not necessarily translate into (essentially) bus operations that are required for atomicity. That is usually left up to the operating system.
Edit: to clarify, volatile is never enough if you want to guarantee atomicity. Or rather, it's up to the compiler to make it enough or not.