In many MSDN documents, this is written under the Thread Safety heading;
"Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe."
for example; here
can someone explain it please in a rather simple way?
Thank you :)
Eric Lippert has an excellent blog post about this. Basically it's somewhat meaningless on its own.
Personally I don't trust MSDN too much on this front, when I see that boiler-plate. It doesn't always mean what it says. For example, it says the same thing about Encoding - despite the fact that we all use encodings from multiple threads all over the place.
Unless I have any reason to believe otherwise (which I do with Encoding) I assume that I can call any static member from any thread with no corruption of global state. If I want to use instance members of the same object from different threads, I assume that's okay if I ensure - via locking - that only one thread will use the object at a time. (That's not always the case, of course. Some objects have thread affinity and actively dislike being used from multiple threads, even with locking in place. UI controls are the obvious example.)
Of course, it becomes tricky if objects are being shared unobviously - if I have two objects which each share a reference to a third, then I may end up using the first two objects independently from different threads, with all the proper locking - but still end up corrupting the third object.
If a type does advertise itself to be thread safe, I'd hope that it would give some details about it. It's easy if it's immutable - you can just use instances however you like without worrying about them. It's partially or wholly "thread-safe" types which are mutable where the details matter greatly.
You may access any public static member of that class from multiple threads at the same time, and not disrupt the state of the class. If multiple threads attempt to access the object using instance methods (those methods not marked "static") at the same time, the object may become corrupted.
A class is "thread-safe" if attempts to access the same instance of the class from multiple threads at the same time does not cause problems.
An object being "thread safe" means that if two threads are using it at (or very near, on single-CPU systems) the exact same time, there's no chance of it being corrupted by said access. That's usually achieved by acquiring and releasing locks, which can cause bottlenecks, so "thread safe" can also mean "slow" if it's done when it doesn't need to be.
Public static members are pretty much expected to be shared between threads (Note, VB even calls it "Shared"), so public statics are generally made in such a way that they can be used safely.
Instance members aren't usually thread-safe, because in the general case it'd slow things down. If you have an object you want to share between threads, therefore, you'll need to do your own synchronization/locking.
To understand this, consider the following example.
In MSDN description of .net class HashSet, there is a part that says about the thread safety. In the case of HashSet Class, MSDN says “Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.”
Of cause we all know the concept of race conditions and deadlocks, but what does Microsoft wants to say in simple English?
If two threads add two values to an “instance” of a HashSet there are some situation where we can get its count as one. Of cause in this situation the HashSet object is corrupted since we now have two objects in the HashSet, yet its count shows only one. However, public static version of the HashSet will never face such a corruption even if two threads concurrently add values.
Related
Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe). Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
Is this means that I use the class in a thread-safe way, and the fact that the documentation state that it's not thread-safe
is no longer relevant to my situation?
If the answer is No: Can I do everything related to a specific object in the same thread - i.e, creating it and calling its members always in the same thread (but not the GUI thread)? If so, how do I do that? (If relevant, it's a WPF app).
No, it is not thread safe. As a general rule, you should never write multi threaded code without some kind of synchronization. In your first example, even if you somehow manage to ensure that modifying/reading is never done at the same time, still there is a problem of caching values and instructions reordering.
Just for example, CPU caches values into a register, you update it on one thread, read it from another. If the second one has it cached, it doesn't go to RAM to fetch it and doesn't see the updated value.
Take a look at this great post for more info and problems with writing lock free multi threaded code link. It has a great explanation how CPU, compiler and CLI byte code compiler can reorder instructions.
Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe).
"Thread-safe" has a number of different meanings. Most objects fall into one of three categories:
Thread-affine. These objects can only be accessed from a single thread, never from another thread. Most UI components fall into this category.
Thread-safe. These objects can be accessed from any thread at any time. Most synchronization objects (including concurrent collections) fall into this category.
One-at-a-time. These objects can be accessed from one thread at a time. This is the "default" category, with most .NET types falling into this category.
Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
As another answerer noted, you have to take into consideration instruction reordering and cached reads. In other words, it's not sufficient to just do these at different times; you'll need to implement proper barriers to ensure it is guaranteed to work correctly.
The easiest way to do this is to protect all access of the object with a lock statement. If all reads, writes, and method calls are all within the same lock, then this would work (assuming the object does have a one-at-a-time kind of threading model and not thread-affine).
Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe). Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
All Classes are by default non thread safe, except few Collections like Concurrent Collections designed specifically for the thread safety. So for any other class that you may choose and if you access it via multiple threads or in a Non atomic manner, whether read / write then it's imperative to introduce thread safety while changing the state of an object. This only applies to the objects whose state can be modified in a multi-threaded environment but Methods as such are just functional implementation, they are themselves not a state, which can be modified, they just introduce thread safety for maintaining the object state.
Is this means that I use the class in a thread-safe way, and the fact that the documentation state that it's not thread-safe is no longer relevant to my situation? If the answer is No: Can I do everything related to a class in the same thread (but not the GUI thread)? If so, how do I do that? (If relevant, it's a WPF app).
For a Ui application, consider introducing Async-Await for IO based operations, like file read, database read and use TPL for compute bound operations. Benefit of Async-Await is that:
It doesn't block the Ui thread at all, and keeps Ui completely responsive, in fact post await Ui controls can be directly updated with no Cross thread concern, since only one thread is involved
The TPL concurrency too makes compute operations blocking, they summon the threads from the thread Pool and can't be used for the Ui update due to Cross thread concern
And last: there are classes in which one method starts an operation, and another one ends it. For example, using the SpeechRecognitionEngine class you can start a speech recognition session with RecognizeAsync (this method was before the TPL library so it does not return a Task), and then cancel the recognition session with RecognizeAsyncCancel. What if I call RecognizeAsync from one thread and RecognizeAsyncCancel from another one? (It works, but is it "safe"? Will it fail on some conditions which I'm not aware of?)
As you have mentioned the Async method, this might be an older implementation, based on APM, which needs AsyncCallBack to coordinate, something on the lines of BeginXX, EndXX, if that's the case, then nothing much would be required to co-ordinate, as they use AsyncCallBack to execute a callback delegate. In fact as mentioned earlier, there's no extra thread involved here, whether its old version or new Async-Await. Regarding task cancellation, CancellationTokenSource can be used for the Async-Await, a separate cancellation task is not required. Between multiple threads coordination can be done via Auto / Manual ResetEvent.
If the calls mentioned above are synchronous, then use the Task wrapper to return the Task can call them via Async method as follows:
await Task.Run(() => RecognizeAsync())
Though its a sort of Anti-Pattern, but can be useful in making whole call chain Async
Edits (to answer OP questions)
Thanks for your detailed answer, but I didn't understand some of it. At the first point you are saying that "it's imperative to introduce thread safety", but how?
Thread safety is introduced using synchronization constructs like lock, mutex, semaphore, monitor, Interlocked, all of them serve the purpose of saving an object from getting corrupt / race condition. I don't see any steps.
Does the steps I have taken, as described in my post, are enough?
I don't see any thread safety steps in your post, please highlight which steps you are talking about
At the second point I'm asking how to use an object in the same thread all the time (whenever I use it). Async-Await has nothing to do with this, AFAIK.
Async-Await is the only mechanism in concurrency, which since doesn't involved any extra thread beside calling thread, can ensure everything always runs on same thread, since it use the IO completion ports (hardware based concurrency), otherwise if you use Task Parallel library, then there's no way for you to ensure that same / given thread is always use, as that's a very high level abstraction
Check one of my recent detailed answer on threading here, it may help in providing some more detailed aspects
It is not thread-safe, as the technical risk exists, but your policy is designed to cope with the problem and work around the risk. So, if things stand as you described, then you are not having a thread-safe environment, however, you are safe. For now.
As an API designer, does it make sense to perform lock checks to ensure that the object state is not invalidated by the caller?
Consider a Grid3D data structure which must resize itself each time its Width, Heigth, or Depth is changed. If a caller is modifying the Grid3D from multiple threads, the Grid could be resizing while a new resize attempt is made and this would invalidate the object state or throw an exception.
This can be overcome by using locks to provide mutual exclusion to the resize function, and it could happen either within the API (that is, the class definition of Grid3D) or it can happen in the application where Grid3D is used.
If it is correct to lock inside the Grid3D class definition, it stands to say that thread synchronization should be considered for all API development. In many cases (certainly many examples online including StackOverflow) do not consider synchronization in API-level classes.
So, where is the correct place to perform locking? Under what conditions should the API-level be concerned with locking?
It's your design decision, You can create a thread safe class or you can delegate the task of using it in thread safe manner to the one who uses it.
Usually libraries does not provide thread safe instances of classes when they are not intended to be used in multi-threaded environment. If the main usage of your class is in multi-threaded environment you should handle the thread-safety.
You can see this quote million of times in MSDN
Any public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
this means that instances of the class are not meant to be used by default in multi-thread environment but you can see also classes that support built in synchronization by default and are ready to be used in multi-threaded environment.
As I see the Grid3D is meant to be a UI component and usually UI components are built to be used just by the thread that have created them.
It depends on whether the API is supposed to be thread-safe or not. If it is, then API layer should be concerned with locking, otherwise it should not.
As an example have a look at System.Threading.Thread and System.Array classes. The Thread is thread-safe and it is documented so (see Thread Safety section), while Array is not thread-safe and it is noted in documentation that its “…Any instance members are not guaranteed to be thread safe”.
A friend asked me which would be better ThreadStatic or ThreadLocal. Checking the doc I told him ThreadLocal looks more convenient, is available since .NET 4.0, but I don't understand why use any of them over creating object instance for a thread. Their purpose is to store "thread-local-data", so you can call methods less clumsily and avoid locking in some instances. When I wanted such thread-local-data I always was creating something like:
class ThreadHandler
{
SomeClass A;
public ThreadHandler(SomeClass A)
{
this.A = A;
}
public void Worker()
{
}
}
If I want just fire and forget thread it would be new Thread(new ThreadHandler(new SomeClass()).TheWorkerMethod).Start(), if I want to track threads it can be added to collection, if I want to track data ThreadHandler can be added to collection, if I want to handle both I can make Thread property for ThreadHandler and put ThreadHandler to collection, I want threadpool it's QueueUserWorkItem instead of new Thread(). It's short and simple if scope is simple, but easily extensible if scope gets wider.
When I'm trying to google why use ThreadLocal over an object instance all my searches end up with explanation how ThreadLocal is much greater than ThreadStatic, which in my eyes look like people explaining that they had this clumsy screwdriver, but now toolbox has heavy monkey-wrench which is much more convenient for hammering nails. Whilst toolbox had a hammer to begin with.
I understand I'm missing something, because if ThreadStatic/ThreadLocal had no advantage they just wouldn't exist. Can somebody please point out at least one significant advantage of ThreadLocal over creating an object instance for a thread?
UPD: Looks like a double of this, I think when I was googling "java" keyword was throwing me off. So there's at least one advantage - ThreadLocal is more natural to use with Task Parallel Library.
I don't get advantage of ThreadLocal over creating an instance of object for a thread.
You're right, when you have control over the threads being created, and how they're used, it's very handy to just wrap the whole thread in a helper class, and have it get 'thread local' data from there.
The problem is that, especially in institutionally large projects, you don't always have this kind of control. You may start up a thread, and call some code, and that one thread may wind its way through calls in millions of lines of code scattered between 10 projects owned by 3 internal teams and one external contractor team. Good luck plumbing some of those parameters everywhere.
Thread-local storage lets those guys interact without requiring that they have explicit references to the object that represents that thread's context.
A related problem I had was associating data to some thread and every child thread created by that thread (since my large projects create their own threads, and so thread-local doesn't work anymore), see this question I had: Is there any programmable data that is automatically inherited by children Thread objects?
At the end of the day, it's often lazy programming, but sometimes you find situations where you just need it.
ThreadLocal<T> works like a Dictionary<Thread, T>. The problem with a dictionary is that instances belonging to killed or dead threads stay around forever - they don't get garbage collected, because they are referenced by the dictionary. Using ThreadLocal will ensure that, when a thread dies, the instances referenced by that thread are eligible for GC.
Plus, it's a much nicer interface than having to manually deal with a Dictionary<Thread, T>. It Just Works.
ThreadLocal has 2 benefits over ThreadStatic attribute approach, you can avoid to define class-field and it has built in lazy loading feature. your manual collection approach requires locking mechanism, if you look ThreadLocal's source code, you see its optimized to this specific case.
ThreadLocal can get benfits when T type object new and gc frequenctly. And it's thread safe.
I read about lock, though not understood nothing at all.
My question is why do we use a un-used object and lock that and how this makes something thread-safe or how this helps in multi-threading ? Isn't there other way to make thread-safe code.
public class test {
private object Lock { get; set; }
...
lock (this.Lock) { ... }
...
}
Sorry is my question is very stupid, but i don't understand, although i've used it many times.
Accessing a piece of data from one thread while other thread is modifying it is called "data race condition" (or just "data race") and can lead to corruption of data. (*)
Locks are simply a mechanism for avoiding data races. If two (or more) concurrent threads lock the same lock object, then they are no longer concurrent and can no longer cause data races, for the duration of the lock. Essentially, we are serializing the access to shared data.
The trick is to keep your locks as "wide" as you must to avoid data races, yet as "narrow" as you can to gain performance through concurrent execution. This is a fine balance that can easily go out of whack in either direction, which is why multi-threaded programming is hard.
Some guidelines:
As long all threads are just reading the data and none will ever modify it, lock is unnecessary.
Conversely, if at least one thread might at some point modify the data, then all concurrent code paths accessing that same data must be properly serialized through locks, even those that only read the data.
Using a lock in one code path but not the other will leave the data wide open to race conditions.
Also, using one lock object in one code path, but a different lock object in another (concurrent) code path does not serialize these code paths and leaves you wide open to data races.
On the other hand, if two concurrent code paths access different data, they can use different lock objects. But, whenever there is more than one lock object, watch out for deadlocks. A deadlock is often also a "code race condition" (and a heisenbug, see below).
The lock object does not need to be (and usually isn't) the same thing as the data you are trying to protect. Unfortunately, there is no language facility that lets you "declare" which data is protected by which lock object, so you'll have to very carefully document your "locking convention" both for other people that might maintain your code, and for yourself (since even after a short time you will forget some of the nooks and crannies of your locking convention).
It's usually a good idea to protect the lock object from the outside world as much as you can. After all, you are using it for the very sensitive task of locking and you don't want it locked by external actors in unforeseen ways. That's why using this or a public field as a lock object is usually a bad idea.
The lock keyword is simply a more convenient syntax for Monitor.Enter and Monitor.Exit.
The lock object can be any object in .NET, but value objects will be boxed in the call to Monitor.Enter, which means threads will not share the same lock object, leaving the data unprotected. Therefore, only use reference types as lock objects.
For inter-process communication you can use a global mutex, which can be created by passing a non-empty name to Mutex Constructor. Global mutexes provide essentially the same functionality as regular "local" locking, except they can be shared between separate processes.
There are synchronization mechanisms other than locks, such as semaphores, condition variables, message queues or atomic operations. Be careful when mixing different synchronization mechanisms.
Locks also behave as memory barriers, which is increasingly important on modern multi-core, multi-cache CPUs. This is part of the reason why you need locks on reading the data and not just writing.
(*) It is called "race" because concurrent threads are "racing" towards performing an operation on the shared data and whoever wins that race determines the outcome of the operation. So the outcome depends on timing of the execution, which is essentially random on modern preemptive multitasking OSes. Worse yet, timing is easily modified by a simple act of observing the program execution through tools such as debugger, which makes them "heisenbugs" (i.e. the phenomenon being observed is changed by the mere act of observation).
Lock object is like a door into the single room where only one guest per time can enter.
The room can be your data, the guest can be your function.
define data (room)
add door (lock object)
invite guests (functions)
using lock insctruction close/open door to allow only one guest per time enter into the room.
Why we need this? If you simulatniously write a data in a file (just an example, can be 1000s others) you will need to sync an access of your funcitons (close/open door for guests) to the write file, so any function will append to the end of the file (assuming that is requierement of this example)
This is naturally not only way sync the threads, there are more out there:
Monitors
Wait hadlers
...
Check out the link for complete information and description of each of them
Thread Synchronization
Yes, there is indeed another way:
using System.Runtime.CompilerServices;
class Test
{
private object Lock { get; set; }
[MethodImpl(MethodImplOptions.Synchronized)]
public void Foo()
{
// Now this instance is locked
}
}
While it looks more "natural", it's not used often, because of the fact that the object is locking on itself this way, so other code could not risk locking on this object -- it could cause a deadlock.
Because of this, you usually create a (lazy-initialized) private field referring to an object, and use that object as a lock instead. This will guarantee that no one else can lock against the same object as you.
A little more detail on what's happening beneath the hood:
When you "lock on an object", you're not locking on the object itself. Rather, you're using the object as a guaranteed-to-be-unique-address-in-memory throughout your program. When you "lock", the runtime takes the object's address, uses it to look up the actual lock inside another table (which is hidden from you), and uses that object as the ""lock" (also known as a "critical section").
So really, for you, an object is just a proxy/symbol -- it isn't doing anything by itself; it's just acting as a unique indicator that will never clash with another valid object in the same program.
When you have different threads accessing same variable/resource at the same time they may over write on this variable/resource and you can have unexpected results. Lock will make sure only one thread can assess variable at on time and remain thread will queue to get access to this variable/resource till lock is released
suppose we have balance variable of an account.
Two different thread read its value which was 100
Suppose first thread adds 50 to it like 100 + 50 and saves it and balance will have 150
As second thread already read 100 and mean while. suppose it subtract 50 like 100-50 but point to note here is that first thread has made the balance 150 so second thread should to 150-50 this could cause serious problems.
So lock makes sure that when on thread wants to change some resource states it locks it and leaves after committing change
The lock statement introduces the concept of mutual exclusion. Only one thread can acquire a lock on a given object at any one time. This prevents threads from accessing shared data structures concurrently, thus corrupting them.
If other threads already hold a lock, the lock statement will block until it is able to acquire an exclusive lock on its argument before allowing its block to execute.
Note that the only thing lock does is control entry to the block of code. Access to members of the class is completely unrelated to the lock. It is up to the class itself to ensure that accesses that must be synchronized are coordinated by the use of lock or other synchronization primitives. Also note that access to some or all members may not have to be synchronized. For instance, if you want to maintain a counter, you could use the Interlocked class without locking.
An alternative to locking is lock-free data structures, which behave correctly in the presence of multiple threads. Operations on lock-free data structures must be designed very carefully, usually with the assistance of lock-free primitives such as compare-and-swap (CAS).
The general theme of such techniques is to try to perform operations on data structures atomically and detect when operations fail due to concurrent actions by other threads, followed by retries. This works well on a lightly loaded system where failures are unlikely, but can produce runaway behaviour as the failure rate climbs and retries become a dominant load. This problem can be ameliorated by backing off the retry rate, effectively throttling the load.
A more sophisticated alternative is software transactional memory. Unlike CAS, STM generalizes the concept of fail-and-retry to arbitrarily complex memory operations. In simple terms, you start a transaction, perform all your operations, and finally commit. The system detects if the operations cannot succeed due to conflicting operations performed by other threads that beat the current thread to the punch. In such cases, STM can either fail outright, requiring the application to take corrective action, or, in more sophisticated implementations, it can automatically go back to the start of the transaction and try again.
Your confusion is pretty typical for those just getting familiar with the lock keyword in C#. You are right, the object used in the lock statement is really nothing more than a token that defines a critical section. That object, in no way, has any protection from multithreaded access itself.
The way this works is that the CLR reserves a 4 byte (32-bit systems) section in the object header (type handle) called the sync block. The sync block is nothing more than an index into an array that stores the actual critical section information. When you use the lock keyword the CLR will modify this sync block value accordingly.
There are advantages and disadvantages to this scheme. The advantage is that it made for a fairly elegant solution to defining critical sections. One obvious disadvantage is that each object instance contains the sync block and most instances never use it so it would seem to be a waste of space in most cases. Another disadvantage is that boxed value types can be used which is almost always wrong and certainly leads to confusion.
I remember way back when .NET was first released that there was a lot of chatter over whether the lock keyword was good or bad for the language. The general consensus (at least as I remember it) was that it was bad because the using keyword could have been easily used instead. In fact, a solution that used the using keyword actually would have made more sense because it could have been done without the need for the sync block. The c# design team even went on record to say that had they been given a second chance the lock keyword never would have made it into the language.1
1The only reference I could find for this is on Jon Skeet's website here.
While reading Joe Albahari's excellent book "Threading in C#" I came across the following ambiguous sentence:
A thread-safe type does not necessarily make the program using it thread-safe, and often the work involved in the latter makes the former redundant.
(You can find the sentence on this page; just search for "indeterminacy" to quickly jump to the appropriate section.)
I am looking to use a ConcurrentDictionary to implement certain thread-safe data structures. Is the paragraph telling me that ConcurrentDictionary does not guarantee thread-safe writes to my data structure? Can someone please provide a counter-example that shows a thread-safe type actually failing to provide thread safety?
Thanks in advance for your help.
At the simplest, a thread safe list or dictionary is a good example; having each individual operation thread safe isn't always enough - for example, "check if the list is empty; if it is, add an item" - even if all thread-safe, you can't do:
if(list.Count == 0) list.Add(foo);
as it could change between the two. You need to synchronize the test and the change.
My understanding of the warning is that just because you are using thread safe variables does not mean that your program is thread safe.
As an example, consider a class that has two variables that can be modified from two threads. Just because these variables are individually thread safe doesn't guarantee atomicity of modifications to the class. If there are two threads modifying these variables, it is possible that one variable will end up with the value set by one thread, while the other gets set by another thread. This can easily break the internal consistency of the class.
Was doing some searching a while back to fix a problem I had with some threading and came across this page:
http://www.albahari.com/threading/part2.aspx#_Thread_Safety
Particularly the section on "Locking around thread-safe objects"
From the page:
Sometimes you also need to lock around accessing thread-safe objects. To illustrate, imagine that the Framework’s List class was, indeed, thread-safe, and we want to add an item to a list:
if (!_list.Contains (newItem)) _list.Add (newItem);
Whether or not the list was thread-safe, this statement is certainly not!
I think what he means is that just using ConcurrentDictionary instead of Dictionary everywhere isn't going to make the program thread-safe. So, if you have a non-thread-safe program, a search and replace isn't going to help; likewise, adding SynchronizedAttribute everywhere isn't going to work like a magic fairy dust. This is particularly true regarding collections, where iteration is always a problem[1].
On the other hand, if you restructure the non-thread-safe program into a more thread-safe design, then you often don't need thread-safe data structures. One popular approach is to redefine the program in terms of "actors" that send "messages" to each other - aside from a single producer/consumer-style message queue, each actor can stand alone and does not need to use thread-safe data structures internally.
[1] The first release of BCL collections included some "thread-safe" collections that just plain were not thread-safe during iterations. The Concurrent collections are thread-safe during iteration, but iterate concurrently with other threads' modifications. Other collection libraries allow "snapshots" which can then be iterated, ignoring modifications from other threads.
It's a bit of a vague statement, but consider for example, a class has two members, each of which is thread-safe, but that must both be updated in an atomic manner.
In dealing with that situation, you're likely to make that entire operation atomic, and thus thread-safe, rendering the thread-safe access to the individual members irrelevant.
If doesn't mean that your ConcurrentDictionary is going to behave in an unsafe way.
My concise explanation is this. There are many forms of thread safety and code that satisfies one form does not automatically satisfy all the others.
Roy,
I guess you're "over-reading" a too-concise sentence... I interpret that sentence as meaning two things:
"Just using threadsafe data-structures doesn't mean your program handles multithreading properly... any more than than the presence of threadsafe data-structures inherently makes your program multithreaded"; and he then goes on to say
"Unless you're prepared to put in "the hard yards" involved (it often requires a very precise understanding of quite complex scenarios) to make your WHOLE program handle threading properly, using a threadsafe data-structure is basically a waste of clock-ticks.
Ergo: Multi-threading is pretty hard, using appropriate out-of-the-box datastructures is an important part of any solution, but it's certainly NOT the whole solution... and unless you're prepared to think-it-through (i.e. do your syncronization home-work) you're just kidding yourself that a data-structure will somehow magically "fix" your program.
I know that sounds "a bit harsh" but my perception is that a lot of noobs are really disappointed when they discover that programming (still, in this enlightened age of animiated icons and GUI painters) requires Deep Thought. Who'd've thunk it?!?!
Cheers. Keith.
Is the paragraph telling me that
ConcurrentDictionary does not
guarantee thread-safe writes to my
data structure?
No, that is not what Joe Albahari means. ConcurrentDictionary will always maintain a consistent state through simultaneous writes from multiple threads. Another thread will never see the data structure in an inconsistent state.
Can someone please provide a
counter-example that shows a
thread-safe type actually failing to
provide thread safety?
However, a series of reads and writes from a thread-safe type may still fail in a multithreaded environment.
void ExecutedByMultipleThreads(ConcurrentQueue<object> queue)
{
object value;
if (!queue.IsEmpty)
{
queue.TryDequeue(out value);
Console.WriteLine(value.GetHashCode());
}
}
So clearly ConcurrentQueue is a thread-safe type, but this program can still fail with a NullReferenceException if another thread dequeued the last item between the IsEmpty and TryDequeue methods. The data structure itself still provides its thread-safety guarentee by remaining in a consistent state, but the program is not thread-safe by assumptions it make about thread-safety in a general are not correct. In this case its the program that is incorrect; not the data structure.