What "thread safe" really means...In Practical terms - c#

please bear with my newbie questions..
I was trying to convert PDF to PNG using ghostscript, with ASP.NET and C#. However, I also read that ghostscript is not thread safe. So my questions are:
What exactly does "ghostscript is not thread safe" mean in practical terms? What impact does it have if I use it in a live ASP.NET(aspx) web application with many concurrent users accessing it at the same time?
I also read from another site that the major feature of ghostscript ver. 8.63 is multithreaded rendering. Does this mean our thread safe issue is now resolved? Is ghostscript thread safe now?
I am also evaluating PDF2Image from PDFTron, which is supposed to be thread safe. But the per CPU license doesn't come cheap. Is it worth paying the extra money for "thread safe" vs "not safe"?

A precise technical definition that everyone agrees on is difficult to come up with.
Informally, "thread safe" simply means "is reasonably well-behaved when called from multiple threads". The object will not crash or produce crazy results when called from multiple threads.
The question you actually need to get answered if you intend to do multi-threaded programming involving a particular object is "what is the threading model expected by the object?"
There are a bunch of different threading models. For example, the "free threaded" model is "do whatever you want from any thread; the object will deal with it." That's the easiest model for you to deal with, and the hardest for the object provider to provide.
On the other end of the spectrum is the "single threaded" model -- all instances of all objects must be accessed from a single thread, period.
And then there's a bunch of stuff in the middle. The "apartment threaded" model is "you can create two instances on two different threads, but whatever thread you use to create an instance is the thread you must always use to call methods on that instance".
The "rental threaded" model is "you can call one instance on two different threads, but you are responsible for ensuring that no two threads are ever doing so at the same time".
And so on. Find out what the threading model your object expects before you attempt to write threading code against it.

Given that a Collection, for instance, is not threasafe:
var myDic = new Dictionary<string, string>();
In a multhread environment, this will throw:
string s = null;
if (!myDic.TryGetValue("keyName", out s)) {
s = new string('#', 10);
myDic.Add("keyName", s);
}
As one thread is working trying to add the KeyValuePair to the dictionary myDic, another one may TryGetValue(). As Collections can't be read and written at the same time, an Exception will occur.
However, on the other hand, if you try this:
// Other threads will wait here until the variable myDic gets unlocked from the preceding thread that has locked it.
lock (myDic) {
string s = null;
if (!myDic.TryGetValue("keyName", out s)) {
s = new string('#', 10);
myDic.Add("keyName", s);
}
} // The first thread that locked the myDic variable will now release the lock so that other threads will be able to work with the variable.
Then suddenly, the second thread trying to get the same "keyName" key value will not have to add it to the dictionary as the first thread already added it.
So in short, threadsafe means that an object supports being used by multiple threads at the same time, or will lock the threads appropriately for you, without you having to worry about threadsafety.
2. I don't think GhostScript is now threadsafe. It is majorly using multiple threads to perform its tasks, so this makes it deliver a greater performance, that's all.
3. Depending on your budget and your requirements, it may be worthy. But if you build around wrapper, you could perhaps only lock() where it is convenient to do so, or if you do not use multithreading yourself, it is definitely not worth to pay for threadsafety. This means only that if YOUR application uses multithreading, then you will not suffer the consequences of a library not being threadsafe. Unless you really multihread, it is not worth paying for a threadsafe library.

I am a Ghostscript developer, and won't repeat the general theory about thread safety. We have been working on getting GS to be thread safe so that multiple 'instances' can be created using gsapi_new_instance from within a single process, but we have not yet completed this to our satisfaction (which includes our QA testing of this). The graphics library is, however, thread safe and the multi-threaded rendering relies on this to allow us to spawn multiple threads to render bands from a display list in parallel. The multi-threaded rendering has been subjected to a lot of QA testing and is used by many commercial licensees to improve performance on multi-core CPU's.
You can bet we will announce when we finally support multiple instances of GS. Most people that want to use current GS from applications that need multiple instances spawn separate processes for each instance so that GS doesn't need to be thread safe. The GS can run a job as determined by the argument list options or I/O can be piped to/from the process to provide data and collect output.

1) It means if you share the same Ghostscript objects or fields among multiple threads, it will crash. For example:
private GhostScript someGSObject = new GhostScript();
...
// Uh oh, 2 threads using shared memory. This can crash!
thread1.Use(someGSObject);
thread2.Use(someGSObject);
2) I don't think so - multithreaded rendering suggests GS is internally using multiple threads to render. It doesn't address the problem of GS being unsafe for use from multiple threads.
3) Is there a question in there?
To make GhostScript thread safe, make sure only 1 thread at a time is accessing it. You can do this via locks:
lock(someObject)
{
thread1.Use(someGSObject);
}
lock(someObject)
{
thread2.Use(someGSObject);
}

If you are using ghostscript from a shell object (i.e. running a command line to process the file) you will not be caught by threading problems because every instance running will in a different process on the server. Where you need to be careful is when you have a dll that you are using from C# to process the PDF, that code would need to be synchronized to keep from two threads from executing the same code at the same time.

Thread safe basically means that a piece of code will function correctly even when accessed by multiple threads. Multiple problems can occur if you use non-thread safe code in a threaded application. The most common problem is deadlocking. However, there are much more nefarious problems (race conditions) which can be more of a problem because thread issues are notoriously difficult to debug.
No. Multithreaded rendering just means that GS will be able to render faster because it is using threads to render (in theory, anyway - not always true in practice).
That really depends on what you want to use your renderer for. If you are going to be accessing your application with multiple threads, then, yes, you'll need to worry about it being thread safe. Otherwise, it's not a big deal.

In general it is an ambiguous term.
Thread-Safety could be at the conceptual level, where you have correct synchronization of your shared data. This is usually, what is meant by library writers.
Sometimes, it means concurrency is defined at the language level. i.e. the memory model of the language supports concurrency. This is tricky! because as a library writer you can't produce concurrent libraries, because the language have no guarantees for many essential primitives that are needed to use. This concerns compiler writers more than library users. C# is thread-safe in that sense.
I know I didn't answer your question directly, but hope that helps.

Related

C# Threading - Using a class in a thread-safe way vs. implementing it as thread-safe

Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe). Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
Is this means that I use the class in a thread-safe way, and the fact that the documentation state that it's not thread-safe
is no longer relevant to my situation?
If the answer is No: Can I do everything related to a specific object in the same thread - i.e, creating it and calling its members always in the same thread (but not the GUI thread)? If so, how do I do that? (If relevant, it's a WPF app).
No, it is not thread safe. As a general rule, you should never write multi threaded code without some kind of synchronization. In your first example, even if you somehow manage to ensure that modifying/reading is never done at the same time, still there is a problem of caching values and instructions reordering.
Just for example, CPU caches values into a register, you update it on one thread, read it from another. If the second one has it cached, it doesn't go to RAM to fetch it and doesn't see the updated value.
Take a look at this great post for more info and problems with writing lock free multi threaded code link. It has a great explanation how CPU, compiler and CLI byte code compiler can reorder instructions.
Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe).
"Thread-safe" has a number of different meanings. Most objects fall into one of three categories:
Thread-affine. These objects can only be accessed from a single thread, never from another thread. Most UI components fall into this category.
Thread-safe. These objects can be accessed from any thread at any time. Most synchronization objects (including concurrent collections) fall into this category.
One-at-a-time. These objects can be accessed from one thread at a time. This is the "default" category, with most .NET types falling into this category.
Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
As another answerer noted, you have to take into consideration instruction reordering and cached reads. In other words, it's not sufficient to just do these at different times; you'll need to implement proper barriers to ensure it is guaranteed to work correctly.
The easiest way to do this is to protect all access of the object with a lock statement. If all reads, writes, and method calls are all within the same lock, then this would work (assuming the object does have a one-at-a-time kind of threading model and not thread-affine).
Suppose I want to use a non thread-safe class from the .Net Framework (the documentation states that it is not thread-safe). Sometimes I change the value of Property X from one thread, and sometimes from another thread, but I never access it from two threads at the same time. And sometimes I call Method Y from one thread, and sometimes from another thread, but never at the same time.
All Classes are by default non thread safe, except few Collections like Concurrent Collections designed specifically for the thread safety. So for any other class that you may choose and if you access it via multiple threads or in a Non atomic manner, whether read / write then it's imperative to introduce thread safety while changing the state of an object. This only applies to the objects whose state can be modified in a multi-threaded environment but Methods as such are just functional implementation, they are themselves not a state, which can be modified, they just introduce thread safety for maintaining the object state.
Is this means that I use the class in a thread-safe way, and the fact that the documentation state that it's not thread-safe is no longer relevant to my situation? If the answer is No: Can I do everything related to a class in the same thread (but not the GUI thread)? If so, how do I do that? (If relevant, it's a WPF app).
For a Ui application, consider introducing Async-Await for IO based operations, like file read, database read and use TPL for compute bound operations. Benefit of Async-Await is that:
It doesn't block the Ui thread at all, and keeps Ui completely responsive, in fact post await Ui controls can be directly updated with no Cross thread concern, since only one thread is involved
The TPL concurrency too makes compute operations blocking, they summon the threads from the thread Pool and can't be used for the Ui update due to Cross thread concern
And last: there are classes in which one method starts an operation, and another one ends it. For example, using the SpeechRecognitionEngine class you can start a speech recognition session with RecognizeAsync (this method was before the TPL library so it does not return a Task), and then cancel the recognition session with RecognizeAsyncCancel. What if I call RecognizeAsync from one thread and RecognizeAsyncCancel from another one? (It works, but is it "safe"? Will it fail on some conditions which I'm not aware of?)
As you have mentioned the Async method, this might be an older implementation, based on APM, which needs AsyncCallBack to coordinate, something on the lines of BeginXX, EndXX, if that's the case, then nothing much would be required to co-ordinate, as they use AsyncCallBack to execute a callback delegate. In fact as mentioned earlier, there's no extra thread involved here, whether its old version or new Async-Await. Regarding task cancellation, CancellationTokenSource can be used for the Async-Await, a separate cancellation task is not required. Between multiple threads coordination can be done via Auto / Manual ResetEvent.
If the calls mentioned above are synchronous, then use the Task wrapper to return the Task can call them via Async method as follows:
await Task.Run(() => RecognizeAsync())
Though its a sort of Anti-Pattern, but can be useful in making whole call chain Async
Edits (to answer OP questions)
Thanks for your detailed answer, but I didn't understand some of it. At the first point you are saying that "it's imperative to introduce thread safety", but how?
Thread safety is introduced using synchronization constructs like lock, mutex, semaphore, monitor, Interlocked, all of them serve the purpose of saving an object from getting corrupt / race condition. I don't see any steps.
Does the steps I have taken, as described in my post, are enough?
I don't see any thread safety steps in your post, please highlight which steps you are talking about
At the second point I'm asking how to use an object in the same thread all the time (whenever I use it). Async-Await has nothing to do with this, AFAIK.
Async-Await is the only mechanism in concurrency, which since doesn't involved any extra thread beside calling thread, can ensure everything always runs on same thread, since it use the IO completion ports (hardware based concurrency), otherwise if you use Task Parallel library, then there's no way for you to ensure that same / given thread is always use, as that's a very high level abstraction
Check one of my recent detailed answer on threading here, it may help in providing some more detailed aspects
It is not thread-safe, as the technical risk exists, but your policy is designed to cope with the problem and work around the risk. So, if things stand as you described, then you are not having a thread-safe environment, however, you are safe. For now.

Threads Syncronization Vs Tasks Syncronization Vs ConcurrentDictionary (No sync Needed) , which to choose

If in our program we are using Threads to access lets say shared collection, then we should ensure thread safety with Mutex, Monitor or Sempahore, et.c
but If we are not using Threads but we are using Tasks and then multiple tasks are trying to access common shared collection then also we should ensure safety by some methods
But If we use some readymade threadsafe collection like ConcurrentDictionary then ensuring locking and thread-task safety is not required as it is already handled at framework level.
So basically i want to know which approach can be used if we are working with shared resource in concurrent consumer environment.
They're all great solutions for different problems. If you can tell us precisely what you're trying to do, what resources are shared, what kinds of accesses are required, then we can tell you which is probably right for your solution.
Overall, unless you've got very specific performance requirements, go with the easiest solution. That is, the ConcurrentDictionary. Since the synchronization logic is built-in, you can be almost certain that nobody will mess up. 'Manual' task and thread synchronization can be pretty tricky at times.

Multi-threading concept and lock in c#

I read about lock, though not understood nothing at all.
My question is why do we use a un-used object and lock that and how this makes something thread-safe or how this helps in multi-threading ? Isn't there other way to make thread-safe code.
public class test {
private object Lock { get; set; }
...
lock (this.Lock) { ... }
...
}
Sorry is my question is very stupid, but i don't understand, although i've used it many times.
Accessing a piece of data from one thread while other thread is modifying it is called "data race condition" (or just "data race") and can lead to corruption of data. (*)
Locks are simply a mechanism for avoiding data races. If two (or more) concurrent threads lock the same lock object, then they are no longer concurrent and can no longer cause data races, for the duration of the lock. Essentially, we are serializing the access to shared data.
The trick is to keep your locks as "wide" as you must to avoid data races, yet as "narrow" as you can to gain performance through concurrent execution. This is a fine balance that can easily go out of whack in either direction, which is why multi-threaded programming is hard.
Some guidelines:
As long all threads are just reading the data and none will ever modify it, lock is unnecessary.
Conversely, if at least one thread might at some point modify the data, then all concurrent code paths accessing that same data must be properly serialized through locks, even those that only read the data.
Using a lock in one code path but not the other will leave the data wide open to race conditions.
Also, using one lock object in one code path, but a different lock object in another (concurrent) code path does not serialize these code paths and leaves you wide open to data races.
On the other hand, if two concurrent code paths access different data, they can use different lock objects. But, whenever there is more than one lock object, watch out for deadlocks. A deadlock is often also a "code race condition" (and a heisenbug, see below).
The lock object does not need to be (and usually isn't) the same thing as the data you are trying to protect. Unfortunately, there is no language facility that lets you "declare" which data is protected by which lock object, so you'll have to very carefully document your "locking convention" both for other people that might maintain your code, and for yourself (since even after a short time you will forget some of the nooks and crannies of your locking convention).
It's usually a good idea to protect the lock object from the outside world as much as you can. After all, you are using it for the very sensitive task of locking and you don't want it locked by external actors in unforeseen ways. That's why using this or a public field as a lock object is usually a bad idea.
The lock keyword is simply a more convenient syntax for Monitor.Enter and Monitor.Exit.
The lock object can be any object in .NET, but value objects will be boxed in the call to Monitor.Enter, which means threads will not share the same lock object, leaving the data unprotected. Therefore, only use reference types as lock objects.
For inter-process communication you can use a global mutex, which can be created by passing a non-empty name to Mutex Constructor. Global mutexes provide essentially the same functionality as regular "local" locking, except they can be shared between separate processes.
There are synchronization mechanisms other than locks, such as semaphores, condition variables, message queues or atomic operations. Be careful when mixing different synchronization mechanisms.
Locks also behave as memory barriers, which is increasingly important on modern multi-core, multi-cache CPUs. This is part of the reason why you need locks on reading the data and not just writing.
(*) It is called "race" because concurrent threads are "racing" towards performing an operation on the shared data and whoever wins that race determines the outcome of the operation. So the outcome depends on timing of the execution, which is essentially random on modern preemptive multitasking OSes. Worse yet, timing is easily modified by a simple act of observing the program execution through tools such as debugger, which makes them "heisenbugs" (i.e. the phenomenon being observed is changed by the mere act of observation).
Lock object is like a door into the single room where only one guest per time can enter.
The room can be your data, the guest can be your function.
define data (room)
add door (lock object)
invite guests (functions)
using lock insctruction close/open door to allow only one guest per time enter into the room.
Why we need this? If you simulatniously write a data in a file (just an example, can be 1000s others) you will need to sync an access of your funcitons (close/open door for guests) to the write file, so any function will append to the end of the file (assuming that is requierement of this example)
This is naturally not only way sync the threads, there are more out there:
Monitors
Wait hadlers
...
Check out the link for complete information and description of each of them
Thread Synchronization
Yes, there is indeed another way:
using System.Runtime.CompilerServices;
class Test
{
private object Lock { get; set; }
[MethodImpl(MethodImplOptions.Synchronized)]
public void Foo()
{
// Now this instance is locked
}
}
While it looks more "natural", it's not used often, because of the fact that the object is locking on itself this way, so other code could not risk locking on this object -- it could cause a deadlock.
Because of this, you usually create a (lazy-initialized) private field referring to an object, and use that object as a lock instead. This will guarantee that no one else can lock against the same object as you.
A little more detail on what's happening beneath the hood:
When you "lock on an object", you're not locking on the object itself. Rather, you're using the object as a guaranteed-to-be-unique-address-in-memory throughout your program. When you "lock", the runtime takes the object's address, uses it to look up the actual lock inside another table (which is hidden from you), and uses that object as the ""lock" (also known as a "critical section").
So really, for you, an object is just a proxy/symbol -- it isn't doing anything by itself; it's just acting as a unique indicator that will never clash with another valid object in the same program.
When you have different threads accessing same variable/resource at the same time they may over write on this variable/resource and you can have unexpected results. Lock will make sure only one thread can assess variable at on time and remain thread will queue to get access to this variable/resource till lock is released
suppose we have balance variable of an account.
Two different thread read its value which was 100
Suppose first thread adds 50 to it like 100 + 50 and saves it and balance will have 150
As second thread already read 100 and mean while. suppose it subtract 50 like 100-50 but point to note here is that first thread has made the balance 150 so second thread should to 150-50 this could cause serious problems.
So lock makes sure that when on thread wants to change some resource states it locks it and leaves after committing change
The lock statement introduces the concept of mutual exclusion. Only one thread can acquire a lock on a given object at any one time. This prevents threads from accessing shared data structures concurrently, thus corrupting them.
If other threads already hold a lock, the lock statement will block until it is able to acquire an exclusive lock on its argument before allowing its block to execute.
Note that the only thing lock does is control entry to the block of code. Access to members of the class is completely unrelated to the lock. It is up to the class itself to ensure that accesses that must be synchronized are coordinated by the use of lock or other synchronization primitives. Also note that access to some or all members may not have to be synchronized. For instance, if you want to maintain a counter, you could use the Interlocked class without locking.
An alternative to locking is lock-free data structures, which behave correctly in the presence of multiple threads. Operations on lock-free data structures must be designed very carefully, usually with the assistance of lock-free primitives such as compare-and-swap (CAS).
The general theme of such techniques is to try to perform operations on data structures atomically and detect when operations fail due to concurrent actions by other threads, followed by retries. This works well on a lightly loaded system where failures are unlikely, but can produce runaway behaviour as the failure rate climbs and retries become a dominant load. This problem can be ameliorated by backing off the retry rate, effectively throttling the load.
A more sophisticated alternative is software transactional memory. Unlike CAS, STM generalizes the concept of fail-and-retry to arbitrarily complex memory operations. In simple terms, you start a transaction, perform all your operations, and finally commit. The system detects if the operations cannot succeed due to conflicting operations performed by other threads that beat the current thread to the punch. In such cases, STM can either fail outright, requiring the application to take corrective action, or, in more sophisticated implementations, it can automatically go back to the start of the transaction and try again.
Your confusion is pretty typical for those just getting familiar with the lock keyword in C#. You are right, the object used in the lock statement is really nothing more than a token that defines a critical section. That object, in no way, has any protection from multithreaded access itself.
The way this works is that the CLR reserves a 4 byte (32-bit systems) section in the object header (type handle) called the sync block. The sync block is nothing more than an index into an array that stores the actual critical section information. When you use the lock keyword the CLR will modify this sync block value accordingly.
There are advantages and disadvantages to this scheme. The advantage is that it made for a fairly elegant solution to defining critical sections. One obvious disadvantage is that each object instance contains the sync block and most instances never use it so it would seem to be a waste of space in most cases. Another disadvantage is that boxed value types can be used which is almost always wrong and certainly leads to confusion.
I remember way back when .NET was first released that there was a lot of chatter over whether the lock keyword was good or bad for the language. The general consensus (at least as I remember it) was that it was bad because the using keyword could have been easily used instead. In fact, a solution that used the using keyword actually would have made more sense because it could have been done without the need for the sync block. The c# design team even went on record to say that had they been given a second chance the lock keyword never would have made it into the language.1
1The only reference I could find for this is on Jon Skeet's website here.

Are there any cases when it's preferable to use a plain old Thread object instead of one of the newer constructs?

I see a lot of people in blog posts and here on SO either avoiding or advising against the usage of the Thread class in recent versions of C# (and I mean of course 4.0+, with the addition of Task & friends). Even before, there were debates about the fact that a plain old thread's functionality can be replaced in many cases by the ThreadPool class.
Also, other specialized mechanisms are further rendering the Thread class less appealing, such as Timers replacing the ugly Thread + Sleep combo, while for GUIs we have BackgroundWorker, etc.
Still, the Thread seems to remain a very familiar concept for some people (myself included), people that, when confronted with a task that involves some kind of parallel execution, jump directly to using the good old Thread class. I've been wondering lately if it's time to amend my ways.
So my question is, are there any cases when it's necessary or useful to use a plain old Thread object instead of one of the above constructs?
The Thread class cannot be made obsolete because obviously it is an implementation detail of all those other patterns you mention.
But that's not really your question; your question is
are there any cases when it's necessary or useful to use a plain old Thread object instead of one of the above constructs?
Sure. In precisely those cases where one of the higher-level constructs does not meet your needs.
My advice is that if you find yourself in a situation where existing higher-abstraction tools do not meet your needs, and you wish to implement a solution using threads, then you should identify the missing abstraction that you really need, and then implement that abstraction using threads, and then use the abstraction.
Threads are a basic building block for certain things (namely parallelism and asynchrony) and thus should not be taken away. However, for most people and most use cases there are more appropriate things to use which you mentioned, such as thread pools (which provide a nice way of handling many small jobs in parallel without overloading the machine by spawning 2000 threads at once), BackgroundWorker (which encapsulates useful events for a single shortlived piece of work).
But just because in many cases those are more appropriate as they shield the programmer from needlessly reinventing the wheel, doing stupid mistakes and the like, that does not mean that the Thread class is obsolete. It is still used by the abstractions named above and you would still need it if you need fine-grained control over threads that is not covered by the more special classes.
In a similar vein, .NET doesn't forbid the use of arrays, despite List<T> being a better fit for many cases where people use arrays. Simply because you may still want to build things that are not covered by the standard lib.
Task and Thread are different abstractions. If you want to model a thread, the Thread class is still the most appropriate choice. E.g. if you need to interact with the current thread, I don't see any better types for this.
However, as you point out .NET has added several dedicated abstractions which are preferable over Thread in many cases.
The Thread class is not obsolete, it is still useful in special circumstances.
Where I work we wrote a 'background processor' as part of a content management system: a Windows service that monitors directories, e-mail addresses and RSS feeds, and every time something new shows up execute a task on it - typically to import the data.
Attempts to use the thread pool for this did not work: it tries to execute too much stuff at the same time and trash the disks, so we implemented our own polling and execution system using directly the Thread class.
The new options make direct use and management of the (expensive) threads less frequent.
people that, when confronted with a task that involves some kind of parallel execution, jump directly to using the good old Thread class.
Which is a very expensive and relatively complex way of doing stuff in parallel.
Note that the expense matters most: You cannot use a full thread to do a small job, it would be counterproductive. The ThreadPool combats the costs, the Task class the complexities (exceptions, waiting and canceling).
To answer the question of "are there any cases when it's necessary or useful to use a plain old Thread object", I'd say a plain old Thread is useful (but not necessary) when you have a long running process that you won't ever interact with from a different thread.
For example, if you're writing an application that subscribes to receive messages from some sort of message queue and you're application is going to do more than just process those messages then it would be useful to use a Thread because the thread will be self-contained (i.e. you aren't waiting on it to get done), and it isn't short-lived. Using the ThreadPool class is more for queuing up a bunch of short-lived work items and allowing the ThreadPool class manage efficiently processing each one as a new Thread is available. Tasks can be used where you would use Thread directly, but in the above scenario I don't think they would buy you much. They help you interact with the thread more easily (which the above scenario doesn't need) and they help determine how many Threads actually should be used for the given set of tasks based on the number of processors you have (which isn't what you want, so you'd tell the Task your thing is LongRunning in which case in the current 4.0 implementation it would simply create a separate non-pooled Thread).
Probably not the answer you were expecting, but I use Thread all the time when coding against the .NET Micro Framework. MF is quite cut down and doesn't include higher level abstractions and the Thread class is super flexible when you need to get the last bit of performance out of a low MHz CPU.
You could compare the Thread class to ADO.NET. It's not the recommended tool for getting the job done, but its not obsolete. Other tools build on top of it to ease the job.
Its not wrong to use the Thread class over other things, especially if those things don't provide a functionality that you need.
It's not definitely obsolete.
The problem with multithreaded apps is that they are very hard to get right (often indeterministic behavior, input, output and also internal state is important), so a programmer should push as much work as possible to framework/tools. Abstract it away. But, the mortal enemy of abstraction is performance.
So my question is, are there any cases when it's necessary or useful
to use a plain old Thread object instead of one of the above
constructs?
I'd go with Threads and locks only if there will be serious performance problems, high performance goals.
I've always used the Thread class when I need to keep count and control over the threads I've spun up. I realize I could use the threadpool to hold all of my outstanding work, but I've never found a good way to keep track of how much work is currently being done or what the status is.
Instead, I create a collection and place the threads in them after I spin them up - the very last thing a thread does is remove itself from the collection. That way, I can always tell how many threads are running, and I can use the collection to ask each what it's doing. If there's a case when I need to kill them all, normally you'd have to set some kind of "Abort" flag in your application, wait for every thread to notice that on its own and self-terminate - in my case, I can walk the collection and issue a Thread.Abort to each one in turn.
In that case, I haven't found a better way that working directly with the Thread class. As Eric Lippert mentioned, the others are just higher-level abstractions, and it's appropriate to work with the lower-level classes when the available high-level implementations don't meet your need. Just as you sometimes need to do Win32 API calls when .NET doesn't address your exact needs, there will always be cases where the Thread class is the best choice despite recent "advancements."

C# Delegates and Threads!

What exactly do I need delegates, and threads for?
Delegates act as the logical (but safe) equivalent to function-pointers; they allow you to talk about an operation in an abstract way. The typical example of this is events, but I'm going to use a more "functional programming" example: searching in a list:
List<Person> people = ...
Person fred = people.Find( x => x.Name == "Fred");
Console.WriteLine(fred.Id);
The "lambda" here is essentially an instance of a delegate - a delegate of type Predicate<Person> - i.e. "given a person, is something true or false". Using delegates allows very flexible code - i.e. the List<T>.Find method can find all sorts of things based on the delegate that the caller passes in.
In this way, they act largely like a 1-method interface - but much more succinctly.
Delegates: Basically, a delegate is a method to reference a method. It's like a pointer to a method which you can set it to different methods that match its signature and use it to pass the reference to that method around.
Thread is a sequentual stream of instructions that execute one after another to complete a computation. You can have different threads running simultaneously to accomplish a specific task. A thread runs on a single logical processor.
Delegates are used to add methods to events dynamically.
Threads run inside of processes, and allow you to run 2 or more tasks at once that share resources.
I'd suggest have a search on these terms, there is plenty of information out there. They are pretty fundamental concepts, wiki is a high level place to start:
http://en.wikipedia.org/wiki/Thread_(computer_science)
http://en.wikipedia.org/wiki/C_Sharp_(programming_language)
Concrete examples always help me so here is one for threads. Consider your web server. As requests arrive at the server, they are sent to the Web Server process for handling. It could handle each as it arrives, fully processing the request and producing the page before turning to the next one. But consider just how much of the processing takes place at hard drive speeds (rather than CPU speeds) as the requested page is pulled from the disk (or data is pulled from the database) before the response can be fulfilled.
By pulling threads from a thread pool and giving each request its own thread, we can take care of the non-disk needs for hundreds of requests before the disk has returned data for the first one. This will permit a degree of virtual parallelism that can significantly enhance performance. Keep in mind that there is a lot more to Web Server performance but this should give you a concrete model for how threading can be useful.
They are useful for the same reason high-level languages are useful. You don't need them for anything, since really they are just abstractions over what is really happening. They do make things significantly easier and faster to program or understand.
Marc Gravell provided a nice answer for 'what is a delegate.'
Andrew Troelsen defines a thread as
...a path of execution within a process. "Pro C# 2008 and the .NET 3.5 Platform," APress.
All processes that are run on your system have at least one thread. Let's call it the main thread. You can create additional threads for any variety of reasons, but the clearest example for illustrating the purpose of threads is printing.
Let's say you open your favorite word processing application (WPA), type a few lines, and then want to print those lines. If your WPA uses the main thread to print the document, the WPA's user interface will be 'frozen' until the printing is finished. This is because the main thread has to print the lines before it can process any user interface events, i.e., button clicks, mouse movements, etc. It's as if the code were written like this:
do
{
ProcessUserInterfaceEvents();
PrintDocument();
} while (true);
Clearly, this is not what users want. Users want the user interface to be responsive while the document is being printed.
The answer, of course, is to print the lines in a second thread. In this way, the user interface can focus on processing user interface events while the secondary thread focuses on printing the lines.
The illusion is that both tasks happen simultaneously. On a single processor machine, this cannot be true since the processor can only execute one thread at a time. However, switching between the threads happens so fast that the illusion is usually maintained. On a multi-processor (or mulit-core) machine, this can be literally true since the main thread can run on one processor while the secondary thread runs on another processor.
In .NET, threading is a breeze. You can utilize the System.Threading.ThreadPool class, use asynchronous delegates, or create your own System.Threading.Thread objects.
If you are new to threading, I would throw out two cautions.
First, you can actually hurt your application's performance if you choose the wrong threading model. Be careful to avoid using too many threads or trying to thread things that should really happen sequentially.
Second (and more importantly), be aware that if you share data between threads, you will likely need to sychronize access to that shared data, e.g., using the lock keyword in C#. There is a wealth of information on this topic available online, so I won't repeat it here. Just be aware that you can run into intermittent, not-always-repeatable bugs if you do not do this carefully.
Your question is to vague...
But you probably just want to know how to use them in order to have a window, a time consuming process running and a progress bar...
So create a thread to do the time consuming process and use the delegates to increase the progress bar! :)

Categories