CLR via C# 4th Ed. - Confused about waiting for Task deadlock

CLR via C# 4th Ed. - Confused about waiting for Task deadlock - c#

Jeffrey Richter pointed out in his book 'CLR via C#' the example of a possible deadlock I don't understand (page 702, bordered paragraph).
The example is a thread that runs Task and call Wait() for this Task. If the Task is not started it should possible that the Wait() call is not blocking, instead it's running the not started Task. If a lock is entered before the Wait() call and the Task also try to enter this lock can result in a deadlock.
But the locks are entered in the same thread, should this end up in a deadlock scenario?
The following code produce the expected output.
class Program
{
static object lockObj = new object();
static void Main(string[] args)
{
Task.Run(() =>
{
Console.WriteLine("Program starts running on thread {0}",
Thread.CurrentThread.ManagedThreadId);
var taskToRun = new Task(() =>
{
lock (lockObj)
{
for (int i = 0; i < 10; i++)
Console.WriteLine("{0} from Thread {1}",
i, Thread.CurrentThread.ManagedThreadId);
}
});
taskToRun.Start();
lock (lockObj)
{
taskToRun.Wait();
}
}).Wait() ;
}
}
/* Console output
Program starts running on thread 3
0 from Thread 3
1 from Thread 3
2 from Thread 3
3 from Thread 3
4 from Thread 3
5 from Thread 3
6 from Thread 3
7 from Thread 3
8 from Thread 3
9 from Thread 3
*/
No deadlock occured.
J. Richter wrote in his book "CLR via C#" 4th Edition on page 702:
When a thread calls the Wait method, the system checks if the Task that the thread is waiting for has started executing. If it has, then the thread calling Wait will block until the Task has completed running. But if the Task has not started executing yet, then the system may (depending on the TaskScheduler) execute the Trask by using the thread that called Wait. If this happens, then the thread calling Wait does not block; it executes the Task and returns immediatlely. This is good in that no thread has blocked, thereby reducing resource usage (by not creating a thread to replace the blocked thread) while improving performance (no time is spet to create a thread an there is no contexte switcing). But it can also be bad if, for example, thre thread has taken a thread synchronization lock before calling Wait and thren the Task tries to take the same lock, resulting in a deadlocked thread!
If I'm understand the paragraph correctly, the code above has to end in a deadlock!?

You're taking my usage of the word "lock" too literally. The C# "lock" statement (which my book discourages the use of), internally leverages Monitor.Enter/Exit. The Monitor lock is a lock that supports thread ownership & recursion. Therefore, a single thread can acquire this kind of lock multiple times successfully. But, if you use a different kind of lock, like a Semaphore(Slim), an AutoResetEvent(Slim) or a ReaderWriterLockSlim (without recursion), then when a single thread tries to acquire any of these locks multiple times, deadlock occurs.

In this example, you're dealing with task inlining, a not-so-rare behavior of the TPL's default task scheduler. It results in the task being executed on the same thread which is already waiting for it with Task.Wait(), rather than on a random pool thread. In which case, there is no deadlock.
Change your code like below and you'll have a dealock:
taskToRun.Start();
lock (lockObj)
{
//taskToRun.Wait();
((IAsyncResult)taskToRun).AsyncWaitHandle.WaitOne();
}
The task inlining is nondeterministic, it may or may not happen. You should make no assumptions. Check Task.Wait and “Inlining” by Stephen Toub for more details.
Updated, the lock does not affect the task inlining here. Your code still runs without deadlock if you move taskToRun.Start() inside the lock:
lock (lockObj)
{
taskToRun.Start();
taskToRun.Wait();
}
What does cause the inlining here is the circumstance that the main thread is calling taskToRun.Wait() right after taskToRun.Start(). Here's what happens behind the scene:
taskToRun.Start() queues the task for execution by the task scheduler, but it hasn't been allocated a pool thread yet.
On the same thread, the TPL code inside taskToRun.Wait() checks if the task has already been allocated a pool thread (it hasn't) and executes it inline on the main thread. In which case, it's OK to acquired the same lock twice without a deadlock.
There is also a TPL Task Scheduler thread. If this thread gets a chance to execute before taskToRun.Wait() is called on the main thread, inlining doesn't happen and you get a deadlock. Adding Thread.Sleep(100) before Task.Wait() would be modelling this scenario. Inlining also doesn't happen if you don't use Task.Wait() and rather use something like AsyncWaitHandle.WaitOne() above.
As to the quote you've added to your question, it depends on how you read it. One thing is for sure: the same lock from the main thread can be entered inside the task, when the task gets inlined, without a deadlock. You just cannot make any assumptions that it will get inlined.

In your example, no deadlock occurs because the thread scheduling the task and the thread executing the task happen to be the same. If you were to modify the code such that your task ran on a different thread, you would see the deadlock occur, because two threads would then be contending for a lock on the same object.
Your example, modified to create a deadlock:
class Program {
static object lockObj = new object();
static void Main(string[] args) {
Console.WriteLine("Program starts running on thread {0}",
Thread.CurrentThread.ManagedThreadId);
var taskToRun = new Task(() => {
lock (lockObj) {
for (int i = 0; i < 10; i++)
Console.WriteLine("{0} from Thread {1}",
i, Thread.CurrentThread.ManagedThreadId);
}
});
lock (lockObj) {
taskToRun.Start();
taskToRun.Wait();
}
}
}

This example code has two standard threading problems. To understand it, you first have to understand thread races. When you start a thread, you can never assume it will start running right away. Nor can you assume that the code inside the thread arrives at a particular statement at a particular moment in time.
What matters a great deal here is whether or not the task arrives at the lock statement before the main thread does. In other words, whether it races ahead of the code in the main thread. Do model this as a horse race, the thread that acquired the lock is the horse that wins.
If it is the task that wins, pretty common on modern machines with multiple processor cores or a simple program that doesn't have any other threads active (and probably when you test the code) then nothing goes wrong. It acquires the lock and prevents the main thread from doing the same when it, later, arrives at the lock statement. So you'll see the console output, the task finishes, the main thread now acquires the lock and the Wait() call quickly completes.
But if the thread pool is already busy with other threads, or the machine is busy executing threads in other programs, or you are unlucky and you get an email just as the task starts running, then the code in the task doesn't start running right away and it is the main thread that acquired the lock first. The task can now no longer enter the lock statement so it cannot complete. And the main thread can not complete, Wait() will never return. A deadly embrace called deadlock.
Deadlock is relatively easy to debug, you've got all the time in the world to attach a debugger and look at the active threads to see why they are blocked. Threading race bugs are incredibly difficult to debug, they happen too infrequently and it can be very difficult to reason through the ordering problem that causes them. A common approach to diagnose thread races is to add tracing to the program so you can see the order. Which changes the timing and can make the bug disappear. Lots of programs were shipped with the tracing left on because they couldn't diagnose the problem :)

Thanks #jeffrey-richter for pointing it out, #embee there are scenario when we use locks other than Monitor than a single thread tries to acquire any of these locks multiple times, deadlock occurs. Check out the example below
The following code produce the expected deadlock. It need not be nested task the deadlock can occur without nesting also
class Program
{
static AutoResetEvent signalEvent = new AutoResetEvent(false);
static void Main(string[] args)
{
Task.Run(() =>
{
Console.WriteLine("Program starts running on thread {0}",
Thread.CurrentThread.ManagedThreadId);
var taskToRun = new Task(() =>
{
signalEvent.WaitOne();
for (int i = 0; i < 10; i++)
Console.WriteLine("{0} from Thread {1}",
i, Thread.CurrentThread.ManagedThreadId);
});
taskToRun.Start();
signalEvent.Set();
taskToRun.Wait();
}).Wait() ;
}
}

Related

Await operator in C# spanning a background Thread

Lets assume I have the following simple program which uses the await operator in both DownloadDocsMainPageAsync() and Main(). While I understand that the current awaitable method gets suspended and continue from that point after the results are available, I need some clarity on the following points .
a) If the execution from Main() starts on Thread A from the threadpool , as soon as it encounters the await operator will this thread be returned to the threadpool for executing other operations in the program , for eg: if its a web app then for invocation of some Controller methods after button clicks from UI?
b) Will the await operator always take the execution on a new thread from the threadpool or in this case assuming there is no other method to be executed apart from Main() ,will it continue execution on the same thread itself (ThreadA)? If my understanding is correct who decides this , is it Garbage collector of CLR?
using System;
using System.Net.Http;
using System.Threading.Tasks;
public class AwaitOperator
{
public static async Task Main()
{
Task<int> downloading = DownloadDocsMainPageAsync();
Console.WriteLine($"{nameof(Main)}: Launched downloading.");
int bytesLoaded = await downloading;
Console.WriteLine($"{nameof(Main)}: Downloaded {bytesLoaded} bytes.");
}
private static async Task<int> DownloadDocsMainPageAsync()
{
Console.WriteLine($"{nameof(DownloadDocsMainPageAsync)}: About to start downloading.");
var client = new HttpClient();
byte[] content = await client.GetByteArrayAsync("https://learn.microsoft.com/en-us/");
Console.WriteLine($"{nameof(DownloadDocsMainPageAsync)}: Finished downloading.");
return content.Length;
}
}

Actually, async/await is not about threads (almost), but just about control flow control. So, in your code execution goes in a Main thread (which is not from a thread pool, by the way) until reaches await client.GetByteArrayAsync. Here the real low-level downloading is internally offloaded to the OS level and the program just waits the downloading result. And still no additional thread are spawned. But, when downloading finished, the .NET runtime want to continue execution after await. And here it can see no SynchronizationContext (as a console application does not has it) and then runtime executes the code after await in any thread available in thread pool. So, the rest of code after downloading will be executed in the thread from the pool.
If you will add a SynchronizationContext (or just move the code in WinForms app where the context exists out-of-the-box) you will see that all code will be executed in the main thread on no threads will be spawned/taken in from the thread pool as the runtime will see SynchronizationContext and will schedule after-await code on the original thread.
So, the answers
a) Main starts on the Main thread, not on the thread pool's thread. await itself does not actually spawn any threads. On the await, if the current thread was from thread pool, this thread will be put back in thread pool and will be available for future work. There is an exception, when the await will continue immediately and synchronously (see below).
b) runtime decides on which thread execution will be continued after 'await' depending of the current SynchronizationContext, ConfigureAwait settings and the availability of the operation result on the moment of reaching await.
In particular
if SynchronizationContext present and ConfigureAwait is set to true (or omitted), then code always continue in the current thread.
if SynchronizationContext does not present or ConfigureAwait is set to false, code will continue in any available thread (main thread or thread pool)
if you write something like
var task = DoSomeWorkAsync();
//some synchronous work which takes a while
await task;
then you can have a situation, when task is already finished on the moment when the code reaches await. In this case runtime can continue execution after await synchronously in the same thread. But this case is implementation-specific, as I know.
additionally, this is a special class TaskCompletionSource<TResult> (docs here) which provides explicit control over the task state and, in particular, may switch execution on any thread selected by the code owning TaskCompletionSource instance (see sample in #TheodorZoulias comment or here).

C# Cannot join main thread

I have very simple code which behavior I cannot understand.
class Program
{
static void Main(string[] args)
{
// Get reference to main thread
Thread mainThread = Thread.CurrentThread;
// Start second thread
new Thread(() =>
{
Console.WriteLine("Working...");
Thread.Sleep(1000);
Console.WriteLine("Work finished. Waiting for main thread to end...");
mainThread.Join(); // Obviously this join cannot pass
Console.WriteLine("This message never prints. Why???");
}).Start();
Thread.Sleep(300);
Console.WriteLine("Main thread ended");
}
}
Output of this never-ending program is:
Working...
Main thread ended
Work finished. Waiting for main thread to end...
Why the threads' code get stuck on the Join() method call? By other print outs can be find out, that already before Join() call, the propertyIsAlive of mainThread is set to false and the ThreadState is Background, Stopped, WaitSleepJoin. Also removing sleeps doesn't make any difference.
What's the reasoning for this behavior? What mystery is under the hood of Join() method and execution of Main method?

Join() works as you expect it to, the issue here is with the assumption that the thread running Main() terminates when Main() returns, which is not always the case.
Your Main() method is called by the .NET framework, and when the method returns, there is additional code executed by the framework before the main thread (and hence the process) exits. Specifically, one of the things the framework does as part of the post-Main code is wait for all foreground threads to exit.
This essentially results in a classic deadlock situation - the main thread is waiting for your worker thread to exit, and your worker thread is waiting for the main thread to exit.
Of course, if you make your worker thread a background thread, (by setting IsBackground = true before starting it) then the post-Main code won't wait for it to exit, eliminating the deadlock. However your Join() will still never return because when the main thread does exit, the process also exits.
For more details on the framework internals that run before and after Main(), you can take a look at the .NET Core codebase on GitHub. The overall method that runs Main() is Assembly::ExecuteMainMethod, and the code that runs after Main() returns is Assembly::RunMainPost.

Thread blocked after await

With this code:
static void Main(string[] args)
{
Console.WriteLine("Main Thread Pre - " + GetNativeThreadId(System.Threading.Thread.CurrentThread));
Task.Run(() => AsyncMethod()).Wait();
Console.WriteLine("Main Thread Post - " + GetNativeThreadId(System.Threading.Thread.CurrentThread));
Console.ReadKey();
}
static async Task AsyncMethod()
{
Console.WriteLine("AsyncMethod Thread Pre - " + GetNativeThreadId(System.Threading.Thread.CurrentThread));
await Task.Delay(4000).ConfigureAwait(false);
Console.WriteLine("AsyncMethod Thread Post - " + GetNativeThreadId(System.Threading.Thread.CurrentThread));
}
The output is:
Main Thread Pre - 8652
AsyncMethod Thread Pre - 4764
AsyncMethod Thread Post - 1768
Main Thread Post - 8652
Using the Concurrency Visualizer, I can see that during the 4 second delay, thread 4764 is stuck in Synchronization. It is eventually unblocked by the main thread on shutdown.
Shouldn't thread 4764 be returned to the ThreadPool once it hits the await? (That being said I don't know what that would look like inside the Concurrency Visualizer)

Shouldn't thread 4764 be returned to the ThreadPool once it hits the await?
Yes. And it is.
(That being said I don't know what that would look like inside the Concurrency Visualizer)
That's easy enough to check. Just explicitly execute some code in the thread pool, and take a look at what that thread looks like in the visualizer when it's not busy.
For example:
ThreadPool.QueueUserWorkItem(o =>
{
Console.WriteLine("worker: " + GetNativeThreadId(System.Threading.Thread.CurrentThread));
Thread.Sleep(250);
});
(I added the sleep so it shows up more easily in the visualizer as having done something :) ).
And when you do, you'll see that what it looks like is just like what you see. :)
When I ran this, the thread pool even used the same worker thread that it used for the original Task. And you can see that a thread pool worker thread sits in the Synchronization state while it's waiting for more work.
Which makes sense. At an abstract level, what's the thread pool doing? The whole point is for it to have threads which already exist. But you don't want those threads actually out there working unless they have something to work on. That would burn CPU time for no reason. So instead, they wait on a synchronization object.
When something (like Task) wants to use one, it queues a work item, and then the thread pool signals to the thread it's got something to do. This wakes up the thread, it does its work, and then it blocks on the synchronization object again, waiting for something else to do.
If you check the call stack for the relevant threads, you'll see the worker thread waiting on a call to WaitForSingleObject(), and you'll see that the thread pool ultimately unblocks the thread using ReleaseSemaphore().
And this shows up as the Synchronization state for the thread pool thread, just as you saw.

Task continuation parallel execution with async/await

In the context of a console application making use of async/await constructs, I would like to know if it's possible for "continuations" to run in parallel on multiple threads on different CPUs.
I think this is the case, as continuations are posted on the default task scheduler (no SynchronizationContext in console app), which is the thread pool.
I know that async/await construct do not construct any additional thread. Still there should be at least one thread constructed per CPU by the thread pool, and therefore if continuations are posted on the thread pool, it could schedule task continuations in parrallel on different CPUs ... that's what I thought, but for some reason I got really confused yesterday regarding this and I am not so sure anymore.
Here is some simple code :
public class AsyncTest
{
int i;
public async Task DoOpAsync()
{
await SomeOperationAsync();
// Does the following code continuation can run
// in parrallel ?
i++;
// some other continuation code ....
}
public void Start()
{
for (int i=0; i<1000; i++)
{ var _ = DoOpAsync(); } // dummy variable to bypass warning
}
}
SomeOperationAsync does not create any thread in itself, and let's say for the sake of the example that it just sends some request asynchronously relying on I/O completion port so not blocking any thread at all.
Now, if I call Start method which will issue 1000 async operations, is it possible for the continuation code of the async method (after the await) to be run in parallel on different CPU threads ? i.e do I need to take care of thread synchronization in this case and synchronize access to field "i" ?

Yes, you should put thread synchronization logic around i++ because it is possible that multiple threads would be executing code after await at the same time.
As a result of your for loop, number of Tasks will be created. These Tasks will be executed on different Thread Pool threads. Once these Tasks are completed the continuation i.e. the code after the await, will be executed again on different Thread Pool threads. This makes it possible that multiple threads would be doing i++ at the same time

Your understanding is correct: in Console applications, by default continuations will be scheduled to the thread pool due to the default SynchronizationContext.
Each async method does start synchronously, so your for loop will execute the beginning of DoOpAsync on the same thread. Assuming that SomeOperationAsync returns an incomplete Task, the continuations will be scheduled on the thread pool.
So each of the invocations of DoOpAsync may continue in parallel.

c#: what is a thread polling?

What does it mean when one says no polling is allowed when implimenting your thread solution since it's wasteful, it has latency and it's non-deterministic. Threads should not use polling to signal each other.
EDIT
Based on your answers so far, I believe my threading implementation (taken from: http://www.albahari.com/threading/part2.aspx#_AutoResetEvent) below is not using polling. Please correct me if I am wrong.
using System;
using System.Threading;
using System.Collections.Generic;
class ProducerConsumerQueue : IDisposable {
EventWaitHandle _wh = new AutoResetEvent (false);
Thread _worker;
readonly object _locker = new object();
Queue<string> _tasks = new Queue<string>();
public ProducerConsumerQueue() (
_worker = new Thread (Work);
_worker.Start();
}
public void EnqueueTask (string task) (
lock (_locker) _tasks.Enqueue (task);
_wh.Set();
}
public void Dispose() (
EnqueueTask (null); // Signal the consumer to exit.
_worker.Join(); // Wait for the consumer's thread to finish.
_wh.Close(); // Release any OS resources.
}
void Work() (
while (true)
{
string task = null;
lock (_locker)
if (_tasks.Count > 0)
{
task = _tasks.Dequeue();
if (task == null) return;
}
if (task != null)
{
Console.WriteLine ("Performing task: " + task);
Thread.Sleep (1000); // simulate work...
}
else
_wh.WaitOne(); // No more tasks - wait for a signal
}
}
}

Your question is very unclear, but typically "polling" refers to periodically checking for a condition, or sampling a value. For example:
while (true)
{
Task task = GetNextTask();
if (task != null)
{
task.Execute();
}
else
{
Thread.Sleep(5000); // Avoid tight-looping
}
}
Just sleeping is a relatively inefficient way of doing this - it's better if there's some coordination so that the thread can wake up immediately when something interesting happens, e.g. via Monitor.Wait/Pulse or Manual/AutoResetEvent... but depending on the context, that's not always possible.
In some contexts you may not want the thread to actually sleep - you may want it to become available for other work. For example, you might use a Timer of one sort or other to periodically poll a mailbox to see whether there's any incoming mail - but you don't need the thread to actually be sleeping when it's not checking; it can be reused by another thread-pool task.

Here you go: check out this website:
http://msdn.microsoft.com/en-us/library/dsw9f9ts%28VS.71%29.aspx
Synchronization Techniques
There are two approaches to synchronization, polling and using synchronization objects. Polling repeatedly checks the status of an asynchronous call from within a loop. Polling is the least efficient way to manage threads because it wastes resources by repeatedly checking the status of the various thread properties.
For example, the IsAlive property can be used when polling to see if a thread has exited. Use this property with caution because a thread that is alive is not necessarily running. You can use the thread's ThreadState property to get more detailed information about a thread's status. Because threads can be in more than one state at any given time, the value stored in ThreadState can be a combination of the values in the System.Threading.Threadstate enumeration. Consequently, you should carefully check all relevant thread states when polling. For example, if a thread's state indicates that it is not Running, it may be done. On the other hand, it may be suspended or sleeping.
Waiting for a Thread to Finish
The Thread.Join method is useful for determining if a thread has completed before starting another task. The Join method waits a specified amount of time for a thread to end. If the thread ends before the timeout, Join returns True; otherwise it returns False. For information on Join, see Thread.Join Method
Polling sacrifices many of the advantages of multithreading in return for control over the order that threads run. Because it is so inefficient, polling generally not recommended. A more efficient approach would use the Join method to control threads. Join causes a calling procedure to wait either until a thread is done or until the call times out if a timeout is specified. The name, join, is based on the idea that creating a new thread is a fork in the execution path. You use Join to merge separate execution paths into a single thread again
One point should be clear: Join is a synchronous or blocking call. Once you call Join or a wait method of a wait handle, the calling procedure stops and waits for the thread to signal that it is done.
Copy
Sub JoinThreads()
Dim Thread1 As New System.Threading.Thread(AddressOf SomeTask)
Thread1.Start()
Thread1.Join() ' Wait for the thread to finish.
MsgBox("Thread is done")
End Sub
These simple ways of controlling threads, which are useful when you are managing a small number of threads, are difficult to use with large projects. The next section discusses some advanced techniques you can use to synchronize threads.
Hope this helps.
PK

Polling can be used in reference to the four asyncronous patterns .NET uses for delegate execution.
The 4 types (I've taken these descriptions from this well explained answer) are:
Polling: waiting in a loop for IAsyncResult.Completed to be true
I'll call you
You call me
I don't care what happens (fire and forget)
So for an example of 1:
Action<IAsyncResult> myAction = (IAsyncResult ar) =>
{
// Send Nigerian Prince emails
Console.WriteLine("Starting task");
Thread.Sleep(2000);
// Finished
Console.WriteLine("Finished task");
};
IAsyncResult result = myAction.BeginInvoke(null,null,null);
while (!result.IsCompleted)
{
// Do something while you wait
Console.WriteLine("I'm waiting...");
}
There's alternative ways of polling, but in general it means "I we there yet", "I we there yet", "I we there yet"

What does it mean when one says no
polling is allowed when implimenting
your thread solution since it's
wasteful, it has latency and it's
non-deterministic. Threads should not
use polling to signal each other.
I would have to see the context in which this statement was made to express an opinion on it either way. However, taken as-is it is patently false. Polling is a very common and very accepted strategy for signaling threads.
Pretty much all lock-free thread signaling strategies use polling in some form or another. This is clearly evident in how these strategies typically spin around in a loop until a certain condition is met.
The most frequently used scenario is the case of signaling a worker thread that it is time to terminate. The worker thread will periodically poll a bool flag at safe points to see if a shutdown was requested.
private volatile bool shutdownRequested;
void WorkerThread()
{
while (true)
{
// Do some work here.
// This is a safe point so see if a shutdown was requested.
if (shutdownRequested) break;
// Do some more work here.
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.