Task continuation parallel execution with async/await

Task continuation parallel execution with async/await - c#

In the context of a console application making use of async/await constructs, I would like to know if it's possible for "continuations" to run in parallel on multiple threads on different CPUs.
I think this is the case, as continuations are posted on the default task scheduler (no SynchronizationContext in console app), which is the thread pool.
I know that async/await construct do not construct any additional thread. Still there should be at least one thread constructed per CPU by the thread pool, and therefore if continuations are posted on the thread pool, it could schedule task continuations in parrallel on different CPUs ... that's what I thought, but for some reason I got really confused yesterday regarding this and I am not so sure anymore.
Here is some simple code :
public class AsyncTest
{
int i;
public async Task DoOpAsync()
{
await SomeOperationAsync();
// Does the following code continuation can run
// in parrallel ?
i++;
// some other continuation code ....
}
public void Start()
{
for (int i=0; i<1000; i++)
{ var _ = DoOpAsync(); } // dummy variable to bypass warning
}
}
SomeOperationAsync does not create any thread in itself, and let's say for the sake of the example that it just sends some request asynchronously relying on I/O completion port so not blocking any thread at all.
Now, if I call Start method which will issue 1000 async operations, is it possible for the continuation code of the async method (after the await) to be run in parallel on different CPU threads ? i.e do I need to take care of thread synchronization in this case and synchronize access to field "i" ?

Yes, you should put thread synchronization logic around i++ because it is possible that multiple threads would be executing code after await at the same time.
As a result of your for loop, number of Tasks will be created. These Tasks will be executed on different Thread Pool threads. Once these Tasks are completed the continuation i.e. the code after the await, will be executed again on different Thread Pool threads. This makes it possible that multiple threads would be doing i++ at the same time

Your understanding is correct: in Console applications, by default continuations will be scheduled to the thread pool due to the default SynchronizationContext.
Each async method does start synchronously, so your for loop will execute the beginning of DoOpAsync on the same thread. Assuming that SomeOperationAsync returns an incomplete Task, the continuations will be scheduled on the thread pool.
So each of the invocations of DoOpAsync may continue in parallel.

Related

How does asynchronous programming work with threads when using Thread.Sleep()?

Presumptions/Prelude:
In previous questions, we note that Thread.Sleep blocks threads see: When to use Task.Delay, when to use Thread.Sleep?.
We also note that console apps have three threads: The main thread, the GC thread & the finalizer thread IIRC. All other threads are debugger threads.
We know that async does not spin up new threads, and it instead runs on the synchronization context, "uses time on the thread only when the method is active". https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/task-asynchronous-programming-model
Setup:
In a sample console app, we can see that neither the sibling nor the parent code are affected by a call to Thread.Sleep, at least until the await is called (unknown if further).
var sw = new Stopwatch();
sw.Start();
Console.WriteLine($"{sw.Elapsed}");
var asyncTests = new AsyncTests();
var go1 = asyncTests.WriteWithSleep();
var go2 = asyncTests.WriteWithoutSleep();
await go1;
await go2;
sw.Stop();
Console.WriteLine($"{sw.Elapsed}");
Stopwatch sw1 = new Stopwatch();
public async Task WriteWithSleep()
{
sw1.Start();
await Task.Delay(1000);
Console.WriteLine("Delayed 1 seconds");
Console.WriteLine($"{sw1.Elapsed}");
Thread.Sleep(9000);
Console.WriteLine("Delayed 10 seconds");
Console.WriteLine($"{sw1.Elapsed}");
sw1.Stop();
}
public async Task WriteWithoutSleep()
{
await Task.Delay(3000);
Console.WriteLine("Delayed 3 second.");
Console.WriteLine($"{sw1.Elapsed}");
await Task.Delay(6000);
Console.WriteLine("Delayed 9 seconds.");
Console.WriteLine($"{sw1.Elapsed}");
}
Question:
If the thread is blocked from execution during Thread.Sleep, how is it that it continues to process the parent and sibling? Some answer that it is background threads, but I see no evidence of multithreading background threads. What am I missing?

I see no evidence of multithreading background threads. What am I missing?
Possibly you are looking in the wrong place, or using the wrong tools. There's a handy property that might be of use to you, in the form of Thread.CurrentThread.ManagedThreadId. According to the docs,
A thread's ManagedThreadId property value serves to uniquely identify that thread within its process.
The value of the ManagedThreadId property does not vary over time
This means that all code running on the same thread will always see the same ManagedThreadId value. If you sprinkle some extra WriteLines into your code, you'll be able to see that your tasks may run on several different threads during their lifetimes. It is even entirely possible for some async applications to have all their tasks run on the same thread, though you probably won't see that behaviour in your code under normal circumstances.
Here's some example output from my machine, not guaranteed to be the same on yours, nor is it necessarily going to be the same output on successive runs of the same application.
00:00:00.0000030
* WriteWithSleep on thread 1 before await
* WriteWithoutSleep on thread 1 before first await
* WriteWithSleep on thread 4 after await
Delayed 1 seconds
00:00:01.0203244
* WriteWithoutSleep on thread 5 after first await
Delayed 3 second.
00:00:03.0310891
* WriteWithoutSleep on thread 6 after second await
Delayed 9 seconds.
00:00:09.0609263
Delayed 10 seconds
00:00:10.0257838
00:00:10.0898976
The business of running tasks on threads is handled by a TaskScheduler. You could write one that forces code to be single threaded, but that's not often a useful thing to do. The default scheduler uses a threadpool, and as such tasks can be run on a number of different threads.

The Task.Delay method is implemented basically like this (simplified¹):
public static Task Delay(int millisecondsDelay)
{
var tcs = new TaskCompletionSource();
_ = new Timer(_ => tcs.SetResult(), null, millisecondsDelay, -1);
return tcs.Task;
}
The Task is completed on the callback of a System.Threading.Timer component, and according to the documentation this callback is invoked on a ThreadPool thread:
The method does not execute on the thread that created the timer; it executes on a ThreadPool thread supplied by the system.
So when you await the task returned by the Task.Delay method, the continuation after the await runs on the ThreadPool. The ThreadPool typically has more than one threads available immediately on demand, so it's not difficult to introduce concurrency and parallelism if you create 2 tasks at once, like you do in your example. The main thread of a console application is not equipped with a SynchronizationContext by default, so there is no mechanism in place to prevent the observed concurrency.
¹ For demonstration purposes only. The Timer reference is not stored anywhere, so it might be garbage collected before the callback is invoked, resulting in the Task never completing.

I am not accepting my own answer, I will accept someone else's answer because they helped me figure this out. First, in the context of my question, I was using async Main. It was very hard to choose between Theodor's & Rook's answer. However, Rook's answer provided me with one thing that helped me fish: Thread.CurrentThread.ManagedThreadId
These are the results of my running code:
1 00:00:00.0000767
Not Delayed.
1 00:00:00.2988809
Delayed 1 second.
4 00:00:01.3392148
Delayed 3 second.
5 00:00:03.3716776
Delayed 9 seconds.
5 00:00:09.3838139
Delayed 10 seconds
4 00:00:10.3411050
4 00:00:10.5313519
I notice that there are 3 threads here, The initial thread (1) provides for the first calling method and part of the WriteWithSleep() until Task.Delay is initialized and later awaited. At the point that Task.Delay is brought back into Thread 1, everything is run on Thread 4 instead of Thread 1 for the main and the remainder of WriteWithSleep.
WriteWithoutSleep uses its own Thread(5).
So my error was believing that there were only 3 threads. I believed the answer to this question: https://stackoverflow.com/questions/3476642/why-does-this-simple-net-console-app-have-so-many-threads#:~:text=You%20should%20only%20see%20three,see%20are%20debugger%2Drelated%20threads.
However, that question may not have been async, or may not have considered these additional worker threads from the threadpool.
Thank you all for your assistance in figuring out this question.

Await operator in C# spanning a background Thread

Lets assume I have the following simple program which uses the await operator in both DownloadDocsMainPageAsync() and Main(). While I understand that the current awaitable method gets suspended and continue from that point after the results are available, I need some clarity on the following points .
a) If the execution from Main() starts on Thread A from the threadpool , as soon as it encounters the await operator will this thread be returned to the threadpool for executing other operations in the program , for eg: if its a web app then for invocation of some Controller methods after button clicks from UI?
b) Will the await operator always take the execution on a new thread from the threadpool or in this case assuming there is no other method to be executed apart from Main() ,will it continue execution on the same thread itself (ThreadA)? If my understanding is correct who decides this , is it Garbage collector of CLR?
using System;
using System.Net.Http;
using System.Threading.Tasks;
public class AwaitOperator
{
public static async Task Main()
{
Task<int> downloading = DownloadDocsMainPageAsync();
Console.WriteLine($"{nameof(Main)}: Launched downloading.");
int bytesLoaded = await downloading;
Console.WriteLine($"{nameof(Main)}: Downloaded {bytesLoaded} bytes.");
}
private static async Task<int> DownloadDocsMainPageAsync()
{
Console.WriteLine($"{nameof(DownloadDocsMainPageAsync)}: About to start downloading.");
var client = new HttpClient();
byte[] content = await client.GetByteArrayAsync("https://learn.microsoft.com/en-us/");
Console.WriteLine($"{nameof(DownloadDocsMainPageAsync)}: Finished downloading.");
return content.Length;
}
}

Actually, async/await is not about threads (almost), but just about control flow control. So, in your code execution goes in a Main thread (which is not from a thread pool, by the way) until reaches await client.GetByteArrayAsync. Here the real low-level downloading is internally offloaded to the OS level and the program just waits the downloading result. And still no additional thread are spawned. But, when downloading finished, the .NET runtime want to continue execution after await. And here it can see no SynchronizationContext (as a console application does not has it) and then runtime executes the code after await in any thread available in thread pool. So, the rest of code after downloading will be executed in the thread from the pool.
If you will add a SynchronizationContext (or just move the code in WinForms app where the context exists out-of-the-box) you will see that all code will be executed in the main thread on no threads will be spawned/taken in from the thread pool as the runtime will see SynchronizationContext and will schedule after-await code on the original thread.
So, the answers
a) Main starts on the Main thread, not on the thread pool's thread. await itself does not actually spawn any threads. On the await, if the current thread was from thread pool, this thread will be put back in thread pool and will be available for future work. There is an exception, when the await will continue immediately and synchronously (see below).
b) runtime decides on which thread execution will be continued after 'await' depending of the current SynchronizationContext, ConfigureAwait settings and the availability of the operation result on the moment of reaching await.
In particular
if SynchronizationContext present and ConfigureAwait is set to true (or omitted), then code always continue in the current thread.
if SynchronizationContext does not present or ConfigureAwait is set to false, code will continue in any available thread (main thread or thread pool)
if you write something like
var task = DoSomeWorkAsync();
//some synchronous work which takes a while
await task;
then you can have a situation, when task is already finished on the moment when the code reaches await. In this case runtime can continue execution after await synchronously in the same thread. But this case is implementation-specific, as I know.
additionally, this is a special class TaskCompletionSource<TResult> (docs here) which provides explicit control over the task state and, in particular, may switch execution on any thread selected by the code owning TaskCompletionSource instance (see sample in #TheodorZoulias comment or here).

async Task - What actually happens on the CPU?

I've been reading about Tasks after asking this question and seeing that I completely misunderstood the concept. Answers such as the top answers here and here explain the idea, but I still don't get it.
So I've made this a very specific question: What actually happens on the CPU when a Task is executed?
This is what I've understood after some reading: A Task will share CPU time with the caller (and let's assume the caller is the "UI") so that if it's CPU-intensive - it will slow down the UI. If the Task is not CPU-intensive - it will be running "in the background". Seems clear enough …… until tested. The following code should allow the user to click on the button, and then alternately show "Shown" and "Button". But in reality: the Form is completely busy (-no user input possible) until the "Shown"s are all shown.
public Form1()
{
InitializeComponent();
Shown += Form1_Shown;
}
private async void Form1_Shown(object sender, EventArgs e)
{
await Doit("Shown");
}
private async Task Doit(string s)
{
WebClient client = new WebClient();
for (int i = 0; i < 10; i++)
{
client.DownloadData(uri);//This is here in order to delay the Text writing without much CPU use.
textBox1.Text += s + "\r\n";
this.Update();//textBox1.
}
}
private async void button1_Click(object sender, EventArgs e)
{
await Doit("Button");
}
Can someone please tell me what is actually happening on the CPU when a Task is executed (e.g. "When the CPU is not used by the UI, the Task uses it, except for when… etc.")?

The key to understanding this is that there are two kinds of tasks - one that executes code (what I call Delegate Tasks), and one that represents a future event (what I call Promise Tasks). Those two tasks are completely different, even though they're both represented by an instance of Task in .NET. I have some pretty pictures on my blog that may help understand how these types of task are different.
Delegate Tasks are the ones created by Task.Run and friends. They execute code on the thread pool (or possibly another TaskScheduler if you're using a TaskFactory). Most of the "task parallel library" documentation deals with Delegate Tasks. These are used to spread CPU-bound algorithms across multiple CPUs, or to push CPU-bound work off a UI thread.
Promise Tasks are the ones created by TaskCompletionSource<T> and friends (including async). These are the ones used for asynchronous programming, and are a natural fit for I/O-bound code.
Note that your example code will cause a compiler warning to the effect that your "asynchronous" method Doit is not actually asynchronous but is instead synchronous. So as it stands right now, it will synchronously call DownloadData, blocking the UI thread until the download completes, and then it will update the text box and finally return an already-completed task.
To make it asynchronous, you have to use await:
private async Task Doit(string s)
{
WebClient client = new WebClient();
for (int i = 0; i < 10; i++)
{
await client.DownloadDataTaskAsync(uri);
textBox1.Text += s + "\r\n";
this.Update();//textBox1.
}
}
Now it's returning an incomplete task when it hits the await, which allows the UI thread to return to its message processing loop. When the download completes, the remainder of this method will be queued to the UI thread as a message, and it will resume executing that method when it gets around to it. When the Doit method completes, then the task it returned earlier will complete.
So, tasks returned by async methods logically represent that method. The task itself is a Promise Task, not a Delegate Task, and does not actually "execute". The method is split into multiple parts (at each await point) and executes in chunks, but the task itself does not execute anywhere.
For further reading, I have a blog post on how async and await actually work (and how they schedule the chunks of the method), and another blog post on why asynchronous I/O tasks do not need to block threads.

As per your linked answers, Tasks and Threads are totally different concepts, and you are also getting confused with async / await
A Task is just a representation of some work to be done. It says nothing about HOW that work should be done.
A Thread is a representation of some work that is running on the CPU, but is sharing the CPU time with other threads that it can know nothing about.
You can run a Task on a Thread using Task.Run(). Your Task will run asynchronously and independently of any other code providing a threadpool thread is available.
You can also run a Task asynchronously on the SAME thread using async / await. Anytime the thread hits an await, it can save the current stack state, then travel back up the stack and carry on with other work until the awaited task has finished. Your Doit() code never awaits anything, so will run synchronously on your GUI thread until complete.

Tasks use the ThreadPool you can read extensively about what it is and how it works here
But in a nutshell, when a task is executed, the Task Scheduler looks in the ThreadPool to see if there is a thread available to run the action of the task. If not, it's going to be queued until one becomes available.
A ThreadPool is just a collection of already-instantiated threads made available so that multithreaded code can safely use concurrent programming without overwhelming the CPU with context-switching all the time.
Now, the problem with your code is that even though you return an object of type Task, you are not running anything concurrently - No separate thread is ever started!
In order to do that, you have two options, either you start yourDoit method as a Task, with
Option1
Task.Run(() => DoIt(s));
This will run the whole DoIt method on another thread from the Thread Pool, but it will lead to more problems, because in this method, you're trying to access UI-controls. therefore, you will need either to marshal those calls to the UI thread, or re-think your code so that the UI access is done directly on the UI thread after the asynchronous tasks completes.
Option 2 (preferred, if you can)
You use .net APIs which are already asynchronous, such as client.DownloadDataTaskAsync(); instead of client.DownloadData();
now, in your case, the problem is that you will need to have 10 calls, which are going to return 10 different objects of type Task<byte[]> and you want to await on the completion of all of them, not just one.
In order to do this, you will need to create a List<Task<byte[]>> returnedTasks and you will add to it all returned value from DownloadDataTaskAsync(). then, once this is done, you can use the following return value for your DoIt method.
return Task.WhenAll(returnedTasks);

How can I have two separate task schedulers?

I am writing a game, and using OpenGL I require that some work be offloaded to the rendering thread where an OpenGL context is active, but everything else is handled by the normal thread pool.
Is there a way I can force a Task to be executed in a special thread-pool, and any new tasks created from an async also be dispatched to that thread pool?
I want a few specialized threads for rendering, and I would like to be able to use async and await for example for creating and filling a vertex buffer.
If I just use a custom task scheduler and a new Factory(new MyScheduler()) it seems that any subsequent Task objects will be dispatched to the thread pool anyway where Task.Factory.Scheduler suddenly is null.
The following code should show what I want to be able to do:
public async Task Initialize()
{
// The two following tasks should run on the rendering thread pool
// They cannot run synchronously because that will cause them to fail.
this.VertexBuffer = await CreateVertexBuffer();
this.IndexBuffer = await CreateIndexBuffer();
// This should be dispatched, or run synchrounousyly, on the normal thread pool
Vertex[] vertices = CreateVertices();
// Issue task for filling vertex buffer on rendering thread pool
var fillVertexBufferTask = FillVertexBufffer(vertices, this.VertexBuffer);
// This should be dispatched, or run synchrounousyly, on the normal thread pool
short[] indices = CreateIndices();
// Wait for tasks on the rendering thread pool to complete.
await FillIndexBuffer(indices, this.IndexBuffer);
await fillVertexBufferTask; // Wait for the rendering task to complete.
}
Is there any way to achieve this, or is it outside the scope of async/await?

This is possible and basically the same thing what Microsoft did for the Windows Forms and WPF Synchronization Context.
First Part - You are in the OpenGL thread, and want to put some work into the thread pool, and after this work is done you want back into the OpenGL thread.
I think the best way for you to go about this is to implement your own SynchronizationContext. This thing basically controls how the TaskScheduler works and how it schedules the task. The default implementation simply sends the tasks to the thread pool. What you need to do is to send the task to a dedicated thread (that holds the OpenGL context) and execute them one by one there.
The key of the implementation is to overwrite the Post and the Send methods. Both methods are expected to execute the callback, where Send has to wait for the call to finish and Post does not. The example implementation using the thread pool is that Sendsimply directly calls the callback and Post delegates the callback to the thread pool.
For the execution queue for your OpenGL thread I am think a Thread that queries a BlockingCollection should do nicely. Just send the callbacks to this queue. You may also need some callback in case your post method is called from the wrong thread and you need to wait for the task to finish.
But all in all this way should work. async/await ensures that the SynchronizationContext is restored after a async call that is executed in the thread pool for example. So you should be able to return to the OpenGL thread after you did put some work off into another thread.
Second Part - You are in another thread and want to send some work into the OpenGL thread and await the completion of that work.
This is possible too. My idea in this case is that you don't use Tasks but other awaitable objects. In general every object can be awaitable. It just has to implement a public method getAwaiter() that returns a object implementing the INotifyCompletion interface. What await does is that it puts the remaining method into a new Action and sends this action to the OnCompleted method of that interface. The awaiter is expected to call the scheduled actions once the operation it is awaiting is done. Also this awaiter has to ensure that the SynchronizationContext is captured and the continuations are executed on the captured SynchronizationContext. That sounds complicated, but once you get the hang of it, it goes fairly easy. What helped me a lot is the reference source of the YieldAwaiter (this is basically what happens if you use await Task.Yield()). This is not what you need, but I think it is a place to start.
The method that returns the awaiter has to take care of sending the actual work to the thread that has to execute it (you maybe already have the execution queue from the first part) and the awaiter has to trigger once that work is done.
Conclusion
Make no mistake. That is a lot of work. But if you do all that you will have less problem down the line because you can seamless use the async/await pattern as if you would be working inside windows forms or WPF and that is a hue plus.

First, realize that await introduces the special behavior after the method is called; that is to say, this code:
this.VertexBuffer = await CreateVertexBuffer();
is pretty much the same as this code:
var createVertexBufferTask = CreateVertexBuffer();
this.VertexBuffer = await createVertexBufferTask;
So, you'll have to explicitly schedule code to execute a method within a different context.
You mention using a MyScheduler but I don't see your code using it. Something like this should work:
this.factory = new TaskFactory(CancellationToken.None, TaskCreationOptions.DenyChildAttach, TaskContinuationOptions.None, new MyScheduler());
public async Task Initialize()
{
// Since you mention OpenGL, I'm assuming this method is called on the UI thread.
// Run these methods on the rendering thread pool.
this.VertexBuffer = await this.factory.StartNew(() => CreateVertexBuffer()).Unwrap();
this.IndexBuffer = await this.factory.StartNew(() => CreateIndexBuffer()).Unwrap();
// Run these methods on the normal thread pool.
Vertex[] vertices = await Task.Run(() => CreateVertices());
var fillVertexBufferTask = Task.Run(() => FillVertexBufffer(vertices, this.VertexBuffer));
short[] indices = await Task.Run(() => CreateIndices());
await Task.Run(() => FillIndexBuffer(indices, this.IndexBuffer));
// Wait for the rendering task to complete.
await fillVertexBufferTask;
}
I would look into combining those multiple Task.Run calls, or (if Initialize is called on a normal thread pool thread) removing them completely.

How does the runtime know when to spawn a thread when using "await"?

EDIT
I took Jon's comment and retried the whole thing. And indeed, it is blocking the UI thread. I must have messed up my initial test somehow. The string "OnResume exits" is written after SomeAsync has finished. If the method is changed to use await Task.WhenAll(t) it will (as expected) not block. Thanks for the input!
I was first thinking about deleting the question because the initial assumption was just wrong but I think the answers contains valuable information that should not be lost.
The original post:
Trying to understand the deeper internals of async-await. The example below is from an Android app using Xamarin. OnResume() executes on the UI thread.
SomeAsync() starts a new task (= it spawns a thread). Then it is using Task.WaitAll() to perform a blocking wait (let's not discuss now if WhenAll() would be a better option).
I can see that the UI is not getting blocked while Task.WaitAll() is running. So SomeAsync() does not run on the UI thread. This means that a new thread was created.
How does the await "know" that it has to spawn a thread here - will it always do it? If I change the WaitAll() to WhenAll(), there would not be a need for an additional thread as fast as I understand.
// This runs on the UI thread.
async override OnResume()
{
// What happens here? Not necessarily a new thread I suppose. But what else?
Console.WriteLine ("OnResume is about to call an async method.");
await SomeAsync();
// Here we are back on the current sync context, which is the UI thread.
SomethingElse();
Console.WriteLine ("OnResume exits");
}
Task<int> SomeAsync()
{
var t = Task.Factory.StartNew (() => {
Console.WriteLine("Working really hard!");
Thread.Sleep(10000);
Console.WriteLine("Done working.");
});
Task.WhenAll (t);
return Task.FromResult (42);
}

Simple: it never spawns a thread for await. If the awaitable has already completed, it just keeps running; if the awaitable has not completed, it simply tells the awaitable instance to add a continuation (via a fairly complex state machine). When the thing that is being completed completes, that will invoke the continuations (typically via the sync-context, if one - else synchronously on the thread that is marking the work as complete). However! The sync-context could theoretically be one that chooses to push things onto the thread-pool (most UI sync-contexts, however, push things to the UI thread).

I think you will find this thread interesting: How does C# 5.0's async-await feature differ from the TPL?
In short, await does not start any threads.
What it does, is just "splitting" the code into at the point where the, let's say, line where 'await' is placed, and everything that that line is added as continuation to the Task.
Note the Task. And note that you've got Factory.StartNew. So, in your code, it is the Factory who actually starts the task - and it includes placing it on some thread, be it UI or pool or any other task scheduler. This means, that the "Task" is usually already assigned to some scheduler when you perform the await.
Of course, it does not have to be assigned, nor started at all. The only important thing is that you need to have a Task, any, really.
If the Task is not started - the await does not care. It simply attaches continuation, and it's up to you to start the task later. And to assign it to proper scheduler.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.