Using Task.Yield to overcome ThreadPool starvation while implementing producer/consumer pattern - c#

Answering the question: Task.Yield - real usages?
I proposed to use Task.Yield allowing a pool thread to be reused by other tasks. In such pattern:
CancellationTokenSource cts;
void Start()
{
cts = new CancellationTokenSource();
// run async operation
var task = Task.Run(() => SomeWork(cts.Token), cts.Token);
// wait for completion
// after the completion handle the result/ cancellation/ errors
}
async Task<int> SomeWork(CancellationToken cancellationToken)
{
int result = 0;
bool loopAgain = true;
while (loopAgain)
{
// do something ... means a substantial work or a micro batch here - not processing a single byte
loopAgain = /* check for loop end && */ cancellationToken.IsCancellationRequested;
if (loopAgain) {
// reschedule the task to the threadpool and free this thread for other waiting tasks
await Task.Yield();
}
}
cancellationToken.ThrowIfCancellationRequested();
return result;
}
void Cancel()
{
// request cancelation
cts.Cancel();
}
But one user wrote
I don't think using Task.Yield to overcome ThreadPool starvation while
implementing producer/consumer pattern is a good idea. I suggest you
ask a separate question if you want to go into details as to why.
Anybody knows, why is not a good idea?

There are some good points left in the comments to your question. Being the user you quoted, I'd just like to sum it up: use the right tool for the job.
Using ThreadPool doesn't feel like the right tool for executing multiple continuous CPU-bound tasks, even if you try to organize some cooperative execution by turning them into state machines which yield CPU time to each other with await Task.Yield(). Thread switching is rather expensive; by doing await Task.Yield() on a tight loop you add a significant overhead. Besides, you should never take over the whole ThreadPool, as the .NET framework (and the underlying OS process) may need it for other things. On a related note, TPL even has the TaskCreationOptions.LongRunning option that requests to not run the task on a ThreadPool thread (rather, it creates a normal thread with new Thread() behind the scene).
That said, using a custom TaskScheduler with limited parallelism on some dedicated, out-of-pool threads with thread affinity for individual long-running tasks might be a different thing. At least, await continuations would be posted on the same thread, which should help reducing the switching overhead. This reminds me of a different problem I was trying to solve a while ago with ThreadAffinityTaskScheduler.
Still, depending on a particular scenario, it's usually better to use an existing well-established and tested tool. To name a few: Parallel Class, TPL Dataflow, System.Threading.Channels, Reactive Extensions.
There is also a whole range of existing industrial-strength solutions to deal with Publish-Subscribe pattern (RabbitMQ, PubNub, Redis, Azure Service Bus, Firebase Cloud Messaging (FCM), Amazon Simple Queue Service (SQS) etc).

After a bit of debating on the issue with other users - who are worried about the context switching and its influence on the performance.
I see what they are worried about.
But I meant: do something ... inside the loop to be a substantial task - usually in the form of a message handler which reads a message from the queue and processes it. The message handlers are usually user defined and the message bus executes them using some sort of dispatcher. The user can implement a handler which executes synchronously (nobody knows what the user will do), and without Task.Yield that will block the thread to process those synchronous tasks in a loop.
Not to be empty worded i added tests to github: https://github.com/BBGONE/TestThreadAffinity
They compare the ThreadAffinityTaskScheduler, .NET ThreadScheduler with BlockingCollection and .NET ThreadScheduler with Threading.Channels.
The tests show that for Ultra Short jobs the performance degradation is
around 15%. To use the Task.Yield without the performance degradation (even small) - it is not to use extremely short tasks and if the task is too short then combine shorter tasks into a bigger batch.
[The price of context switch] = [context switch duration] / ([job duration]+[context switch duration]).
In that case the influence of the switching the tasks is negligible on the performance. But it adds a better task cooperation and responsiveness of the system.
For long running tasks it is better to use a custom Scheduler which executes tasks on its own dedicated thread pool - (like the WorkStealingTaskScheduler).
For the mixed jobs - which can contain different parts - short running CPU bound, asynchronous and long running code parts. It is better to split the task into subtasks.
private async Task HandleLongRunMessage(TestMessage message, CancellationToken token = default(CancellationToken))
{
// SHORT SYNCHRONOUS TASK - execute as is on the default thread (from thread pool)
CPU_TASK(message, 50);
// IO BOUND ASYNCH TASK - used as is
await Task.Delay(50);
// BUT WRAP the LONG SYNCHRONOUS TASK inside the Task
// which is scheduled on the custom thread pool
// (to save threadpool threads)
await Task.Factory.StartNew(() => {
CPU_TASK(message, 100000);
}, token, TaskCreationOptions.DenyChildAttach, _workStealingTaskScheduler);
}

Related

Task based vs. thread based Watchdog - but async needed

We're using watchdogs to determine whether a connected system is still alive or not.
In the previous code we used TCP directly and treated the watchdog in a separate thread. Now is a new service used that provides it's data using gRPC.
For that we tried using the async interface with tasks but a task based watchdog will fail.
I wrote a small DEMO that abstracts the code and illustrates the problem. You can switch between task based watchdog and thread based watchdog by commenting out line 18 with //.
The demo contains this code that causes the problem:
async Task gRPCSendAsync(CancellationToken cancellationToken = default) => await Task.Yield();
async Task gRPCReceiveAsync(CancellationToken cancellationToken = default) => await Task.Yield();
var start = DateTime.UtcNow;
await gRPCSendAsync(cancellationToken).ConfigureAwait(false);
await gRPCReceiveAsync(cancellationToken).ConfigureAwait(false);
var end = DateTime.UtcNow;
if ((end - start).TotalMilliseconds >= 100)
// signal failing
If this code is used in Task.Run it will signal failing if the application has a lot cpu-work to do in other tasks.
If a dedicated thread is used the watchdog works as expected and no problem is raise.
I do understand the problem: All code after await may be (if not finished already or does not contain a "real" await) queued to the thread pool. But the thread pool has other things to do so that it took too long to finish the method.
Yes the simple answer is: USE THREAD.
But using a thread limits us to only use synchronous methods. There is no way to call an async method out of a thread. I created another sample that shows that all code after first await will be queued to thread bool so that CallAsync().Wait() will not work. (Btw. that issue is much more handled here.)
We're having a lot of async code that may be used within such time critical operations.
So the question is: Is there any way to perform that that operations using tasks with async/await?
Maybe I'm completely wrong and creating an task based watchdog should be done very differently.
thoughts
I was thinking about System.Threading.Timer but the problem of async sending and async receiving will cause that problem anyways.
Here is how you could use Stephen Cleary's AsyncContext class from the Nito.AsyncEx.Context package, in order to constrain an asynchronous workflow to a dedicated thread:
await Task.Factory.StartNew(() =>
{
AsyncContext.Run(async () =>
{
await DoTheWatchdogAsync(watchdogCts.Token);
});
}, TaskCreationOptions.LongRunning);
The call to AsyncContext.Run will block until the supplied asynchronous operation is completed. All asynchronous continuations created by the DoTheWatchdogAsync will be processed internally by the AsyncContext on the current thread. In the above example the current thread is not a ThreadPool thread, because of the flag TaskCreationOptions.LongRunning used in the construction of the wrapper Task. You could confirm this by querying the property Thread.CurrentThread.IsThreadPoolThread.
If you prefer you could use a traditional Thread constructor instead of the somewhat unconventional Task.Factory.StartNew+LongRunning.

Waiting on a continuous UI background polling task

I am somewhat new to parallel programming C# (When I started my project I worked through the MSDN examples for TPL) and would appreciate some input on the following example code.
It is one of several background worker tasks. This specific task pushes status messages to a log.
var uiCts = new CancellationTokenSource();
var globalMsgQueue = new ConcurrentQueue<string>();
var backgroundUiTask = new Task(
() =>
{
while (!uiCts.IsCancellationRequested)
{
while (globalMsgQueue.Count > 0)
ConsumeMsgQueue();
Thread.Sleep(backgroundUiTimeOut);
}
},
uiCts.Token);
// Somewhere else entirely
backgroundUiTask.Start();
Task.WaitAll(backgroundUiTask);
I'm asking for professional input after reading several topics like Alternatives to using Thread.Sleep for waiting, Is it always bad to use Thread.Sleep()?, When to use Task.Delay, when to use Thread.Sleep?, Continuous polling using Tasks
Which prompts me to use Task.Delay instead of Thread.Sleep as a first step and introduce TaskCreationOptions.LongRunning.
But I wonder what other caveats I might be missing? Is polling the MsgQueue.Count a code smell? Would a better version rely on an event instead?
First of all, there's no reason to use Task.Start or use the Task constructor. Tasks aren't threads, they don't run themselves. They are a promise that something will complete in the future and may or may not produce any results. Some of them will run on a threadpool thread. Use Task.Run to create and run the task in a single step when you need to.
I assume the actual problem is how to create a buffered background worker. .NET already offers classes that can do this.
ActionBlock< T >
The ActionBlock class already implements this and a lot more - it allows you to specify how big the input buffer is, how many tasks will process incoming messages concurrently, supports cancellation and asynchronous completion.
A logging block could be as simple as this :
_logBlock=new ActionBlock<string>(msg=>File.AppendAllText("myLog.txt",msg));
The ActionBlock class itself takes care of buffering the inputs, feeding new messages to the worker function when it arrives, potentially blocking senders if the buffer gets full etc. There's no need for polling.
Other code can use Post or SendAsync to send messages to the block :
_block.Post("some message");
When we are done, we can tell the block to Complete() and await for it to process any remaining messages :
_block.Complete();
await _block.Completion;
Channels
A newer, lower-level option is to use Channels. You can think of channels as a kind of asynchronous queue, although they can be used to implement complex processing pipelines. If ActionBlock was written today, it would use Channels internally.
With channels, you need to provide the "worker" task yourself. There's no need for polling though, as the ChannelReader class allows you to read messages asynchronously or even use await foreach.
The writer method could look like this :
public ChannelWriter<string> LogIt(string path,CancellationToken token=default)
{
var channel=Channel.CreateUnbounded<string>();
var writer=channel.Writer;
_=Task.Run(async ()=>{
await foreach(var msg in channel.Reader.ReadAllAsync(token))
{
File.AppendAllText(path,msg);
}
},token).ContinueWith(t=>writer.TryComplete(t.Exception);
return writer;
}
....
_logWriter=LogIt(somePath);
Other code can send messages by using WriteAsync or TryWrite, eg :
_logWriter.TryWrite(someMessage);
When we're done, we can call Complete() or TryComplete() on the writer :
_logWriter.TryComplete();
The line
.ContinueWith(t=>writer.TryComplete(t.Exception);
is needed to ensure the channel is closed even if an exception occurs or the cancellation token is signaled.
This may seem too cumbersome at first. Channels allow us to easily run initialization code or carry state from one message to the next. We could open a stream before the loop starts and use it instead of reopening the file each time we call File.AppendAllText, eg :
public ChannelWriter<string> LogIt(string path,CancellationToken token=default)
{
var channel=Channel.CreateUnbounded<string>();
var writer=channel.Writer;
_=Task.Run(async ()=>{
//***** Can't do this with an ActionBlock ****
using(var writer=File.AppendText(somePath))
{
await foreach(var msg in channel.Reader.ReadAllAsync(token))
{
writer.WriteLine(msg);
//Or
//await writer.WriteLineAsync(msg);
}
}
},token).ContinueWith(t=>writer.TryComplete(t.Exception);
return writer;
}
Definitely Task.Delay is better than Thread.Sleep, because you will not be blocking the thread on the pool, and during the wait the thread on the pool will be available to handle other tasks. Then, you don't need to make your task long-running. Long-running tasks are run in a dedicated thread, and then Task.Delay is meaningless.
Instead, I will recommend a different approach. Just use System.Threading.Timer and make your life simple. Timers are kernel objects that will run their callback on the thread pool, and you will not have to worry about delay or sleep.
The TPL Dataflow library is the preferred tool for this kind of job. It allows building efficient producer-consumer pairs quite easily, and more complex pipelines as well, while offering a complete set of configuration options. In your case using a single ActionBlock should be enough.
A simpler solution you might consider is to use a BlockingCollection. It has the advantage of not requiring the installation of any package (because it is built-in), and it's also much easier to learn. You don't have to learn more than the methods Add, CompleteAdding, and GetConsumingEnumerable. It also supports cancellation. The drawback is that it's a blocking collection, so it blocks the consumer thread while waiting for new messages to arrive, and the producer thread while waiting for available space in the internal buffer (only if you specify a boundedCapacity in the constructor).
var uiCts = new CancellationTokenSource();
var globalMsgQueue = new BlockingCollection<string>();
var backgroundUiTask = new Task(() =>
{
foreach (var item in globalMsgQueue.GetConsumingEnumerable(uiCts.Token))
{
ConsumeMsgQueueItem(item);
}
}, uiCts.Token);
The BlockingCollection uses a ConcurrentQueue internally as a buffer.

Synchronous I/O within an async/await-based Windows Service

Let's say I have a Windows Service which is doing some bit of work, then sleeping for a short amount of time, over and over forever (until the service is shut down). So in the service's OnStart, I could start up a thread whose entry point is something like:
private void WorkerThreadFunc()
{
while (!shuttingDown)
{
DoSomething();
Thread.Sleep(10);
}
}
And in the service's OnStop, I somehow set that shuttingDown flag and then join the thread. Actually there might be several such threads, and other threads too, all started in OnStart and shut down/joined in OnStop.
If I want to instead do this sort of thing in an async/await based Windows Service, it seems like I could have OnStart create cancelable tasks but not await (or wait) on them, and have OnStop cancel those tasks and then Task.WhenAll().Wait() on them. If I understand correctly, the equivalent of the "WorkerThreadFunc" shown above might be something like:
private async Task WorkAsync(CancellationToken cancel)
{
while (true)
{
cancel.ThrowIfCancellationRequested();
DoSomething();
await Task.Delay(10, cancel).ConfigureAwait(false);
}
}
Question #1: Uh... right? I am new to async/await and still trying to get my head around it.
Assuming that's right, now let's say that DoSomething() call is (or includes) a synchronous write I/O to some piece of hardware. If I'm understanding correctly:
Question #2: That is bad? I shouldn't be doing synchronous I/O within a Task in an async/await-based program? Because it ties up a thread from the thread pool while the I/O is happening, and threads from the thread pool are a highly limited resource? Please note that I might have dozens of such Workers going simultaneously to different pieces of hardware.
I am not sure I'm understanding that correctly - I am getting the idea that it's bad from articles like Stephen Cleary's "Task.Run Etiquette Examples: Don't Use Task.Run for the Wrong Thing", but that's specifically about it being bad to do blocking work within Task.Run. I'm not sure if it's also bad if I'm just doing it directly, as in the "private async Task Work()" example above?
Assuming that's bad too, then if I understand correctly I should instead utilize the nonblocking version of DoSomething (creating a nonblocking version of it if it doesn't already exist), and then:
private async Task WorkAsync(CancellationToken cancel)
{
while (true)
{
cancel.ThrowIfCancellationRequested();
await DoSomethingAsync(cancel).ConfigureAwait(false);
await Task.Delay(10, cancel).ConfigureAwait(false);
}
}
Question #3: But... what if DoSomething is from a third party library, which I must use and cannot alter, and that library doesn't expose a nonblocking version of DoSomething? It's just a black box set in stone that at some point does a blocking write to a piece of hardware.
Maybe I wrap it and use TaskCompletionSource? Something like:
private async Task WorkAsync(CancellationToken cancel)
{
while (true)
{
cancel.ThrowIfCancellationRequested();
await WrappedDoSomething().ConfigureAwait(false);
await Task.Delay(10, cancel).ConfigureAwait(false);
}
}
private Task WrappedDoSomething()
{
var tcs = new TaskCompletionSource<object>();
DoSomething();
tcs.SetResult(null);
return tcs.Task;
}
But that seems like it's just pushing the issue down a bit further rather than resolving it. WorkAsync() will still block when it calls WrappedDoSomething(), and only get to the "await" for that after WrappedDoSomething() has already completed the blocking work. Right?
Given that (if I understand correctly) in the general case async/await should be allowed to "spread" all the way up and down in a program, would this mean that if I need to use such a library, I essentially should not make the program async/await-based? I should go back to the Thread/WorkerThreadFunc/Thread.Sleep world?
What if an async/await-based program already exists, doing other things, but now additional functionality that uses such a library needs to be added to it? Does that mean that the async/await-based program should be rewritten as a Thread/etc.-based program?
Actually there might be several such threads, and other threads too, all started in OnStart and shut down/joined in OnStop.
On a side note, it's usually simpler to have a single "master" thread that will start/join all the others. Then OnStart/OnStop just deals with the master thread.
If I want to instead do this sort of thing in an async/await based Windows Service, it seems like I could have OnStart create cancelable tasks but not await (or wait) on them, and have OnStop cancel those tasks and then Task.WhenAll().Wait() on them.
That's a perfectly acceptable approach.
If I understand correctly, the equivalent of the "WorkerThreadFunc" shown above might be something like:
Probably want to pass the CancellationToken down; cancellation can be used by synchronous code, too:
private async Task WorkAsync(CancellationToken cancel)
{
while (true)
{
DoSomething(cancel);
await Task.Delay(10, cancel).ConfigureAwait(false);
}
}
Question #1: Uh... right? I am new to async/await and still trying to get my head around it.
It's not wrong, but it only saves you one thread on a Win32 service, which doesn't do much for you.
Question #2: That is bad? I shouldn't be doing synchronous I/O within a Task in an async/await-based program? Because it ties up a thread from the thread pool while the I/O is happening, and threads from the thread pool are a highly limited resource? Please note that I might have dozens of such Workers going simultaneously to different pieces of hardware.
Dozens of threads are not a lot. Generally, asynchronous I/O is better because it doesn't use any threads at all, but in this case you're on the desktop, so threads are not a highly limited resource. async is most beneficial on UI apps (where the UI thread is special and needs to be freed), and ASP.NET apps that need to scale (where the thread pool limits scalability).
Bottom line: calling a blocking method from an asynchronous method is not bad but it's not the best, either. If there is an asynchronous method, call that instead. But if there isn't, then just keep the blocking call and document it in the XML comments for that method (because an asynchronous method blocking is rather surprising behavior).
I am getting the idea that it's bad from articles like Stephen Cleary's "Task.Run Etiquette Examples: Don't Use Task.Run for the Wrong Thing", but that's specifically about it being bad to do blocking work within Task.Run.
Yes, that is specifically about using Task.Run to wrap synchronous methods and pretend they're asynchronous. It's a common mistake; all it does is trade one thread pool thread for another.
Assuming that's bad too, then if I understand correctly I should instead utilize the nonblocking version of DoSomething (creating a nonblocking version of it if it doesn't already exist)
Asynchronous is better (in terms of resource utilization - that is, fewer threads used), so if you want/need to reduce the number of threads, you should use async.
Question #3: But... what if DoSomething is from a third party library, which I must use and cannot alter, and that library doesn't expose a nonblocking version of DoSomething? It's just a black box set in stone that at some point does a blocking write to a piece of hardware.
Then just call it directly.
Maybe I wrap it and use TaskCompletionSource?
No, that doesn't do anything useful. That just calls it synchronously and then returns an already-completed task.
But that seems like it's just pushing the issue down a bit further rather than resolving it. WorkAsync() will still block when it calls WrappedDoSomething(), and only get to the "await" for that after WrappedDoSomething() has already completed the blocking work. Right?
Yup.
Given that (if I understand correctly) in the general case async/await should be allowed to "spread" all the way up and down in a program, would this mean that if I need to use such a library, I essentially should not make the program async/await-based? I should go back to the Thread/WorkerThreadFunc/Thread.Sleep world?
Assuming you already have a blocking Win32 service, it's probably fine to just keep it as it is. If you are writing a new one, personally I would make it async to reduce threads and allow asynchronous APIs, but you don't have to do it either way. I prefer Tasks over Threads in general, since it's much easier to get results from Tasks (including exceptions).
The "async all the way" rule only goes one way. That is, once you call an async method, then its caller should be async, and its caller should be async, etc. It does not mean that every method called by an async method must be async.
So, one good reason to have an async Win32 service would be if there's an async-only API you need to consume. That would cause your DoSomething method to become async DoSomethingAsync.
What if an async/await-based program already exists, doing other things, but now additional functionality that uses such a library needs to be added to it? Does that mean that the async/await-based program should be rewritten as a Thread/etc.-based program?
No. You can always just block from an async method. With proper documentation so when you are reusing/maintaining this code a year from now, you don't swear at your past self. :)
If you still spawn your threads, well, yes, it's bad. Because it will not give you any benefit as the thread is still allocated and consuming resources for the specific purpose of running your worker function. Running a few threads to be able to do work in parallel within a service has a minimal impact on your application.
If DoSomething() is synchronous, you could switch to the Timer class instead. It allows multiple timers to use a smaller amount of threads.
If it's important that the jobs can complete, you can modify your worker classes like this:
SemaphoreSlim _shutdownEvent = new SemaphoreSlim(0,1);
public async Task Stop()
{
return await _shutdownEvent.WaitAsync();
}
private void WorkerThreadFunc()
{
while (!shuttingDown)
{
DoSomething();
Thread.Sleep(10);
}
_shutdownEvent.Release();
}
.. which means that during shutdown you can do this:
var tasks = myServices.Select(x=> x.Stop());
Task.WaitAll(tasks);
A thread can only do one thing at a time. While it is working on your DoSomething it can't do anything else.
In an interview Eric Lippert described async-await in a restaurant metaphor. He suggests to use async-await only for functionality where your thread can do other things instead of waiting for a process to complete, like respond to operator input.
Alas, your thread is not waiting, it is doing hard work in DoSomething. And as long as DoSomething is not awaiting, your thread will not return from DoSomething to do the next thing.
So if your thread has something meaningful to do while procedure DoSomething is executing, it's wise to let another thread do the DoSomething, while your original thread is doing the meaningful stuff. Task.Run( () => DoSomething()) could do this for you. As long as the thread that called Task.Run doesn't await for this task, it is free to do other things.
You also want to cancel your process. DoSomething can't be cancelled. So even if cancellation is requested you'll have to wait until DoSomething is completed.
Below is your DoSomething in a form with a Start button and a Cancel button. While your thread is DoingSomething, one of the meaningful things your GUI thread may want to do is respond to pressing the cancel button:
void CancellableDoSomething(CancellationToken token)
{
while (!token.IsCancellationRequested)
{
DoSomething()
}
}
async Task DoSomethingAsync(CancellationToken token)
{
var task = Task.Run(CancellableDoSomething(token), token);
// if you have something meaningful to do, do it now, otherwise:
return Task;
}
CancellationTokenSource cancellationTokenSource = null;
private async void OnButtonStartSomething_Clicked(object sender, ...)
{
if (cancellationTokenSource != null)
// already doing something
return
// else: not doing something: start doing something
cancellationTokenSource = new CancellationtokenSource()
var task = AwaitDoSomethingAsync(cancellationTokenSource.Token);
// if you have something meaningful to do, do it now, otherwise:
await task;
cancellationTokenSource.Dispose();
cancellationTokenSource = null;
}
private void OnButtonCancelSomething(object sender, ...)
{
if (cancellationTokenSource == null)
// not doing something, nothing to cancel
return;
// else: cancel doing something
cancellationTokenSource.Cancel();
}

async Task - What actually happens on the CPU?

I've been reading about Tasks after asking this question and seeing that I completely misunderstood the concept. Answers such as the top answers here and here explain the idea, but I still don't get it.
So I've made this a very specific question: What actually happens on the CPU when a Task is executed?
This is what I've understood after some reading: A Task will share CPU time with the caller (and let's assume the caller is the "UI") so that if it's CPU-intensive - it will slow down the UI. If the Task is not CPU-intensive - it will be running "in the background". Seems clear enough …… until tested. The following code should allow the user to click on the button, and then alternately show "Shown" and "Button". But in reality: the Form is completely busy (-no user input possible) until the "Shown"s are all shown.
public Form1()
{
InitializeComponent();
Shown += Form1_Shown;
}
private async void Form1_Shown(object sender, EventArgs e)
{
await Doit("Shown");
}
private async Task Doit(string s)
{
WebClient client = new WebClient();
for (int i = 0; i < 10; i++)
{
client.DownloadData(uri);//This is here in order to delay the Text writing without much CPU use.
textBox1.Text += s + "\r\n";
this.Update();//textBox1.
}
}
private async void button1_Click(object sender, EventArgs e)
{
await Doit("Button");
}
Can someone please tell me what is actually happening on the CPU when a Task is executed (e.g. "When the CPU is not used by the UI, the Task uses it, except for when… etc.")?
The key to understanding this is that there are two kinds of tasks - one that executes code (what I call Delegate Tasks), and one that represents a future event (what I call Promise Tasks). Those two tasks are completely different, even though they're both represented by an instance of Task in .NET. I have some pretty pictures on my blog that may help understand how these types of task are different.
Delegate Tasks are the ones created by Task.Run and friends. They execute code on the thread pool (or possibly another TaskScheduler if you're using a TaskFactory). Most of the "task parallel library" documentation deals with Delegate Tasks. These are used to spread CPU-bound algorithms across multiple CPUs, or to push CPU-bound work off a UI thread.
Promise Tasks are the ones created by TaskCompletionSource<T> and friends (including async). These are the ones used for asynchronous programming, and are a natural fit for I/O-bound code.
Note that your example code will cause a compiler warning to the effect that your "asynchronous" method Doit is not actually asynchronous but is instead synchronous. So as it stands right now, it will synchronously call DownloadData, blocking the UI thread until the download completes, and then it will update the text box and finally return an already-completed task.
To make it asynchronous, you have to use await:
private async Task Doit(string s)
{
WebClient client = new WebClient();
for (int i = 0; i < 10; i++)
{
await client.DownloadDataTaskAsync(uri);
textBox1.Text += s + "\r\n";
this.Update();//textBox1.
}
}
Now it's returning an incomplete task when it hits the await, which allows the UI thread to return to its message processing loop. When the download completes, the remainder of this method will be queued to the UI thread as a message, and it will resume executing that method when it gets around to it. When the Doit method completes, then the task it returned earlier will complete.
So, tasks returned by async methods logically represent that method. The task itself is a Promise Task, not a Delegate Task, and does not actually "execute". The method is split into multiple parts (at each await point) and executes in chunks, but the task itself does not execute anywhere.
For further reading, I have a blog post on how async and await actually work (and how they schedule the chunks of the method), and another blog post on why asynchronous I/O tasks do not need to block threads.
As per your linked answers, Tasks and Threads are totally different concepts, and you are also getting confused with async / await
A Task is just a representation of some work to be done. It says nothing about HOW that work should be done.
A Thread is a representation of some work that is running on the CPU, but is sharing the CPU time with other threads that it can know nothing about.
You can run a Task on a Thread using Task.Run(). Your Task will run asynchronously and independently of any other code providing a threadpool thread is available.
You can also run a Task asynchronously on the SAME thread using async / await. Anytime the thread hits an await, it can save the current stack state, then travel back up the stack and carry on with other work until the awaited task has finished. Your Doit() code never awaits anything, so will run synchronously on your GUI thread until complete.
Tasks use the ThreadPool you can read extensively about what it is and how it works here
But in a nutshell, when a task is executed, the Task Scheduler looks in the ThreadPool to see if there is a thread available to run the action of the task. If not, it's going to be queued until one becomes available.
A ThreadPool is just a collection of already-instantiated threads made available so that multithreaded code can safely use concurrent programming without overwhelming the CPU with context-switching all the time.
Now, the problem with your code is that even though you return an object of type Task, you are not running anything concurrently - No separate thread is ever started!
In order to do that, you have two options, either you start yourDoit method as a Task, with
Option1
Task.Run(() => DoIt(s));
This will run the whole DoIt method on another thread from the Thread Pool, but it will lead to more problems, because in this method, you're trying to access UI-controls. therefore, you will need either to marshal those calls to the UI thread, or re-think your code so that the UI access is done directly on the UI thread after the asynchronous tasks completes.
Option 2 (preferred, if you can)
You use .net APIs which are already asynchronous, such as client.DownloadDataTaskAsync(); instead of client.DownloadData();
now, in your case, the problem is that you will need to have 10 calls, which are going to return 10 different objects of type Task<byte[]> and you want to await on the completion of all of them, not just one.
In order to do this, you will need to create a List<Task<byte[]>> returnedTasks and you will add to it all returned value from DownloadDataTaskAsync(). then, once this is done, you can use the following return value for your DoIt method.
return Task.WhenAll(returnedTasks);

Force a Task to continue on the current thread?

I'm making a port of the AKKA framework for .NET (don't take this too serious now, it is a weekend hack of the Actor part of it right now)
I'm having some problems with the "Future" support in it.
In Java/Scala Akka, Futures are to be awaited synchronously with an Await call.
Much like the .NET Task.Wait()
My goal is to support true async await for this.
It works right now, but the continuation is executed on the wrong thread in my current solution.
This is the result when passing a message to one of my actors that contain an await block for a future.
As you can see, the actor always executes on the same thread, while the await block executes on a random threadpool thread.
actor thread: 6
await thread 10
actor thread: 6
await thread 12
actor thread: 6
actor thread: 6
await thread 13
...
The actor gets a message using a DataFlow BufferBlock<Message>
Or rather, I use RX over the bufferblock to subscribe to messages.
It is configured like this:
var messages = new BufferBlock<Message>()
{
BoundedCapacity = 100,
TaskScheduler = TaskScheduler.Default,
};
messages.AsObservable().Subscribe(this);
So far so good.
However, when I await on a future result.
like so:
protected override void OnReceive(IMessage message)
{
....
var result = await Ask(logger, m);
// This is not executed on the same thread as the above code
result.Match()
.With<SomeMessage>(t => {
Console.WriteLine("await thread {0}",
System.Threading.Thread.CurrentThread.GetHashCode());
})
.Default(_ => Console.WriteLine("Unknown message"));
...
I know this is normal behavior of async await, but I really must ensure that only one thread has access to my actor.
I don't want the future to run synchronously, I want to to run async just like normal, but I want the continuation to run on the same thread as the message processor/actor does.
My code for the future support looks like this:
public Task<IMessage> Ask(ActorRef actor, IMessage message)
{
TaskCompletionSource<IMessage> result =
new TaskCompletionSource<IMessage>();
var future = Context.ActorOf<FutureActor>(name : Guid.NewGuid().ToString());
// once this object gets a response,
// we set the result for the task completion source
var futureActorRef = new FutureActorRef(result);
future.Tell(new SetRespondTo(), futureActorRef);
actor.Tell(message, future);
return result.Task;
}
Any ideas what I can do to force the continuation to run on the same thread that started the above code?
I'm making a port of the AKKA framework for .NET
Sweet. I went to an Akka talk at CodeMash '13 despite having never touched Java/Scala/Akka. I saw a lot of potential there for a .NET library/framework. Microsoft is working on something similar, which I hope will eventually be made generally available (it's currently in a limited preview).
I suspect that staying in the Dataflow/Rx world as much as possible is the easier approach; async is best when you have asynchronous operations (with a single start and single result for each operation), while Dataflow and Rx work better with streams and subscriptions (with a single start and multiple results). So my first gut reaction is to either link the buffer block to an ActionBlock with a specific scheduler, or use ObserveOn to move the Rx notifications to a specific scheduler, instead of trying to do it on the async side. Of course I'm not really familiar with the Akka API design, so take that with a grain of salt.
Anyway, my async intro describes the only two reliable options for scheduling await continuations: SynchronizationContext.Current and TaskScheduler.Current. If your Akka port is more of a framework (where your code does the hosting, and end-user code is always executed by your code), then a SynchronizationContext may make sense. If your port is more of a library (where end-user code does the hosting and calls your code as necessary), then a TaskScheduler would make more sense.
There aren't many examples of a custom SynchronizationContext, because that's pretty rare. I do have an AsyncContextThread type in my AsyncEx library which defines both a SynchronizationContext and a TaskScheduler for that thread. There are several examples of custom TaskSchedulers, such as the Parallel Extensions Extras which has an STA scheduler and a "current thread" scheduler.
Task scheduler decides whether to run a task on a new thread or on the current thread.
There is an option to force running it on a new thread, but none forcing it to run on the current thread.
But there is a method Task.RunSynchronously() which Runs the Task synchronously on the current TaskScheduler.
Also if you are using async/await there is already a similar question on that.

Categories