How Task.WhenAll works under the hood

How Task.WhenAll works under the hood - c#

How does Task.WhenAll works under the hood? Does it create separate thread which finished once all of tasks receive callback about finish. I have a suggestion, that under the hood it creates new thread and pass work to system drivers for each of the task and waits for them at the end, but not sure about is it correct or not?

No, Task.WhenAll doesn't create a thread. It is possible that some of the element tasks passed to Task.WhenAll have created threads (but optimally they would not). Task.WhenAll itself just calls ContinueWith on the element tasks, passing a piece of code that checks the other task states. There is no "wait".
Here is an example of how Task.WhenAll may be implemented. (It is not the Microsoft code)
Task MyWhenAll(IEnumerable<Task> tasks)
{
var a = tasks.ToArray();
var tcs = new TaskCompletionSource<bool>();
Array.ForEach(a, WatchTask);
return tcs.Task;
async void WatchTask(Task t)
{
try {
await t;
}
catch {}
if (a.All(element => element.IsCompleted)) {
if (a.Any(element => element.IsFaulted))
// omitted logic for adding each individual exception
// to the aggregate
tcs.TrySetException(new AggregateException());
else
tcs.TrySetResult(true);
}
}
}

Related

Why Task.WhenAll requires a manually created Task to be started when the same doesn't require for async methods?

Ex, the following code manually instantiates a Task and passes to a Task.WhenAll in a List<T>
public async Task Do3()
{
var task1 = new Task(async () => { await Task.Delay(2000); Console.WriteLine("########## task1"); });
var taskList = new List<Task>() { task1};
taskList[0].Start();
var taskDone = Task.WhenAll(taskList);
await taskDone;
}
without starting the Task it doesn't work, it hangs forever calling from a console app, but the below works just fine without starting it
public async Task Do3()
{
//var task1 = new Task(async () => { await Task.Delay(2000); Console.WriteLine("########## task1"); });
var taskList = new List<Task>() { SubDo1() };
//taskList[0].Start();
var taskDone = Task.WhenAll(taskList);
await taskDone;
}
public async Task SubDo1()
{
await Task.Delay(2000);
Console.WriteLine("########## task1");
}

Task is used in two completely different ways here; when you call an async method: you are starting it yourself; at this point, two things can happen:
it can run to completion (eventually) without ever reaching a truly asynchronous state, and return a completed (or faulted) task to the caller
it can reach an incomplete awaitable (in this case await Task.Delay), at which point it creates a state machine that represents the current position, schedules a completion operation on that incomplete awaitable (to do whatever comes next), and then returns an incomplete task to the caller
It is not "not started"; to return anything to the caller: we have started it. However, unlike Task.Start(), we start that work on our current thread - not an external worker thread - with other threads only getting involved based on how that incomplete awaitable schedules the completion callbacks that the compiler gives it.
This is very different to the new Task(...) scenario, where nothing is initially started. That's why they behave differently. Note also the Remarks section of the Task constructor here - it is a very niche API, and honestly: not hugely recommended.
Additionally: when you don't immediately await an async method, you're essentially going into concurrent territory (assuming the awaitable won't always complete synchronously). In some cases, this matters, and may cause threading problems re race-conditions. It shouldn't matter much in this case, though.

Task.WaitAll() deadlocking

I want to call an asynchronous method multiple times in a xUnit test and wait for all calls to complete before I continue execution. I read that I can use Task.WhenAll() and Task.WaitAll() for precisely this scenario. For some reason however, the code is deadlocking.
[Fact]
public async Task GetLdapEntries_ReturnsLdapEntries()
{
var ldapEntries = _fixture.CreateMany<LdapEntryDto>(2).ToList();
var creationTasks = new List<Task>();
foreach (var led in ldapEntries)
{
var task = _attributesServiceClient.CreateLdapEntry(led);
task.Start();
creationTasks.Add(task);
}
Task.WaitAll(creationTasks.ToArray()); //<-- deadlock(?) here
//await Task.WhenAll(creationTasks);
var result = await _ldapAccess.GetLdapEntries();
result.Should().BeEquivalentTo(ldapEntries);
}
public async Task<LdapEntryDto> CreateLdapEntry(LdapEntryDto ldapEntryDto)
{
using (var creationResponse = await _httpClient.PostAsJsonAsync<LdapEntryDto>("", ldapEntryDto))
{
if (creationResponse.StatusCode == HttpStatusCode.Created)
{
return await creationResponse.Content.ReadAsAsync<LdapEntryDto>();
}
throw await buildException(creationResponse);
}
}
The system under test is a wrapper around an HttpClient that calls a web service, awaits the response, and possibly awaits reading the response's content that is finally deserialized and returned.
When I change the foreach part in the test to the following (ie, don't use Task.WhenAll() / WaitAll()), the code is running without a deadlock:
foreach (var led in ldapEntries)
{
await _attributesServiceClient.CreateLdapEntry(led);
}
What exactly is happening?
EDIT: While this question has been marked as duplicate, I don't see how the linked question relates to this one. The code examples in the link all use .Result which, as far as I understand, blocks the execution until the task has finished. In contrast, Task.WhenAll() returns a task that can be awaited and that finishes when all tasks have finished. So why is awaiting Task.WhenAll() deadlocking?

The code you posted cannot possibly have the behavior described. The first call to Task.Start would throw an InvalidOperationException, failing the test.
I read that I can use Task.WhenAll() and Task.WaitAll() for precisely this scenario.
No; to asynchronously wait on multiple tasks, you must use Task.WhenAll, not Task.WaitAll.
Example:
[Fact]
public async Task GetLdapEntries_ReturnsLdapEntries()
{
var ldapEntries = new List<int> { 0, 1 };
var creationTasks = new List<Task>();
foreach (var led in ldapEntries)
{
var task = CreateLdapEntry(led);
creationTasks.Add(task);
}
await Task.WhenAll(creationTasks);
}
public async Task<string> CreateLdapEntry(int ldapEntryDto)
{
await Task.Delay(500);
return "";
}

Task.WaitAll() will deadlock simply because it blocks the current thread while the tasks are not finished (and since you are using async/await and not threads, all of your tasks are running on the same thread, and you are not letting your awaited tasks to go back to the calling point because the thread they are running in -the same one where you called Task.WaitAll()-, is blocked).
Not sure why WhenAll is also deadlocking for you here though, it definitely shouldn't.
PS: you don't need to call Start on tasks returned by an async method: they are "hot" (already started) already upon creation

ContinueWith chaining not working as expected

I have this example code:
static void Main(string[] args) {
var t1 = Task.Run(async () => {
Console.WriteLine("Putting in fake processing 1.");
await Task.Delay(300);
Console.WriteLine("Fake processing finished 1. ");
});
var t2 = t1.ContinueWith(async (c) => {
Console.WriteLine("Putting in fake processing 2.");
await Task.Delay(200);
Console.WriteLine("Fake processing finished 2.");
});
var t3 = t2.ContinueWith(async (c) => {
Console.WriteLine("Putting in fake processing 3.");
await Task.Delay(100);
Console.WriteLine("Fake processing finished 3.");
});
Console.ReadLine();
}
The console output baffles me:
Putting in fake processing 1.
Fake processing finished 1.
Putting in fake processing 2.
Putting in fake processing 3.
Fake processing finished 3.
Fake processing finished 2.
I am trying to chain the tasks so they execute one after another, what am I doing wrong? And I can't use await, this is just example code, in reality I am queueing incoming tasks (some asynchronous, some not) and want to execute them in the same order they came in but with no parallelism, ContinueWith seemed better than creating a ConcurrentQueue and handling everythning myself, but it just doesn't work...

Take a look at the type of t2. It's a Task<Task>. t2 will be completed when it finishes starting the task that does the actual work not when that work actually finishes.
The smallest change to your code to get it to work would be to add an unwrap after both your second and third calls to ContinueWith, so that you get out the task that represents the completion of your work.
The more idiomatic solution would be to simply remove the ContinueWith calls entirely and just use await to add continuations to tasks.
Interestingly enough, you would see the same behavior for t1 if you used Task.Factory.StartNew, but Task.Run is specifically designed to work with async lambdas and actually internally unwraps all Action<Task> delegates to return the result of the task returned, rather than a task that represents starting that task, which is why you don't need to unwrap that task.

in reality I am queueing incoming tasks (some asynchronous, some not) and want to execute them in the same order they came in but with no parallelism
You probably want to use TPL Dataflow for that. Specifically, ActionBlock.
var block = new ActionBlock<object>(async item =>
{
// Handle synchronous item
var action = item as Action;
if (action != null)
action();
// Handle asynchronous item
var func = item as Func<Task>;
if (func != null)
await func();
});
// To queue a synchronous item
Action synchronous = () => Thread.Sleep(1000);
block.Post(synchronous);
// To queue an asynchronous item
Func<Task> asynchronous = async () => { await Task.Delay(1000); };
blockPost(asynchronous);

Running multiple async tasks and waiting for them all to complete

I need to run multiple async tasks in a console application, and wait for them all to complete before further processing.
There's many articles out there, but I seem to get more confused the more I read. I've read and understand the basic principles of the Task library, but I'm clearly missing a link somewhere.
I understand that it's possible to chain tasks so that they start after another completes (which is pretty much the scenario for all the articles I've read), but I want all my Tasks running at the same time, and I want to know once they're all completed.
What's the simplest implementation for a scenario like this?

Both answers didn't mention the awaitable Task.WhenAll:
var task1 = DoWorkAsync();
var task2 = DoMoreWorkAsync();
await Task.WhenAll(task1, task2);
The main difference between Task.WaitAll and Task.WhenAll is that the former will block (similar to using Wait on a single task) while the latter will not and can be awaited, yielding control back to the caller until all tasks finish.
More so, exception handling differs:
Task.WaitAll:
At least one of the Task instances was canceled -or- an exception was thrown during the execution of at least one of the Task instances. If a task was canceled, the AggregateException contains an OperationCanceledException in its InnerExceptions collection.
Task.WhenAll:
If any of the supplied tasks completes in a faulted state, the returned task will also complete in a Faulted state, where its exceptions will contain the aggregation of the set of unwrapped exceptions from each of the supplied tasks.
If none of the supplied tasks faulted but at least one of them was canceled, the returned task will end in the Canceled state.
If none of the tasks faulted and none of the tasks were canceled, the resulting task will end in the RanToCompletion state.
If the supplied array/enumerable contains no tasks, the returned task will immediately transition to a RanToCompletion state before it's returned to the caller.

You could create many tasks like:
List<Task> TaskList = new List<Task>();
foreach(...)
{
var LastTask = new Task(SomeFunction);
LastTask.Start();
TaskList.Add(LastTask);
}
Task.WaitAll(TaskList.ToArray());

You can use WhenAll which will return an awaitable Task or WaitAll which has no return type and will block further code execution simular to Thread.Sleep until all tasks are completed, canceled or faulted.
WhenAll
WaitAll
Any of the supplied tasks completes in a faulted state
A task with the faulted state will be returned. The exceptions will contain the aggregation of the set of unwrapped exceptions from each of the supplied tasks.
An AggregateException will be thrown.
None of the supplied tasks faulted but at least one of them was canceled
The returned task will end in the TaskStatus.Canceled state
An AggregateException will be thrown which contains an OperationCanceledException in its InnerExceptions collection
An empty list was given
An ArgumentException will be thrown
The returned task will immediately transition to a TaskStatus.RanToCompletion State before it's returned to the caller.
Doesn't block the current thread
Blocks the current thread
Example
var tasks = new Task[] {
TaskOperationOne(),
TaskOperationTwo()
};
Task.WaitAll(tasks);
// or
await Task.WhenAll(tasks);
If you want to run the tasks in a particular/specific order you can get inspiration from this answer.

The best option I've seen is the following extension method:
public static Task ForEachAsync<T>(this IEnumerable<T> sequence, Func<T, Task> action) {
return Task.WhenAll(sequence.Select(action));
}
Call it like this:
await sequence.ForEachAsync(item => item.SomethingAsync(blah));
Or with an async lambda:
await sequence.ForEachAsync(async item => {
var more = await GetMoreAsync(item);
await more.FrobbleAsync();
});

Yet another answer...but I usually find myself in a case, when I need to load data simultaneously and put it into variables, like:
var cats = new List<Cat>();
var dog = new Dog();
var loadDataTasks = new Task[]
{
Task.Run(async () => cats = await LoadCatsAsync()),
Task.Run(async () => dog = await LoadDogAsync())
};
try
{
await Task.WhenAll(loadDataTasks);
}
catch (Exception ex)
{
// handle exception
}

Do you want to chain the Tasks, or can they be invoked in a parallel manner?
For chaining
Just do something like
Task.Run(...).ContinueWith(...).ContinueWith(...).ContinueWith(...);
Task.Factory.StartNew(...).ContinueWith(...).ContinueWith(...).ContinueWith(...);
and don't forget to check the previous Task instance in each ContinueWith as it might be faulted.
For the parallel manner
The most simple method I came across: Parallel.Invoke
Otherwise there's Task.WaitAll or you can even use WaitHandles for doing a countdown to zero actions left (wait, there's a new class: CountdownEvent), or ...

This is how I do it with an array Func<>:
var tasks = new Func<Task>[]
{
() => myAsyncWork1(),
() => myAsyncWork2(),
() => myAsyncWork3()
};
await Task.WhenAll(tasks.Select(task => task()).ToArray()); //Async
Task.WaitAll(tasks.Select(task => task()).ToArray()); //Or use WaitAll for Sync

I prepared a piece of code to show you how to use the task for some of these scenarios.
// method to run tasks in a parallel
public async Task RunMultipleTaskParallel(Task[] tasks) {
await Task.WhenAll(tasks);
}
// methode to run task one by one
public async Task RunMultipleTaskOneByOne(Task[] tasks)
{
for (int i = 0; i < tasks.Length - 1; i++)
await tasks[i];
}
// method to run i task in parallel
public async Task RunMultipleTaskParallel(Task[] tasks, int i)
{
var countTask = tasks.Length;
var remainTasks = 0;
do
{
int toTake = (countTask < i) ? countTask : i;
var limitedTasks = tasks.Skip(remainTasks)
.Take(toTake);
remainTasks += toTake;
await RunMultipleTaskParallel(limitedTasks.ToArray());
} while (remainTasks < countTask);
}

There should be a more succinct solution than the accepted answer. It shouldn't take three steps to run multiple tasks simultaneously and get their results.
Create tasks
await Task.WhenAll(tasks)
Get task results (e.g., task1.Result)
Here's a method that cuts this down to two steps:
public async Task<Tuple<T1, T2>> WhenAllGeneric<T1, T2>(Task<T1> task1, Task<T2> task2)
{
await Task.WhenAll(task1, task2);
return Tuple.Create(task1.Result, task2.Result);
}
You can use it like this:
var taskResults = await Task.WhenAll(DoWorkAsync(), DoMoreWorkAsync());
var DoWorkResult = taskResults.Result.Item1;
var DoMoreWorkResult = taskResults.Result.Item2;
This removes the need for the temporary task variables. The problem with using this is that while it works for two tasks, you'd need to update it for three tasks, or any other number of tasks. Also it doesn't work well if one of the tasks doesn't return anything. Really, the .Net library should provide something that can do this

If you're using the async/await pattern, you can run several tasks in parallel like this:
public async Task DoSeveralThings()
{
// Start all the tasks
Task first = DoFirstThingAsync();
Task second = DoSecondThingAsync();
// Then wait for them to complete
var firstResult = await first;
var secondResult = await second;
}

Regarding the usage of SemaphoreSlim with Async/Await

I am not an advanced developer. I'm just trying to get a hold on the task library and just googling. I've never used the class SemaphoreSlim so I would like to know what it does. Here I present code where SemaphoreSlim is used with async & await but which I do not understand. Could someone help me to understand the code below.
1st set of code
await WorkerMainAsync();
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
while (true)
{
await ss.WaitAsync();
// you should probably store this task somewhere and then await it
var task = DoPollingThenWorkAsync();
}
}
async Task DoPollingThenWorkAsync(SemaphoreSlim semaphore)
{
var msg = Poll();
if (msg != null)
{
await Task.Delay(3000); // process the I/O-bound job
}
// this assumes you don't have to worry about exceptions
// otherwise consider try-finally
semaphore.Release();
}
Firstly, the WorkerMainAsync will be called and a SemaphoreSlim is used. Why is 10 passed to the constructor of SemaphoreSlim?
When does the control come out of the while loop again?
What does ss.WaitAsync(); do?
The DoPollingThenWorkAsync() function is expecting a SemaphoreSlim but is not passed anything when it is called. Is this typo?
Why is await Task.Delay(3000); used?
They could simply use Task.Delay(3000) but why do they use await here instead?
2nd set of code for same purpose
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
List<Task> trackedTasks = new List<Task>();
while (DoMore())
{
await ss.WaitAsync();
trackedTasks.Add(Task.Run(() =>
{
DoPollingThenWorkAsync();
ss.Release();
}));
}
await Task.WhenAll(trackedTasks);
}
void DoPollingThenWorkAsync()
{
var msg = Poll();
if (msg != null)
{
Thread.Sleep(2000); // process the long running CPU-bound job
}
}
Here is a task & ss.Release added to a list. I really do not understand how tasks can run after adding to a list?
trackedTasks.Add(Task.Run(async () =>
{
await DoPollingThenWorkAsync();
ss.Release();
}));
I am looking forward for a good explanation & help to understand the two sets of code. Thanks

why 10 is passing to SemaphoreSlim constructor.
They are using SemaphoreSlim to limit to 10 tasks at a time. The semaphore is "taken" before each task is started, and each task "releases" it when it finishes. For more about semaphores, see MSDN.
they can use simply Task.Delay(3000) but why they use await here.
Task.Delay creates a task that completes after the specified time interval and returns it. Like most Task-returning methods, Task.Delay returns immediately; it is the returned Task that has the delay. So if the code did not await it, there would be no delay.
just really do not understand after adding task to list how they can run?
In the Task-based Asynchronous Pattern, Task objects are returned "hot". This means they're already running by the time they're returned. The await Task.WhenAll at the end is waiting for them all to complete.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How Task.WhenAll works under the hood - c#

Related

Why Task.WhenAll requires a manually created Task to be started when the same doesn't require for async methods?

Task.WaitAll() deadlocking

ContinueWith chaining not working as expected

Running multiple async tasks and waiting for them all to complete

Regarding the usage of SemaphoreSlim with Async/Await

Categories

Resources