Task.WhenAll(IEnumerable): Tasks are started twice?

Task.WhenAll(IEnumerable): Tasks are started twice? - c#

I just stumbled upon one of the overloads of Task.WhenAll, the one that takes an IEnumerable as parameter
public static Task WhenAll(IEnumerable<Task<TResult>> tasks)
I thought I'd try this function with the following short program.
In a Test class:
// contains the task numbers that has been run
private HashSet<int> completedTasks = new HashSet<int>();
// async function. waits a while and marks that it has been run:
async Task<int> Calculate(int taskNr)
{
string msg = completedTasks.Contains(taskNr) ?
"This task has been run before" :
"This is the first time this task runs";
Console.WriteLine($"Start task {i} {msg}");
await Task.Delay(TimeSpan.FromMilliseconds(100));
Console.WriteLine($"Finished task {taskNr}");
// mark that this task has been run:
completedTasks.Add(taskNr);
return i;
}
// async test function that uses Task.WhenAll(IEnumerable)
public async Task TestAsync()
{
Console.Write("Create the task enumerators... ");
IEnumerable<Task<int>> tasks = Enumerable.Range(1, 3)
.Select(i => Calculate(i));
Console.WriteLine("Done!");
Console.WriteLine("Start Tasks and await");
await Task.WhenAll(tasks);
Console.WriteLine("Finished waiting. Results:");
foreach (var task in tasks)
{
Console.WriteLine(task.Result);
}
}
Finally the main program:
static void Main(string[] args)
{
var testClass = new TestClass();
Task t = Task.Run(() => testClass.TestAsync());
t.Wait();
}
The output is as follows:
Create the task enumerators... Done!
Start Tasks and wait
Start task 1 This is the first time this task runs
Start task 2 This is the first time this task runs
Start task 3 This is the first time this task runs
Finished task 2
Finished task 3
Finished task 1
Finished waiting. Results:
Start task 1 This task has been run before
Finished task 1
1
Start task 2 This task has been run before
Finished task 2
2
Start task 3 This task has been run before
Finished task 3
3
Apparently each task is run twice! What am I doing wrong?
Even stranger: if I enumerate over the sequence of Tasks using ToList() before the Task.Whenall, the function works as expected!

Your problem is deferred execution. Change this line
IEnumerable<Task<int>> tasks = Enumerable.Range(1, 3)
.Select(i => Calculate(i));
to
var tasks = Enumerable.Range(1, 3)
.Select(i => Calculate(i)).ToList();
Select() does not execute the "query" immediatly, but returns an enumerator. Only if you use this enumerator to iterate through the tasks, the inner lambda is called for the sequence 1...3.
In your version, every time you iterate through tasks, Calculate(i) is called again and new tasks are created.
With .ToList() the enumerator is executed once and the resulting sequence of Task<int> is stored in a List<Task<int>> (and not generated again when that list is enumerated a second time).
When you call Task.WhenAll(tasks) this method iterates through tasks and thereby starts each task. When you later iterate again (with your foreach loop to output the result), the query is executed again and thereby new tasks are started.

Related

How to synchronize the recurrent execution of three tasks that depend on each other?

I would like to ask expert developers in C#. I have three recurrent tasks that my program needs to do. Task 2 depends on task 1 and task 3 depends on task 2, but task 1 doesn't need to wait for the other two tasks to finish in order to start again (the program is continuously running). Since each task takes some time, I would like to run each task in one thread or a C# Task. Once task 1 finishes task 2 starts and task 1 starts again ... etc.
I'm not sure what is the best way to implement this. I hope someone can guide me on this.

One way to achieve this is using something called the the Task Parallel Library. This provides a set of classes that allow you to arrange your tasks into "blocks". You create a method that does A, B and C sequentially, then TPL will take care of running multiple invocations of that method simultaneously. Here's a small example:
async Task Main()
{
var actionBlock = new ActionBlock<int>(DoTasksAsync, new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 2 // This is the number of simultaneous executions of DoTasksAsync that will be run
};
await actionBlock.SendAsync(1);
await actionBlock.SendAsync(2);
actionBlock.Complete();
await actionBlock.Completion;
}
async Task DoTasksAsync(int input)
{
await DoTaskAAsync();
await DoTaskBAsync();
await DoTaskCAsync();
}

I would probably use some kind of queue pattern.
I am not sure what the requirements for if task 1 is threadsafe or not, so I will keep it simple:
Task 1 is always executing. As soon as it finished, it posts a message on some queue and starts over.
Task 2 is listening to the queue. Whenever a message is available, it starts working on it.
Whenever task 2 finishes working, it calls task 3, so that it can do it's work.
As one of the comments mentioned, you should probably be able to use async/await successfully in your code. Especially between task 2 and 3. Note that task 1 can be run in parallel to task 2 and 3, since it is not dependent on any of the other task.

You could use the ParallelLoop method below. This method starts an asynchronous workflow, where the three tasks are invoked in parallel to each other, but sequentially to themselves. So you don't need to add synchronization inside each task, unless some task produces global side-effects that are visible from some other task.
The tasks are invoked on the ThreadPool, with the Task.Run method.
/// <summary>
/// Invokes three actions repeatedly in parallel on the ThreadPool, with the
/// action2 depending on the action1, and the action3 depending on the action2.
/// Each action is invoked sequentially to itself.
/// </summary>
public static async Task ParallelLoop<TResult1, TResult2>(
Func<TResult1> action1,
Func<TResult1, TResult2> action2,
Action<TResult2> action3,
CancellationToken cancellationToken = default)
{
// Arguments validation omitted
var task1 = Task.FromResult<TResult1>(default);
var task2 = Task.FromResult<TResult2>(default);
var task3 = Task.CompletedTask;
try
{
int counter = 0;
while (true)
{
counter++;
var result1 = await task1.ConfigureAwait(false);
cancellationToken.ThrowIfCancellationRequested();
task1 = Task.Run(action1); // Restart the task1
if (counter <= 1) continue; // In the first loop result1 is undefined
var result2 = await task2.ConfigureAwait(false);
cancellationToken.ThrowIfCancellationRequested();
task2 = Task.Run(() => action2(result1)); // Restart the task2
if (counter <= 2) continue; // In the second loop result2 is undefined
await task3.ConfigureAwait(false);
cancellationToken.ThrowIfCancellationRequested();
task3 = Task.Run(() => action3(result2)); // Restart the task3
}
}
finally
{
// Prevent fire-and-forget
Task allTasks = Task.WhenAll(task1, task2, task3);
try { await allTasks.ConfigureAwait(false); } catch { allTasks.Wait(); }
// Propagate all errors in an AggregateException
}
}
There is an obvious pattern in the implementation, that makes it trivial to add overloads having more than three actions. Each added action will require its own generic type parameter (TResult3, TResult4 etc).
Usage example:
var cts = new CancellationTokenSource();
Task loopTask = ParallelLoop(() =>
{
// First task
Thread.Sleep(1000); // Simulates synchronous work
return "OK"; // The result that is passed to the second task
}, result =>
{
// Second task
Thread.Sleep(1000); // Simulates synchronous work
return result + "!"; // The result that is passed to the third task
}, result =>
{
// Third task
Thread.Sleep(1000); // Simulates synchronous work
}, cts.Token);
In case any of the tasks fails, the whole loop will stop (with the loopTask.Exception containing the error). Since the tasks depend on each other, recovering from a single failed task is not possible¹. What you could do is to execute the whole loop through a Polly Retry policy, to make sure that the loop will be reincarnated in case of failure. If you are unfamiliar with the Polly library, you could use the simple and featureless RetryUntilCanceled method below:
public static async Task RetryUntilCanceled(Func<Task> action,
CancellationToken cancellationToken)
{
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
try { await action().ConfigureAwait(false); }
catch { if (cancellationToken.IsCancellationRequested) throw; }
}
}
Usage:
Task loopTask = RetryUntilCanceled(() => ParallelLoop(() =>
{
//...
}, cts.Token), cts.Token);
Before exiting the process you are advised to Cancel() the CancellationTokenSource and Wait() (or await) the loopTask, in order for the loop to terminate gracefully. Otherwise some tasks may be aborted in the middle of their work.
¹ It is actually possible, and probably preferable, to execute each individual task through a Polly Retry policy. The parallel loop will be suspended until the failed task is retried successfully.

async & await - How to wait until all Tasks are done?

ok. I made a simple console app to figure out how to make all this work. Once I have the basic outline working, then I'll apply it to the real application.
The idea is that we have a lot of database calls to execute that we know are going to take a long time. We do NOT want to (or have to) wait for one database call to be completed before we make the next. They can all run at the same time.
But, before making all of the calls, we need to perform a "starting" task. And when all of the calls are complete, we need to perform a "finished" task.
Here's where I'm at now:
static void Main(string[] args)
{
Console.WriteLine("starting");
PrintAsync().Wait();
Console.WriteLine("ending"); // Must not fire until all tasks are finished
Console.Read();
}
// Missing an "await", I know. But what do I await for?
static async Task PrintAsync()
{
Task.Run(() => PrintOne());
Task.Run(() => PrintTwo());
}
static void PrintOne()
{
Console.WriteLine("one - start");
Thread.Sleep(3000);
Console.WriteLine("one - finish");
}
static void PrintTwo()
{
Console.WriteLine("two - start");
Thread.Sleep(3000);
Console.WriteLine("two - finish");
}
But no matter what I try, Ending always gets printed too early:
starting
ending
one - start
two - start
one - finish
two - finish
What IS working right is that PrintTwo() starts before PrintOne() is done. But how do I properly wait for PrintAsync() to finish before doing anything else?

you need to await the ending of the inner tasks:
static async Task PrintAsync()
{
await Task.WhenAll(Task.Run(() => PrintOne()), Task.Run(() => PrintTwo()));
}
explanation: async Task denotes an awaitable method. Inside this method you can also await Tasks. If you don't do this then it will simply let the tasks loose which will run on their own. Task.Run returns a Task which can be awaited. If you want both tasks to run in parallel you can use the tasks from the retur values and use them in the awaitable method Task.WhenAll
EDIT: Actually Visual Studio would mark this code with a green curvy line. When hoovering with the mouse over it you get a warning:
CS4014
This should explain why "ending" is printed before the tasks have finished
EDIT 2:
If you have a collection of parameters that you want to iterate and call an async method to pass the parameter in, you can also do it with a select statement in 1 line:
static async Task DatabaseCallsAsync()
{
// List of input parameters
List<int> inputParameters = new List<int> {1,2,3,4,5};
await Task.WhenAll(inputParameters.Select(x => DatabaseCallAsync($"Task {x}")));
}
static async Task DatabaseCallAsync(string taskName)
{
Console.WriteLine($"{taskName}: start");
await Task.Delay(3000);
Console.WriteLine($"{taskName}: finish");
}
The last part is similar to a previous answer

OP here. I'm going to leave the answer by Mong Zhu marked as correct, as it lead me to the solution. But I also want to share the final result here, which includes excellent feedback in the comments from juharr. Here's what I came up with:
class Program
{
static void Main(string[] args)
{
Console.WriteLine("starting");
DatabaseCallsAsync().Wait();
Console.WriteLine("ending"); // Must not fire until all database calls are complete.
Console.Read();
}
static async Task DatabaseCallsAsync()
{
// This is one way to do it...
var tasks = new List<Task>();
for (int i = 0; i < 3; i++)
{
tasks.Add(DatabaseCallAsync($"Task {i}"));
}
await Task.WhenAll(tasks.ToArray());
// This is another. Same result...
List<int> inputParameters = new List<int> { 1, 2, 3, 4, 5 };
await Task.WhenAll(inputParameters.Select(x => DatabaseCallAsync($"Task {x}")));
}
static async Task DatabaseCallAsync(string taskName)
{
Console.WriteLine($"{taskName}: start");
await Task.Delay(3000);
Console.WriteLine($"{taskName}: finish");
}
}
Here's the result:
starting
Task 0: start
Task 1: start
Task 2: start
Task 2: finish
Task 0: finish
Task 1: finish
ending

Execute list of async tasks

I have a list of async functions I want to execute in order. When I run the following code I get the output:
Task 1 before
Task 2 before
Finished tasks
Why are my async functions not being awaited correctly?
[Test]
public async Task AsyncTaskList()
{
var data = "I'm data";
var tasks = new List<Func<object, Task>>() {Task1, Task2};
tasks.ForEach(async task =>
{
await task(data);
});
Debug.WriteLine("Finished tasks");
}
private static async Task Task1(object data)
{
Debug.WriteLine("Task 1 before");
await Task.Delay(1000);
Debug.WriteLine("Task 1 after");
}
private static async Task Task2(object data)
{
Debug.WriteLine("Task 2 before");
await Task.Delay(1000);
Debug.WriteLine("Task 2 after");
}

Because the await inside your ForEach delegate actually completes after the method exits. Change it to an actual foreach loop and awaiting will work as expected.
ForEach has no specific handling for Func<Task> (few delegate-accepting methods in the Base Class Library do, and you should note that they will almost invariably return a Task themselves). ForEach will only run the synchronous portion of your lambda - and that is the portion preceding the first await of a Task which does not complete synchronously (which is Task.Delay in your case). This is why you're seeing the "before" messages popping up at the expected time. As soon as your delegate hits await Task.Delay, the the rest of your lambda is scheduled to run sometime in the future and ForEach moves on to the next item in the list. The scheduled task continuations will then run unobserved and complete later.

Task.WaitAll returns before all tasks have completed

WaitAll not working. I get Done before tasks are completed.
I have some groups. each group contains items i want each item to be processed in a separate task.
static void Main(string[] args)
{
List<Task> TaskList = new List<Task>();
foreach (string group in groups)
{
Task task = new Task(() => { scan_group(group, timeout); });
task.Start();
TaskList.Add(task);
}
Task.WaitAll(TaskList.ToArray());
Console.WriteLine("- Done !" );
}
public static void scan_group (){
Task.Factory.StartNew(() => Parallel.ForEach<string>(items, x =>
{
scan(x, y);
}));
}

All your scan_group does is start a task without waiting for it to complete, or returning it to be waited for outside.
So you only wait for the internal tasks to be created, you don't wait for them to complete. That's why you get Done before your internal tasks run.
If you want to wait for the scan_group tasks, return them and store them in the list instead of creating tasks using the task constructor. For example:
foreach (string group in groups)
{
TaskList.Add(Task.Factory.StartNew(() => Parallel.ForEach<string>(items, x => scan(x, y))));
}
Task.WaitAll(TaskList.ToArray());
Note: Using the Task constructor directly is almost never the right solution. Also Task.Run is preferable to Task.Factory.StartNew if you're on .Net 4.5 and above.

This is happening because you are not awaiting any tasks. The Task.WaitAll does wait till all your tasks of scan_group() are done. Your scan_group() method however should be an async task.
Inside this method you can then use await on any task you create there. At the moment you are only starting a new task in the scan_group() method but you do not await these tasks. This causes the tasks to be started somewhere on another thread but the thread starting it already moves on and ends the method. The calling method sees this as your task being done.
Some implementation:
Declaration of scan_group():
public static async Task scan_group()
{
// create tasks for your tasks inside this method.
List<Task> itemTasks = new List<Task>();
items.foreach(x => itemTasks.Add(scan(x,y));
await Task.WhenAll(itemTasks.ToArray());
}
Your call to scan_group would stay the same.

Running multiple async tasks and waiting for them all to complete

I need to run multiple async tasks in a console application, and wait for them all to complete before further processing.
There's many articles out there, but I seem to get more confused the more I read. I've read and understand the basic principles of the Task library, but I'm clearly missing a link somewhere.
I understand that it's possible to chain tasks so that they start after another completes (which is pretty much the scenario for all the articles I've read), but I want all my Tasks running at the same time, and I want to know once they're all completed.
What's the simplest implementation for a scenario like this?

Both answers didn't mention the awaitable Task.WhenAll:
var task1 = DoWorkAsync();
var task2 = DoMoreWorkAsync();
await Task.WhenAll(task1, task2);
The main difference between Task.WaitAll and Task.WhenAll is that the former will block (similar to using Wait on a single task) while the latter will not and can be awaited, yielding control back to the caller until all tasks finish.
More so, exception handling differs:
Task.WaitAll:
At least one of the Task instances was canceled -or- an exception was thrown during the execution of at least one of the Task instances. If a task was canceled, the AggregateException contains an OperationCanceledException in its InnerExceptions collection.
Task.WhenAll:
If any of the supplied tasks completes in a faulted state, the returned task will also complete in a Faulted state, where its exceptions will contain the aggregation of the set of unwrapped exceptions from each of the supplied tasks.
If none of the supplied tasks faulted but at least one of them was canceled, the returned task will end in the Canceled state.
If none of the tasks faulted and none of the tasks were canceled, the resulting task will end in the RanToCompletion state.
If the supplied array/enumerable contains no tasks, the returned task will immediately transition to a RanToCompletion state before it's returned to the caller.

You could create many tasks like:
List<Task> TaskList = new List<Task>();
foreach(...)
{
var LastTask = new Task(SomeFunction);
LastTask.Start();
TaskList.Add(LastTask);
}
Task.WaitAll(TaskList.ToArray());

You can use WhenAll which will return an awaitable Task or WaitAll which has no return type and will block further code execution simular to Thread.Sleep until all tasks are completed, canceled or faulted.
WhenAll
WaitAll
Any of the supplied tasks completes in a faulted state
A task with the faulted state will be returned. The exceptions will contain the aggregation of the set of unwrapped exceptions from each of the supplied tasks.
An AggregateException will be thrown.
None of the supplied tasks faulted but at least one of them was canceled
The returned task will end in the TaskStatus.Canceled state
An AggregateException will be thrown which contains an OperationCanceledException in its InnerExceptions collection
An empty list was given
An ArgumentException will be thrown
The returned task will immediately transition to a TaskStatus.RanToCompletion State before it's returned to the caller.
Doesn't block the current thread
Blocks the current thread
Example
var tasks = new Task[] {
TaskOperationOne(),
TaskOperationTwo()
};
Task.WaitAll(tasks);
// or
await Task.WhenAll(tasks);
If you want to run the tasks in a particular/specific order you can get inspiration from this answer.

The best option I've seen is the following extension method:
public static Task ForEachAsync<T>(this IEnumerable<T> sequence, Func<T, Task> action) {
return Task.WhenAll(sequence.Select(action));
}
Call it like this:
await sequence.ForEachAsync(item => item.SomethingAsync(blah));
Or with an async lambda:
await sequence.ForEachAsync(async item => {
var more = await GetMoreAsync(item);
await more.FrobbleAsync();
});

Yet another answer...but I usually find myself in a case, when I need to load data simultaneously and put it into variables, like:
var cats = new List<Cat>();
var dog = new Dog();
var loadDataTasks = new Task[]
{
Task.Run(async () => cats = await LoadCatsAsync()),
Task.Run(async () => dog = await LoadDogAsync())
};
try
{
await Task.WhenAll(loadDataTasks);
}
catch (Exception ex)
{
// handle exception
}

Do you want to chain the Tasks, or can they be invoked in a parallel manner?
For chaining
Just do something like
Task.Run(...).ContinueWith(...).ContinueWith(...).ContinueWith(...);
Task.Factory.StartNew(...).ContinueWith(...).ContinueWith(...).ContinueWith(...);
and don't forget to check the previous Task instance in each ContinueWith as it might be faulted.
For the parallel manner
The most simple method I came across: Parallel.Invoke
Otherwise there's Task.WaitAll or you can even use WaitHandles for doing a countdown to zero actions left (wait, there's a new class: CountdownEvent), or ...

This is how I do it with an array Func<>:
var tasks = new Func<Task>[]
{
() => myAsyncWork1(),
() => myAsyncWork2(),
() => myAsyncWork3()
};
await Task.WhenAll(tasks.Select(task => task()).ToArray()); //Async
Task.WaitAll(tasks.Select(task => task()).ToArray()); //Or use WaitAll for Sync

I prepared a piece of code to show you how to use the task for some of these scenarios.
// method to run tasks in a parallel
public async Task RunMultipleTaskParallel(Task[] tasks) {
await Task.WhenAll(tasks);
}
// methode to run task one by one
public async Task RunMultipleTaskOneByOne(Task[] tasks)
{
for (int i = 0; i < tasks.Length - 1; i++)
await tasks[i];
}
// method to run i task in parallel
public async Task RunMultipleTaskParallel(Task[] tasks, int i)
{
var countTask = tasks.Length;
var remainTasks = 0;
do
{
int toTake = (countTask < i) ? countTask : i;
var limitedTasks = tasks.Skip(remainTasks)
.Take(toTake);
remainTasks += toTake;
await RunMultipleTaskParallel(limitedTasks.ToArray());
} while (remainTasks < countTask);
}

There should be a more succinct solution than the accepted answer. It shouldn't take three steps to run multiple tasks simultaneously and get their results.
Create tasks
await Task.WhenAll(tasks)
Get task results (e.g., task1.Result)
Here's a method that cuts this down to two steps:
public async Task<Tuple<T1, T2>> WhenAllGeneric<T1, T2>(Task<T1> task1, Task<T2> task2)
{
await Task.WhenAll(task1, task2);
return Tuple.Create(task1.Result, task2.Result);
}
You can use it like this:
var taskResults = await Task.WhenAll(DoWorkAsync(), DoMoreWorkAsync());
var DoWorkResult = taskResults.Result.Item1;
var DoMoreWorkResult = taskResults.Result.Item2;
This removes the need for the temporary task variables. The problem with using this is that while it works for two tasks, you'd need to update it for three tasks, or any other number of tasks. Also it doesn't work well if one of the tasks doesn't return anything. Really, the .Net library should provide something that can do this

If you're using the async/await pattern, you can run several tasks in parallel like this:
public async Task DoSeveralThings()
{
// Start all the tasks
Task first = DoFirstThingAsync();
Task second = DoSecondThingAsync();
// Then wait for them to complete
var firstResult = await first;
var secondResult = await second;
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Task.WhenAll(IEnumerable): Tasks are started twice? - c#

Related

How to synchronize the recurrent execution of three tasks that depend on each other?

async & await - How to wait until all Tasks are done?

Execute list of async tasks

Task.WaitAll returns before all tasks have completed

Running multiple async tasks and waiting for them all to complete

Categories

Resources