I have this example code:
static void Main(string[] args) {
var t1 = Task.Run(async () => {
Console.WriteLine("Putting in fake processing 1.");
await Task.Delay(300);
Console.WriteLine("Fake processing finished 1. ");
});
var t2 = t1.ContinueWith(async (c) => {
Console.WriteLine("Putting in fake processing 2.");
await Task.Delay(200);
Console.WriteLine("Fake processing finished 2.");
});
var t3 = t2.ContinueWith(async (c) => {
Console.WriteLine("Putting in fake processing 3.");
await Task.Delay(100);
Console.WriteLine("Fake processing finished 3.");
});
Console.ReadLine();
}
The console output baffles me:
Putting in fake processing 1.
Fake processing finished 1.
Putting in fake processing 2.
Putting in fake processing 3.
Fake processing finished 3.
Fake processing finished 2.
I am trying to chain the tasks so they execute one after another, what am I doing wrong? And I can't use await, this is just example code, in reality I am queueing incoming tasks (some asynchronous, some not) and want to execute them in the same order they came in but with no parallelism, ContinueWith seemed better than creating a ConcurrentQueue and handling everythning myself, but it just doesn't work...
Take a look at the type of t2. It's a Task<Task>. t2 will be completed when it finishes starting the task that does the actual work not when that work actually finishes.
The smallest change to your code to get it to work would be to add an unwrap after both your second and third calls to ContinueWith, so that you get out the task that represents the completion of your work.
The more idiomatic solution would be to simply remove the ContinueWith calls entirely and just use await to add continuations to tasks.
Interestingly enough, you would see the same behavior for t1 if you used Task.Factory.StartNew, but Task.Run is specifically designed to work with async lambdas and actually internally unwraps all Action<Task> delegates to return the result of the task returned, rather than a task that represents starting that task, which is why you don't need to unwrap that task.
in reality I am queueing incoming tasks (some asynchronous, some not) and want to execute them in the same order they came in but with no parallelism
You probably want to use TPL Dataflow for that. Specifically, ActionBlock.
var block = new ActionBlock<object>(async item =>
{
// Handle synchronous item
var action = item as Action;
if (action != null)
action();
// Handle asynchronous item
var func = item as Func<Task>;
if (func != null)
await func();
});
// To queue a synchronous item
Action synchronous = () => Thread.Sleep(1000);
block.Post(synchronous);
// To queue an asynchronous item
Func<Task> asynchronous = async () => { await Task.Delay(1000); };
blockPost(asynchronous);
Related
How does Task.WhenAll works under the hood? Does it create separate thread which finished once all of tasks receive callback about finish. I have a suggestion, that under the hood it creates new thread and pass work to system drivers for each of the task and waits for them at the end, but not sure about is it correct or not?
No, Task.WhenAll doesn't create a thread. It is possible that some of the element tasks passed to Task.WhenAll have created threads (but optimally they would not). Task.WhenAll itself just calls ContinueWith on the element tasks, passing a piece of code that checks the other task states. There is no "wait".
Here is an example of how Task.WhenAll may be implemented. (It is not the Microsoft code)
Task MyWhenAll(IEnumerable<Task> tasks)
{
var a = tasks.ToArray();
var tcs = new TaskCompletionSource<bool>();
Array.ForEach(a, WatchTask);
return tcs.Task;
async void WatchTask(Task t)
{
try {
await t;
}
catch {}
if (a.All(element => element.IsCompleted)) {
if (a.Any(element => element.IsFaulted))
// omitted logic for adding each individual exception
// to the aggregate
tcs.TrySetException(new AggregateException());
else
tcs.TrySetResult(true);
}
}
}
I would like to ask expert developers in C#. I have three recurrent tasks that my program needs to do. Task 2 depends on task 1 and task 3 depends on task 2, but task 1 doesn't need to wait for the other two tasks to finish in order to start again (the program is continuously running). Since each task takes some time, I would like to run each task in one thread or a C# Task. Once task 1 finishes task 2 starts and task 1 starts again ... etc.
I'm not sure what is the best way to implement this. I hope someone can guide me on this.
One way to achieve this is using something called the the Task Parallel Library. This provides a set of classes that allow you to arrange your tasks into "blocks". You create a method that does A, B and C sequentially, then TPL will take care of running multiple invocations of that method simultaneously. Here's a small example:
async Task Main()
{
var actionBlock = new ActionBlock<int>(DoTasksAsync, new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 2 // This is the number of simultaneous executions of DoTasksAsync that will be run
};
await actionBlock.SendAsync(1);
await actionBlock.SendAsync(2);
actionBlock.Complete();
await actionBlock.Completion;
}
async Task DoTasksAsync(int input)
{
await DoTaskAAsync();
await DoTaskBAsync();
await DoTaskCAsync();
}
I would probably use some kind of queue pattern.
I am not sure what the requirements for if task 1 is threadsafe or not, so I will keep it simple:
Task 1 is always executing. As soon as it finished, it posts a message on some queue and starts over.
Task 2 is listening to the queue. Whenever a message is available, it starts working on it.
Whenever task 2 finishes working, it calls task 3, so that it can do it's work.
As one of the comments mentioned, you should probably be able to use async/await successfully in your code. Especially between task 2 and 3. Note that task 1 can be run in parallel to task 2 and 3, since it is not dependent on any of the other task.
You could use the ParallelLoop method below. This method starts an asynchronous workflow, where the three tasks are invoked in parallel to each other, but sequentially to themselves. So you don't need to add synchronization inside each task, unless some task produces global side-effects that are visible from some other task.
The tasks are invoked on the ThreadPool, with the Task.Run method.
/// <summary>
/// Invokes three actions repeatedly in parallel on the ThreadPool, with the
/// action2 depending on the action1, and the action3 depending on the action2.
/// Each action is invoked sequentially to itself.
/// </summary>
public static async Task ParallelLoop<TResult1, TResult2>(
Func<TResult1> action1,
Func<TResult1, TResult2> action2,
Action<TResult2> action3,
CancellationToken cancellationToken = default)
{
// Arguments validation omitted
var task1 = Task.FromResult<TResult1>(default);
var task2 = Task.FromResult<TResult2>(default);
var task3 = Task.CompletedTask;
try
{
int counter = 0;
while (true)
{
counter++;
var result1 = await task1.ConfigureAwait(false);
cancellationToken.ThrowIfCancellationRequested();
task1 = Task.Run(action1); // Restart the task1
if (counter <= 1) continue; // In the first loop result1 is undefined
var result2 = await task2.ConfigureAwait(false);
cancellationToken.ThrowIfCancellationRequested();
task2 = Task.Run(() => action2(result1)); // Restart the task2
if (counter <= 2) continue; // In the second loop result2 is undefined
await task3.ConfigureAwait(false);
cancellationToken.ThrowIfCancellationRequested();
task3 = Task.Run(() => action3(result2)); // Restart the task3
}
}
finally
{
// Prevent fire-and-forget
Task allTasks = Task.WhenAll(task1, task2, task3);
try { await allTasks.ConfigureAwait(false); } catch { allTasks.Wait(); }
// Propagate all errors in an AggregateException
}
}
There is an obvious pattern in the implementation, that makes it trivial to add overloads having more than three actions. Each added action will require its own generic type parameter (TResult3, TResult4 etc).
Usage example:
var cts = new CancellationTokenSource();
Task loopTask = ParallelLoop(() =>
{
// First task
Thread.Sleep(1000); // Simulates synchronous work
return "OK"; // The result that is passed to the second task
}, result =>
{
// Second task
Thread.Sleep(1000); // Simulates synchronous work
return result + "!"; // The result that is passed to the third task
}, result =>
{
// Third task
Thread.Sleep(1000); // Simulates synchronous work
}, cts.Token);
In case any of the tasks fails, the whole loop will stop (with the loopTask.Exception containing the error). Since the tasks depend on each other, recovering from a single failed task is not possible¹. What you could do is to execute the whole loop through a Polly Retry policy, to make sure that the loop will be reincarnated in case of failure. If you are unfamiliar with the Polly library, you could use the simple and featureless RetryUntilCanceled method below:
public static async Task RetryUntilCanceled(Func<Task> action,
CancellationToken cancellationToken)
{
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
try { await action().ConfigureAwait(false); }
catch { if (cancellationToken.IsCancellationRequested) throw; }
}
}
Usage:
Task loopTask = RetryUntilCanceled(() => ParallelLoop(() =>
{
//...
}, cts.Token), cts.Token);
Before exiting the process you are advised to Cancel() the CancellationTokenSource and Wait() (or await) the loopTask, in order for the loop to terminate gracefully. Otherwise some tasks may be aborted in the middle of their work.
¹ It is actually possible, and probably preferable, to execute each individual task through a Polly Retry policy. The parallel loop will be suspended until the failed task is retried successfully.
Suppose I have the following code (just for learninig purposes):
static async Task Main(string[] args)
{
var results = new ConcurrentDictionary<string, int>();
var tasks = Enumerable.Range(0, 100).Select(async index =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
});
await Task.WhenAll(tasks);
Console.WriteLine($"Items in dictionary {results.Count}");
}
static async Task<int> DoAsyncJob(int i)
{
// simulate some I/O bound operation
await Task.Delay(100);
return i * 10;
}
I want to know what will be the difference if I make it as follows:
var tasks = Enumerable.Range(0, 100)
.Select(index => Task.Run(async () =>
{
var res = await DoAsyncJob(index);
results.TryAdd(index.ToString(), res);
}));
I get the same results in both cases. But does code executes similarly?
Task.Run is there to execute CPU bound, synchronous, operations in a thread pool thread. As it is the operation you're running is already asynchronous, so using Task.Run means you're scheduling work to run in a thread pool thread, and that work is merely starting an asynchronous operation, which then completes almost immediately, and goes off to do whatever asynchronous work it has to do without blocking that thread pool thread. So by using Task.Run you're waiting to schedule work in the thread pool, but then not actually don't any meaningful work. You're better off just starting the asynchronous operation in the current thread.
The only exception would be if DoAsyncJob were implemented improperly and for some reason wasn't actually asynchronous, contrary to its name and signature, and actually did a lot of synchronous work before returning. But if it is doing that, you should just fix that buggy method, rather than using Task.Run to call it.
On a side note, there's no reason to have a ConcurrentDictionary to collect the results here. Task.WhenAll returns a collection of the results of all of the tasks you've executed. Just use that. Now you don't even need a method to wrap your asynchronous method and process the result in any special way, simplifying the code further:
var tasks = Enumerable.Range(0, 100).Select(DoAsyncJob);
var results = await Task.WhenAll(tasks);
Console.WriteLine($"Items in results {results.Count}");
Yes, both cases execute similarly. Actually, they execute in exactly the same way.
Lets say i have the following code for example:
private async Task ManageClients()
{
for (int i =0; i < listClients.Count; i++)
{
if (list[i] == 0)
await DoSomethingWithClientAsync();
else
await DoOtherThingAsync();
}
DoOtherWork();
}
My questions are:
1. Will the for() continue and process other clients on the list?, or it
will await untill it finishes one of the tasks.
2. Is even a good practice to use async/await inside a loop?
3. Can it be done in a better way?
I know it was a really simple example, but I'm trying to imagine what would happen if that code was a server with thousands of clients.
In your code example, the code will "block" when the loop reaches await, meaning the other clients will not be processed until the first one is complete. That is because, while the code uses asynchronous calls, it was written using a synchronous logic mindset.
An asynchronous approach should look more like this:
private async Task ManageClients()
{
var tasks = listClients.Select( client => DoSomethingWithClient() );
await Task.WhenAll(tasks);
DoOtherWork();
}
Notice there is only one await, which simultaneously awaits all of the clients, and allows them to complete in any order. This avoids the situation where the loop is blocked waiting for the first client.
If a thread that is executing an async function is calling another async function, this other function is executed as if it was not async until it sees a call to a third async function. This third async function is executed also as if it was not async.
This goes on, until the thread sees an await.
Instead of really doing nothing, the thread goes up the call stack, to see if the caller was not awaiting for the result of the called function. If not, the thread continues the statements in the caller function until it sees an await. The thread goes up the call stack again to see if it can continue there.
This can be seen in the following code:
var taskDoSomething = DoSomethingAsync(...);
// because we are not awaiting, the following is done as soon as DoSomethingAsync has to await:
DoSomethingElse();
// from here we need the result from DoSomethingAsync. await for it:
var someResult = await taskDoSomething;
You can even call several sub-procedures without awaiting:
var taskDoSomething = DoSomethingAsync(...);
var taskDoSomethingElse = DoSomethingElseAsync(...);
// we are here both tasks are awaiting
DoSomethingElse();
Once you need the results of the tasks, if depends what you want to do with them. Can you continue processing if one task is completed but the other is not?
var someResult = await taskDoSomething;
ProcessResult(someResult);
var someOtherResult = await taskDoSomethingelse;
ProcessBothResults(someResult, someOtherResult);
If you need the result of all tasks before you can continue, use Task.WhenAll:
Task[] allTasks = new Task[] {taskDoSomething, taskDoSomethingElse);
await Task.WhenAll(allTasks);
var someResult = taskDoSomething.Result;
var someOtherResult = taskDoSomethingElse.Result;
ProcessBothResults(someResult, someOtherResult);
Back to your question
If you have a sequence of items where you need to start awaitable tasks, it depends on whether the tasks need the result of other tasks or not. In other words can task[2] start if task[1] has not been completed yet? Do Task[1] and Task[2] interfere with each other if they run both at the same time?
If they are independent, then start all Tasks without awaiting. Then use Task.WhenAll to wait until all are finished. The Task scheduler will take care that not to many tasks will be started at the same time. Be aware though, that starting several tasks could lead to deadlocks. Check carefully if you need critical sections
var clientTasks = new List<Task>();
foreach(var client in clients)
{
if (list[i] == 0)
clientTasks.Add(DoSomethingWithClientAsync());
else
clientTasks.Add(DoOtherThingAsync());
}
// if here: all tasks started. If desired you can do other things:
AndNowForSomethingCompletelyDifferent();
// later we need the other tasks to be finished:
var taskWaitAll = Task.WhenAll(clientTasks);
// did you notice we still did not await yet, we are still in business:
MontyPython();
// okay, done with frolicking, we need the results:
await taskWaitAll;
DoOtherWork();
This was the scenario where all Tasks where independent: no task needed the other to be completed before it could start. However if you need Task[2] to be completed before you can start Task[3] you should await:
foreach(var client in clients)
{
if (list[i] == 0)
await DoSomethingWithClientAsync());
else
await DoOtherThingAsync();
}
I have a slightly complex requirement of performing some tasks in parallel, and having to wait for some of them to finish before continuing. Now, I am encountering unexpected behavior, when I have a number of tasks, that I want executed in parallel, but inside a ContinueWith handler. I have whipped up a small sample to illustrate the problem:
var task1 = Task.Factory.StartNew(() =>
{
Console.WriteLine("11");
Thread.Sleep(1000);
Console.WriteLine("12");
}).ContinueWith(async t =>
{
Console.WriteLine("13");
var innerTasks = new List<Task>();
for (var i = 0; i < 10; i++)
{
var j = i;
innerTasks.Add(Task.Factory.StartNew(() =>
{
Console.WriteLine("1_" + j + "_1");
Thread.Sleep(500);
Console.WriteLine("1_" + j + "_2");
}));
}
await Task.WhenAll(innerTasks.ToArray());
//Task.WaitAll(innerTasks.ToArray());
Thread.Sleep(1000);
Console.WriteLine("14");
});
var task2 = Task.Factory.StartNew(() =>
{
Console.WriteLine("21");
Thread.Sleep(1000);
Console.WriteLine("22");
}).ContinueWith(t =>
{
Console.WriteLine("23");
Thread.Sleep(1000);
Console.WriteLine("24");
});
Console.WriteLine("1");
await Task.WhenAll(task1, task2);
Console.WriteLine("2");
The basic pattern is:
- Task 1 should be executed in parallel with Task 2.
- Once the first part of part 1 is done, it should do some more things in parallel. I want to complete, once everything is done.
I expect the following result:
1 <- Start
11 / 21 <- The initial task start
12 / 22 <- The initial task end
13 / 23 <- The continuation task start
Some combinations of "1_[0..9]_[1..2]" and 24 <- the "inner" tasks of task 1 + the continuation of task 2 end
14 <- The end of the task 1 continuation
2 <- The end
Instead, what happens, is that the await Task.WhenAll(innerTasks.ToArray()); does not "block" the continuation task from completing. So, the inner tasks execute after the outer await Task.WhenAll(task1, task2); has completed. The result is something like:
1 <- Start
11 / 21 <- The initial task start
12 / 22 <- The initial task end
13 / 23 <- The continuation task start
Some combinations of "1_[0..9]_[1..2]" and 24 <- the "inner" tasks of task 1 + the continuation of task 2 end
2 <- The end
Some more combinations of "1_[0..9]_[1..2]" <- the "inner" tasks of task 1
14 <- The end of the task 1 continuation
If, instead, I use Task.WaitAll(innerTasks.ToArray()), everything seems to work as expected. Of course, I would not want to use WaitAll, so I won't block any threads.
My questions are:
Why is this unexpected behavior occuring?
How can I remedy the situation without blocking any threads?
Thanks a lot in advance for any pointers!
You're using the wrong tools. Instead of StartNew, use Task.Run. Instead of ContinueWith, use await:
var task1 = Task1();
var task2 = Task2();
Console.WriteLine("1");
await Task.WhenAll(task1, task2);
Console.WriteLine("2");
private async Task Task1()
{
await Task.Run(() =>
{
Console.WriteLine("11");
Thread.Sleep(1000);
Console.WriteLine("12");
});
Console.WriteLine("13");
var innerTasks = new List<Task>();
for (var i = 0; i < 10; i++)
{
innerTasks.Add(Task.Run(() =>
{
Console.WriteLine("1_" + i + "_1");
Thread.Sleep(500);
Console.WriteLine("1_" + i + "_2");
}));
await Task.WhenAll(innerTasks);
}
Thread.Sleep(1000);
Console.WriteLine("14");
}
private async Task Task2()
{
await Task.Run(() =>
{
Console.WriteLine("21");
Thread.Sleep(1000);
Console.WriteLine("22");
});
Console.WriteLine("23");
Thread.Sleep(1000);
Console.WriteLine("24");
}
Task.Run and await are superior here because they correct a lot of unexpected behavior in StartNew/ContinueWith. In particular, asynchronous delegates and (for Task.Run) always using the thread pool.
I have more detailed info on my blog regarding why you shouldn't use StartNew and why you shouldn't use ContinueWith.
As noted in the comments, what you're seeing is normal. The Task returned by ContinueWith() completes when the delegate passed to and invoked by ContinueWith() finishes executing. This happens the first time the anonymous method uses the await statement, and the delegate returns a Task object itself that represents the eventual completion of the entire anonymous method.
Since you are only waiting on the ContinueWith() task, and this task only represents the availability of the task that represents the anonymous method, not the completion of that task, your code doesn't wait.
From your example, it's not clear what the best fix is. But if you make this small change, it will do what you want:
await Task.WhenAll(await task1, task2);
I.e. in the WhenAll() call, don't wait on the ContinueWith() task itself, but rather on the task that task will eventually return. Use await here to avoid blocking the thread while you wait for that task to be available.
When using async methods/lambdas with StartNew, you either wait on the returned task and the contained task:
var task = Task.Factory.StartNew(async () => { /* ... */ });
task.Wait();
task.Result.Wait();
// consume task.Result.Result
Or you use the extension method Unwrap on the result of StartNew and wait on the task it returns.
var task = Task.Factory.StartNew(async () => { /* ... */ })
.Unwrap();
task.Wait();
// consume task.Result
The following discussion goes along the line that Task.Factory.StartNew and ContinueWith should be avoided in specific cases, such as when you don't provide creation or continuation options or when you don't provide a task scheduler.
I don't agree that Task.Factory.StartNew shouldn't be used, I agree that you should use (or consider using) Task.Run wherever you use a Task.Factory.StartNew method overload that doesn't take TaskCreationOptions or a TaskScheduler.
Note that this only applies to the default Task.Factory. I've used custom task factories where I chose to use the StartNew overloads without options and task scheduler, because I configured the factories specific defaults for my needs.
Likewise, I don't agree that ContinueWith shouldn't be used, I agree that you should use (or consider using) async/await wherever you use a ContinueWith method overload that doesn't take TaskContinuationOptions or a TaskScheduler.
For instance, up to C# 5, the most practical way to workaround the limitation of await not being supported in catch and finally blocks is to use ContinueWith.
C# 6:
try
{
return await something;
}
catch (SpecificException ex)
{
await somethingElse;
// throw;
}
finally
{
await cleanup;
}
Equivalent before C# 6:
return await something
.ContinueWith(async somethingTask =>
{
var ex = somethingTask.Exception.InnerException as SpecificException;
if (ex != null)
{
await somethingElse;
// await somethingTask;
}
},
CancellationToken.None,
TaskContinuationOptions.DenyChildAttach | TaskContinuationOptions.NotOnRanToCompletion,
TaskScheduler.Default)
.Unwrap()
.ContinueWith(async catchTask =>
{
await cleanup;
await catchTask;
},
CancellationToken.None,
TaskContinuationOptions.DenyChildAttach,
TaskScheduler.Default)
.Unwrap();
Since, as I told, in some cases I have a TaskFactory with specific defaults, I've defined a few extension methods that take a TaskFactory, reducing the error chance of not passing one of the arguments (I known I can always forget to pass the factory itself):
public static Task ContinueWhen(this TaskFactory taskFactory, Task task, Action<Task> continuationAction)
{
return task.ContinueWith(continuationAction, taskFactory.CancellationToken, taskFactory.ContinuationOptions, taskFactory.Scheduler);
}
public static Task<TResult> ContinueWhen<TResult>(this TaskFactory taskFactory, Task task, Func<Task, TResult> continuationFunction)
{
return task.ContinueWith(continuationFunction, taskFactory.CancellationToken, taskFactory.ContinuationOptions, taskFactory.Scheduler);
}
// Repeat with argument combinations:
// - Task<TResult> task (instead of non-generic Task task)
// - object state
// - bool notOnRanToCompletion (useful in C# before 6)
Usage:
// using namespace that contains static task extensions class
var task = taskFactory.ContinueWhen(existsingTask, t => Continue(a, b, c));
var asyncTask = taskFactory.ContinueWhen(existingTask, async t => await ContinueAsync(a, b, c))
.Unwrap();
I decided not to mimic Task.Run by not overloading the same method name to unwrapping task-returning delegates, it's really not always what you want. Actually, I didn't even implement ContinueWhenAsync extension methods so you need to use Unwrap or two awaits.
Often, these continuations are I/O asynchronous operations, and the pre- and post-processing overhead should be so small that you shouldn't care if it starts running synchronously up to the first yielding point, or even if it completes synchronously (e.g. using an underlying MemoryStream or a mocked DB access). Also, most of them don't depend on a synchronization context.
Whenever you apply the Unwrap extension method or two awaits, you should check if the task falls in this category. If so, async/await is most probably a better choice than starting a task.
For asynchronous operations with a non-negligible synchronous overhead, starting a new task may be preferable. Even so, a notable exception where async/await is still a better choice is if your code is async from the start, such as an async method invoked by a framework or host (ASP.NET, WCF, NServiceBus 6+, etc.), as the overhead is your actual business. For long processing, you may consider using Task.Yield with care. One of the tenets of asynchronous code is to not be too fine grained, however, too coarse grained is just as bad: a set of heavy-duty tasks may prevent the processing of queued lightweight tasks.
If the asynchronous operation depends on a synchronization context, you can still use async/await if you're within that context (in this case, think twice or more before using .ConfigureAwait(false)), otherwise, start a new task using a task scheduler from the respective synchronization context.