How does Task.WhenAll works under the hood? Does it create separate thread which finished once all of tasks receive callback about finish. I have a suggestion, that under the hood it creates new thread and pass work to system drivers for each of the task and waits for them at the end, but not sure about is it correct or not?
No, Task.WhenAll doesn't create a thread. It is possible that some of the element tasks passed to Task.WhenAll have created threads (but optimally they would not). Task.WhenAll itself just calls ContinueWith on the element tasks, passing a piece of code that checks the other task states. There is no "wait".
Here is an example of how Task.WhenAll may be implemented. (It is not the Microsoft code)
Task MyWhenAll(IEnumerable<Task> tasks)
{
var a = tasks.ToArray();
var tcs = new TaskCompletionSource<bool>();
Array.ForEach(a, WatchTask);
return tcs.Task;
async void WatchTask(Task t)
{
try {
await t;
}
catch {}
if (a.All(element => element.IsCompleted)) {
if (a.Any(element => element.IsFaulted))
// omitted logic for adding each individual exception
// to the aggregate
tcs.TrySetException(new AggregateException());
else
tcs.TrySetResult(true);
}
}
}
Ex, the following code manually instantiates a Task and passes to a Task.WhenAll in a List<T>
public async Task Do3()
{
var task1 = new Task(async () => { await Task.Delay(2000); Console.WriteLine("########## task1"); });
var taskList = new List<Task>() { task1};
taskList[0].Start();
var taskDone = Task.WhenAll(taskList);
await taskDone;
}
without starting the Task it doesn't work, it hangs forever calling from a console app, but the below works just fine without starting it
public async Task Do3()
{
//var task1 = new Task(async () => { await Task.Delay(2000); Console.WriteLine("########## task1"); });
var taskList = new List<Task>() { SubDo1() };
//taskList[0].Start();
var taskDone = Task.WhenAll(taskList);
await taskDone;
}
public async Task SubDo1()
{
await Task.Delay(2000);
Console.WriteLine("########## task1");
}
Task is used in two completely different ways here; when you call an async method: you are starting it yourself; at this point, two things can happen:
it can run to completion (eventually) without ever reaching a truly asynchronous state, and return a completed (or faulted) task to the caller
it can reach an incomplete awaitable (in this case await Task.Delay), at which point it creates a state machine that represents the current position, schedules a completion operation on that incomplete awaitable (to do whatever comes next), and then returns an incomplete task to the caller
It is not "not started"; to return anything to the caller: we have started it. However, unlike Task.Start(), we start that work on our current thread - not an external worker thread - with other threads only getting involved based on how that incomplete awaitable schedules the completion callbacks that the compiler gives it.
This is very different to the new Task(...) scenario, where nothing is initially started. That's why they behave differently. Note also the Remarks section of the Task constructor here - it is a very niche API, and honestly: not hugely recommended.
Additionally: when you don't immediately await an async method, you're essentially going into concurrent territory (assuming the awaitable won't always complete synchronously). In some cases, this matters, and may cause threading problems re race-conditions. It shouldn't matter much in this case, though.
** I've summarised this question at the bottom with an edit **
This has been asked before but I think my circumstances are different.
I am processing multiple requests simultaneously.
This is my code to do that, it runs in a loop. I've removed a bit of code that handles the taskAllocated variable for brevity.
while (!taskAllocated)
{
lock (_lock)
{
// Find an empty slot in the task queue to insert this task
for (i = 0; i < MaxNumTasks; i++)
{
if (_taskQueue[i] == null)
{
_taskQueue[i] = Task.Run(() => Process());
_taskQueue[i].ContinueWith(ProcessCompleted);
break;
}
}
}
}
Process is a typical async Task Process() { CpuIntensiveStuff(); } method.
I've been running the above code, and it has been working fine. It multithreads nicely. Whenever an item comes in, it will find an empty slot in the task queue, and kick it off. When the task completes, the ProcessCompleted method runs, and frees up the slot.
But then I thought, shouldn't I be using await inside my Task.Run? Something like:
_taskQueue[i] = Task.Run(async () => await Process());
After thinking about it, I'm not sure. ContinueWith triggers correctly, when the task has completed, so perhaps it's not necessary.
I ask because I wanted to monitor and log how long each task takes to complete.
So Instead of Process(), I would make another method like:
async Task DoProcess()
{
var sw = Stopwatch.StartNew();
Process();
sw.Stop();
Log(sw.ElapsedMilliseconds);
}
And it occurred to me that if I did that, I wasn't sure if I'd need to await Process(); or not, in addition to not knowing if I should await inside the Task.Run()
I'm in a bit of a tizz about this. Can anyone offer guidance?
Edit:
To summarise:
If Somemethod is:
void SomeMethod() { }
Then
Task.Run(() => SomeMethod()); is great, calls SomeMethod on a new 'thread' (not technically, but you know what I mean).
However, my SomeMethod is actually:
async Task SomeMethod() { }
Do you need to do anything special with Task.Run()?
My code, I am not, I am just straight up ignoring that it's an async Task, and that seems to work:
Task.Run(() => SomeMethod()); // SomeMethod is async Task but I am ignoring that
But I'm not convinced that it a) should work or b) is a good idea. The alternative could be to do:
Task.Run(async() => await SomeMethod());
But is there any point? And this is compounded by the fact I want to really do:
Task.Run(() =>
{
someCode();
var x = startTimer();
SomeMethod();
var y = stopTimer();
someMoreCode()
});
but without await I'm not sure it will wait for somemethod to finish and the timer will be wrong.
Things become more clear if you do not use anonymous methods. For example,
Task.Run(() => Process())
is equivalent to this:
Task.Run(DoSomething);
Task DoSomething() {
return Process();
}
Whereas
Task.Run(async () => await Process())
is equivalent to this:
Task.Run(DoSomething);
async Task DoSomething() {
await Process();
}
In most cases, there is no functional difference between return SomethingThatReturnsATask() and return await SomethingThatReturnsATask(), and you usually want to return the Task directly and not use await (for reasons described here). When used inside Task.Run, things could easily go bad if the .NET team didn't have your back.
It is important to note that asynchronous methods start running on the same thread just like any other method. The magic happens at the first await that acts on an incomplete Task. At that point, await returns its own incomplete Task. That's important - it returns, with a promise to do the rest later.
This could have meant that the Task returned from Task.Run would complete whenever Process() returns a Task. And since Process() returns a Task at the first await, that would happen when it has not yet totally completed.
The .NET team has your back
That is not the case however, because Task.Run has a specific overload for when you give it a method returning a Task. And if you look at the code, it returns a Task *that is tied to the Task you return.
That means that the Task returned from Task.Run(() => Process()) will not complete until the Task returned from Process() has completed.
So your code is fine the way it is.
I have a slightly complex requirement of performing some tasks in parallel, and having to wait for some of them to finish before continuing. Now, I am encountering unexpected behavior, when I have a number of tasks, that I want executed in parallel, but inside a ContinueWith handler. I have whipped up a small sample to illustrate the problem:
var task1 = Task.Factory.StartNew(() =>
{
Console.WriteLine("11");
Thread.Sleep(1000);
Console.WriteLine("12");
}).ContinueWith(async t =>
{
Console.WriteLine("13");
var innerTasks = new List<Task>();
for (var i = 0; i < 10; i++)
{
var j = i;
innerTasks.Add(Task.Factory.StartNew(() =>
{
Console.WriteLine("1_" + j + "_1");
Thread.Sleep(500);
Console.WriteLine("1_" + j + "_2");
}));
}
await Task.WhenAll(innerTasks.ToArray());
//Task.WaitAll(innerTasks.ToArray());
Thread.Sleep(1000);
Console.WriteLine("14");
});
var task2 = Task.Factory.StartNew(() =>
{
Console.WriteLine("21");
Thread.Sleep(1000);
Console.WriteLine("22");
}).ContinueWith(t =>
{
Console.WriteLine("23");
Thread.Sleep(1000);
Console.WriteLine("24");
});
Console.WriteLine("1");
await Task.WhenAll(task1, task2);
Console.WriteLine("2");
The basic pattern is:
- Task 1 should be executed in parallel with Task 2.
- Once the first part of part 1 is done, it should do some more things in parallel. I want to complete, once everything is done.
I expect the following result:
1 <- Start
11 / 21 <- The initial task start
12 / 22 <- The initial task end
13 / 23 <- The continuation task start
Some combinations of "1_[0..9]_[1..2]" and 24 <- the "inner" tasks of task 1 + the continuation of task 2 end
14 <- The end of the task 1 continuation
2 <- The end
Instead, what happens, is that the await Task.WhenAll(innerTasks.ToArray()); does not "block" the continuation task from completing. So, the inner tasks execute after the outer await Task.WhenAll(task1, task2); has completed. The result is something like:
1 <- Start
11 / 21 <- The initial task start
12 / 22 <- The initial task end
13 / 23 <- The continuation task start
Some combinations of "1_[0..9]_[1..2]" and 24 <- the "inner" tasks of task 1 + the continuation of task 2 end
2 <- The end
Some more combinations of "1_[0..9]_[1..2]" <- the "inner" tasks of task 1
14 <- The end of the task 1 continuation
If, instead, I use Task.WaitAll(innerTasks.ToArray()), everything seems to work as expected. Of course, I would not want to use WaitAll, so I won't block any threads.
My questions are:
Why is this unexpected behavior occuring?
How can I remedy the situation without blocking any threads?
Thanks a lot in advance for any pointers!
You're using the wrong tools. Instead of StartNew, use Task.Run. Instead of ContinueWith, use await:
var task1 = Task1();
var task2 = Task2();
Console.WriteLine("1");
await Task.WhenAll(task1, task2);
Console.WriteLine("2");
private async Task Task1()
{
await Task.Run(() =>
{
Console.WriteLine("11");
Thread.Sleep(1000);
Console.WriteLine("12");
});
Console.WriteLine("13");
var innerTasks = new List<Task>();
for (var i = 0; i < 10; i++)
{
innerTasks.Add(Task.Run(() =>
{
Console.WriteLine("1_" + i + "_1");
Thread.Sleep(500);
Console.WriteLine("1_" + i + "_2");
}));
await Task.WhenAll(innerTasks);
}
Thread.Sleep(1000);
Console.WriteLine("14");
}
private async Task Task2()
{
await Task.Run(() =>
{
Console.WriteLine("21");
Thread.Sleep(1000);
Console.WriteLine("22");
});
Console.WriteLine("23");
Thread.Sleep(1000);
Console.WriteLine("24");
}
Task.Run and await are superior here because they correct a lot of unexpected behavior in StartNew/ContinueWith. In particular, asynchronous delegates and (for Task.Run) always using the thread pool.
I have more detailed info on my blog regarding why you shouldn't use StartNew and why you shouldn't use ContinueWith.
As noted in the comments, what you're seeing is normal. The Task returned by ContinueWith() completes when the delegate passed to and invoked by ContinueWith() finishes executing. This happens the first time the anonymous method uses the await statement, and the delegate returns a Task object itself that represents the eventual completion of the entire anonymous method.
Since you are only waiting on the ContinueWith() task, and this task only represents the availability of the task that represents the anonymous method, not the completion of that task, your code doesn't wait.
From your example, it's not clear what the best fix is. But if you make this small change, it will do what you want:
await Task.WhenAll(await task1, task2);
I.e. in the WhenAll() call, don't wait on the ContinueWith() task itself, but rather on the task that task will eventually return. Use await here to avoid blocking the thread while you wait for that task to be available.
When using async methods/lambdas with StartNew, you either wait on the returned task and the contained task:
var task = Task.Factory.StartNew(async () => { /* ... */ });
task.Wait();
task.Result.Wait();
// consume task.Result.Result
Or you use the extension method Unwrap on the result of StartNew and wait on the task it returns.
var task = Task.Factory.StartNew(async () => { /* ... */ })
.Unwrap();
task.Wait();
// consume task.Result
The following discussion goes along the line that Task.Factory.StartNew and ContinueWith should be avoided in specific cases, such as when you don't provide creation or continuation options or when you don't provide a task scheduler.
I don't agree that Task.Factory.StartNew shouldn't be used, I agree that you should use (or consider using) Task.Run wherever you use a Task.Factory.StartNew method overload that doesn't take TaskCreationOptions or a TaskScheduler.
Note that this only applies to the default Task.Factory. I've used custom task factories where I chose to use the StartNew overloads without options and task scheduler, because I configured the factories specific defaults for my needs.
Likewise, I don't agree that ContinueWith shouldn't be used, I agree that you should use (or consider using) async/await wherever you use a ContinueWith method overload that doesn't take TaskContinuationOptions or a TaskScheduler.
For instance, up to C# 5, the most practical way to workaround the limitation of await not being supported in catch and finally blocks is to use ContinueWith.
C# 6:
try
{
return await something;
}
catch (SpecificException ex)
{
await somethingElse;
// throw;
}
finally
{
await cleanup;
}
Equivalent before C# 6:
return await something
.ContinueWith(async somethingTask =>
{
var ex = somethingTask.Exception.InnerException as SpecificException;
if (ex != null)
{
await somethingElse;
// await somethingTask;
}
},
CancellationToken.None,
TaskContinuationOptions.DenyChildAttach | TaskContinuationOptions.NotOnRanToCompletion,
TaskScheduler.Default)
.Unwrap()
.ContinueWith(async catchTask =>
{
await cleanup;
await catchTask;
},
CancellationToken.None,
TaskContinuationOptions.DenyChildAttach,
TaskScheduler.Default)
.Unwrap();
Since, as I told, in some cases I have a TaskFactory with specific defaults, I've defined a few extension methods that take a TaskFactory, reducing the error chance of not passing one of the arguments (I known I can always forget to pass the factory itself):
public static Task ContinueWhen(this TaskFactory taskFactory, Task task, Action<Task> continuationAction)
{
return task.ContinueWith(continuationAction, taskFactory.CancellationToken, taskFactory.ContinuationOptions, taskFactory.Scheduler);
}
public static Task<TResult> ContinueWhen<TResult>(this TaskFactory taskFactory, Task task, Func<Task, TResult> continuationFunction)
{
return task.ContinueWith(continuationFunction, taskFactory.CancellationToken, taskFactory.ContinuationOptions, taskFactory.Scheduler);
}
// Repeat with argument combinations:
// - Task<TResult> task (instead of non-generic Task task)
// - object state
// - bool notOnRanToCompletion (useful in C# before 6)
Usage:
// using namespace that contains static task extensions class
var task = taskFactory.ContinueWhen(existsingTask, t => Continue(a, b, c));
var asyncTask = taskFactory.ContinueWhen(existingTask, async t => await ContinueAsync(a, b, c))
.Unwrap();
I decided not to mimic Task.Run by not overloading the same method name to unwrapping task-returning delegates, it's really not always what you want. Actually, I didn't even implement ContinueWhenAsync extension methods so you need to use Unwrap or two awaits.
Often, these continuations are I/O asynchronous operations, and the pre- and post-processing overhead should be so small that you shouldn't care if it starts running synchronously up to the first yielding point, or even if it completes synchronously (e.g. using an underlying MemoryStream or a mocked DB access). Also, most of them don't depend on a synchronization context.
Whenever you apply the Unwrap extension method or two awaits, you should check if the task falls in this category. If so, async/await is most probably a better choice than starting a task.
For asynchronous operations with a non-negligible synchronous overhead, starting a new task may be preferable. Even so, a notable exception where async/await is still a better choice is if your code is async from the start, such as an async method invoked by a framework or host (ASP.NET, WCF, NServiceBus 6+, etc.), as the overhead is your actual business. For long processing, you may consider using Task.Yield with care. One of the tenets of asynchronous code is to not be too fine grained, however, too coarse grained is just as bad: a set of heavy-duty tasks may prevent the processing of queued lightweight tasks.
If the asynchronous operation depends on a synchronization context, you can still use async/await if you're within that context (in this case, think twice or more before using .ConfigureAwait(false)), otherwise, start a new task using a task scheduler from the respective synchronization context.
I am not sure whether this is possible, so here me out:
I have a sequence of action to perform multiple
async Task MethodA(...)
{
// some code
// a call to specific Async IO bound method
// some code
}
there are also MethodB(), MethodC(), etc, and all of the have exactly the same code, except for the call to specific Async IO bound method. I am trying to find a way to pass a task pointer to a method, so that we can execute it later in a Method().
What i am currently doing is this:
private async Task Method(Func<Task<Entity>> func)
{
// some code
var a = await task.Run(func);
// some code
}
var task = async () => await CallToIOBoundTask(params);
Method(task);
This code, however, pulls a new thread each time, which is not required for IO bound task, and should be avoided.
So, is there a way to refactor the code so that no ThreadPool thread is used? A goal is to have a code like this:
private async Task Method(Task<Entity> task)
{
// some code
var a = await task;
// some code
}
It is also important to mention that different IO calls have different method signatures. Also, a task can start to execute only in Method() body, and not before.
Of course, simply invoke the func, get back a task, and await it:
async Task Method(Func<Task<Entity>> func)
{
// some code
var a = await func();
// some code
}
Also, when you're sending that lambda expression, since all it's doing is calling an async method which in itself returns a task, it doesn't need to be async in itself:
Method(() => CallToIOBoundTask(params));
That's fine as long as all these calls return Task<Entity>. If not, you can only use Task (which means starting the operation and awaiting its completion) and you won't be able to use the result.