I want to call an asynchronous method multiple times in a xUnit test and wait for all calls to complete before I continue execution. I read that I can use Task.WhenAll() and Task.WaitAll() for precisely this scenario. For some reason however, the code is deadlocking.
[Fact]
public async Task GetLdapEntries_ReturnsLdapEntries()
{
var ldapEntries = _fixture.CreateMany<LdapEntryDto>(2).ToList();
var creationTasks = new List<Task>();
foreach (var led in ldapEntries)
{
var task = _attributesServiceClient.CreateLdapEntry(led);
task.Start();
creationTasks.Add(task);
}
Task.WaitAll(creationTasks.ToArray()); //<-- deadlock(?) here
//await Task.WhenAll(creationTasks);
var result = await _ldapAccess.GetLdapEntries();
result.Should().BeEquivalentTo(ldapEntries);
}
public async Task<LdapEntryDto> CreateLdapEntry(LdapEntryDto ldapEntryDto)
{
using (var creationResponse = await _httpClient.PostAsJsonAsync<LdapEntryDto>("", ldapEntryDto))
{
if (creationResponse.StatusCode == HttpStatusCode.Created)
{
return await creationResponse.Content.ReadAsAsync<LdapEntryDto>();
}
throw await buildException(creationResponse);
}
}
The system under test is a wrapper around an HttpClient that calls a web service, awaits the response, and possibly awaits reading the response's content that is finally deserialized and returned.
When I change the foreach part in the test to the following (ie, don't use Task.WhenAll() / WaitAll()), the code is running without a deadlock:
foreach (var led in ldapEntries)
{
await _attributesServiceClient.CreateLdapEntry(led);
}
What exactly is happening?
EDIT: While this question has been marked as duplicate, I don't see how the linked question relates to this one. The code examples in the link all use .Result which, as far as I understand, blocks the execution until the task has finished. In contrast, Task.WhenAll() returns a task that can be awaited and that finishes when all tasks have finished. So why is awaiting Task.WhenAll() deadlocking?
The code you posted cannot possibly have the behavior described. The first call to Task.Start would throw an InvalidOperationException, failing the test.
I read that I can use Task.WhenAll() and Task.WaitAll() for precisely this scenario.
No; to asynchronously wait on multiple tasks, you must use Task.WhenAll, not Task.WaitAll.
Example:
[Fact]
public async Task GetLdapEntries_ReturnsLdapEntries()
{
var ldapEntries = new List<int> { 0, 1 };
var creationTasks = new List<Task>();
foreach (var led in ldapEntries)
{
var task = CreateLdapEntry(led);
creationTasks.Add(task);
}
await Task.WhenAll(creationTasks);
}
public async Task<string> CreateLdapEntry(int ldapEntryDto)
{
await Task.Delay(500);
return "";
}
Task.WaitAll() will deadlock simply because it blocks the current thread while the tasks are not finished (and since you are using async/await and not threads, all of your tasks are running on the same thread, and you are not letting your awaited tasks to go back to the calling point because the thread they are running in -the same one where you called Task.WaitAll()-, is blocked).
Not sure why WhenAll is also deadlocking for you here though, it definitely shouldn't.
PS: you don't need to call Start on tasks returned by an async method: they are "hot" (already started) already upon creation
Related
Ex, the following code manually instantiates a Task and passes to a Task.WhenAll in a List<T>
public async Task Do3()
{
var task1 = new Task(async () => { await Task.Delay(2000); Console.WriteLine("########## task1"); });
var taskList = new List<Task>() { task1};
taskList[0].Start();
var taskDone = Task.WhenAll(taskList);
await taskDone;
}
without starting the Task it doesn't work, it hangs forever calling from a console app, but the below works just fine without starting it
public async Task Do3()
{
//var task1 = new Task(async () => { await Task.Delay(2000); Console.WriteLine("########## task1"); });
var taskList = new List<Task>() { SubDo1() };
//taskList[0].Start();
var taskDone = Task.WhenAll(taskList);
await taskDone;
}
public async Task SubDo1()
{
await Task.Delay(2000);
Console.WriteLine("########## task1");
}
Task is used in two completely different ways here; when you call an async method: you are starting it yourself; at this point, two things can happen:
it can run to completion (eventually) without ever reaching a truly asynchronous state, and return a completed (or faulted) task to the caller
it can reach an incomplete awaitable (in this case await Task.Delay), at which point it creates a state machine that represents the current position, schedules a completion operation on that incomplete awaitable (to do whatever comes next), and then returns an incomplete task to the caller
It is not "not started"; to return anything to the caller: we have started it. However, unlike Task.Start(), we start that work on our current thread - not an external worker thread - with other threads only getting involved based on how that incomplete awaitable schedules the completion callbacks that the compiler gives it.
This is very different to the new Task(...) scenario, where nothing is initially started. That's why they behave differently. Note also the Remarks section of the Task constructor here - it is a very niche API, and honestly: not hugely recommended.
Additionally: when you don't immediately await an async method, you're essentially going into concurrent territory (assuming the awaitable won't always complete synchronously). In some cases, this matters, and may cause threading problems re race-conditions. It shouldn't matter much in this case, though.
Lets say i have the following code for example:
private async Task ManageClients()
{
for (int i =0; i < listClients.Count; i++)
{
if (list[i] == 0)
await DoSomethingWithClientAsync();
else
await DoOtherThingAsync();
}
DoOtherWork();
}
My questions are:
1. Will the for() continue and process other clients on the list?, or it
will await untill it finishes one of the tasks.
2. Is even a good practice to use async/await inside a loop?
3. Can it be done in a better way?
I know it was a really simple example, but I'm trying to imagine what would happen if that code was a server with thousands of clients.
In your code example, the code will "block" when the loop reaches await, meaning the other clients will not be processed until the first one is complete. That is because, while the code uses asynchronous calls, it was written using a synchronous logic mindset.
An asynchronous approach should look more like this:
private async Task ManageClients()
{
var tasks = listClients.Select( client => DoSomethingWithClient() );
await Task.WhenAll(tasks);
DoOtherWork();
}
Notice there is only one await, which simultaneously awaits all of the clients, and allows them to complete in any order. This avoids the situation where the loop is blocked waiting for the first client.
If a thread that is executing an async function is calling another async function, this other function is executed as if it was not async until it sees a call to a third async function. This third async function is executed also as if it was not async.
This goes on, until the thread sees an await.
Instead of really doing nothing, the thread goes up the call stack, to see if the caller was not awaiting for the result of the called function. If not, the thread continues the statements in the caller function until it sees an await. The thread goes up the call stack again to see if it can continue there.
This can be seen in the following code:
var taskDoSomething = DoSomethingAsync(...);
// because we are not awaiting, the following is done as soon as DoSomethingAsync has to await:
DoSomethingElse();
// from here we need the result from DoSomethingAsync. await for it:
var someResult = await taskDoSomething;
You can even call several sub-procedures without awaiting:
var taskDoSomething = DoSomethingAsync(...);
var taskDoSomethingElse = DoSomethingElseAsync(...);
// we are here both tasks are awaiting
DoSomethingElse();
Once you need the results of the tasks, if depends what you want to do with them. Can you continue processing if one task is completed but the other is not?
var someResult = await taskDoSomething;
ProcessResult(someResult);
var someOtherResult = await taskDoSomethingelse;
ProcessBothResults(someResult, someOtherResult);
If you need the result of all tasks before you can continue, use Task.WhenAll:
Task[] allTasks = new Task[] {taskDoSomething, taskDoSomethingElse);
await Task.WhenAll(allTasks);
var someResult = taskDoSomething.Result;
var someOtherResult = taskDoSomethingElse.Result;
ProcessBothResults(someResult, someOtherResult);
Back to your question
If you have a sequence of items where you need to start awaitable tasks, it depends on whether the tasks need the result of other tasks or not. In other words can task[2] start if task[1] has not been completed yet? Do Task[1] and Task[2] interfere with each other if they run both at the same time?
If they are independent, then start all Tasks without awaiting. Then use Task.WhenAll to wait until all are finished. The Task scheduler will take care that not to many tasks will be started at the same time. Be aware though, that starting several tasks could lead to deadlocks. Check carefully if you need critical sections
var clientTasks = new List<Task>();
foreach(var client in clients)
{
if (list[i] == 0)
clientTasks.Add(DoSomethingWithClientAsync());
else
clientTasks.Add(DoOtherThingAsync());
}
// if here: all tasks started. If desired you can do other things:
AndNowForSomethingCompletelyDifferent();
// later we need the other tasks to be finished:
var taskWaitAll = Task.WhenAll(clientTasks);
// did you notice we still did not await yet, we are still in business:
MontyPython();
// okay, done with frolicking, we need the results:
await taskWaitAll;
DoOtherWork();
This was the scenario where all Tasks where independent: no task needed the other to be completed before it could start. However if you need Task[2] to be completed before you can start Task[3] you should await:
foreach(var client in clients)
{
if (list[i] == 0)
await DoSomethingWithClientAsync());
else
await DoOtherThingAsync();
}
I'm still getting up to speed with async & multi threading. I'm trying to monitor when the Task I Start is still running (to show in a UI). However it's indicating that it is RanToCompletion earlier than I want, when it hits an await, even when I consider its Status as still Running.
Here is the sample I'm doing. It all seems to be centred around the await's. When it hits an await, it is then marked as RanToCompletion.
I want to keep track of the main Task which starts it all, in a way which indicates to me that it is still running all the way to the end and only RanToCompletion when it is all done, including the repo call and the WhenAll.
How can I change this to get the feedback I want about the tskProdSeeding task status?
My Console application Main method calls this:
Task tskProdSeeding;
tskProdSeeding = Task.Factory.StartNew(SeedingProd, _cts.Token);
Which the runs this:
private async void SeedingProd(object state)
{
var token = (CancellationToken)state;
while (!token.IsCancellationRequested)
{
int totalSeeded = 0;
var codesToSeed = await _myRepository.All().ToListAsync(token);
await Task.WhenAll(Task.Run(async () =>
{
foreach (var code in codesToSeed)
{
if (!token.IsCancellationRequested)
{
try
{
int seedCountByCode = await _myManager.SeedDataFromLive(code);
totalSeeded += seedCountByCode;
}
catch (Exception ex)
{
_logger.InfoFormat(ex.ToString());
}
}
}
}, token));
Thread.Sleep(30000);
}
}
If you use async void the outer task can't tell when the task is finished, you need to use async Task instead.
Second, once you do switch to async Task, Task.Factory.StartNew can't handle functions that return a Task, you need to switch to Task.Run(
tskProdSeeding = Task.Run(() => SeedingProd(_cts.Token), _cts.Token);
Once you do both of those changes you will be able to await or do a .Wait() on tskProdSeeding and it will properly wait till all the work is done before continuing.
Please read "Async/Await - Best Practices in Asynchronous Programming" to learn more about not doing async void.
Please read "StartNew is Dangerous" to learn more about why you should not be using StartNew the way you are using it.
P.S. In SeedingProd you should switch it to use await Task.Delay(30000); insetad of Thread.Sleep(30000);, you will then not tie up a thread while it waits. If you do this you likely could drop the
tskProdSeeding = Task.Run(() => SeedingProd(_cts.Token), _cts.Token);
and just make it
tskProdSeeding = SeedingProd(_cts.Token);
because the function no-longer has a blocking call inside of it.
I'm not convinced that you need a second thread (Task.Run or StartNew) at all. It looks like the bulk of the work is I/O-bound and if you're doing it asynchronously and using Task.Delay instead of Thread.Sleep, then there is no thread consumed by those operations and your UI shouldn't freeze. The first thing anyone new to async needs to understand is that it's not the same thing as multithreading. The latter is all about consuming more threads, the former is all about consuming fewer. Focus on eliminating the blocking and you shouldn't need a second thread.
As others have noted, SeedingProd needs to return a Task, not void, so you can observe its completion. I believe your method can be reduced to this:
private async Task SeedingProd(CancellationToken token)
{
while (!token.IsCancellationRequested)
{
int totalSeeded = 0;
var codesToSeed = await _myRepository.All().ToListAsync(token);
foreach (var code in codesToSeed)
{
if (token.IsCancellationRequested)
return;
try
{
int seedCountByCode = await _myManager.SeedDataFromLive(code);
totalSeeded += seedCountByCode;
}
catch (Exception ex)
{
_logger.InfoFormat(ex.ToString());
}
}
await Task.Dealy(30000);
}
}
Then simply call the method, without awaiting it, and you'll have your task.
Task mainTask = SeedingProd(token);
When you specify async on a method, it compiles into a state machine with a Task, so SeedingProd does not run synchronously, but acts as a Task even if returns void. So when you call Task.Factory.StartNew(SeedingProd) you start a task that kick off another task - that's why the first one finishes immediately before the second one. All you have to do is add the Task return parameter instead of void:
private async Task SeedingProdAsync(CancellationToken ct)
{
...
}
and call it as simply as this:
Task tskProdSeeding = SeedingProdAsync(_cts.Token);
I'm working on a C# console application that will be responsible for running an array of tasks. The basic structure for that is as follows:
var tasks = workItems.Select(x => Task.Factory.StartNew(() =>
{
DoSomeWork(x);
})).ToArray();
Task.WaitAll(tasks);
The problem is that DoSomeWork() is an async method which awaits the result of another task, which also needs to await a call to the Facebook API.
http://facebooksdk.net/docs/reference/SDK/Facebook.FacebookClient.html#GetTaskAsync(string)
public async void DoSomeWork(WorkItem item)
{
var results = await GetWorkData();
}
public async Task<List<WorkData>> GetWorkData()
{
var fbClient = new FacebookClient();
var task = fbClient.GetTaskAsync("something");
var fbResults = await task;
};
I thought I would be able to support this notion of nested tasks with the call to Task.WaitAll() but the execution of the parent tasks finishes almost immediately. Putting a Console.ReadLine() at the end of the application to prevent it from early execution shows that the result will indeed come back later from Facebook.
Am I missing something obvious or is there a better way to block for my Task collection that will allow for this kind of scenario?
DoSomeWork needs to not return void. The by not returning a Task the caller has no way of knowing when if finishes.
Additionally, it's already asynchronous, so there is no reason to use StartNew here. Just call DoSomeWork directly, since it should be returning a Task.
I am not an advanced developer. I'm just trying to get a hold on the task library and just googling. I've never used the class SemaphoreSlim so I would like to know what it does. Here I present code where SemaphoreSlim is used with async & await but which I do not understand. Could someone help me to understand the code below.
1st set of code
await WorkerMainAsync();
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
while (true)
{
await ss.WaitAsync();
// you should probably store this task somewhere and then await it
var task = DoPollingThenWorkAsync();
}
}
async Task DoPollingThenWorkAsync(SemaphoreSlim semaphore)
{
var msg = Poll();
if (msg != null)
{
await Task.Delay(3000); // process the I/O-bound job
}
// this assumes you don't have to worry about exceptions
// otherwise consider try-finally
semaphore.Release();
}
Firstly, the WorkerMainAsync will be called and a SemaphoreSlim is used. Why is 10 passed to the constructor of SemaphoreSlim?
When does the control come out of the while loop again?
What does ss.WaitAsync(); do?
The DoPollingThenWorkAsync() function is expecting a SemaphoreSlim but is not passed anything when it is called. Is this typo?
Why is await Task.Delay(3000); used?
They could simply use Task.Delay(3000) but why do they use await here instead?
2nd set of code for same purpose
async Task WorkerMainAsync()
{
SemaphoreSlim ss = new SemaphoreSlim(10);
List<Task> trackedTasks = new List<Task>();
while (DoMore())
{
await ss.WaitAsync();
trackedTasks.Add(Task.Run(() =>
{
DoPollingThenWorkAsync();
ss.Release();
}));
}
await Task.WhenAll(trackedTasks);
}
void DoPollingThenWorkAsync()
{
var msg = Poll();
if (msg != null)
{
Thread.Sleep(2000); // process the long running CPU-bound job
}
}
Here is a task & ss.Release added to a list. I really do not understand how tasks can run after adding to a list?
trackedTasks.Add(Task.Run(async () =>
{
await DoPollingThenWorkAsync();
ss.Release();
}));
I am looking forward for a good explanation & help to understand the two sets of code. Thanks
why 10 is passing to SemaphoreSlim constructor.
They are using SemaphoreSlim to limit to 10 tasks at a time. The semaphore is "taken" before each task is started, and each task "releases" it when it finishes. For more about semaphores, see MSDN.
they can use simply Task.Delay(3000) but why they use await here.
Task.Delay creates a task that completes after the specified time interval and returns it. Like most Task-returning methods, Task.Delay returns immediately; it is the returned Task that has the delay. So if the code did not await it, there would be no delay.
just really do not understand after adding task to list how they can run?
In the Task-based Asynchronous Pattern, Task objects are returned "hot". This means they're already running by the time they're returned. The await Task.WhenAll at the end is waiting for them all to complete.