Async Task.WhenAll with timeout - issue with completed tasks accumulating - c#

I have created the following in order to execute multiple async tasks with a timeout. I was looking for something that will allow extracting results from the tasks - taking only those that beat the timeout, regardless if the rest of tasks failed to do so (simplified):
TimeSpan timeout = TimeSpan.FromSeconds(5.0);
Task<Task>[] tasksOfTasks =
{
Task.WhenAny(SomeTaskAsync("a"), Task.Delay(timeout)),
Task.WhenAny(SomeTaskAsync("b"), Task.Delay(timeout)),
Task.WhenAny(SomeTaskAsync("c"), Task.Delay(timeout))
};
Task[] completedTasks = await Task.WhenAll(tasksOfTasks);
List<MyResult> = completedTasks.OfType<Task<MyResult>>().Select(task => task.Result).ToList();
I have implemented this in a non-static class in the (Web API) server.
This worked well on the first call, however additional calls caused completedTasks to strangely accumulate tasks from previous calls to the server (as shown by the debugger). On the second call there were 6 completed tasks, on the third call 9 and so on.
My questions:
Any idea why is that?
I assume it's because the previous tasks weren't cancelled however this code is in a new instance of a class!
Any idea how to avoid this accumulation?
PS: See my answer to this question.

I couldn't use my psychic debugging to understand why your code "caused completedTasks to strangely accumulate tasks from previous calls" but it does probably expose some of your misunderstandings.
Here's a working example based on your code (using string instead of MyResult):
Task<string> timeoutTask =
Task.Delay(TimeSpan.FromSeconds(5)).ContinueWith(_ => string.Empty);
Task<Task<string>>[] tasksOfTasks =
{
Task.WhenAny(SomeTaskAsync("a"), timeoutTask),
Task.WhenAny(SomeTaskAsync("b"), timeoutTask),
Task.WhenAny(SomeTaskAsync("c"), timeoutTask)
};
Task<string>[] completedTasks = await Task.WhenAll(tasksOfTasks);
List<string> results = completedTasks.Where(task => task != timeoutTask).
Select(task => task.Result).ToList();
So, what's different:
I'm using the same timeout task for all WhenAny calls. There's no need to use more, and they could complete in slightly different times.
I make the timeout task return a value, so it's actually a Task<string> and not a Task.
That makes each WhenAny call also return a Task<string> (and tasksOfTasks be Task<Task<string>>[]) which would make it possible to actually return a result out of these tasks.
After awaiting we need to filter out WhenAny calls that returned our timeout task, because there would be no result there (only string.Empty) using completedTasks.Where(task => task != timeoutTask).
P.S : I've also answered that question and I would (surprisingly) recommend you use my solution.
Note: Using the Task.Result property isn't advisable. You should await it instead (even when you know it's already completed)

Related

c#, multiple async task execute [duplicate]

In terms of performance, will these 2 methods run GetAllWidgets() and GetAllFoos() in parallel?
Is there any reason to use one over the other? There seems to be a lot happening behind the scenes with the compiler so I don't find it clear.
============= MethodA: Using multiple awaits ======================
public async Task<IHttpActionResult> MethodA()
{
var customer = new Customer();
customer.Widgets = await _widgetService.GetAllWidgets();
customer.Foos = await _fooService.GetAllFoos();
return Ok(customer);
}
=============== MethodB: Using Task.WaitAll =====================
public async Task<IHttpActionResult> MethodB()
{
var customer = new Customer();
var getAllWidgetsTask = _widgetService.GetAllWidgets();
var getAllFoosTask = _fooService.GetAllFos();
Task.WaitAll(new List[] {getAllWidgetsTask, getAllFoosTask});
customer.Widgets = getAllWidgetsTask.Result;
customer.Foos = getAllFoosTask.Result;
return Ok(customer);
}
=====================================
The first option will not execute the two operations concurrently. It will execute the first and await its completion, and only then the second.
The second option will execute both concurrently but will wait for them synchronously (i.e. while blocking a thread).
You shouldn't use both options since the first completes slower than the second and the second blocks a thread without need.
You should wait for both operations asynchronously with Task.WhenAll:
public async Task<IHttpActionResult> MethodB()
{
var customer = new Customer();
var getAllWidgetsTask = _widgetService.GetAllWidgets();
var getAllFoosTask = _fooService.GetAllFos();
await Task.WhenAll(getAllWidgetsTask, getAllFoosTask);
customer.Widgets = await getAllWidgetsTask;
customer.Foos = await getAllFoosTask;
return Ok(customer);
}
Note that after Task.WhenAll completed both tasks already completed so awaiting them completes immediately.
Short answer: No.
Task.WaitAll is blocking, await returns the task as soon as it is encountered and registers the remaining part of the function and continuation.
The "bulk" waiting method you were looking for is Task.WhenAll that actually creates a new Task that finishes when all tasks that were handed to the function are done.
Like so: await Task.WhenAll({getAllWidgetsTask, getAllFoosTask});
That is for the blocking matter.
Also your first function does not execute both functions parallel. To get this working with await you'd have to write something like this:
var widgetsTask = _widgetService.GetAllWidgets();
var foosTask = _fooService.GetAllWidgets();
customer.Widgets = await widgetsTask;
customer.Foos = await foosTask;
This will make the first example to act very similar to the Task.WhenAll method.
As an addition to what #i3arnon said. You will see that when you use await you are forced to have to declare the enclosing method as async, but with waitAll you don't. That should tell you that there is more to it than what the main answer says. Here it is:
WaitAll will block until the given tasks finish, it does not pass control back to the caller while those tasks are running. Also as mentioned, the tasks are run asynchronous to themselves, not to the caller.
Await will not block the caller thread, it will however suspend the execution of the code below it, but while the task is running, control is returned back to the caller. For the fact that control is returned back to the caller (the called method is running async), you have to mark the method as async.
Hopefully the difference is clear. Cheers
Only your second option will run them in parallel. Your first will wait on each call in sequence.
As soon as you invoke the async method it will start executing. Whether it will execute on the current thread (and thus run synchronously) or it will run async is not possible to determine.
Thus, in your first example the first method will start doing work, but then you artificially stops the flow of the code with the await. And thus the second method will not be invoked before the first is done executing.
The second example invokes both methods without stopping the flow with an await. Thus they will potentially run in parallel if the methods are asynchronous.

Select with asynchronous method

I've found a question that was really useful for me, but I still can't realize what equivalent in LINQ-world has further construction:
public async Task<List<ObjectInfo>> GetObjectsInfo(string[] objectIds)
{
var result = new List<ObjectInfo>(objectIds.Length);
foreach (var id in objectIds)
{
result.Add(await GetObjectInfo(id));
}
return result;
}
If I wrote instead
var result = await Task.WhenAll(objectIds.Select(id => GetObjectInfo(id)));
wouldn't these tasks be started simultaneously?
In my case it would be better to run them in series.
Edit 1: Answering to the comment of Theodor Zoulias.
Of course, I forgot Async suffix in methods' names!
The method GetObjectInfoAsync makes http request to external service. Additionally, this service has restriction for requests' frequency, so I use following construction.
using (var throttler = new Throttler(clientId))
{
while (!throttler.IsCallAllowed(out var waitTime))
{
await Task.Delay(waitTime);
}
var response = await client.PerformHttpRequestAsync(request);
return response.Content.FromJson<TResponse>(serializerSettings);
}
Throttler knowns last request's time for each client.
You should take into account the following considerations when you are using whenAll
Quotes taken from MS documentation
If any of the supplied tasks completes in a faulted state, the returned task will also complete in a Faulted state, where its exceptions will contain the aggregation of the set of unwrapped exceptions from each of the supplied tasks.
If none of the supplied tasks faulted but at least one of them was canceled, the returned task will end in the Canceled state.
If none of the tasks faulted and none of the tasks were canceled, the resulting task will end in the RanToCompletion state.
If the supplied array/enumerable contains no tasks, the returned task will immediately transition to a RanToCompletion state before it's returned to the caller.
That means that while the items will run maybe simultaneously (you can't know how as it depends on the resources of the machine and you can't know the order), the above can't be avoided.
If you want to have specific error handling per case and a specific order, then the first solution is the way to go. It kind of beats the point of the async/await but not in it's entirety. Even with sequential execution of the async items, your thread at least will not go to sleep and can still be used until the awaitable is ready.

Awaiting async tasks instantly vs declaring first and then awaiting

Let's look at the following 2 examples:
public class MyClass
{
public async Task Main()
{
var result1 = "";
var result2 = "";
var request1 = await DelayMe();
var request2 = await DelayMe();
result1 = request1;
result2 = request2;
}
private static async Task<String> DelayMe()
{
await Task.Delay(2000);
return "";
}
}
And:
public class MyClass
{
public async Task Main()
{
var result1 = "";
var result2 = "";
var request1 = DelayMe();
var request2 = DelayMe();
result1 = await request1;
result2 = await request2;
}
private static async Task<String> DelayMe()
{
await Task.Delay(2000);
return "";
}
}
In the first example shows how you would typically write async await code where one thing happens after the other and awaited properly.
The second one is first calling the async Task method but it's awaiting it later.
The first example takes a bit over 4000ms to execute because the await is computing the first request before it makes the second; but the second example takes a bit over 2000ms. This happens because the Task actually starts running as soon as the execution steps over the var request1 = DelayMe(); line which means that request1 and request2 are running in parallel. At this point it looks like the await keyword just ensures that the Task is computed.
The second approach feels and acts like a await Task.WhenAll(request1, request2), but in this scenario, if something fails in the 2 requests, you will get an exception instantly instead of waiting for everything to compute and then getting an AggregateException.
My question is that is there a drawback (performance or otherwise) in using the second approach to run multiple awaitable Tasks in parallel when the result of one doesn't depend on the execution of the other? Looking at the lowered code, it looks like the second example generates an equal amount of System.Threading.Tasks.Task1per awaited item while the first one doesn't. Is this still going through theasync await` state-machine flow?
if something fails in the 2 requests, you will get an exception instantly instead of waiting for everything to compute and then getting an AggregateException.
If something fails in the first request, then yes. If something fails in the second request, then no, you wouldn't check the second request results until that task is awaited.
My question is that is there a drawback (performance or otherwise) in using the second approach to run multiple awaitable Tasks in parallel when the result of one doesn't depend on the execution of the other? Looking at the lowered code, it looks like the second example generates an equal amount of System.Threading.Tasks.Task1per awaited item while the first one doesn't. Is this still going through theasync await` state-machine flow?
It's still going through the state machine flow. I tend to recommend await Task.WhenAll because the intent of the code is more explicit, but there are some people who don't like the "always wait even when there are exceptions" behavior. The flip side to that is that Task.WhenAll always collects all the exceptions - if you have fail-fast behavior, then some exceptions could be ignored.
Regarding performance, concurrent execution would be better because you can do multiple operations concurrently. There's no danger of threadpool exhaustion from this because async/await does not use additional threads.
As a side note, I recommend using the term "asynchronous concurrency" for this rather than "parallel", since to many people "parallel" implies parallel processing, i.e., Parallel or PLINQ, which would be the wrong technologies to use in this case.
The drawback of using the second approach to run multiple awaitable tasks in parallel is that the parallelism is not obvious. And not obvious parallelism (implicit multithreading in other words) is dangerous because the bugs that could be introduced are notoriously inconsistent and sporadically observed. Lets suppose that the actual DelayMe running in the production environment was the one bellow:
private static int delaysCount = 0;
private static async Task<String> DelayMe()
{
await Task.Delay(2000);
return (++delaysCount).ToString();
}
Sequentially awaited calls to DelayMe will return increasing numbers. Parallelly awaited calls will occasionally return the same number.

How to make sure a task is started and safely start it if not?

I get an IEnumerable<Task> tasks from somewhere that I do not control. I don't know if the tasks are manually created using new Task, Task.Run, or if they are a result of an async method call async Task DoSomethingAsync().
If I do await Task.WhenAll(tasks), I risk hanging indefinitely because maybe one or more of the tasks are not started.
I can't do tasks.ForEach(t => t.Start()), because then I will get an InvalidOperationException "Start may not be called on a promise-style task" if it's from an async method call (already started).
I can't do await Task.WhenAll(tasks.Select(t => Task.Run(async () => await t))) because each t still does not start just by awaiting it.
I assume the solution has something to do with checking each task's Status and Start() based on that, but I also assume that it can be tricky because that status could change at any time, right? If this is still the way to go, which statuses would be correct to check and what threading issues should I worry about?
Non working case example:
//making an IEnumerable as an example, remember I don't control this part
Task t = new Task( () => Console.WriteLine("started"));
IEnumerable<Task> tasks = new[] {t};
//here I receive the tasks
await Task.WhenAll(tasks);//waits forever because t is not started
Working case example:
//calls the async function, starting it.
Task t = DoSomethingAsync();
IEnumerable<Task> tasks = new[] {t};
//here I receive the tasks and it will complete because the task is already started
await Task.WhenAll(tasks);
async Task DoSomethingAsync() => Console.WriteLine("started");
If for whatever reason you cannot change the code to not return unstarted tasks, you can check Status and start task if it has Created status:
if (task.Status == TaskStatus.Created)
task.Start();
All other task statues indicate that task is either completed, running, or being scheduled, so you don't need to start tasks in that statuses.
Of course in theory this introduces race condition, because task can be started right between your check and Start call, but, as correctly pointed by Servy in comments - if there ever is race condition here - that means another party (which created that task) is also trying to start it. Even if you handle exception (InvalidOperationException) - another party is unlikely to do that, and so will get exception while trying to start their own task. So only one side (either you, or code that created that task) should be trying to start it.
That said - much better than doing this is to ensure you might never get unstarted task in the first place, because it's just bad design to return such tasks to external code, at least without explicitly indicating that (while it's for some use cases ok to use unstarted task internally).

Calling async methods from non-async code

I'm in the process of updating a library that has an API surface that was built in .NET 3.5. As a result, all methods are synchronous. I can't change the API (i.e., convert return values to Task) because that would require that all callers change. So I'm left with how to best call async methods in a synchronous way. This is in the context of ASP.NET 4, ASP.NET Core, and .NET/.NET Core console applications.
I may not have been clear enough - the situation is that I have existing code that is not async aware, and I want to use new libraries such as System.Net.Http and the AWS SDK that support only async methods. So I need to bridge the gap, and be able to have code that can be called synchronously but then can call async methods elsewhere.
I've done a lot of reading, and there are a number of times this has been asked and answered.
Calling async method from non async method
Synchronously waiting for an async operation, and why does Wait() freeze the program here
Calling an async method from a synchronous method
How would I run an async Task<T> method synchronously?
Calling async method synchronously
How to call asynchronous method from synchronous method in C#?
The problem is that most of the answers are different! The most common approach I've seen is use .Result, but this can deadlock. I've tried all the following, and they work, but I'm not sure which is the best approach to avoid deadlocks, have good performance, and plays nicely with the runtime (in terms of honoring task schedulers, task creation options, etc). Is there a definitive answer? What is the best approach?
private static T taskSyncRunner<T>(Func<Task<T>> task)
{
T result;
// approach 1
result = Task.Run(async () => await task()).ConfigureAwait(false).GetAwaiter().GetResult();
// approach 2
result = Task.Run(task).ConfigureAwait(false).GetAwaiter().GetResult();
// approach 3
result = task().ConfigureAwait(false).GetAwaiter().GetResult();
// approach 4
result = Task.Run(task).Result;
// approach 5
result = Task.Run(task).GetAwaiter().GetResult();
// approach 6
var t = task();
t.RunSynchronously();
result = t.Result;
// approach 7
var t1 = task();
Task.WaitAll(t1);
result = t1.Result;
// approach 8?
return result;
}
So I'm left with how to best call async methods in a synchronous way.
First, this is an OK thing to do. I'm stating this because it is common on Stack Overflow to point this out as a deed of the devil as a blanket statement without regard for the concrete case.
It is not required to be async all the way for correctness. Blocking on something async to make it sync has a performance cost that might matter or might be totally irrelevant. It depends on the concrete case.
Deadlocks come from two threads trying to enter the same single-threaded synchronization context at the same time. Any technique that avoids this reliably avoids deadlocks caused by blocking.
In your code snippet, all calls to .ConfigureAwait(false) are pointless because the return value is not awaited. ConfigureAwait returns a struct that, when awaited, exhibits the behavior that you requested. If that struct is simply dropped, it does nothing.
RunSynchronously is invalid to use because not all tasks can be processed that way. This method is meant for CPU-based tasks, and it can fail to work under certain circumstances.
.GetAwaiter().GetResult() is different from Result/Wait() in that it mimics the await exception propagation behavior. You need to decide if you want that or not. (So research what that behavior is; no need to repeat it here.) If your task contains a single exception then the await error behavior is usually convenient and has little downside. If there are multiple exceptions, for example from a failed Parallel loop where multiple tasks failed, then await will drop all exceptions but the first one. That makes debugging harder.
All these approaches have similar performance. They will allocate an OS event one way or another and block on it. That's the expensive part. The other machinery is rather cheap compared to that. I don't know which approach is absolutely cheapest.
In case an exception is being thrown, that is going to be the most expensive part. On .NET 5, exceptions are processed at a rate of at most 200,000 per second on a fast CPU. Deep stacks are slower, and the task machinery tends to rethrow exceptions multiplying their cost. There are ways of blocking on a task without the exception being rethrown, for example task.ContinueWith(_ => { }, TaskContinuationOptions.ExecuteSynchronously).Wait();.
I personally like the Task.Run(() => DoSomethingAsync()).Wait(); pattern because it avoids deadlocks categorically, it is simple and it does not hide some exceptions that GetResult() might hide. But you can use GetResult() as well with this.
I'm in the process of updating a library that has an API surface that was built in .NET 3.5. As a result, all methods are synchronous. I can't change the API (i.e., convert return values to Task) because that would require that all callers change. So I'm left with how to best call async methods in a synchronous way.
There is no universal "best" way to perform the sync-over-async anti-pattern. Only a variety of hacks that each have their own drawbacks.
What I recommend is that you keep the old synchronous APIs and then introduce asynchronous APIs alongside them. You can do this using the "boolean argument hack" as described in my MSDN article on Brownfield Async.
First, a brief explanation of the problems with each approach in your example:
ConfigureAwait only makes sense when there is an await; otherwise, it does nothing.
Result will wrap exceptions in an AggregateException; if you must block, use GetAwaiter().GetResult() instead.
Task.Run will execute its code on a thread pool thread (obviously). This is fine only if the code can run on a thread pool thread.
RunSynchronously is an advanced API used in extremely rare situations when doing dynamic task-based parallelism. You're not in that scenario at all.
Task.WaitAll with a single task is the same as just Wait().
async () => await x is just a less-efficient way of saying () => x.
Blocking on a task started from the current thread can cause deadlocks.
Here's the breakdown:
// Problems (1), (3), (6)
result = Task.Run(async () => await task()).ConfigureAwait(false).GetAwaiter().GetResult();
// Problems (1), (3)
result = Task.Run(task).ConfigureAwait(false).GetAwaiter().GetResult();
// Problems (1), (7)
result = task().ConfigureAwait(false).GetAwaiter().GetResult();
// Problems (2), (3)
result = Task.Run(task).Result;
// Problems (3)
result = Task.Run(task).GetAwaiter().GetResult();
// Problems (2), (4)
var t = task();
t.RunSynchronously();
result = t.Result;
// Problems (2), (5)
var t1 = task();
Task.WaitAll(t1);
result = t1.Result;
Instead of any of these approaches, since you have existing, working synchronous code, you should use it alongside the newer naturally-asynchronous code. For example, if your existing code used WebClient:
public string Get()
{
using (var client = new WebClient())
return client.DownloadString(...);
}
and you want to add an async API, then I would do it like this:
private async Task<string> GetCoreAsync(bool sync)
{
using (var client = new WebClient())
{
return sync ?
client.DownloadString(...) :
await client.DownloadStringTaskAsync(...);
}
}
public string Get() => GetCoreAsync(sync: true).GetAwaiter().GetResult();
public Task<string> GetAsync() => GetCoreAsync(sync: false);
or, if you must use HttpClient for some reason:
private string GetCoreSync()
{
using (var client = new WebClient())
return client.DownloadString(...);
}
private static HttpClient HttpClient { get; } = ...;
private async Task<string> GetCoreAsync(bool sync)
{
return sync ?
GetCoreSync() :
await HttpClient.GetString(...);
}
public string Get() => GetCoreAsync(sync: true).GetAwaiter().GetResult();
public Task<string> GetAsync() => GetCoreAsync(sync: false);
With this approach, your logic would go into the Core methods, which may be run synchronously or asynchronously (as determined by the sync parameter). If sync is true, then the core methods must return an already-completed task. For implemenation, use synchronous APIs to run synchronously, and use asynchronous APIs to run asynchronously.
Eventually, I recommend deprecating the synchronous APIs.
I just went thru this very thing with the AWS S3 SDK. Used to be sync, and I built a bunch of code on that, but now it's async. And that's fine: they changed it, nothing to be gained by moaning about it, move on.So I need to update my app, and my options are to either refactor a large part of my app to be async, or to "hack" the S3 async API to behave like sync.I'll eventually get around to the larger async refactoring - there are many benefits - but for today I have bigger fish to fry so I chose to fake the sync.Original sync code was ListObjectsResponse response = api.ListObjects(request); and a really simple async equivalent that works for me is Task<ListObjectsV2Response> task = api.ListObjectsV2Async(rq2);ListObjectsV2Response rsp2 = task.GetAwaiter().GetResult();
While I get it that purists might pillory me for this, the reality is that this is just one of many pressing issues and I have finite time so I need to make tradeoffs. Perfect? No. Works? Yes.
You Can Call Async Method From non-async method .Check below Code .
public ActionResult Test()
{
TestClass result = Task.Run(async () => await GetNumbers()).GetAwaiter().GetResult();
return PartialView(result);
}
public async Task<TestClass> GetNumbers()
{
TestClass obj = new TestClass();
HttpResponseMessage response = await APICallHelper.GetData(Functions.API_Call_Url.GetCommonNumbers);
if (response.IsSuccessStatusCode)
{
var result = response.Content.ReadAsStringAsync().Result;
obj = JsonConvert.DeserializeObject<TestClass>(result);
}
return obj;
}

Categories