Run async operation in parallel without Task.WhenAll

Run async operation in parallel without Task.WhenAll - c#

I need to run three async I/O operations in parallel, particularly they are the database calls. So, I write the following code:
// I need to know whether these tasks started running here
var task1 = _repo.GetThingOneAsync();
var task2 = _repo.GetThingTwoAsync();
var task3 = _repo.GetThingThreeAsync();
// await the results
var task1Result = await task1;
var task2Result = await task2;
var task3Result = await task3;
The GetThingOneAsync(), GetThingTwoAsync(), GetThingThreeAsync() methods are pretty much the similar to each other except that they have different return types(Task<string>, Task<int>, Task<IEnumerable<int>>). The example of one of the database calls is the following:
public async Task<IEnumerable<int>> GetThingOneAsync(string code)
{
return await db.DrType.Where(t => t.Code == code).Select(t => t.IdType).ToListAsync();
}
In debug mode I can see that var task1 = _repo.GetThingOneAsync(); started to run GetThingOneAsync() async method (the same with other two tasks).
My colleagues say that _repo.GetThingOneAsync() does not start the async operation. They say that the operation started when we reach await (that statement seems to be wrong for me).
So, they suggest to fix my code to the following:
var task1 = _repo.GetThingOneAsync();
var task2 = _repo.GetThingTwoAsync();
var task3 = _repo.GetThingThreeAsync();
await Task.WhenAll(task1, task2, task3);
// then get the result of each task through `Result` property.
In my opinion, it's the same as I wrote in the very beginning of the question except that Task.WhenAll waits for later tasks to finish if an earlier task faults (this is Servy's comment from this question)
I know that my question is kind of duplicate but I want to know whether I'm doing things right or wrong.

My colleagues say that _repo.GetThingOneAsync() does not start the async operation. They say that the operation started when we reach await (that statement seems to be wrong for me).
They're wrong. The operation starts when the method is called.
This is easy to prove by starting an operation that has some observable side effect (like writing to a database). Call the method and then block the application, e.g., with Console.ReadKey(). You will then see the operation complete (in the database) without an await.
The remainder of the question is all about stylistic preference. There's a slight semantic difference between these options, but usually it's not important.
var task1 = _repo.GetThingOneAsync();
var task2 = _repo.GetThingTwoAsync();
var task3 = _repo.GetThingThreeAsync();
// await the results
var task1Result = await task1;
var task2Result = await task2;
var task3Result = await task3;
The code above will await (asynchronously wait) for each task to complete, one at a time. If all three complete successfully, then this code is equivalent to the Task.WhenAll approach. The difference is that if task1 or task2 have an exception, then task3 is never awaited.
This is my personal favorite:
var task1 = _repo.GetThingOneAsync();
var task2 = _repo.GetThingTwoAsync();
var task3 = _repo.GetThingThreeAsync();
await Task.WhenAll(task1, task2, task3);
var task1Result = await task1;
var task2Result = await task2;
var task3Result = await task3;
I like the Task.WhenAll because it's explicit. Reading the code, it's clear that it's doing asynchronous concurrency because there's a Task.WhenAll right there.
I like using await instead of Result because it's more resilient to code changes. In particular, if someone who doesn't like Task.WhenAll comes along and removes it, you still end up awaiting those tasks instead of using Result, which can cause deadlocks and wrap exceptions in AggregateException. The only reason Result works after Task.WhenAll is because those tasks have already been completed and their exceptions have already been observed.
But this is largely opinion.

I agree with #MickyD, the tasks have been created on the initial call. The two calls are similar in effect.
A few nuances though. When you call GetThingOneAsync method, it executes up until the point where it reaches an await statement; that is when it returns the Task. If the Async method never does an await then it exits and returns an already-completed Task. So if these were compute-intensive routines (doesn't look like it) then you would not be achieving any parallelism. You would need to use Task.Run to achieve simultaneous execution. Another point is that if you use await from the UI thread then all of the execution will be on the UI thread -- just scheduled at different times. This is somewhat OK if the Task is doing IO because it will block for the read/write. However it can start to add up so if you are going to do anything substantial then you should put it on the thread pool (I.e. with Task.Run).
As for the comment from your colleagues, as I said, the task1,2,3 do start running before the awaits. But when you hit the await, the method that you are currently executing will suspend and return a Task. So it is somewhat correct that it is the await that creates the Task -- just that the task you are thinking about in your question (task1,2,3) is the one created when GetThingXxxAsync hits an await, not the one created when your main routine awaits task1,2,3.

Related

c#, multiple async task execute [duplicate]

In terms of performance, will these 2 methods run GetAllWidgets() and GetAllFoos() in parallel?
Is there any reason to use one over the other? There seems to be a lot happening behind the scenes with the compiler so I don't find it clear.
============= MethodA: Using multiple awaits ======================
public async Task<IHttpActionResult> MethodA()
{
var customer = new Customer();
customer.Widgets = await _widgetService.GetAllWidgets();
customer.Foos = await _fooService.GetAllFoos();
return Ok(customer);
}
=============== MethodB: Using Task.WaitAll =====================
public async Task<IHttpActionResult> MethodB()
{
var customer = new Customer();
var getAllWidgetsTask = _widgetService.GetAllWidgets();
var getAllFoosTask = _fooService.GetAllFos();
Task.WaitAll(new List[] {getAllWidgetsTask, getAllFoosTask});
customer.Widgets = getAllWidgetsTask.Result;
customer.Foos = getAllFoosTask.Result;
return Ok(customer);
}
=====================================

The first option will not execute the two operations concurrently. It will execute the first and await its completion, and only then the second.
The second option will execute both concurrently but will wait for them synchronously (i.e. while blocking a thread).
You shouldn't use both options since the first completes slower than the second and the second blocks a thread without need.
You should wait for both operations asynchronously with Task.WhenAll:
public async Task<IHttpActionResult> MethodB()
{
var customer = new Customer();
var getAllWidgetsTask = _widgetService.GetAllWidgets();
var getAllFoosTask = _fooService.GetAllFos();
await Task.WhenAll(getAllWidgetsTask, getAllFoosTask);
customer.Widgets = await getAllWidgetsTask;
customer.Foos = await getAllFoosTask;
return Ok(customer);
}
Note that after Task.WhenAll completed both tasks already completed so awaiting them completes immediately.

Short answer: No.
Task.WaitAll is blocking, await returns the task as soon as it is encountered and registers the remaining part of the function and continuation.
The "bulk" waiting method you were looking for is Task.WhenAll that actually creates a new Task that finishes when all tasks that were handed to the function are done.
Like so: await Task.WhenAll({getAllWidgetsTask, getAllFoosTask});
That is for the blocking matter.
Also your first function does not execute both functions parallel. To get this working with await you'd have to write something like this:
var widgetsTask = _widgetService.GetAllWidgets();
var foosTask = _fooService.GetAllWidgets();
customer.Widgets = await widgetsTask;
customer.Foos = await foosTask;
This will make the first example to act very similar to the Task.WhenAll method.

As an addition to what #i3arnon said. You will see that when you use await you are forced to have to declare the enclosing method as async, but with waitAll you don't. That should tell you that there is more to it than what the main answer says. Here it is:
WaitAll will block until the given tasks finish, it does not pass control back to the caller while those tasks are running. Also as mentioned, the tasks are run asynchronous to themselves, not to the caller.
Await will not block the caller thread, it will however suspend the execution of the code below it, but while the task is running, control is returned back to the caller. For the fact that control is returned back to the caller (the called method is running async), you have to mark the method as async.
Hopefully the difference is clear. Cheers

Only your second option will run them in parallel. Your first will wait on each call in sequence.

As soon as you invoke the async method it will start executing. Whether it will execute on the current thread (and thus run synchronously) or it will run async is not possible to determine.
Thus, in your first example the first method will start doing work, but then you artificially stops the flow of the code with the await. And thus the second method will not be invoked before the first is done executing.
The second example invokes both methods without stopping the flow with an await. Thus they will potentially run in parallel if the methods are asynchronous.

Can anyone tell which code block does parallel operation and why?

Code Block 1 :-
var services1 = new service1();
var services2 = new service2();
var result1 = await service1.GetData();
var result2 = await service2.GetData();
Code Block 2 :-
var services1 = new service1();
var services2 = new service2();
var task1 = await service1.GetData();
var task2 = await service2.GetData();
Task.WhenAll(task1,task2);
today i got these question in my quiz..!
As options where to choose one from them CB1 or CB2.

Your first example is fine as long as await service1.GetData() does not throw an exception. If it does, then the result of, or any exeptions thrown by, await service2.GetData() will be lost.
It will, however, serialise the operations, as service2.GetData() will not be invoked until service1.GetData() has completed.
Your second example will not compile, unless you meant to do this:
var service1 = new service1();
var service2 = new service2();
var task1 = service1.GetData();
var task2 = service2.GetData();
await Task.WhenAll(task1, task2);
Where the Task.WhenAll is awaited rather than service1.GetData() and service2.GetData().
Then you can safely access the results like this:
var result1 = task1.Result;
var result2 = task2.Result;
The difference here is that there is only one place that an exception can be thrown: Task.WhenAll, which will aggregate the exceptions from all provided tasks.
It will also allow service2.GetData() to be invoked whilst any asynchronous work done by service1.GetData() is executing.
There is a third option as well, assuming service1.GetData() and service2.GetData() have the same return type:
var service1 = new service1();
var service2 = new service2();
var results = await Task.WhenAll(services1.GetData(), services2.GetData());
That way, the result of each Task will be added to an array (here results).
You could then extract the individual values:
var result1 = results[0];
var result2 = results[1];

Normally I wouldn't answer a homework question, since you should have learned it in class. But I feel the need to answer the this one, because it's a bad question and I fear you are being helped to misunderstand asynchronous programming.
Parallel != asynchronous.
"Parallel" means that two or more pieces of code are being executed at the same time. That means there is more than one thread. It's about how code runs.
"Asynchronous" means that while a block of code is waiting for some external operation, the thread is freed to do some other work, instead of locking the thread. It's about how code waits.
Let's assume that GetData() makes a network request to get the data. This is what happens in that second example:
service1.GetData() runs until the network request is sent and returns a Task.
service2.GetData() runs until the network request is sent and returns a Task.
So far, both network requests have been sent and we're waiting for responses. Everything has happened on the same thread, not in parallel. But we still need to run the continuation of each (everything after await in GetData()) after each response is received. How those run depends on if the application has a synchronization context.
If there is a synchronization context (ASP.NET, or UI app, for example) then nothing will run in parallel. The continuation of each call to GetData() will run one after the other on the same thread.
If there is no synchronization context, (ASP.NET Core or console app or ConfigureAwait(false) is used inside GetData(), for example) then each continuation will run on a ThreadPool thread as soon as the responses come back, which may happen in parallel.
If your teacher wants you to put B, then put the answer that will get you the marks. But it might actually be wrong, unless you have been given more detail about the type of application and if it has a synchronization context.
Also, there should be an await before Task.WhenAll().
Microsoft has an excellent series of articles about Asynchronous programming with async and await that are worth the read. You will find the other articles in that series in the table of contents on the left of that first article.

Awaiting async tasks instantly vs declaring first and then awaiting

Let's look at the following 2 examples:
public class MyClass
{
public async Task Main()
{
var result1 = "";
var result2 = "";
var request1 = await DelayMe();
var request2 = await DelayMe();
result1 = request1;
result2 = request2;
}
private static async Task<String> DelayMe()
{
await Task.Delay(2000);
return "";
}
}
And:
public class MyClass
{
public async Task Main()
{
var result1 = "";
var result2 = "";
var request1 = DelayMe();
var request2 = DelayMe();
result1 = await request1;
result2 = await request2;
}
private static async Task<String> DelayMe()
{
await Task.Delay(2000);
return "";
}
}
In the first example shows how you would typically write async await code where one thing happens after the other and awaited properly.
The second one is first calling the async Task method but it's awaiting it later.
The first example takes a bit over 4000ms to execute because the await is computing the first request before it makes the second; but the second example takes a bit over 2000ms. This happens because the Task actually starts running as soon as the execution steps over the var request1 = DelayMe(); line which means that request1 and request2 are running in parallel. At this point it looks like the await keyword just ensures that the Task is computed.
The second approach feels and acts like a await Task.WhenAll(request1, request2), but in this scenario, if something fails in the 2 requests, you will get an exception instantly instead of waiting for everything to compute and then getting an AggregateException.
My question is that is there a drawback (performance or otherwise) in using the second approach to run multiple awaitable Tasks in parallel when the result of one doesn't depend on the execution of the other? Looking at the lowered code, it looks like the second example generates an equal amount of System.Threading.Tasks.Task1per awaited item while the first one doesn't. Is this still going through theasync await` state-machine flow?

if something fails in the 2 requests, you will get an exception instantly instead of waiting for everything to compute and then getting an AggregateException.
If something fails in the first request, then yes. If something fails in the second request, then no, you wouldn't check the second request results until that task is awaited.
My question is that is there a drawback (performance or otherwise) in using the second approach to run multiple awaitable Tasks in parallel when the result of one doesn't depend on the execution of the other? Looking at the lowered code, it looks like the second example generates an equal amount of System.Threading.Tasks.Task1per awaited item while the first one doesn't. Is this still going through theasync await` state-machine flow?
It's still going through the state machine flow. I tend to recommend await Task.WhenAll because the intent of the code is more explicit, but there are some people who don't like the "always wait even when there are exceptions" behavior. The flip side to that is that Task.WhenAll always collects all the exceptions - if you have fail-fast behavior, then some exceptions could be ignored.
Regarding performance, concurrent execution would be better because you can do multiple operations concurrently. There's no danger of threadpool exhaustion from this because async/await does not use additional threads.
As a side note, I recommend using the term "asynchronous concurrency" for this rather than "parallel", since to many people "parallel" implies parallel processing, i.e., Parallel or PLINQ, which would be the wrong technologies to use in this case.

The drawback of using the second approach to run multiple awaitable tasks in parallel is that the parallelism is not obvious. And not obvious parallelism (implicit multithreading in other words) is dangerous because the bugs that could be introduced are notoriously inconsistent and sporadically observed. Lets suppose that the actual DelayMe running in the production environment was the one bellow:
private static int delaysCount = 0;
private static async Task<String> DelayMe()
{
await Task.Delay(2000);
return (++delaysCount).ToString();
}
Sequentially awaited calls to DelayMe will return increasing numbers. Parallelly awaited calls will occasionally return the same number.

How to make two Tasks run with an even distribution of cpu time

They start out even, but eventually the processTasks never gets hits.
Originally I had this as two threads when the tasks were simple. Someone suggested async/await tasks and being new to c# I had no reason to doubt them.
Task monitorTasks= new Task (monitor.start );
Task processTasks= new Task( () => processor.process(ref param, param2) );
monitorTasks.Start();
processTasks.Start();
await processTasks;
Have I executed this wrong? Is my problem inevitable while running two tasks? Should they be threads? How to avoid.
edit
To clarify. The tasks are never intended to end. They will always be processing and monitoring while triggering events that notify watchers of monitor outputs or processor outputs.

If you await on a Task.WhenAll then it will wait until all tasks have been processed
await Task.WhenAll(monitorTasks, processTasks)
https://msdn.microsoft.com/en-us/library/system.threading.tasks.task.whenall(v=vs.110).aspx

Task.WaitAll blocks the current thread until everything has completed.
Task.WhenAll returns a task which represents the action of waiting until everything has completed.
Task.WhenAll Method
Creates a task that will complete when all of the Task objects in an
enumerable collection have completed.
Task.WaitAll Method
Waits for all of the provided Task objects to complete execution.
If you want to block wait on started tasks (which is seemingly what you want)
Task monitorTasks= new Task (monitor.start );
Task processTasks= new Task( () => processor.process(ref param, param2) );
monitorTasks.Start();
processTasks.Start();
Task.WaitAll(new Task[]{monitorTasks,processTasks})
If you are using async await, see Asynchronous programming with async and await
Then you could do something like this
var task1 = DoWorkAsync();
var task2 = DoMoreWorkAsync();
await Task.WhenAll(task1, task2);

I couldn't get tasks to run evenly.
The monitor task was getting constantly flooded whereas the processor task was getting tasks less frequently, which is when I suspect the monitor task took over.
Since no one could help me,
My solution was to turn them back into threads and set the priority of the threads.Lower than normal for the monitor task, and higher than normal for the processor task.
This seems to have solved my problem.

difference between await Task(ReadFromIO) and await Task.WhenAll(task1,task2);

I read in the book about the differences of the below.
private async Task GetDataAsync() {
var task1 = ReadDataFromIOAsync();
var task2 = ReadDataFromIOAsync();
// Here we can do more processing
// that doesn't need the data from the previous calls.
// Now we need the data so we have to wait
await Task.WhenAll(task1, task2);
// Now we have data to show.
lblResult.Content = task1.Result;
lblResult2.Content = task2.Result;
}
private async Task GetDataAsync() {
var task1 = ReadDataFromIOAsync();
var task2 = ReadDataFromIOAsync();
lblResult.Content = await task1;
lblResult2.Content = await task2;
}
I understood whats happening in the first method's await statement. But for the second one, though I understood the logic I couldn't understand the pitfall of the second implementation compared to first. In the book, they mentioned that compiler rewrites the method twice. What I understood is because of the two await calls, there could be a time delay more than the first one as we separately call the await for each task here. Can someone explain me in a better way?

I don't know what point your book was trying to make but I agree with your initial guess at it's interoperation.
Potentially the issue is there could be a period of time where lblResult shows new data and lblResult2 shows old data if task2 takes longer than task1 to process. In the first method you wait till both tasks finish then update both labels at the same time (and when you exit your method both get repainted on the screen a the same time). In the second method you update the first label then you give the message loop a opportunity to repaint the screen then some time later you update the 2nd label and have that value get updated on the screen.
I guess you would have a slightly more complex state machine for the 2nd example too but the overhead of that is negligible, I am confident the book was trying to point out the issue you and I both came up with.

Essentially what the first method is doing is this:
Start task1
Start task2
Asynchronously wait for task1 and task2 to complete
The second implementation does this:
Start task1
Start task2
Asynchronously wait for task1 to complete.
Once task1 is complete, resume execution and then asynchronously wait for task2 to complete.
So with the second approach you are individually awaiting the results of each task rather than waiting for both tasks to complete. In the case where task1 completes before task2, the code will resume execution and then return straight away, that will result in an extra context switch which may take extra time. Also for the case of multiple awaits the compiler may end up generating a more complex state machine, but the effect of that should be negligible.
In either case you are not using the result until both tasks are complete so the behavior of the application shouldn't be too different.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.