async all the way down issue - c#

I have an async asp.net controller. This controller calls an async method. The method that actually performs the async IO work is deep down in my application. The series of methods between the controller and the last method in the chain are all marked with the async modifier. Here is an example of how I have the code setup:
public async Task<ActionResult> Index(int[] ids)
{
List<int> listOfDataPoints = dataPointService(ids);
List<Task> dpTaskList = new List<Task>();
foreach (var x in listOfDataPoints)
{
dpTaskList.Add(C_Async(x));
}
await Task.WhenAll(dpTaskList);
return View();
}
private async Task C_Async(int id)
{
//this method executes very fast
var idTemp = paddID(id);
await D_Async(idTemp);
}
private async Task D_Async(string id)
{
//this method executes very fast
await E_Async(id);
}
private async Task E_Async(string url)
{
//this method performs the actual async IO
result = await new WebClient().DownloadStringTaskAsync(new Uri(url))
saveContent(result);
}
As you can see the controller calls C_Async(x) asynchronously then there is a chain of async methods to E_Async. There are methods between the controller and E_Async and all have the async modifier. Is there a performance penalty since there are methods using the async modifyer but not doing any async IO work?
Note: This is a simplified version of the real code there are more async methods between the controller and the E_Async method.

Yes. There is a penalty (though not a huge one), and if you don't need to be async don't be. This pattern is often called "return await" where you can almost always remove both the async and the await. Simply return the task you already have that represents the asynchronous operations:
private Task C_Async(int id)
{
// This method executes very fast
var idTemp = paddID(id);
return D_Async(idTemp);
}
private Task D_Async(string id)
{
// This method executes very fast
return E_Async(id);
}
In this specific case Index will only await the tasks that E_Async returns. That means that after all the I/O is done the next line of code will directly be return View();. C_Async and D_Async already ran and finished in the synchronous call.

You must be careful about the thread message pumps and what async really does. The sample below calls into an async method which calls two other async methods which start two tasks to do the actual work which wait 2 and 3 seconds.
13.00 6520 .ctor Calling async method
13.00 6520 RunSomethingAsync Before
13.00 6520 GetSlowString Before
13.00 5628 OtherTask Sleeping for 2s
15.00 5628 OtherTask Sleeping done
15.00 6520 GetVerySlow Inside
15.00 2176 GetVerySlow Sleeping 3s
18.00 2176 GetVerySlow Sleeping Done
18.00 6520 RunSomethingAsync After GetSlowOtherTaskResultGetVerySlowReturn
As you can see the calls are serialized which might not be what you want when you after performance. Perhaps the two distinct await calls do not depend on each other and can be started directly as tasks.
All methods until GetSlowStringBefore are called on the UI or ASP.NET thread that started the async operation (if it it has a message pump). Only the last call with the result of the operation are marshalled back to the initiating thread.
The performance penalty is somewhere in the ContextSwitch region to wake up an already existing thread. This should be somewhere at microsecond level. The most expensive stuff would be the creation of the managed objects and the garbage collector cleaning up the temporary objects. If you call this in a tight loop you will be GC bound because there is an upper limit how many threads can be created. In that case TPL will buffer your tasks in queues which require memory allocations and then drain the queues with n worker threads from the thread pool.
On my Core I7 I get an overhead of 2microseconds for each call (comment out the Debug.Print line) and a memory consumption of 6,5GB for 5 million calls in a WPF application which gives you a memory overhead of 130KB per asynchronous operation chain. If you are after high scalability you need to watch after your GC. Until Joe Duffy has finished his new language we have to use CLR we currently have.
public partial class MainWindow : Window
{
public MainWindow()
{
InitializeComponent();
Print("Calling async method");
RunSomethingAsync();
}
private async void RunSomethingAsync()
{
Print("Before");
string msg = await GetSlowString();
Print("After " + msg);
cLabel.Content = msg;
}
void Print(string message, [CallerMemberName] string method = "")
{
Debug.Print("{0:N2} {1} {2} {3}", DateTime.Now.Second, AppDomain.GetCurrentThreadId(), method, message);
}
private async Task<string> GetSlowString()
{
Print("Before");
string otherResult = await OtherTask();
return "GetSlow" + otherResult + await GetVerySlow(); ;
}
private Task<string> OtherTask()
{
return Task.Run(() =>
{
Print("Sleeping for 2s");
Thread.Sleep(2 * 1000);
Print("Sleeping done");
return "OtherTaskResult";
});
}
private Task<string> GetVerySlow()
{
Print("Inside");
return Task.Run(() =>
{
Print("Sleeping 3s");
Thread.Sleep(3000);
Print("Sleeping Done");
return "GetVerySlowReturn";
});
}
}

Related

Is Sleep really blocking the execution?

Nearly every introduction about async programming for C# warns against using the Sleep instruction, because it would block the whole thread.
But I found that during sleep, the Tasks from the queue are being fetched and executed. See:
using System;
using System.Threading.Tasks;
namespace TestApp {
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Main");
Program.step1();
for (int i = 0; i < 6; i++) {
System.Threading.Thread.Sleep(200);
Console.WriteLine("Sleep-Loop");
}
}
private static async void step1() {
await Task.Delay(400);
Console.WriteLine("Step1");
Program.step2();
}
private static async void step2() {
await Task.Delay(400);
Console.WriteLine("Step2");
}
}
}
The output:
Main
Sleep-Loop
Sleep-Loop
Step1
Sleep-Loop
Sleep-Loop
Step2
Sleep-Loop
Sleep-Loop
My questions:
Is Sleep really allows the queued tasks to execute, or something else happens?
If yes, then does this also happen in every other cases of idleness? For example during polling?
In the above example, if we comment out the loop, then the application exits before any tasks could get executed. Is there another way to prevent that?
In C# 7.3 you can have async entry points, I suggest using that.
Some notes :
Don't use async void, it has subtleties with the way it deals with errors, if you see yourself writing async void then think about what you are doing. If it's not for an event handler you are probably doing something wrong
If you want to wait for a bunch of tasks to finish, use Task.WhenAll
Modified example
static async Task Main(string[] args)
{
Console.WriteLine("Start Task");
var task = Program.step1();
for (int i = 0; i < 6; i++)
{
await Task.Delay(100);
Console.WriteLine("Sleep-Loop");
}
Console.WriteLine("waiting for the task to finish");
await task;
Console.WriteLine("finished");
Console.ReadKey();
}
private static async Task step1()
{
await Task.Delay(1000);
Console.WriteLine("Step1");
await Program.step2();
}
private static async Task step2()
{
await Task.Delay(1000);
Console.WriteLine("Step2");
}
It's important to note Tasks are not threads and async is not parallel, however they can be.
9 times out of 10 if you are using the async await pattern it is for IO bound work to use operating system I/O completion ports so you can free up threads. It's a scalability and UI responsiveness feature.
If you aren't doing any I/O work, then there is actually very little need for the async await pattern at all, and as such CPU work should probably be just wrapped in a Task.Run at the point of calling. Not wrapped in an async method.
At this point it's also good to note just using tasks are not the async and await pattern. Although they both have tasks in common, they are not the same thing.
Lastly, if you find you need to use asynchronous code in a fire and forget way, think very carefully how you will handle any errors.
Here are some guidelines.
If you want to do I/O work, use the async await pattern.
If you want to do CPU work use Task.Run.
Never use async void unless it's for an event handler.
Never wrap CPU work in an async method, let the caller use Task.Run
If you need to wait for a task, await it, never call Result, or Wait or use Task.WhenAll

Nested await or async's "glutting"

Let's consider the next procedures hierarhy
Main.cs:
// timer callback
{
Plot.UpdateData();
}
Plot.cs:
public async void UpdateData()
{
await CalculateData();
// ...
}
private async Task CalculateData()
{
await Calculations.Calculate();
// UI updating
// ...
}
Calculations.cs:
public static async Task<bool> Calculate()
{
async Task<bool> CalculateLR()
{
var task1 = Calculate1();
var task2 = Calculate2();
await Task.WhenAll(new[] { task1, task2 });
return true;
}
var leftTask = CalculateLR();
var rightTask = CalculateLR();
await Task.WhenAll(new[] { leftTask, rightTask });
await Calculate3();
return true;
}
Here I have some basic calculations (in Calculate1-Calculate3 procedures) of Calculations.cs file and some interaction with UI. The "entry point" Plot.UpdateData is placed in Device.StartTimer( block of the main form.
It works, but I think this structure creates excess threads. So my question is can this hierarhy be optimized without loss of asynchronous advantages?
Or, other words, which procedures should be awaited in case of nested calls. Where is first non-awaited call should lie? Thanks.
First thing to note: async/await is about tasks, not threads. Under certain circumstances, a task can be thread equivalent, and in most cases it is not (the same thread can serve a lot of tasks sequentially conveyor-style, depending on how they're scheduled for continuation and what is awaiting condition, for example).
And I could strongly recommend to address this for further reading as very comprehensive source:
https://blog.stephencleary.com/2013/11/there-is-no-thread.html
https://blog.stephencleary.com/2014/05/a-tour-of-task-part-1-constructors.html

Issue with calling async method inside Parallel.Foreach method

I have case there i want to call one asyn method inside paralle.Foreach loop
public void ItemCheck<T>(IList<T> items,int id)
{
Parallel.ForEach(items, (current) =>
{
PostData(current,id);
});
Console.log("ItemCheck executed")
}
public async void PostData<T>(T obj, int id)
{
Console.lgo("PosstData executed")
}
Output :
ItemCheck executed
PosstData executed
Why it happens like this ?? Before completing execution of PostData method,next line is executed.How can i solve this issue.Anyone help on this
Why it happens like this ??
Because you're using async void.
Also - as Jon Skeet mentioned - there's no point in doing parallelism here. Parallel processing is splitting the work over multiple threads. What you really want is concurrency, not parallelism, which can be done with Task.WhenAll:
public async Task ItemCheckAsync<T>(IList<T> items, int id)
{
var tasks = items.Select(current => PostDataAsync(current, id));
await Task.WhenAll(tasks);
}
public async Task PostDataAsync<T>(T obj, int id)
The phrase "in parallel" is commonly used to mean "doing more than one thing at a time", but that usage has misled you into using Parallel, which is not the appropriate tool in this case. This is one reason why I strongly prefer the term "concurrent", and reserve the term "parallel" for things that the Parallel class does.
The problem is that your PostData method is executed asynchronous and nothing tells the parallel loop to wait until completion of all task.
An alternative i would use to sync the execution flow:
var tasks = items
.AsParallel()
.WithDegreeOfParallelisum(...)
.Select(async item => await PostData(item, id))
.ToArray();
Task.WaitAll(tasks); // this will wait for all tasks to finnish
Also your async methods, even void they should return Task not void. Being more explicit about your code is one plus of this approach, additionally, you can use the task for any other operations, like in the case be waited to finish.
public async Task PostData<T>(T obj, int id)
Do you even need to create/start async task in parallel and then wait for them? The result of this is just creating tasks in parallel (which is not a heavy operation, so why do it in parallel ?).
If you dont do any additional heavy work in the parallel loop except the PostData method, i think you don't need the parallel loop at all.

What task is being returned and why is this task status RanToCompletion

I am rather new to task based programming and trying to determine how to return a task and verify that it has been started. The code that I got to work was not what I was expecting. The console application is as follows:
public static void Main(string[] args)
{
var mySimple = new Simple();
var cts = new CancellationTokenSource();
var task = mySimple.RunSomethingAsync(cts.Token);
while (task.Status != TaskStatus.RanToCompletion)
{
Console.WriteLine("Starting...");
Thread.Sleep(100);
}
Console.WriteLine("It is started");
Console.ReadKey();
cts.Cancel();
}
public class Simple
{
public async void RunSomething(CancellationToken token)
{
var count = 0;
while (true)
{
if (token.IsCancellationRequested)
{
break;
}
Console.WriteLine(count++);
await Task.Delay(TimeSpan.FromMilliseconds(1000), token).ContinueWith(task => { });
}
}
public Task RunSomethingAsync(CancellationToken token)
{
return Task.Run(() => this.RunSomething(token));
}
}
The output is:
Starting...
0
It is started
1
2
3
4
Why is the task that is being returned have a status as TaskStatus.RanToCompletion compared to TaskStatus.Running as we see that the while loop is still executing? Am I checking the status of the task of putting the RunSomething task on the threadpool rather than the RunSomething task itself?
RunSomething is an async void method, meaning it exposes no means of the caller ever determining when it finishes, they can only ever start the operation and then have no idea what happens next. You then wrap a call to it inside of Task.Run, this is schedluing a thread pool thread to start RunSomething. It will then complete as soon as it has finished starting that Task.
If RunSomething actually returned a Task, then the caller would be able to determine when it actually finished, and if you waited on it it wouldn't actually indicate that it was done until that asynchronous operation was actually finished (there would be no reason to use Task.Run to start it in another thead, you'd be better off just calling it directly and not wasting the effort of moving that to a thread pool thread).
Never use async void (https://msdn.microsoft.com/en-us/magazine/jj991977.aspx)
instead you should use async Task
If you need to call an async method from a non-async (such as from a static void main) you should do something like this:
mySimple.RunSomethingAsync(cts.Token).GetAwaiter().GetResult();
That will effectively make the method a synchronous call.
You can use async void, but only for events.

Inside a loop,does each async call get chained to the returned task using task's continuewith?

The best practice is to collect all the async calls in a collection inside the loop and do Task.WhenAll(). Yet, want to understand what happens when an await is encountered inside the loop, what would the returned Task contain? what about further async calls? Will it create new tasks and add them to the already returned Task sequentially?
As per the code below
private void CallLoopAsync()
{
var loopReturnedTask = LoopAsync();
}
private async Task LoopAsync()
{
int count = 0;
while(count < 5)
{
await SomeNetworkCallAsync();
count++;
}
}
The steps I assumed are
LoopAsync gets called
count is set to zero, code enters while loop, condition is checked
SomeNetworkCallAsync is called,and the returned task is awaited
New task/awaitable is created
New task is returned to CallLoopAsync()
Now, provided there is enough time for the process to live, How / In what way, will the next code lines like count++ and further SomeNetworkCallAsync be executed?
Update - Based on Jon Hanna and Stephen Cleary:
So there is one Task and the implementation of that Task will involve
5 calls to NetworkCallAsync, but the use of a state-machine means
those tasks need not be explicitly chained for this to work. This, for
example, allows it to decide whether to break the looping or not based
on the result of a task, and so on.
Though they are not chained, each call will wait for the previous call to complete as we have used await (in state m/c, awaiter.GetResult();). It behaves as if five consecutive calls have been made and they are executed one after the another (only after the previous call gets completed). If this is true, we have to be bit more careful in how we are composing the async calls.For ex:
Instead of writing
private async Task SomeWorkAsync()
{
await SomeIndependentNetworkCall();// 2 sec to complete
var result1 = await GetDataFromNetworkCallAsync(); // 2 sec to complete
await PostDataToNetworkAsync(result1); // 2 sec to complete
}
It should be written
private Task[] RefactoredSomeWorkAsync()
{
var task1 = SomeIndependentNetworkCall();// 2 sec to complete
var task2 = GetDataFromNetworkCallAsync()
.ContinueWith(result1 => PostDataToNetworkAsync(result1)).Unwrap();// 4 sec to complete
return new[] { task1, task2 };
}
So that we can say RefactoredSomeWorkAsync is faster by 2 seconds, because of the possibility of parallelism
private async Task CallRefactoredSomeWorkAsync()
{
await Task.WhenAll(RefactoredSomeWorkAsync());//Faster, 4 sec
await SomeWorkAsync(); // Slower, 6 sec
}
Is this correct? - Yes. Along with "async all the way", "Accumulate tasks all the way" is good practice. Similar discussion is here
When count is zero, new task will be created because of await and be returned
No. It will not. It will simply call the async method consequently, without storing or returning the result. The value in loopReturnedTask will store the Task of LoopAsync, not related to SomeNetworkCallAsync.
await SomeNetworkCallAsync(); // call, wait and forget the result
You may want to read the MSDN article on async\await.
To produce code similar to what async and await do, if those keywords didn't exist, would require code a bit like:
private struct LoopAsyncStateMachine : IAsyncStateMachine
{
public int _state;
public AsyncTaskMethodBuilder _builder;
public TestAsync _this;
public int _count;
private TaskAwaiter _awaiter;
void IAsyncStateMachine.MoveNext()
{
try
{
if (_state != 0)
{
_count = 0;
goto afterSetup;
}
TaskAwaiter awaiter = _awaiter;
_awaiter = default(TaskAwaiter);
_state = -1;
loopBack:
awaiter.GetResult();
awaiter = default(TaskAwaiter);
_count++;
afterSetup:
if (_count < 5)
{
awaiter = _this.SomeNetworkCallAsync().GetAwaiter();
if (!awaiter.IsCompleted)
{
_state = 0;
_awaiter = awaiter;
_builder.AwaitUnsafeOnCompleted<TaskAwaiter, TestAsync.LoopAsyncStateMachine>(ref awaiter, ref this);
return;
}
goto loopBack;
}
_state = -2;
_builder.SetResult();
}
catch (Exception exception)
{
_state = -2;
_builder.SetException(exception);
return;
}
}
[DebuggerHidden]
void IAsyncStateMachine.SetStateMachine(IAsyncStateMachine param0)
{
_builder.SetStateMachine(param0);
}
}
public Task LoopAsync()
{
LoopAsyncStateMachine stateMachine = new LoopAsyncStateMachine();
stateMachine._this = this;
AsyncTaskMethodBuilder builder = AsyncTaskMethodBuilder.Create();
stateMachine._builder = builder;
stateMachine._state = -1;
builder.Start(ref stateMachine);
return builder.Task;
}
(The above is based on what happens when you use async and await except that the result of that uses names that cannot be valid C# class or field names, along with some extra attributes. If its MoveNext() reminds you of an IEnumerator that's not entirely irrelevant, the mechanism by which await and async produce an IAsyncStateMachine to implement a Task is similar in many ways to how yield produces an IEnumerator<T>).
The result is a single Task which comes from AsyncTaskMethodBuilder and makes use of LoopAsyncStateMachine (which is close to the hidden struct that the async produces). Its MoveNext() method is first called upon the task being started. It will then use an awaiter on SomeNetworkCallAsync. If it is already completed it moves on to the next stage (increment count and so on), otherwise it stores the awaiter in a field. On subsequent uses it will be called because the SomeNetworkCallAsync() task has returned, and it will get the result (which is void in this case, but could be a value if values were returned). It then attempts further loops and again returns when it is waiting on a task that is not yet completed.
When it finally reaches a count of 5 it calls SetResult() on the builder, which sets the result of the Task that LoopAsync had returned.
So there is one Task and the implementation of that Task will involve 5 calls to NetworkCallAsync, but the use of a state-machine means those tasks need not be explicitly chained for this to work. This, for example, allows it to decide whether to break the looping or not based on the result of a task, and so on.
When an async method first yields at an await, it returns a Task (or Task<T>). This is not the task being observed by the await; it is a completely different task created by the async method. The async state machine controls the lifetime of that Task.
One way to think of it is to consider the returned Task as representing the method itself. The returned Task will only complete when the method completes. If the method returns a value, then that value is set as the result of the task. If the method throws an exception, then that exception is captured by the state machine and placed on that task.
So, there's no need for attaching continuations to the returned task. The returned task will not complete until the method is done.
How / In what way, will the next code lines like count++ and further SomeNetworkCallAsync be executed?
I do explain this in my async intro post. In summary, when a method awaits, it captures a "current context" (SynchronizationContext.Current unless it is null, in which case it uses TaskScheduler.Current). When the await completes, it resumes executing its async method within that context.
That's what technically happens; but in the vast majority of cases, this simply means:
If an async method starts on a UI thread, then it will resume on that same UI thread.
If an async method starts within an ASP.NET request context, then it will resume with that same request context (not necessarily on the same thread, though).
Otherwise, the async method resumes on a thread pool thread.

Categories