async await calling long running sync AND async methods - c#

I am using asp.net web api 2 and Entity Framework 6.
Original pseudo code
public IHttpActionResult GetProductLabel(int productId)
{
var productDetails = repository.GetProductDetails(productId);
var label = labelCalculator.Render(productDetails);
return Ok(label);
}
Modified code
public async Task<IHttpActionResult> GetProductLabel(int productId)
{
var productDetails = await repository.GetProductDetailsAsync(productId); // 1 long second as this call goes into sub services
var label = labelCalculator.Render(productDetails); // 1.5 seconds synchrounous code
return Ok(label);
}
Before my change everything ran synchronously.
After my change the call to a remote service which is calling again a database is done the async-await way.
Then I do a sync call to a rendering library which offers only sync methods. The calculation takes 1,5 seconds.
Is there still a benefit that I did the remote database_service call the async-await way but the 2nd call not? And is there anything I could still improve?
Note
The reason why I ask this is because:
"With async controllers when a process is waiting for I/O to complete, its thread is freed up for the server to use for processing other requests."
So when the first remote database_service call is processing and awaiting for that 1 second the thread is returned to IIS??!!
But what about the 2nd label calculation taking 1,5 seconds that will block the current thread again for 1,5 seconds ?
So I release and block the thread, that does not make sense or what do you think?

The rendering library is not simply "blocking a thread", it is doing work performing your rendering. There is nothing better you can do.

Is there still a benefit that I did the remote database_service call
the async-await way but the 2nd call not?
Yes, this call is now non-blocking and can run alongside other code, even though it is only for 1 second.
And is there anything I could still improve?
If you can have the second call run asynchronously too, the whole method can run asynchronously and won't block at all.

Asynchronous code creates a certain continuation under the hood, it is a sort of synthetic sugar to make asynchronous programming feel more synchonious.
In general, depending on the operations themselves, it could help to make both long running tasks asynchronous. It would use different tasks under the hood for each different long running task and run them asynchronously.
At the moment, these tasks run completely synchoniously within GetProductLabel, meaning that if it is the only method you are calling you would not tell th difference between synchonious code.
If it is possible I would make the second method async, since I am not familiar with any sagnificant drawbacks of using tasks and async await.
In your case there is nothing better you can do, and it won't make much difference, since you have to run it synchronously, as you are using the result from the first method.

Related

Not sure where to use an await operator in an async method

After reading lot of stuff about async/await, I still not sure where do I use await operator in my async method:
public async Task<IActionResult> DailySchedule(int professionalId, DateTime date)
{
var professional = professionalServices.Find(professionalId);
var appointments = scheduleService.SearchForAppointments(date, professional);
appointments = scheduleService.SomeCalculation(appointments);
return PartialView(appointments);
}
Should I create an async version for all 3 method and call like this?
var professional = await professionalServices.FindAsync(professionalId);
var appointments = await scheduleService.SearchForAppointmentsAsync(date, professional);
appointments = await scheduleService.SomeCalculationAsync(appointments);
or Should I make async only the first one ?
var professional = await professionalServices.FindAsync(professionalId);
var appointments = scheduleService.SearchForAppointments(date, professional);
appointments = scheduleService.SomeCalculation(appointments);
What´s is the difference?
I still not sure where do I use await operator in my async method
You're approaching the problem from the wrong end.
The first thing to change when converting to async is the lowest-level API calls. There are some operations that are naturally asynchronous - specifically, I/O operations. Other operations are naturally synchronous - e.g., CPU code.
Based on the names "Find", "SearchForAppointments, and "SomeCalculation", I'd suspect that Find and SearchForAppointments are I/O-based, possibly hitting a database, while SomeCalculation is just doing some CPU calculation.
But don't change Find to FindAsync just yet. That's still going the wrong way. The right way is to start at the lowest API, so whatever I/O that Find eventually does. For example, if this is an EF query, then use the EF6 asynchronous methods within Find. Then Find should become FindAsync, which should drive DailySchedule to be DailyScheduleAsync, which causes its callers to become async, etc. That's the way that async should grow.
As VSG24 has said, you should await each and every call that can be awaited that is because this will help you keep the main thread free from any long running task — that are for instance, tasks that download data from internet, tasks that save the data to the disk using I/O etc. The problem was that whenever you had to do a long running task, it always froze the UI. To overcome this, asynchronous pattern was used and thus this async/await allows you create a function that does the long running task on the background and your main thread keeps talking to the users.
I still not sure where do I use await operator in my async method
The intuition is that every function ending with Async can be awaited (provided their signature also matches, the following) and that they return either a Task, or Task<T>. Then they can be awaited using a function that returns void — you cannot await void. So the functions where there will be a bit longer to respond, you should apply await there. My own opinion is that a function that might take more than 1 second should be awaited — because that is a 1 second on your device, maybe your client has to wait for 5 seconds, or worse 10 seconds. They are just going to close the application and walk away saying, It doesn't work!.
Ok, if one of these is very very fast, it does not need to be async, right ?
Fast in what manner? Don't forget that your clients may not have very very fast machines and they are definitely going to suffer from the frozen applications. So them a favor and always await the functions.
What´s is the difference?
The difference is that in the last code sample, only the first call will be awaited and the others will execute synchronously. Visual Studio will also explain this part to you by providing green squiggly lines under the code that you can see by hovering over it. That is why, you should await every async call where possible.
Tip: If the process in the function is required, such as loading all of the data before starting the application then you should avoid async/await pattern and instead use the synchronous approach to download the data and perform other tasks. So it entirely depends on what you are doing instead of what the document says. :-)
Every async method call needs to be awaited, so that every method call releases the thread and thus not causing a block. Therefore in your case this is how you should do it:
var professional = await professionalServices.FindAsync(professionalId);
var appointments = await scheduleService.SearchForAppointmentsasync(date, professional);
appointments = await scheduleService.SomeCalculationAsync(appointments);

await Task.Run vs await

I've searched the web and seen a lot of questions regarding Task.Run vs await async, but there is this specific usage scenario where I don't not really understand the difference. Scenario is quite simple i believe.
await Task.Run(() => LongProcess());
vs
await LongProcess());
where LongProcess is a async method with a few asynchronous calls in it like calling db with await ExecuteReaderAsync() for instance.
Question:
Is there any difference between the two in this scenario? Any help or input appreciated, thanks!
Task.Run may post the operation to be processed at a different thread. That's the only difference.
This may be of use - for example, if LongProcess isn't truly asynchronous, it will make the caller return faster. But for a truly asynchronous method, there's no point in using Task.Run, and it may result in unnecessary waste.
Be careful, though, because the behaviour of Task.Run will change based on overload resolution. In your example, the Func<Task> overload will be chosen, which will (correctly) wait for LongProcess to finish. However, if a non-task-returning delegate was used, Task.Run will only wait for execution up to the first await (note that this is how TaskFactory.StartNew will always behave, so don't use that).
Quite often people think that async-await is done by several threads. In fact it is all done by one thread.
See the addition below about this one thread statement
The thing that helped me a lot to understand async-await is this interview with Eric Lippert about async-await. Somewhere in the middle he compares async await with a cook who has to wait for some water to boil. Instead of doing nothing, he looks around to see if there is still something else to do like slicing the onions. If that is finished, and the water still doesn't boil he checks if there is something else to do, and so forth until he has nothing to do but wait. In that case he returns to the first thing he waited for.
If your procedure calls an awaitable function, we are certain that somewhere in this awaitable function there is a call to an awaitable function, otherwise the function wouldn't be awaitable. In fact, your compiler will warn you if you forget to await somewhere in your awaitable function.
If your awaitable function calls the other awaitable function, then the thread enters this other function and starts doing the things in this function and goes deeper into other functions until he meets an await.
Instead of waiting for the results, the thread goes up in his call stack to see if there are other pieces of code he can process until he sees an await. Go up again in the call stack, process until await, etc. Once everyone is awaiting the thread looks for the bottom await and continues once that is finished.
This has the advantage, that if the caller of your awaitable function does not need the result of your function, but can do other things before the result is needed, these other things can be done by the thread instead of waiting inside your function.
A call without waiting immediately for the result would look like this:
private async Task MyFunction()
{
Task<ReturnType>taskA = SomeFunctionAsync(...)
// I don't need the result yet, I can do something else
DoSomethingElse();
// now I need the result of SomeFunctionAsync, await for it:
ReturnType result = await TaskA;
// now you can use object result
}
Note that in this scenario everything is done by one thread. As long as your thread has something to do he will be busy.
Addition. It is not true that only one thread is involved. Any thread who has nothing to do might continue processing your code after an await. If you check the thread id, you can see that this id can be changed after the await. The continuing thread has the same context as the original thread, so you can act as if it was the original thread. No need to check for InvokeRequired, no need to use mutexes or critical sections. For your code this is as if there is one thread involved.
The link to the article in the end of this answer explains a bit more about thread context
You'll see awaitable functions mainly where some other process has to do things, while your thread just has to wait idly until the other thing is finished. Examples are sending data over the internet, saving a file, communicating with a database etc.
However, sometimes some heavy calculations has to be done, and you want your thread to be free to do something else, like respond to user input. In that case you can start an awaitable action as if you called an async function.
Task<ResultType> LetSomeoneDoHeavyCalculations(...)
{
DoSomePreparations()
// start a different thread that does the heavy calculations:
var myTask = Task.Run( () => DoHeavyCalculations(...))
// now you are free to do other things
DoSomethingElse();
// once you need the result of the HeavyCalculations await for it
var myResult = await myTask;
// use myResult
...
}
Now a different thread is doing the heavy calculations while your thread is free to do other things. Once it starts awaiting your caller can do things until he starts awaiting. Effectively your thread will be fairly free to react on user input. However, this will only be the case if everyone is awaiting. While your thread is busy doing things your thread can't react on user input. Therefore always make sure that if you think your UI thread has to do some busy processing that takes some time use Task.Run and let another thread do it
Another article that helped me: Async-Await by the brilliant explainer Stephen Cleary
This answer deals with the specific case of awaiting an async method in the event handler of a GUI application. In this case the first approach has a significant advantage over the second. Before explaining why, lets rewrite the two approaches in a way that reflects clearly the context of this answer. What follows is only relevant for event handlers of GUI applications.
private async void Button1_Click(object sender, EventArgs args)
{
await Task.Run(async () => await LongProcessAsync());
}
vs
private async void Button1_Click(object sender, EventArgs args)
{
await LongProcessAsync();
}
I added the suffix Async in the method's name, to comply with the guidlines. I also made async the anonymous delegate, just for readability reasons. The overhead of creating a state machine is minuscule, and is dwarfed by the value of communicating clearly that this Task.Run returns a promise-style Task, not an old-school delegate Task intended for background processing of CPU-bound workloads.
The advantage of the first approach is that guarantees that the UI will remain responsive. The second approach offers no such guarantee. As long as you are using the build-in async APIs of the .NET platform, the probability of the UI being blocked by the second approach is pretty small. After all, these APIs are implemented by experts¹. By the moment you start awaiting your own async methods, all guarantees are off. Unless of course your first name is Stephen, and your surname is Toub or Cleary. If that's not the case, it is quite possible that sooner or later you'll write code like this:
public static async Task LongProcessAsync()
{
TeenyWeenyInitialization(); // Synchronous
await SomeBuildInAsyncMethod().ConfigureAwait(false); // Asynchronous
CalculateAndSave(); // Synchronous
}
The problem obviously is with the method TeenyWeenyInitialization(). This method is synchronous, and comes before the first await inside the body of the async method, so it won't be awaited. It will run synchronously every time you call the LongProcessAsync(). So if you follow the second approach (without Task.Run), the TeenyWeenyInitialization() will run on the UI thread.
How bad this can be? The initialization is teeny-weeny after all! Just a quick trip to the database to get a value, read the first line of a small text file, get a value from the registry. It's all over in a couple of milliseconds. At the time you wrote the program. In your PC. Before moving the data folder in a shared drive. Before the amount of data in the database became huge.
But you may get lucky and the TeenyWeenyInitialization() remains fast forever, what about the second synchronous method, the CalculateAndSave()? This one comes after an await that is configured to not capture the context, so it runs on a thread-pool thread. It should never run on the UI thread, right? Wrong. It depends to the Task returned by SomeBuildInAsyncMethod(). If the Task is completed, a thread switch will not occur, and the CalculateAndSave() will run on the same thread that called the method. If you follow the second approach, this will be the UI thread. You may never experience a case where the SomeBuildInAsyncMethod() returned a completed Task in your development environment, but the production environment may be different in ways difficult to predict.
Having an application that performs badly is unpleasant. Having an application that performs badly and freezes the UI is even worse. Do you really want to risk it? If you don't, please use always Task.Run(async inside your event handlers. Especially when awaiting methods you have coded yourself!
¹ Disclaimer, some built-in async APIs are not properly implemented.
Important: The Task.Run runs the supplied asynchronous delegate on a ThreadPool thread, so it's required that the LongProcessAsync has no affinity to the UI thread. If it involves interaction with UI controls, then the Task.Runis not an option. Thanks to #Zmaster for pointing out this important subtlety in the comments.

Task.Run vs. direct async call for starting long-running async methods

Several times, I have found myself writing long-running async methods for things like polling loops. These methods might look something like this:
private async Task PollLoop()
{
while (this.KeepPolling)
{
var response = await someHttpClient.GetAsync(...).ConfigureAwait(false);
var content = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
// do something with content
await Task.Delay(timeBetweenPolls).ConfigureAwait(false);
}
}
The goal of using async for this purpose is that we don't need a dedicated polling thread and yet the logic is (to me) easier to understand than using something like a timer directly (also, no need to worry about reentrance).
My question is, what is the preferred method for launching such a loop from a synchronous context? I can think of at least 2 approaches:
var pollingTask = Task.Run(async () => await this.PollLoop());
// or
var pollingTask = this.PollLoop();
In either case, I can respond to exceptions using ContinueWith(). My main understanding of the difference between these two methods is that the first will initially start looping on a thread-pool thread, whereas the second will run on the current thread until the first await. Is this true? Are there other things to consider or better approaches to try?
My main understanding of the difference between these two methods is
that the first will initially start looping on a thread-pool thread,
whereas the second will run on the current thread until the first
await. Is this true?
Yes. An async method returns its task to its caller on the first await of an awaitable that is not already completed.
By convention most async methods return very quickly. Yours does as well because await someHttpClient.GetAsync will be reached very quickly.
There is no point in moving the beginning of this async method onto the thread-pool. It adds overhead and saves almost no latency. It certainly does not help throughput or scaling behavior.
Using an async lambda here (Task.Run(async () => await this.PollLoop())) is especially useless. It just wraps the task returned by PollLoop with another layer of tasks. it would be better to say Task.Run(() => this.PollLoop()).
My main understanding of the difference between these two methods is that the first will initially start looping on a thread-pool thread, whereas the second will run on the current thread until the first await. Is this true?
Yes, that's true.
In your scenario, there seem to be no need for using Task.Run though, there's practically no code between the method call and the first await, and so PollLoop() will return almost immediately. Needlessly wrapping a task in another task only makes the code less readable and adds overhead. I would rather use the second approach.
Regarding other considerations (e.g. exception handling), I think the two approaches are equivalent.
The goal of using async for this purpose is that we don't need a dedicated polling thread and yet the logic is (to me) easier to understand than using something like a timer directly
As a side-note, this is more or less what a timer would do anyway. In fact Task.Delay is implemented using a timer!

Easiest way to make controller asynchronous

I inherited a large web application that uses MVC5 and C#. Some of our controllers make several slow database calls and I want to make them asynchronous in an effort to allow the worker threads to service other requests while waiting for the database calls to complete. I want to do this with the least amount of refactoring. Say I have the following controller
public string JsonData()
{
var a = this.servicelayer.getA();
var b = this.servicelayer.getB();
return SerializeObject(new {a, b});
}
I have made the two expensive calls a, b asynchronous by leaving the service layer unchanged and rewriting the controller as
public async Task<string> JsonData()
{
var task1 = Task<something>.Run(() => this.servicelayer.getA());
var task2 = Task<somethingelse>.Run(() => this.servicelayer.getB());
await Task.WhenAll(task1, task2);
var a = await task1;
var b = await task2;
return SerializeObject(new {a, b});
}
The above code runs without any issues but I can't tell using Visual Studio if the worker threads are now available to service other requests or if using Task.Run() in a asp.net controller doesn't do what I think it does. Can anyone comment on the correctness of my code and if it can be improved in any way? Also, I read that using async in a controller has additional overhead and should be used only for long running code. What is the minimum criteria that I can use to decide if the controller needs async? I understand that every use case is different but wondering if there is a baseline that I can use as a starting point. 2 database calls? anything over 2 seconds to return?
The guideline is that you should use async whenever you have I/O. I.e., a database. The overhead is miniscule compared to any kind of I/O.
That said, blocking a thread pool thread via Task.Run is what I call "fake asynchrony". It's exactly what you don't want to do on ASP.NET.
Instead, start at your "lowest-level" code and make that truly asynchronous. E.g., EF6 supports asynchronous database queries. Then let the async code grow naturally from there towards your controller.
The only improvement the new code has is it runs both A and B concurrently and not one at a time. There's actually no real asynchrony in this code.
When you use Task.Run you are offloading work to be done on another thread, so basically you start 2 threads and release the current thread while awaiting both tasks (each of them running completely synchronously)
That means that the operation will finish faster (because of the parallelism) but will be using twice the threads and so will be less scalable.
What you do want to do is make sure all your operations are truly asynchronous. That will mean having a servicelayer.getAAsync() and servicelayer.getBAsync() so you could truly release the threads while IO is being processed:
public async Task<string> JsonData()
{
return SerializeObject(new {await servicelayer.getAAsync(), await servicelayer.getBAsync()});
}
If you can't make sure your actual IO operations are truly async, it would be better to keep the old code.
More on why to avoid Task.Run: Task.Run Etiquette Examples: Don't Use Task.Run in the Implementation

How does C# 5.0's async-await feature differ from the TPL?

I don't see the different between C#'s (and VB's) new async features, and .NET 4.0's Task Parallel Library. Take, for example, Eric Lippert's code from here:
async void ArchiveDocuments(List<Url> urls) {
Task archive = null;
for(int i = 0; i < urls.Count; ++i) {
var document = await FetchAsync(urls[i]);
if (archive != null)
await archive;
archive = ArchiveAsync(document);
}
}
It seems that the await keyword is serving two different purposes. The first occurrence (FetchAsync) seems to mean, "If this value is used later in the method and its task isn't finished, wait until it completes before continuing." The second instance (archive) seems to mean, "If this task is not yet finished, wait right now until it completes." If I'm wrong, please correct me.
Couldn't it just as easily be written like this?
void ArchiveDocuments(List<Url> urls) {
for(int i = 0; i < urls.Count; ++i) {
var document = FetchAsync(urls[i]); // removed await
if (archive != null)
archive.Wait(); // changed to .Wait()
archive = ArchiveAsync(document.Result); // added .Result
}
}
I've replaced the first await with a Task.Result where the value is actually needed, and the second await with Task.Wait(), where the wait is actually occurring. The functionality is (1) already implemented, and (2) much closer semantically to what is actually happening in the code.
I do realize that an async method is rewritten as a state machine, similar to iterators, but I also don't see what benefits that brings. Any code that requires another thread to operate (such as downloading) will still require another thread, and any code that doesn't (such as reading from a file) could still utilize the TPL to work with only a single thread.
I'm obviously missing something huge here; can anybody help me understand this a little better?
I think the misunderstanding arises here:
It seems that the await keyword is serving two different purposes. The first occurrence (FetchAsync) seems to mean, "If this value is used later in the method and its task isn't finished, wait until it completes before continuing." The second instance (archive) seems to mean, "If this task is not yet finished, wait right now until it completes." If I'm wrong, please correct me.
This is actually completely incorrect. Both of these have the same meaning.
In your first case:
var document = await FetchAsync(urls[i]);
What happens here, is that the runtime says "Start calling FetchAsync, then return the current execution point to the thread calling this method." There is no "waiting" here - instead, execution returns to the calling synchronization context, and things keep churning. At some point in the future, FetchAsync's Task will complete, and at that point, this code will resume on the calling thread's synchronization context, and the next statement (assigning the document variable) will occur.
Execution will then continue until the second await call - at which time, the same thing will happen - if the Task<T> (archive) isn't complete, execution will be released to the calling context - otherwise, the archive will be set.
In the second case, things are very different - here, you're explicitly blocking, which means that the calling synchronization context will never get a chance to execute any code until your entire method completes. Granted, there is still asynchrony, but the asynchrony is completely contained within this block of code - no code outside of this pasted code will happen on this thread until all of your code completes.
Anders boiled it down to a very succinct answer in the Channel 9 Live interview he did. I highly recommend it
The new Async and await keywords allow you to orchestrate concurrency in your applications. They don't actually introduce any concurrency in to your application.
TPL and more specifically Task is one way you can use to actually perform operations concurrently. The new async and await keyword allow you to compose these concurrent operations in a "synchronous" or "linear" fashion.
So you can still write a linear flow of control in your programs while the actual computing may or may not happen concurrently. When computation does happen concurrently, await and async allow you to compose these operations.
There is a huge difference:
Wait() blocks, await does not block. If you run the async version of ArchiveDocuments() on your GUI thread, the GUI will stay responsive while the fetching and archiving operations are running.
If you use the TPL version with Wait(), your GUI will be blocked.
Note that async manages to do this without introducing any threads - at the point of the await, control is simply returned to the message loop. Once the task being waited for has completed, the remainder of the method (continuation) is enqueued on the message loop and the GUI thread will continue running ArchiveDocuments where it left off.
The ability to turn the program flow of control into a state machine is what makes these new keywords intresting. Think of it as yielding control, rather than values.
Check out this Channel 9 video of Anders talking about the new feature.
The problem here is that the signature of ArchiveDocuments is misleading. It has an explicit return of void but really the return is Task. To me void implies synchronous as there is no way to "wait" for it to finish. Consider the alternate signature of the function.
async Task ArchiveDocuments(List<Url> urls) {
...
}
To me when it's written this way the difference is much more obvious. The ArchiveDocuments function is not one that completes synchronously but will finish later.
The await keyword does not introduce concurrency. It is like the yield keyword, it tells the compiler to restructure your code into lambda controlled by a state machine.
To see what await code would look like without 'await' see this excellent link: http://blogs.msdn.com/b/windowsappdev/archive/2012/04/24/diving-deep-with-winrt-and-await.aspx
The call to FetchAsync() will still block until it completes (unless a statement within calls await?) The key is that control is returned to the caller (because the ArchiveDocuments method itself is declared as async). So the caller can happily continue processing UI logic, respond to events, etc.
When FetchAsync() completes, it interrupts the caller to finish the loop. It hits ArchiveAsync() and blocks, but ArchiveAsync() probably just creates a new task, starts it, and returns the task. This allows the second loop to begin, while the task is processing.
The second loop hits FetchAsync() and blocks, returning control to the caller. When FetchAsync() completes, it again interrupts the caller to continue processing. It then hits await archive, which returns control to the caller until the Task created in loop 1 completes. Once that task is complete, the caller is again interrupted, and the second loop calls ArchiveAsync(), which gets a started task and begins loop 3, repeat ad nauseum.
The key is returning control to the caller while the heavy lifters are executing.

Categories