Double await operations during POST - c#

Using c# HttpClient to POST data, hypothetically I'm also concerned with the returned content. I'm optimizing my app and trying to understand the performance impact of two await calls in the same method. The question popped up from the following code snippet,
public static async Task<string> AsyncRequest(string URL, string data = null)
{
using (var client = new HttpClient())
{
var post = await client.PostAsync(URL, new StringContent(data, Encoding.UTF8, "application/json")).ConfigureAwait(false);
post.EnsureSuccessStatusCode();
var response = await post.Content.ReadAsStringAsync();
return response;
}
}
Assume I have error handling in there :)
I know await calls are expensive so the double await caught my attention.
After the first await completes the POST response is in memory would it be more efficient to return the result directly, like var response = post.Content.ReadAsStringAsync().Result;
What are the performance considerations when making two await/async calls in the same method?
Will the above code result in a thread per await (2 threads), or 1 thread for the returned Task that will handle both await calls?

I know await calls are expensive so the double await caught my
attention.
Why would you say they're expensive? The compiler generated state-machine is a highly optimized beast which makes sure it doesn't bloat memory. Down to the specifics, for example, where TaskAwaiter returned from Task is a struct and not a class so it won't get allocated on the heap. As #usr points out, and is very well right, is that sending a request over-the-wire will make any state-machine allocation cost neglectable.
would it be more efficient to return the result directly, like var response
= post.Content.ReadAsStringAsync().Result;
Marking your method async is enough for the compiler to generate a state machine. The stack variables will already be lifted to the state-machine created. Once your first await is hit, the rest of your code turns into a continuation anyway. Using post.Content.ReadAsStringAsync().Result; is more likely to cause to deadlock your code rather then save you any memory consumption or make your code a micro-millisecond faster.
What are the performance considerations when making two await/async
calls in the same method?
What you should be asking yourself from a performance perspective is this - Is concurrency going to be an issue within my application which makes it worth using asynchronous operations?
async shines in places where a large amount of consumers will hitting you, and you want to make sure you have enough available resources to process those requests. I see people ask many times "why isn't this async code making my code go faster?". It won't make any noticeable change unless you are going to be under heavy traffic, heavy enough that for example, you'll be stressing out your IIS thread-pool where it will actually benefit from asynchrony.
Will the above code result in a thread per await (2 threads), or 1
thread for the returned Task that will handle both await calls?
Depends on your environment. When your first await is hit, you explicitly tell it not to marshal any synchronization context with ConfigureAwait(false). If you're running this from a UI thread, for example, then any code after PostAsync will be running on a thread-pool worker thread. Again, this shouldn't be a concern to you, these are micro-optimizations which you won't be seeing any benefit from.

Related

Is it really more efficient to WhenAll() on a number of tiny awaitable methods?

Given an external API method signature like the following:
Task<int> ISomething.GetValueAsync(int x)
I often see code such as the following:
public async Task Go(ISomething i)
{
int a = await i.GetValueAsync(1);
int b = await i.GetValueAsync(2);
int c = await i.GetValueAsync(3);
int d = await i.GetValueAsync(4);
Console.WriteLine($"{a} - {b} - {c} - {d}");
}
In code reviews it is sometimes suggested this is inefficient and should be rewritten:
public async Task Go(ISomething i)
{
Task<int> ta = i.GetValueAsync(1);
Task<int> tb = i.GetValueAsync(2);
Task<int> tc = i.GetValueAsync(3);
Task<int> td = i.GetValueAsync(4);
await Task.WhenAll(ta,tb,tc,td);
int a = ta.Result, b= tb.Result, c=tc.Result, d = td.Result;
Console.WriteLine($"{a} - {b} - {c} - {d}");
}
I can see the logic behind allowing parallelisation but in reality is this worthwhile? It presumably adds some overhead to the scheduler and if the methods themselves are very lightweight, it seems likely that thread parallelisation would be more costly than the time saved. Further, on a busy server running many applications, it seems unlikely there would be lots of cores sitting idle.
I can't tell if this is always a good pattern to follow, or if it's an optimisation to make on a case-by-case basis? Do Microsoft (or anyone else) give good best practice advice? Should we always write Task - based code in this way as a matter of course?
if the methods themselves are very lightweight, it seems likely that thread parallelisation would be more costly than the time saved.
This is definitely an issue with parallel code (the parallel form of concurrency), but this is asynchronous concurrency. Presumably, GetValueAsync is a true asynchronous method, which generally implies I/O operations. And I/O operations tend to dwarf many local considerations. (Side note: the WhenAll approach actually causes fewer thread switches and less scheduling overhead, but it does increase memory overhead slightly).
So, this is a win in the general case. However, that does assume that the statements above are correct (i.e., GetValueAsync performs I/O).
As a contrary point, however, I have seen this sometimes used as a band-aid for an inadequate data access layer. If you're hitting a SQL database with four queries, then the best solution is usually to combine that data access into a single query rather than do four calls to the same SQL database concurrently.
First, what does await actually do?
public async Task M() {
await Task.Yield();
}
If the awaitable object has already completed, then execution continues immediately. Otherwise a callback delegate is added to the awaitable object. This callback will be invoked immediately when the task result is made available.
So what about Task.WhenAll, how does that work? The current implementation adds a callback to every incomplete task. That callback will decrement a counter atomically. When the counter reaches zero, the result will be made available.
No new I/O is scheduled, no continuations added to the thread pool. Just a small counter added to the end of each tasks processing. Your continuation will resume on whichever thread executed the last task.
If you are measuring a performance problem. I wouldn't worry about the overheads of Task.WhenAll.

Why use async when I have to use await?

I've been stuck on this question for a while and haven't really found any useful clarification as to why this is.
If I have an async method like:
public async Task<bool> MyMethod()
{
// Some logic
return true;
}
public async void MyMethod2()
{
var status = MyMethod(); // Visual studio green lines this and recommends using await
}
If I use await here, what's the point of the asynchronous method? Doesn't it make the async useless that VS is telling me to call await? Does that not defeat the purpose of offloading a task to a thread without waiting for it to finish?
Does that not defeat the purpose of offloading a task to a thread without waiting for it to finish?
Yes, of course. But that's not the purpose of await/async. The purpose is to allow you to write synchronous code that uses asynchronous operations without wasting threads, or more generally, to give the caller a measure of control over the more or less asynchronous operations.
The basic idea is that as long as you use await and async properly, the whole operation will appear to be synchronous. This is usually a good thing, because most of the things you do are synchronous - say, you don't want to create a user before you request the user name. So you'd do something like this:
var name = await GetNameAsync();
var user = await RemoteService.CreateUserAsync(name);
The two operations are synchronous with respect to each other; the second doesn't (and cannot!) happen before the first. But they aren't (necessarily) synchronous with respect to their caller. A typical example is a Windows Forms application. Imagine you have a button, and the click handler contains the code above - all the code runs on the UI thread, but at the same time, while you're awaiting, the UI thread is free to do other tasks (similar to using Application.DoEvents until the operation completes).
Synchronous code is easier to write and understand, so this allows you to get most of the benefits of asynchronous operations without making your code harder to understand. And you don't lose the ability to do things asynchronously, since Task itself is just a promise, and you don't always have to await it right away. Imagine that GetNameAsync takes a lot of time, but at the same time, you have some CPU work to do before it's done:
var nameTask = GetNameAsync();
for (int i = 0; i < 100; i++) Thread.Sleep(100); // Important busy-work!
var name = await nameTask;
var user = await RemoteService.CreateUserAsync(name);
And now your code is still beautifuly synchronous - await is the synchronization point - while you can do other things in parallel with the asynchronous operations. Another typical example would be firing off multiple asynchronous requests in parallel but keeping the code synchronous with the completion of all of the requests:
var tasks = urls.Select(i => httpClient.GetAsync(i)).ToArray();
await Task.WhenAll(tasks);
The tasks are asynchronous in respect to each other, but not their caller, which is still beautifuly synchronous.
I've made a (incomplete) networking sample that uses await in just this way. The basic idea is that while most of the code is logically synchronous (there's a protocol to be followed - ask for login->verify login->read loop...; you can even see the part where multiple tasks are awaited in parallel), you only use a thread when you actually have CPU work to do. Await makes this almost trivial - doing the same thing with continuations or the old Begin/End async model would be much more painful, especially with respect to error handling. Await makes it look very clean.
If I use await here, what's the point of the asynchronous method?
await does not block thread. MyMethod2 will run synchronously until it reaches await expression. Then MyMethod2 will be suspended until awaited task (MyMethod) is complete. While MyMethod is not completed control will return to caller of MyMethod2. That's the point of await - caller will continue doing it's job.
Doesn't it make the async useless that VS is telling me to call await?
async is just a flag which means 'somewhere in the method you have one or more await'.
Does that not defeat the purpose of offloading a task to a thread
without waiting for it to finish?
As described above, you don't have to wait for task to finish. Nothing is blocked here.
NOTE: To follow framework naming standards I suggest you to add Async suffix to asynchronous method names.
An async method is not automatically executed on a different thread. Actually, the opposite is true: an async method is always executed in the calling thread. async means that this is a method that can yield to an asynchronous operation. That means it can return control to the caller while waiting for the other execution to complete. So asnync methods are a way to wait for other asynchronoous operations.
Since you are doing nothing to wait for in MyMethod2, async makes no sense here, so your compiler warns you.
Interestingly, the team that implemented async methods has acknowledged that marking a method async is not really necessary, since it would be enough to just use await in the method body for the compiler to recognize it as async. The requirement of using the async keyword has been added to avoid breaking changes to existing code that uses await as a variable name.

will async Rest API call with threads affect performance?

So one of my colleagues was saying to me -
"Why would somebody want to call async RestAPIs with thread, Its not beneficial and in return it will create a separate thread for every request."
My scenario is-
public async Task<ApiResponse<ProductLookupDto>> GetProductsByValueOfTheDay()
{
string url = "http://domain.com?param=10120&isBasic=true"
var result = Task.Run(() => JsonConvert.DeserializeObject<ProductLookupDto>(SerializedResults.GET(url)));
return new ApiResponse<ProductLookupDto>()
{
Data = await result,
Message = "success"
};
}
So here I am using async with threading. I know it will create a separate thread for request. Which is my concern about the performance.
I want to know if some of my methods are called together and if reponse is async as I am doing now. Will the thread affect to performance?
keeping in mind where - response from rest call is too large.
Your friend actually is not really knowing what he talks about - a Task will NOT create a separate thread for every request. Rather the task scheduler will determine threads - and the standard uses the thread pool. Which means that threads are pre allocated and reused, especially if you schedule a LOT of tasks. Also threads are released while the request runs due to completion ports. Your colleague shows a serious lack of basics here.
I think your colleague is right to be concerned with that code, and you are right to say "it pretends to be async but not really". And Jon Skeet is right to point out that the real problem here lies with SerializedResults.GET. Whatever the implentation of that is exactly, it's clearly a blocking call, otherwise it would return a Task<string> (that can be awaited) instead of a string. Wrapping a synchronous call in Task.Run and calling it asynchronous is a common mistake. You haven't eliminated blocking at all, you've simply moved the blocking call to a different thread.
So to make this asynchronous, SerializedResults.GET needs to go. It should be replaced by a call that uses a truly asynchronous library like HttpClient. Since this is a JSON API (and I'm a little bit biased), I'll show you an example using my Flurl library, which is basically a wrapper around HttpClient and Json.NET that provides some convenience methods to cut down on the noise a bit:
public async Task<ApiResponse<ProductLookupDto>> GetProductsByValueOfTheDay()
{
string url = "http://domain.com?param=10120&isBasic=true";
var result = await url.GetJsonAsync<ProductLookupDto>();
return new ApiResponse<ProductLookupDto>()
{
Data = result,
Message = "success"
};
}

Task.Run vs. direct async call for starting long-running async methods

Several times, I have found myself writing long-running async methods for things like polling loops. These methods might look something like this:
private async Task PollLoop()
{
while (this.KeepPolling)
{
var response = await someHttpClient.GetAsync(...).ConfigureAwait(false);
var content = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
// do something with content
await Task.Delay(timeBetweenPolls).ConfigureAwait(false);
}
}
The goal of using async for this purpose is that we don't need a dedicated polling thread and yet the logic is (to me) easier to understand than using something like a timer directly (also, no need to worry about reentrance).
My question is, what is the preferred method for launching such a loop from a synchronous context? I can think of at least 2 approaches:
var pollingTask = Task.Run(async () => await this.PollLoop());
// or
var pollingTask = this.PollLoop();
In either case, I can respond to exceptions using ContinueWith(). My main understanding of the difference between these two methods is that the first will initially start looping on a thread-pool thread, whereas the second will run on the current thread until the first await. Is this true? Are there other things to consider or better approaches to try?
My main understanding of the difference between these two methods is
that the first will initially start looping on a thread-pool thread,
whereas the second will run on the current thread until the first
await. Is this true?
Yes. An async method returns its task to its caller on the first await of an awaitable that is not already completed.
By convention most async methods return very quickly. Yours does as well because await someHttpClient.GetAsync will be reached very quickly.
There is no point in moving the beginning of this async method onto the thread-pool. It adds overhead and saves almost no latency. It certainly does not help throughput or scaling behavior.
Using an async lambda here (Task.Run(async () => await this.PollLoop())) is especially useless. It just wraps the task returned by PollLoop with another layer of tasks. it would be better to say Task.Run(() => this.PollLoop()).
My main understanding of the difference between these two methods is that the first will initially start looping on a thread-pool thread, whereas the second will run on the current thread until the first await. Is this true?
Yes, that's true.
In your scenario, there seem to be no need for using Task.Run though, there's practically no code between the method call and the first await, and so PollLoop() will return almost immediately. Needlessly wrapping a task in another task only makes the code less readable and adds overhead. I would rather use the second approach.
Regarding other considerations (e.g. exception handling), I think the two approaches are equivalent.
The goal of using async for this purpose is that we don't need a dedicated polling thread and yet the logic is (to me) easier to understand than using something like a timer directly
As a side-note, this is more or less what a timer would do anyway. In fact Task.Delay is implemented using a timer!

Easiest way to make controller asynchronous

I inherited a large web application that uses MVC5 and C#. Some of our controllers make several slow database calls and I want to make them asynchronous in an effort to allow the worker threads to service other requests while waiting for the database calls to complete. I want to do this with the least amount of refactoring. Say I have the following controller
public string JsonData()
{
var a = this.servicelayer.getA();
var b = this.servicelayer.getB();
return SerializeObject(new {a, b});
}
I have made the two expensive calls a, b asynchronous by leaving the service layer unchanged and rewriting the controller as
public async Task<string> JsonData()
{
var task1 = Task<something>.Run(() => this.servicelayer.getA());
var task2 = Task<somethingelse>.Run(() => this.servicelayer.getB());
await Task.WhenAll(task1, task2);
var a = await task1;
var b = await task2;
return SerializeObject(new {a, b});
}
The above code runs without any issues but I can't tell using Visual Studio if the worker threads are now available to service other requests or if using Task.Run() in a asp.net controller doesn't do what I think it does. Can anyone comment on the correctness of my code and if it can be improved in any way? Also, I read that using async in a controller has additional overhead and should be used only for long running code. What is the minimum criteria that I can use to decide if the controller needs async? I understand that every use case is different but wondering if there is a baseline that I can use as a starting point. 2 database calls? anything over 2 seconds to return?
The guideline is that you should use async whenever you have I/O. I.e., a database. The overhead is miniscule compared to any kind of I/O.
That said, blocking a thread pool thread via Task.Run is what I call "fake asynchrony". It's exactly what you don't want to do on ASP.NET.
Instead, start at your "lowest-level" code and make that truly asynchronous. E.g., EF6 supports asynchronous database queries. Then let the async code grow naturally from there towards your controller.
The only improvement the new code has is it runs both A and B concurrently and not one at a time. There's actually no real asynchrony in this code.
When you use Task.Run you are offloading work to be done on another thread, so basically you start 2 threads and release the current thread while awaiting both tasks (each of them running completely synchronously)
That means that the operation will finish faster (because of the parallelism) but will be using twice the threads and so will be less scalable.
What you do want to do is make sure all your operations are truly asynchronous. That will mean having a servicelayer.getAAsync() and servicelayer.getBAsync() so you could truly release the threads while IO is being processed:
public async Task<string> JsonData()
{
return SerializeObject(new {await servicelayer.getAAsync(), await servicelayer.getBAsync()});
}
If you can't make sure your actual IO operations are truly async, it would be better to keep the old code.
More on why to avoid Task.Run: Task.Run Etiquette Examples: Don't Use Task.Run in the Implementation

Categories