I inherited a large web application that uses MVC5 and C#. Some of our controllers make several slow database calls and I want to make them asynchronous in an effort to allow the worker threads to service other requests while waiting for the database calls to complete. I want to do this with the least amount of refactoring. Say I have the following controller
public string JsonData()
{
var a = this.servicelayer.getA();
var b = this.servicelayer.getB();
return SerializeObject(new {a, b});
}
I have made the two expensive calls a, b asynchronous by leaving the service layer unchanged and rewriting the controller as
public async Task<string> JsonData()
{
var task1 = Task<something>.Run(() => this.servicelayer.getA());
var task2 = Task<somethingelse>.Run(() => this.servicelayer.getB());
await Task.WhenAll(task1, task2);
var a = await task1;
var b = await task2;
return SerializeObject(new {a, b});
}
The above code runs without any issues but I can't tell using Visual Studio if the worker threads are now available to service other requests or if using Task.Run() in a asp.net controller doesn't do what I think it does. Can anyone comment on the correctness of my code and if it can be improved in any way? Also, I read that using async in a controller has additional overhead and should be used only for long running code. What is the minimum criteria that I can use to decide if the controller needs async? I understand that every use case is different but wondering if there is a baseline that I can use as a starting point. 2 database calls? anything over 2 seconds to return?
The guideline is that you should use async whenever you have I/O. I.e., a database. The overhead is miniscule compared to any kind of I/O.
That said, blocking a thread pool thread via Task.Run is what I call "fake asynchrony". It's exactly what you don't want to do on ASP.NET.
Instead, start at your "lowest-level" code and make that truly asynchronous. E.g., EF6 supports asynchronous database queries. Then let the async code grow naturally from there towards your controller.
The only improvement the new code has is it runs both A and B concurrently and not one at a time. There's actually no real asynchrony in this code.
When you use Task.Run you are offloading work to be done on another thread, so basically you start 2 threads and release the current thread while awaiting both tasks (each of them running completely synchronously)
That means that the operation will finish faster (because of the parallelism) but will be using twice the threads and so will be less scalable.
What you do want to do is make sure all your operations are truly asynchronous. That will mean having a servicelayer.getAAsync() and servicelayer.getBAsync() so you could truly release the threads while IO is being processed:
public async Task<string> JsonData()
{
return SerializeObject(new {await servicelayer.getAAsync(), await servicelayer.getBAsync()});
}
If you can't make sure your actual IO operations are truly async, it would be better to keep the old code.
More on why to avoid Task.Run: Task.Run Etiquette Examples: Don't Use Task.Run in the Implementation
Related
I've been stuck on this question for a while and haven't really found any useful clarification as to why this is.
If I have an async method like:
public async Task<bool> MyMethod()
{
// Some logic
return true;
}
public async void MyMethod2()
{
var status = MyMethod(); // Visual studio green lines this and recommends using await
}
If I use await here, what's the point of the asynchronous method? Doesn't it make the async useless that VS is telling me to call await? Does that not defeat the purpose of offloading a task to a thread without waiting for it to finish?
Does that not defeat the purpose of offloading a task to a thread without waiting for it to finish?
Yes, of course. But that's not the purpose of await/async. The purpose is to allow you to write synchronous code that uses asynchronous operations without wasting threads, or more generally, to give the caller a measure of control over the more or less asynchronous operations.
The basic idea is that as long as you use await and async properly, the whole operation will appear to be synchronous. This is usually a good thing, because most of the things you do are synchronous - say, you don't want to create a user before you request the user name. So you'd do something like this:
var name = await GetNameAsync();
var user = await RemoteService.CreateUserAsync(name);
The two operations are synchronous with respect to each other; the second doesn't (and cannot!) happen before the first. But they aren't (necessarily) synchronous with respect to their caller. A typical example is a Windows Forms application. Imagine you have a button, and the click handler contains the code above - all the code runs on the UI thread, but at the same time, while you're awaiting, the UI thread is free to do other tasks (similar to using Application.DoEvents until the operation completes).
Synchronous code is easier to write and understand, so this allows you to get most of the benefits of asynchronous operations without making your code harder to understand. And you don't lose the ability to do things asynchronously, since Task itself is just a promise, and you don't always have to await it right away. Imagine that GetNameAsync takes a lot of time, but at the same time, you have some CPU work to do before it's done:
var nameTask = GetNameAsync();
for (int i = 0; i < 100; i++) Thread.Sleep(100); // Important busy-work!
var name = await nameTask;
var user = await RemoteService.CreateUserAsync(name);
And now your code is still beautifuly synchronous - await is the synchronization point - while you can do other things in parallel with the asynchronous operations. Another typical example would be firing off multiple asynchronous requests in parallel but keeping the code synchronous with the completion of all of the requests:
var tasks = urls.Select(i => httpClient.GetAsync(i)).ToArray();
await Task.WhenAll(tasks);
The tasks are asynchronous in respect to each other, but not their caller, which is still beautifuly synchronous.
I've made a (incomplete) networking sample that uses await in just this way. The basic idea is that while most of the code is logically synchronous (there's a protocol to be followed - ask for login->verify login->read loop...; you can even see the part where multiple tasks are awaited in parallel), you only use a thread when you actually have CPU work to do. Await makes this almost trivial - doing the same thing with continuations or the old Begin/End async model would be much more painful, especially with respect to error handling. Await makes it look very clean.
If I use await here, what's the point of the asynchronous method?
await does not block thread. MyMethod2 will run synchronously until it reaches await expression. Then MyMethod2 will be suspended until awaited task (MyMethod) is complete. While MyMethod is not completed control will return to caller of MyMethod2. That's the point of await - caller will continue doing it's job.
Doesn't it make the async useless that VS is telling me to call await?
async is just a flag which means 'somewhere in the method you have one or more await'.
Does that not defeat the purpose of offloading a task to a thread
without waiting for it to finish?
As described above, you don't have to wait for task to finish. Nothing is blocked here.
NOTE: To follow framework naming standards I suggest you to add Async suffix to asynchronous method names.
An async method is not automatically executed on a different thread. Actually, the opposite is true: an async method is always executed in the calling thread. async means that this is a method that can yield to an asynchronous operation. That means it can return control to the caller while waiting for the other execution to complete. So asnync methods are a way to wait for other asynchronoous operations.
Since you are doing nothing to wait for in MyMethod2, async makes no sense here, so your compiler warns you.
Interestingly, the team that implemented async methods has acknowledged that marking a method async is not really necessary, since it would be enough to just use await in the method body for the compiler to recognize it as async. The requirement of using the async keyword has been added to avoid breaking changes to existing code that uses await as a variable name.
I am using asp.net web api 2 and Entity Framework 6.
Original pseudo code
public IHttpActionResult GetProductLabel(int productId)
{
var productDetails = repository.GetProductDetails(productId);
var label = labelCalculator.Render(productDetails);
return Ok(label);
}
Modified code
public async Task<IHttpActionResult> GetProductLabel(int productId)
{
var productDetails = await repository.GetProductDetailsAsync(productId); // 1 long second as this call goes into sub services
var label = labelCalculator.Render(productDetails); // 1.5 seconds synchrounous code
return Ok(label);
}
Before my change everything ran synchronously.
After my change the call to a remote service which is calling again a database is done the async-await way.
Then I do a sync call to a rendering library which offers only sync methods. The calculation takes 1,5 seconds.
Is there still a benefit that I did the remote database_service call the async-await way but the 2nd call not? And is there anything I could still improve?
Note
The reason why I ask this is because:
"With async controllers when a process is waiting for I/O to complete, its thread is freed up for the server to use for processing other requests."
So when the first remote database_service call is processing and awaiting for that 1 second the thread is returned to IIS??!!
But what about the 2nd label calculation taking 1,5 seconds that will block the current thread again for 1,5 seconds ?
So I release and block the thread, that does not make sense or what do you think?
The rendering library is not simply "blocking a thread", it is doing work performing your rendering. There is nothing better you can do.
Is there still a benefit that I did the remote database_service call
the async-await way but the 2nd call not?
Yes, this call is now non-blocking and can run alongside other code, even though it is only for 1 second.
And is there anything I could still improve?
If you can have the second call run asynchronously too, the whole method can run asynchronously and won't block at all.
Asynchronous code creates a certain continuation under the hood, it is a sort of synthetic sugar to make asynchronous programming feel more synchonious.
In general, depending on the operations themselves, it could help to make both long running tasks asynchronous. It would use different tasks under the hood for each different long running task and run them asynchronously.
At the moment, these tasks run completely synchoniously within GetProductLabel, meaning that if it is the only method you are calling you would not tell th difference between synchonious code.
If it is possible I would make the second method async, since I am not familiar with any sagnificant drawbacks of using tasks and async await.
In your case there is nothing better you can do, and it won't make much difference, since you have to run it synchronously, as you are using the result from the first method.
So one of my colleagues was saying to me -
"Why would somebody want to call async RestAPIs with thread, Its not beneficial and in return it will create a separate thread for every request."
My scenario is-
public async Task<ApiResponse<ProductLookupDto>> GetProductsByValueOfTheDay()
{
string url = "http://domain.com?param=10120&isBasic=true"
var result = Task.Run(() => JsonConvert.DeserializeObject<ProductLookupDto>(SerializedResults.GET(url)));
return new ApiResponse<ProductLookupDto>()
{
Data = await result,
Message = "success"
};
}
So here I am using async with threading. I know it will create a separate thread for request. Which is my concern about the performance.
I want to know if some of my methods are called together and if reponse is async as I am doing now. Will the thread affect to performance?
keeping in mind where - response from rest call is too large.
Your friend actually is not really knowing what he talks about - a Task will NOT create a separate thread for every request. Rather the task scheduler will determine threads - and the standard uses the thread pool. Which means that threads are pre allocated and reused, especially if you schedule a LOT of tasks. Also threads are released while the request runs due to completion ports. Your colleague shows a serious lack of basics here.
I think your colleague is right to be concerned with that code, and you are right to say "it pretends to be async but not really". And Jon Skeet is right to point out that the real problem here lies with SerializedResults.GET. Whatever the implentation of that is exactly, it's clearly a blocking call, otherwise it would return a Task<string> (that can be awaited) instead of a string. Wrapping a synchronous call in Task.Run and calling it asynchronous is a common mistake. You haven't eliminated blocking at all, you've simply moved the blocking call to a different thread.
So to make this asynchronous, SerializedResults.GET needs to go. It should be replaced by a call that uses a truly asynchronous library like HttpClient. Since this is a JSON API (and I'm a little bit biased), I'll show you an example using my Flurl library, which is basically a wrapper around HttpClient and Json.NET that provides some convenience methods to cut down on the noise a bit:
public async Task<ApiResponse<ProductLookupDto>> GetProductsByValueOfTheDay()
{
string url = "http://domain.com?param=10120&isBasic=true";
var result = await url.GetJsonAsync<ProductLookupDto>();
return new ApiResponse<ProductLookupDto>()
{
Data = result,
Message = "success"
};
}
Several times, I have found myself writing long-running async methods for things like polling loops. These methods might look something like this:
private async Task PollLoop()
{
while (this.KeepPolling)
{
var response = await someHttpClient.GetAsync(...).ConfigureAwait(false);
var content = await response.Content.ReadAsStringAsync().ConfigureAwait(false);
// do something with content
await Task.Delay(timeBetweenPolls).ConfigureAwait(false);
}
}
The goal of using async for this purpose is that we don't need a dedicated polling thread and yet the logic is (to me) easier to understand than using something like a timer directly (also, no need to worry about reentrance).
My question is, what is the preferred method for launching such a loop from a synchronous context? I can think of at least 2 approaches:
var pollingTask = Task.Run(async () => await this.PollLoop());
// or
var pollingTask = this.PollLoop();
In either case, I can respond to exceptions using ContinueWith(). My main understanding of the difference between these two methods is that the first will initially start looping on a thread-pool thread, whereas the second will run on the current thread until the first await. Is this true? Are there other things to consider or better approaches to try?
My main understanding of the difference between these two methods is
that the first will initially start looping on a thread-pool thread,
whereas the second will run on the current thread until the first
await. Is this true?
Yes. An async method returns its task to its caller on the first await of an awaitable that is not already completed.
By convention most async methods return very quickly. Yours does as well because await someHttpClient.GetAsync will be reached very quickly.
There is no point in moving the beginning of this async method onto the thread-pool. It adds overhead and saves almost no latency. It certainly does not help throughput or scaling behavior.
Using an async lambda here (Task.Run(async () => await this.PollLoop())) is especially useless. It just wraps the task returned by PollLoop with another layer of tasks. it would be better to say Task.Run(() => this.PollLoop()).
My main understanding of the difference between these two methods is that the first will initially start looping on a thread-pool thread, whereas the second will run on the current thread until the first await. Is this true?
Yes, that's true.
In your scenario, there seem to be no need for using Task.Run though, there's practically no code between the method call and the first await, and so PollLoop() will return almost immediately. Needlessly wrapping a task in another task only makes the code less readable and adds overhead. I would rather use the second approach.
Regarding other considerations (e.g. exception handling), I think the two approaches are equivalent.
The goal of using async for this purpose is that we don't need a dedicated polling thread and yet the logic is (to me) easier to understand than using something like a timer directly
As a side-note, this is more or less what a timer would do anyway. In fact Task.Delay is implemented using a timer!
Trying to understand server-side async/await use. My understanding is that it is only useful when the thread can be freed.
If I have:
[HttpGet]
public async Task<IEnumerable<MyObject>> Get()
{
return await Task<IEnumerable<MyObject>>.Run(() => DoThingsAndGetResults());
}
IEnumerable<MyObject> DoThingsAndGetResults()
{
// ...Do some CPU computations here...
IEnumerable<MyObject> result = myDb.table.Select().ToList(); // entity framework
// ...Do some CPU computations here...
return result;
}
Will the thread ever be freed? Is Async/Await useless in this case? Is the only way to gain benefit from async/await is if I make some bigger changes and do this:
[HttpGet]
public async Task<IEnumerable<MyObject>> Get()
{
return await DoThingsAndGetResultsAsync();
}
async IEnumerable<MyObject> DoThingsAndGetResultsAsync()
{
// ...Do some CPU computations here...
IEnumerable<MyObject> result = await myDb.table.Select().ToListAsync(); // entity framework
// ...Do some CPU computations here...
return result;
}
Async methods are mostly useful when you have to wait for IO operation (reading from a file, querying the database, receiving response from a web server).
It is ok to use async when, for example, your database provider supports async querying like Entity Framework 6. Though it is not very useful for CPU bound operations (calculations and so on).
See related answers at ASP.NET MVC4 Async controller - Why to use?
If what you want to achieve is to free threads, the second example is better. The first one holds a thread throughout the operation.
Explanation
All that async-await does is to simply helps you write code that runs asynchronously but "looks" synchronous.
There are 2 reasons for asynchronous code:
Offloading. Used mostly in GUI threads, or other "more important" threads". (Releasing threads while waiting for CPU operations to complete).
Scalability. Used mainly in the server-side to reduce resource usage. (Releasing threads while waiting for IO to complete).
Unsurprisingly the reasons match your examples.
In the first example you run DoThingsAndGetResults on a different thread (using Task.Run) and waiting for it to complete asynchronously.The first thread is being released, but the second one doesn't.
In the second one you only use 1 thread at a time, and you release it while waiting for IO (Entity Framework).
One of the benefits when using An async operation for long running io operations server Side, like database calls can be, is that you free up The thread The request was processed on, so that web server has more threads available To proces new requests. Mind that this is true when using The default taskscheduler, that schedules tasks on An available thread from The threadpool. When using another taskscheduler, The effect could be different...