Trying to understand server-side async/await use. My understanding is that it is only useful when the thread can be freed.
If I have:
[HttpGet]
public async Task<IEnumerable<MyObject>> Get()
{
return await Task<IEnumerable<MyObject>>.Run(() => DoThingsAndGetResults());
}
IEnumerable<MyObject> DoThingsAndGetResults()
{
// ...Do some CPU computations here...
IEnumerable<MyObject> result = myDb.table.Select().ToList(); // entity framework
// ...Do some CPU computations here...
return result;
}
Will the thread ever be freed? Is Async/Await useless in this case? Is the only way to gain benefit from async/await is if I make some bigger changes and do this:
[HttpGet]
public async Task<IEnumerable<MyObject>> Get()
{
return await DoThingsAndGetResultsAsync();
}
async IEnumerable<MyObject> DoThingsAndGetResultsAsync()
{
// ...Do some CPU computations here...
IEnumerable<MyObject> result = await myDb.table.Select().ToListAsync(); // entity framework
// ...Do some CPU computations here...
return result;
}
Async methods are mostly useful when you have to wait for IO operation (reading from a file, querying the database, receiving response from a web server).
It is ok to use async when, for example, your database provider supports async querying like Entity Framework 6. Though it is not very useful for CPU bound operations (calculations and so on).
See related answers at ASP.NET MVC4 Async controller - Why to use?
If what you want to achieve is to free threads, the second example is better. The first one holds a thread throughout the operation.
Explanation
All that async-await does is to simply helps you write code that runs asynchronously but "looks" synchronous.
There are 2 reasons for asynchronous code:
Offloading. Used mostly in GUI threads, or other "more important" threads". (Releasing threads while waiting for CPU operations to complete).
Scalability. Used mainly in the server-side to reduce resource usage. (Releasing threads while waiting for IO to complete).
Unsurprisingly the reasons match your examples.
In the first example you run DoThingsAndGetResults on a different thread (using Task.Run) and waiting for it to complete asynchronously.The first thread is being released, but the second one doesn't.
In the second one you only use 1 thread at a time, and you release it while waiting for IO (Entity Framework).
One of the benefits when using An async operation for long running io operations server Side, like database calls can be, is that you free up The thread The request was processed on, so that web server has more threads available To proces new requests. Mind that this is true when using The default taskscheduler, that schedules tasks on An available thread from The threadpool. When using another taskscheduler, The effect could be different...
Related
Given an external API method signature like the following:
Task<int> ISomething.GetValueAsync(int x)
I often see code such as the following:
public async Task Go(ISomething i)
{
int a = await i.GetValueAsync(1);
int b = await i.GetValueAsync(2);
int c = await i.GetValueAsync(3);
int d = await i.GetValueAsync(4);
Console.WriteLine($"{a} - {b} - {c} - {d}");
}
In code reviews it is sometimes suggested this is inefficient and should be rewritten:
public async Task Go(ISomething i)
{
Task<int> ta = i.GetValueAsync(1);
Task<int> tb = i.GetValueAsync(2);
Task<int> tc = i.GetValueAsync(3);
Task<int> td = i.GetValueAsync(4);
await Task.WhenAll(ta,tb,tc,td);
int a = ta.Result, b= tb.Result, c=tc.Result, d = td.Result;
Console.WriteLine($"{a} - {b} - {c} - {d}");
}
I can see the logic behind allowing parallelisation but in reality is this worthwhile? It presumably adds some overhead to the scheduler and if the methods themselves are very lightweight, it seems likely that thread parallelisation would be more costly than the time saved. Further, on a busy server running many applications, it seems unlikely there would be lots of cores sitting idle.
I can't tell if this is always a good pattern to follow, or if it's an optimisation to make on a case-by-case basis? Do Microsoft (or anyone else) give good best practice advice? Should we always write Task - based code in this way as a matter of course?
if the methods themselves are very lightweight, it seems likely that thread parallelisation would be more costly than the time saved.
This is definitely an issue with parallel code (the parallel form of concurrency), but this is asynchronous concurrency. Presumably, GetValueAsync is a true asynchronous method, which generally implies I/O operations. And I/O operations tend to dwarf many local considerations. (Side note: the WhenAll approach actually causes fewer thread switches and less scheduling overhead, but it does increase memory overhead slightly).
So, this is a win in the general case. However, that does assume that the statements above are correct (i.e., GetValueAsync performs I/O).
As a contrary point, however, I have seen this sometimes used as a band-aid for an inadequate data access layer. If you're hitting a SQL database with four queries, then the best solution is usually to combine that data access into a single query rather than do four calls to the same SQL database concurrently.
First, what does await actually do?
public async Task M() {
await Task.Yield();
}
If the awaitable object has already completed, then execution continues immediately. Otherwise a callback delegate is added to the awaitable object. This callback will be invoked immediately when the task result is made available.
So what about Task.WhenAll, how does that work? The current implementation adds a callback to every incomplete task. That callback will decrement a counter atomically. When the counter reaches zero, the result will be made available.
No new I/O is scheduled, no continuations added to the thread pool. Just a small counter added to the end of each tasks processing. Your continuation will resume on whichever thread executed the last task.
If you are measuring a performance problem. I wouldn't worry about the overheads of Task.WhenAll.
I have a similair question to Running async methods in parallel in that I wish to run a number of functions from a list of functions in parallel.
I have noted in a number of comments online it is mentioned that if you have another await in your methods, Task.WhenAll() will not help as Async methods are not parallel.
I then went ahead and created a thread for each using function call with the below (the number of parallel functions will be small typically 1 to 5):
public interface IChannel
{
Task SendAsync(IMessage message);
}
public class SendingChannelCollection
{
protected List<IChannel> _channels = new List<IChannel>();
/* snip methods to add channels to list etc */
public async Task SendAsync(IMessage message)
{
var tasks = SendAll(message);
await Task.WhenAll(tasks.AsParallel().Select(async task => await task));
}
private IEnumerable<Task> SendAll(IMessage message)
{
foreach (var channel in _channels)
yield return channel.SendAsync(message, qos);
}
}
I would like to double check I am not doing anything horrendous with code smells or bugs as i get to grips with what I have patched together from what i have found online. Many thanks in advance.
Let's compare the behaviour of your line:
await Task.WhenAll(tasks.AsParallel().Select(async task => await task));
in contrast with:
await Task.WhenAll(tasks);
What are you delegating to PLINQ in the first case? Only the await operation, which does basically nothing - it invokes the async/await machinery to wait for one task. So you're setting up a PLINQ query that does all the heavy work of partitioning and merging the results of an operation that amounts to "do nothing until this task completes". I doubt that is what you want.
If you have another await in your methods, Task.WhenAll() will not help as Async methods are not parallel.
I couldn't find that in any of the answers to the linked questions, except for one comment under the question itself. I'd say that it's probably a misconception, stemming from the fact that async/await doesn't magically turn your code into concurrent code. But, assuming you're in an environment without a custom SynchronizationContext (so not an ASP or WPF app), continuations to async functions will be scheduled on the thread pool and possibly run in parallel. I'll delegate you to this answer to shed some light on that. That basically means that if your SendAsync looks something like this:
Task SendAsync(IMessage message)
{
// Synchronous initialization code.
await something;
// Continuation code.
}
Then:
The first part before await runs synchronously. If this part is heavyweight, you should introduce parallelism in SendAll so that the initialization code is run in parallel.
await works as usual, waiting for work to complete without using up any threads.
The continuation code will be scheduled on the thread pool, so if a few awaits finish up at the same time their continuations might be run in parallel if there's enough threads in the thread pool.
All of the above is assuming that await something actually awaits asynchronously. If there's a chance that await something completes synchronously, then the continuation code will also run synchronously.
Now there is a catch. In the question you linked one of the answers states:
Task.WhenAll() has a tendency to become unperformant with large scale/amount of tasks firing simultaneously - without moderation/throttling.
Now I don't know if that's true, since I weren't able to find any other source claiming that. I guess it's possible and in that case it might actually be beneficial to invoke PLINQ to deal with partitioning and throttling for you. However, you said you typically handle 1-5 functions, so you shouldn't worry about this.
So to summarize, parallelism is hard and the correct approach depends on how exactly your SendAsync method looks like. If it has heavyweight initialization code and that's what you want to parallelise, you should run all the calls to SendAsync in parallel. Otherwise, async/await will be implicitly using the thread pool anyway, so your call to PLINQ is redundant.
I am using asp.net web api 2 and Entity Framework 6.
Original pseudo code
public IHttpActionResult GetProductLabel(int productId)
{
var productDetails = repository.GetProductDetails(productId);
var label = labelCalculator.Render(productDetails);
return Ok(label);
}
Modified code
public async Task<IHttpActionResult> GetProductLabel(int productId)
{
var productDetails = await repository.GetProductDetailsAsync(productId); // 1 long second as this call goes into sub services
var label = labelCalculator.Render(productDetails); // 1.5 seconds synchrounous code
return Ok(label);
}
Before my change everything ran synchronously.
After my change the call to a remote service which is calling again a database is done the async-await way.
Then I do a sync call to a rendering library which offers only sync methods. The calculation takes 1,5 seconds.
Is there still a benefit that I did the remote database_service call the async-await way but the 2nd call not? And is there anything I could still improve?
Note
The reason why I ask this is because:
"With async controllers when a process is waiting for I/O to complete, its thread is freed up for the server to use for processing other requests."
So when the first remote database_service call is processing and awaiting for that 1 second the thread is returned to IIS??!!
But what about the 2nd label calculation taking 1,5 seconds that will block the current thread again for 1,5 seconds ?
So I release and block the thread, that does not make sense or what do you think?
The rendering library is not simply "blocking a thread", it is doing work performing your rendering. There is nothing better you can do.
Is there still a benefit that I did the remote database_service call
the async-await way but the 2nd call not?
Yes, this call is now non-blocking and can run alongside other code, even though it is only for 1 second.
And is there anything I could still improve?
If you can have the second call run asynchronously too, the whole method can run asynchronously and won't block at all.
Asynchronous code creates a certain continuation under the hood, it is a sort of synthetic sugar to make asynchronous programming feel more synchonious.
In general, depending on the operations themselves, it could help to make both long running tasks asynchronous. It would use different tasks under the hood for each different long running task and run them asynchronously.
At the moment, these tasks run completely synchoniously within GetProductLabel, meaning that if it is the only method you are calling you would not tell th difference between synchonious code.
If it is possible I would make the second method async, since I am not familiar with any sagnificant drawbacks of using tasks and async await.
In your case there is nothing better you can do, and it won't make much difference, since you have to run it synchronously, as you are using the result from the first method.
This code hangs (does not return a response) when I make a request to it:
public class MyController : ApiController
{
public async Task<IQueryable<int>> Get()
{
return await new Task<IQueryable<int>>(() => new List<int>().AsQueryable());
}
}
But this method works fine:
public IQueryable<int> Get()
{
return new List<int>().AsQueryable();
}
What fundamental knowledge am I missing??!
As the other answer noted, the reason your controller is not finishing is because the task is not started. However, using Task.Start, Task.Factory.StartNew, or Task.Run is not a good solution on ASP.NET.
The entire point of using async/await on ASP.NET is to free up a thread. However, if you await a task that is started via Start/StartNew/Run, then you're freeing up one thread by taking up another thread from the same thread pool. In this case, you not only lose all the benefits of async/await, but you actually decrease your scalability by regularly throwing off the ASP.NET thread pool heuristics.
There are two types of tasks, as I describe on my blog: Delegate Tasks (which represent some work executed on a thread) and Promise Tasks (which represent an event). You should avoid Delegate Tasks on ASP.NET, including any tasks that are "started" (Start/StartNew/Run).
Since you're returning an IQueryable<T>, I'm assuming that your actual underlying operation is a database query. If you're using EF6, then you have full support for asynchronous queries, which are properly implemented with Promise Tasks.
You're not actually starting your Task so it will wait for something that will never begin.
Instead use Task.Factory.StartNew which will create and start at the same time, or call Task#Start and await that call.
An overview of ways to start a task: http://dotnetcodr.com/2014/01/01/5-ways-to-start-a-task-in-net-c/
There is absolutely no need in async/await there, the method can look like:
public Task<IQueryable<int>> Get()
{
return Task.FromResult(new List<int>().AsQueryable());
}
If you really need it to be async, ok, you can always write something like:
public async Task<IQueryable<int>> Get()
{
return await Task.FromResult(new List<int>().AsQueryable());
}
which will introduce little overhead (a whole state machine will be generated by compiler).
Also, as others already stated, tasks returned from async methods should be hot (started)
Keep in mind, that Task.FromResult will return completed task and this case can be optimized by async/await generated code, writing Task.Run in this case is at least wierd
Read Task-based Asynchronous Pattern for more details
I inherited a large web application that uses MVC5 and C#. Some of our controllers make several slow database calls and I want to make them asynchronous in an effort to allow the worker threads to service other requests while waiting for the database calls to complete. I want to do this with the least amount of refactoring. Say I have the following controller
public string JsonData()
{
var a = this.servicelayer.getA();
var b = this.servicelayer.getB();
return SerializeObject(new {a, b});
}
I have made the two expensive calls a, b asynchronous by leaving the service layer unchanged and rewriting the controller as
public async Task<string> JsonData()
{
var task1 = Task<something>.Run(() => this.servicelayer.getA());
var task2 = Task<somethingelse>.Run(() => this.servicelayer.getB());
await Task.WhenAll(task1, task2);
var a = await task1;
var b = await task2;
return SerializeObject(new {a, b});
}
The above code runs without any issues but I can't tell using Visual Studio if the worker threads are now available to service other requests or if using Task.Run() in a asp.net controller doesn't do what I think it does. Can anyone comment on the correctness of my code and if it can be improved in any way? Also, I read that using async in a controller has additional overhead and should be used only for long running code. What is the minimum criteria that I can use to decide if the controller needs async? I understand that every use case is different but wondering if there is a baseline that I can use as a starting point. 2 database calls? anything over 2 seconds to return?
The guideline is that you should use async whenever you have I/O. I.e., a database. The overhead is miniscule compared to any kind of I/O.
That said, blocking a thread pool thread via Task.Run is what I call "fake asynchrony". It's exactly what you don't want to do on ASP.NET.
Instead, start at your "lowest-level" code and make that truly asynchronous. E.g., EF6 supports asynchronous database queries. Then let the async code grow naturally from there towards your controller.
The only improvement the new code has is it runs both A and B concurrently and not one at a time. There's actually no real asynchrony in this code.
When you use Task.Run you are offloading work to be done on another thread, so basically you start 2 threads and release the current thread while awaiting both tasks (each of them running completely synchronously)
That means that the operation will finish faster (because of the parallelism) but will be using twice the threads and so will be less scalable.
What you do want to do is make sure all your operations are truly asynchronous. That will mean having a servicelayer.getAAsync() and servicelayer.getBAsync() so you could truly release the threads while IO is being processed:
public async Task<string> JsonData()
{
return SerializeObject(new {await servicelayer.getAAsync(), await servicelayer.getBAsync()});
}
If you can't make sure your actual IO operations are truly async, it would be better to keep the old code.
More on why to avoid Task.Run: Task.Run Etiquette Examples: Don't Use Task.Run in the Implementation