I've created sample .NET Core WebApi application to test how async methods can increase the throughput. App is hosted on IIS 10.
Here is a code of my controller:
[HttpGet("sync")]
public IEnumerable<string> Get()
{
return this.GetValues().Result;
}
[HttpGet("async")]
public async Task<IEnumerable<string>> GetAsync()
{
return await this.GetValues();
}
[HttpGet("isresponding")]
public Task<bool> IsResponding()
{
return Task.FromResult(true);
}
private async Task<IEnumerable<string>> GetValues()
{
await Task.Delay(TimeSpan.FromSeconds(10)).ConfigureAwait(false);
return new string[] { "value1", "value2" };
}
there are methods:
Get() - to get result synchronously
GetAsync() - to get result asynchronously.
IsResponding() - to check that server can serve requests
Then I created sample console app, which creates 100 requests to sync and async method (no waiting for result) of the controller. Then I call method IsResponding() to check whether server is available.
Console app code is:
using (var httpClient = new HttpClient())
{
var methodUrl = $"http://localhost:50001/api/values/{apiMethod}";
Console.WriteLine($"Executing {methodUrl}");
//var result1 = httpClient.GetAsync($"http://localhost:50001/api/values/{apiMethod}").Result.Content.ReadAsStringAsync().Result;
Parallel.For(0, 100, ((i, state) =>
{
httpClient.GetAsync(methodUrl);
}));
var sw = Stopwatch.StartNew();
var isAlive = httpClient.GetAsync($"http://localhost:50001/api/values/isresponding").Result.Content;
Console.WriteLine($"{sw.Elapsed.TotalSeconds} sec.");
Console.ReadKey();
}
where {apiMethod} is "sync" or "async", depending on user input.
In both cases server is not responding for a long time (about 40 sec).
I expexted that in async case server should continue serving requests fast, but it doesn't.
UPDATE 1:
I've changed client code like this:
Parallel.For(0, 10000, ((i, state) =>
{
var httpClient = new HttpClient();
httpClient.GetAsync($"http://localhost:50001/api/values/{apiMethod}");
}));
using (var httpClient = new HttpClient())
{
var sw = Stopwatch.StartNew();
// this method should evaluate fast when we called async version and should evaluate slowly when we called sync method (due to busy threads ThreadPool)
var isAlive = httpClient.GetAsync($"http://localhost:50001/api/values/isresponding").Result.Content;
Console.WriteLine($"{sw.Elapsed.TotalSeconds} sec.");
}
and calling IsResponding() method executing for a very long time.
UPDATE 2
Yes, I know how async methods work. Yes, I know how to use HttpClient. It's just a sample to prove theory.
UPDATE 3
As it mentioned by StuartLC in one of the comments, IIS somehow throtling or blocking requests. When I started my WebApi as SelfHosted it started working as expected:
Executing time of "isresponsible" method after bunch of requests to ASYNC method is very fast, at about 0.02 sec.
Executing time of "isresponsible" method after bunch of requests to SYNC method is very slow, at about 35 sec.
You don't seem to understand async. It doesn't make the response return faster. The response cannot be returned until everything the action is doing is complete, async or not. If anything, async is actually slower, if only slightly, because there's additional overhead involved in asynchronous processing not necessary for synchronous processing.
What async does do is potentially allow the active thread servicing the request to be returned to the pool to service other requests. In other words, async is about scale, not performance. You'll only see benefits when your server is slammed with requests. Then, when incoming requests would normally have been queued sync, you'll process additional requests from some of the async tasks forfeiting their threads to the cause. Additionally, there is no guarantee that the thread will be freed at all. If the async task completes immediately or near immediately, the thread will be held, just as with sync.
EDIT
You should also realize that IIS Express is single-threaded. As such, it's not a good guage for performance tuning. If you're running 1000 simultaneous requests, 999 are instantly queued. Then, you're not doing any asynchronous work - just returning a completed task. As such, the thread will never be released, so there is literally no difference between sync and async in this scenario. Therefore, you're down to just how long it takes to process through the queue of 999 requests (plus your status check at the end). You might have better luck at teasing out a difference if you do something like:
await Task.Delay(500);
Instead of just return Task.FromResult. That way, there's actual idle time on the thread that may allow it to be returned to the pool.
IIS somehow throtling or blocking requests (as mentioned in one of the comments). When I started my WebApi as SelfHosted it started working as expected:
Executing time of isresponsible method after bunch of requests to ASYNC method is very fast, at about 0.02 sec.
Executing time of isresponsible method after bunch of requests to SYNC method is very slow, at about 35 sec.
I'm not sure this will yield any major improvement, but you should call ConfigureAwait(false) on every awaitable in the server, including in GetAsync.
It should produce better use of the threads.
Related
I have this function:
async Task RefreshProfileInfo(List<string> listOfPlayers)
// For each player in the listOfPlayers, checks an in-memory cache if we have an entry.
// If we have a cached entry, do nothing.
// If we don't have a cached entry, fetch from backend via an API call.
This function is called very frequently, like:
await RefreshProfileInfo(playerA, playerB, playerC)
or
await RefreshProfileInfo(playerB, playerC, playerD)
or
await RefreshProfileInfo(playerE, playerF)
Ideally, if the players do not overlap each other, the calls should not affect each other (requesting PlayerE and PlayerF should not block the request for PlayerA, PlayerB, PlayerC). However, if the players DO overlap each other, the second call should wait for the first (requesting PlayerB, PlayerC, PlayerD, should wait for PlayerA, PlayerB, PlayerC to finish).
However, if that isn't possible, at the very least I'd like all calls to be sequential. (I think they should still be async, so they don't block other unrelated parts of the code).
Currently, what happens is each RefreshProfileInfo runs in parallel, which results in hitting backend every time (9 times in this example).
Instead, I want to execute them sequentially, so that only the first call hits the backend, and subsequent calls just hit cache.
What data structure/approach should I use? I'm having trouble figuring out how to "connect" the separate calls to each other. I've been playing around with Task.WhenAll() as well as SemaphoreSlim, but I can't figure out how to use them properly.
Failed attempt
The idea behind my failed attempt was to have a helper class where I could call a function, SequentialRequest(Task), and it would sequentially run all tasks invoked in this manner.
List<Task> waitingTasks = new List<Task>();
object _lock = new object();
public async Task SequentialRequest(Task func)
{
var waitingTasksCopy = new List<Task>();
lock (_lock)
{
waitingTasksCopy = new List<Task>(waitingTasks);
waitingTasks.Add(func); // Add this task to the waitingTasks (for future SequentialRequests)
}
// Wait for everything before this to finish
if (waitingTasksCopy.Count > 0)
{
await Task.WhenAll(waitingTasksCopy);
}
// Run this task
await func;
}
I thought this would work, but "func" is either run instantly (instead of waiting for earlier tasks to finish), or never run at all, depending on how I call it.
If I call it using this, it runs instantly:
async Task testTask()
{
await Task.Delay(4000);
}
If I call it using this, it never runs:
Task testTask = new Task(async () =>
{
await Task.Delay(4000);
});
Here's why your current attempt doesn't work:
// Run this task
await func;
The comment above is not describing what the code is doing. In the asynchronous world, a Task represents some operation that is already in progress. Tasks are not "run" by using await; await it a way for the current code to "asynchronously wait" for a task to complete. So no function signature taking a Task is going to work; the task is already in progress before it's even passed to that function.
Your question is actually about caching asynchronous operations. One way to do this is to cache the Task<T> itself. Currently, your cache holds the results (T); you can change your cache to hold the asynchronous operations that retrieve those results (Task<T>). For example, if your current cache type is ConcurrentDictionary<PlayerId, Player>, you could change it to ConcurrentDictionary<PlayerId, Task<Player>>.
With a cache of tasks, when your code checks for a cache entry, it will find an existing entry if the player data is loaded or has started loading. Because the Task<T> represents some asynchronous operation that is already in progress (or has already completed).
A couple of notes for this approach:
This only works for in-memory caches.
Think about how you want to handle errors. A naive cache of Task<T> will also cache error results, which is usually not desired.
The second point above is the trickier part. When an error happens, you'd probably want some additional logic to remove the errored task from the cache. Bonus points (and additional complexity) if the error handling code prevents an errored task from getting into the cache in the first place.
at the very least I'd like all calls to be sequential
Well, that's much easier. SemaphoreSlim is the asynchronous replacement for lock, so you can use a shared SemaphoreSlim. Call await mySemaphoreSlim.WaitAsync(); at the beginning of RefreshProfileInfo, put the body in a try, and in the finally block at the end of RefreshProfileInfo, call mySemaphoreSlim.Release();. That will limit all calls to RefreshProfileInfo to running sequentially.
I had the same issue in one of my projects. I had multiple threads call a single method and they all made IO calls when not found in cache. What you want to do is to add the Task to your cache and then await it. Subsequent calls will then just read the result once the task completes.
Example:
private Task RefreshProfile(Player player)
{
// cache is of type IMemoryCache
return _cache.GetOrCreate(player, entry =>
{
// expire in 30 seconds
entry.AbsoluteExpiration = DateTimeOffset.UtcNow.AddSeconds(30);
return ActualRefreshCodeThatReturnsTask(player);
});
}
Then just await in your calling code
await Task.WhenAll(RefreshProfile(Player a), RefreshProfile(Player b), RefreshProfile(Player c));
I am trying to make my WebApi async in order not to block ASP.net threads handling the requests while accessing the database. So I create the following code. To my understanding when this action is called a new thread away from the ASP.net thread pool is created to handle the GetBalance method and the thread that handled this action request in the past will get freed and returned to the pool until to be used by other requests till the GetBalance method finishes its IO. Is this correct?
Some article I have read suggests that my async calls has to go all the way through all the call chains till it reaches the lowest level async call, in this example an Entity Framework async call. Other wise the new thread created by the code below will still be created in the ASP.net thread pool and I will just be freeing a thread to occupy another, which undermines the whole effort done in the async wait to increase scalability of this WebApi.
Could anybody please explain more on how this works? and if my understanding is correct?
public async Task<Account> Balance(int number)
{
Task<Account> task = GetBalanceAsync(number);
await task;
return task.Result;
}
Task<Account> GetBalanceAsync(int number)
{
return Task.Factory.StartNew(() => GetBalance(number));
}
Account GetBalance(int number)
{
using (AccountServices accountService = new AccountServices())
{
Account account = accountService.Find(number);
return account;
}
}
There's only one thread pool. ASP.NET requests just run on a regular thread pool thread.
To my understanding when this action is called a new thread away from the ASP.net thread pool is created to handle the GetBalance method and the thread that handled this action request in the past will get freed and returned to the pool until to be used by other requests till the GetBalance method finishes its IO. Is this correct?
Yes; your code is taking one thread from the thread pool (StartNew), and then returning a thread to the thread pool (await).
Some article I have read suggests that my async calls has to go all the way through all the call chains till it reaches the lowest level async call, in this example an Entity Framework async call. Other wise the new thread created by the code below will still be created in the ASP.net thread pool and I will just be freeing a thread to occupy another, which undermines the whole effort done in the async wait to increase scalability of this WebApi.
Yes, that's exactly correct. The code posted adds complexity and overhead, and will have worse performance than synchronous code:
public Account Balance(int number)
{
return GetBalance(number);
}
Account GetBalance(int number)
{
using (AccountServices accountService = new AccountServices())
{
return accountService.Find(number);
}
}
A fully-asynchronous solution will have better scalability:
public async Task<Account> Balance(int number)
{
return await GetBalanceAsync(number);
}
Task<Account> GetBalanceAsync(int number)
{
using (AccountServices accountService = new AccountServices())
{
return await accountService.FindAsync(number);
}
}
I use standard HttpListener from System.Net namespace. When I use Task.Delay to simulate some work on server side (without await) and test server with Apache benchmark it gives good results (2000 rps). But when I'm "awaiting" bandwith is about 9 rps (according to Apache Benchmark). Why is it behave like this? Appreciate for answers.
private async Task Listen()
{
while (listener.IsListening)
{
try
{
var context = await listener.GetContextAsync().ConfigureAwait(false);
context.Response.StatusCode = (int)HttpStatusCode.OK;
// good without await
await Task.Delay(100).ConfigureAwait(false);
using (var sw = new StreamWriter(context.Response.OutputStream))
{
await sw.WriteAsync("<html><body>Hello</body></html>");
await sw.FlushAsync();
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}
}
public async Task Run()
{
listener.Start();
Task.Run(() => Listen());
}
This is because the pause you've inserted delays the loop.
You only accept requests on the line:
await listener.GetContextAsync()
which is in this same loop.
This means that you have to wait for the previous request to complete before you accept a new context.
Most likely, you want to handle the accepted contexts concurrently. To achieve this, you should accept contexts in a tight loop, then spin off the handling of the request to a different place which doesn't "hang" the accept loop.
At it's simplest, you could "fire and forget"...
while (listener.IsListening)
{
var context = await listener.GetContextAsync().ConfigureAwait(false);
HandleContextAsync(context); //note: not awaited
}
...in reality, you'll need to think a bit harder about how to track exceptions in the HandleContextAsync method.
The Delay() Task is created and allowed to run in its own time, without influencing the creating method any further. If you neither await (asynchronous) or Wait() (synchronous) on that Task, then the creating method carries on immediately, making it appear 'fast'. await actually waits for the delay Task to complete, but does it in a way that means the thread does not block and can execute other work concurrently. Your other version doesn't wait at all, synchronously or otherwise. The delay Task is created and runs, but since nothing else cares when it finishes, it just gets executed and garbage-collected playing no further part in the method.
That's because you wait 100 ms before processing a next request, so you can process maximum 10 requests per second. Task.Delay without await doesn't affect your application at all, you can remove this line and as a result still will be handle 2000 request per second.
I would need your help in the following. For nearly a month, I have been reading regarding Tasks and async .
I wanted to try to implement my new acquired knowledege, in a simple wep api project. I have the following methods and both of them working as expected:
public HttpResponseMessage Get()
{
var data = _userServices.GetUsers();
return Request.CreateResponse(HttpStatusCode.OK, data);
}
public async Task<HttpResponseMessage> Get()
{
var data = _userServices.GetUsers();
return await Task<HttpResponseMessage>.Factory.StartNew(() =>
{
return Request.CreateResponse(HttpStatusCode.OK, data);
});
}
So the question. I have tried to use fiddler and see what is the difference between these two. The async one is little faster, but apart from that, what is the real benefit in implementing something like that in a web api?
As others have pointed out, the point of async on ASP.NET is that it frees up one of the ASP.NET thread pool threads. This works great for naturally-asynchronous operations such as I/O-bound operations because that's one less thread on the server (there is no thread that is "processing" the async operation, as I explain on my blog). Thus, the primary benefit of async on the server side is scalability.
However, you want to avoid Task.Run (and, even worse, Task.Factory.StartNew) on ASP.NET. I call this "fake asynchrony" because they're just doing synchronous/blocking work on a thread pool thread. They're useful in UI apps where you want to push work off the UI thread so the UI remains responsive, but they should (almost) never be used on ASP.NET or other server apps.
Using Task.Run or Task.Factory.StartNew on ASP.NET will actually decrease your scalability. They will cause some unnecessary thread switches. For longer-running operations, you could end up throwing off the ASP.NET thread pool heuristics, causing additional threads to be created and later destroyed needlessly. I explore these performance problems step-by-step in another blog post.
So, you need to think about what each action is doing, and whether any of that should be async. If it should, then that action should be async. In your case:
public HttpResponseMessage Get()
{
var data = _userServices.GetUsers();
return Request.CreateResponse(HttpStatusCode.OK, data);
}
What exactly is Request.CreateResponse doing? It's just creating response object. That's it - just a fancy new. There's no I/O going on there, and it certainly isn't something that needs to be pushed off to a background thread.
However, GetUsers is much more interesting. That sounds more like a data read, which is I/O-based. If your backend can scale (e.g., Azure SQL / Tables / etc), then you should look at making that async first, and once your service is exposing a GetUsersAsync, then this action could become async too:
public async Task<HttpResponseMessage> Get()
{
var data = await _userServices.GetUsersAsync();
return Request.CreateResponse(HttpStatusCode.OK, data);
}
Using async on your server can dramatically improve scalability as it frees up the thread serving the request to handle other requests while the async operation is in progress. For example in a synchronous IO operaton, the thread would be suspended and doing nothing until the operation completes and would not be available to serve another request.
That being said, using Task.Factory.StartNew starts another thread so you don't get the scalability benefits at all. Your original thread can be reused, but you have offloaded the work to another thread so there is no net benefit at all. in fact there is a cost of switching to another thread, but that is minimal.
Truly asynchronous operations do not start a thread and I would look to see if such an operation exists, or if one can be written for Request.CreateResponse. Then your code would be much more scalable. If not, you are better off sticking with the synchronous approach.
It makes more sense where the call is happening with major IO operations.
Yes, Async is faster because it frees up the request thread for the time that the operations is being performed. Thus, from Web server point of view, you are giving a thread back to the pool that can be used by the server for any future calls coming through.
So for e.g. when you are performing a search operation on SQL server, you might want to do async and see the performance benefit.
It is good for scalability that involves multiple servers.
So, for e.g. when the SearchRecordAsync sends its SQL to the database, it returns an incomplete task, and when the request hits the await, it returns the request thread to the thread pool. Later, when the DB operation completes, a request thread is taken from the thread pool and used to continue the request.
Even if you are not using, SQL operations, let say you want to send an email to 10 people. In this case also async makes more sense.
Async is also very handy to show the progress of long event. So user will still get the active GUI, while the task is running at the background.
To understand, please have a look at this sample.
Here I am trying to initiate task called send mail. Interim I want to update database, while the background is performing send mail task.
Once the database update has happened, it is waiting for the send mail task to be completed. However, with this approach it is quite clear that I can run task at the background and still proceed with original (main) thread.
using System;
using System.Threading;
using System.Threading.Tasks;
public class Program
{
public static void Main()
{
Console.WriteLine("Starting Send Mail Async Task");
Task task = new Task(SendMessage);
task.Start();
Console.WriteLine("Update Database");
UpdateDatabase();
while (true)
{
// dummy wait for background send mail.
if (task.Status == TaskStatus.RanToCompletion)
{
break;
}
}
}
public static async void SendMessage()
{
// Calls to TaskOfTResult_MethodAsync
Task<bool> returnedTaskTResult = MailSenderAsync();
bool result = await returnedTaskTResult;
if (result)
{
UpdateDatabase();
}
Console.WriteLine("Mail Sent!");
}
private static void UpdateDatabase()
{
for (var i = 1; i < 1000; i++) ;
Console.WriteLine("Database Updated!");
}
private static async Task<bool> MailSenderAsync()
{
Console.WriteLine("Send Mail Start.");
for (var i = 1; i < 1000000000; i++) ;
return true;
}
}
I've got a NServiceBus host that goes and downloads a whole bunch of data once a message comes through about a particular users account. One data file is about 3Mb (myob - via a webservice call) and another is about 2Mb (restful endpoint, quite fast!). To avoid waiting around for long, I've wrapped the two download calls like this:
var myobBlock = Task.Factory.StartNew(() => myobService.GetDataForUser(accountId, datablockId, CurrencyFormat.IgnoreValidator));
var account = Task.Factory.StartNew(() => accountService.DownloadMetaAccount(accountId, securityContext));
Task.WaitAll(myobBlock, account);
var myobData = myobBlock.Result;
var accountData = account.Result;
//...Process AccountData Object using myobData object
I'm wondering what the benefits are for using the new async/await patterns are here compared to the TPL-esque method I've got above. Reading Stephen Clearys notes, it seems that the above would cause the thread to sit there waiting, where as Async/Await would continue and release the thread for other work.
How would you rewrite that within the context of Async/Await and would it be beneficial? We have lots of accounts to process but its once MSMQ message per account (end of FY reporting) or per-request (ad-hoc when a customer calls up and wants their report)
The benefit of using async/await is that given a true async api (One which doesn't call sync methods over async using Task.Run and the likes, but does true async I/O work) you can avoid the allocation of any unnecessary Threads which simply waste resources only to wait on blocking I/O operations.
Lets imagine both your service methods exposed an async api, you could do the following instead of using two ThreadPool threads:
var myobBlock = myobService.GetDataForUserAsync(accountId, datablockId, CurrencyFormat.IgnoreValidator));
var account = accountService.DownloadMetaAccountAsync(accountId, securityContext));
// await till both async operations complete
await Task.WhenAll(myobBlock, account);
What will happen is that execution will yield back to the calling method until both tasks complete. When they do, continuation will resume via IOCP onto the assigned SynchronizationContext if needed.