Client-side request rate-limiting - c#

I'm designing a .NET client application for an external API. It's going to have two main responsibilities:
Synchronization - making a batch of requests to API and saving responses to my database periodically.
Client - a pass-through for requests to API from users of my client.
Service's documentation specifies following rules on maximum number of requests that can be issued in given period of time:
During a day:
Maximum of 6000 requests per hour (~1.67 per second)
Maximum of 120 requests per minute (2 per second)
Maximum of 3 requests per second
At night:
Maximum of 8000 requests per hour (~2.23 per second)
Maximum of 150 requests per minute (2.5 per second)
Maximum of 3 requests per second
Exceeding these limits won't result in immediate lockdown - no exception will be thrown. But provider can get annoyed, contact us and then ban us from using his service. So I need to have some request delaying mechanism in place to prevent that. Here's how I see it:
public async Task MyMethod(Request request)
{
await _rateLimter.WaitForNextRequest(); // awaitable Task with calculated Delay
await _api.DoAsync(request);
_rateLimiter.AppendRequestCounters();
}
Safest and simpliest option would be to respect the lowest rate limit only, that is of max 3 requests per 2 seconds. But because of "Synchronization" responsibility, there is a need to use as much of these limits as possible.
So next option would be to to add a delay based on current request count. I've tried to do something on my own and I also have used RateLimiter by David Desmaisons, and it would've been fine, but here's a problem:
Assuming there will be 3 requests per second sent by my client to the API at day, we're going to see:
A 20 second delay every 120th request
A ~15 minute delay every 6000th request
This would've been acceptable if my application was only about "Synchronization", but "Client" requests can't wait that long.
I've searched the Web, and I've read about token/leaky bucket and sliding window algorithms, but I couldn't translate them to my case and .NET, since they mainly cover the rejecting of requests that exceed a limit. I've found this repo and that repo, but they are both only service-side solutions.
QoS-like spliting of rates, so that "Synchronization" would have the slower, and "Client" the faster rate, is not an option.
Assuming that current request rates will be measured, how to calculate the delay for next request so that it could be adaptive to current situation, respect all maximum rates and wouldn't be longer than 5 seconds? Something like gradually slowing down when approaching a limit.

This is achievable by using the Library you linked on GitHub. We need to use a composed TimeLimiter made out of 3 CountByIntervalAwaitableConstraint like so:
var hourConstraint = new CountByIntervalAwaitableConstraint(6000, TimeSpan.FromHours(1));
var minuteConstraint = new CountByIntervalAwaitableConstraint(120, TimeSpan.FromMinutes(1))
var secondConstraint = new CountByIntervalAwaitableConstraint(3, TimeSpan.FromSeconds(1));
var timeLimiter = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
We can test to see if this works by doing this:
for (int i = 0; i < 1000; i++)
{
await timeLimiter;
Console.WriteLine($"Iteration {i} at {DateTime.Now:T}");
}
This will run 3 times every second until we reach 120 iterations (iteration 119) and then wait until the minute is over and the continue running 3 times every second. We can also (again using the Library) easily use the TimeLimiter with a HTTP Client by using the AsDelegatingHandler() extension method provided like so:
var handler = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
var client = new HttpClient(handler);
We can also use CancellationTokens, but as far as I can tell not at the same time as also using it as the handler for the HttpClient. Here is how you can use it with a HttpClientanyways:
var timeLimiter = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
var client = new HttpClient();
for (int i = 0; i < 100; i++)
{
await composed.Enqueue(async () =>
{
var client = new HttpClient();
var response = await client.GetAsync("https://hacker-news.firebaseio.com/v0/item/8863.json?print=pretty");
if (response.IsSuccessStatusCode)
Console.WriteLine(await response.Content.ReadAsStringAsync());
else
Console.WriteLine($"Error code {response.StatusCode} reason: {response.ReasonPhrase}");
}, new CancellationTokenSource(TimeSpan.FromSeconds(10)).Token);
}
Edit to address OPs question more:
If you want to make sure a User can send a request without having to wait for the limit to be over with, we would need to dedicate a certain amount of request every second/ minute/ hour to our user. So we need a new TimeLimiter for this and also adjust our API TimeLimiter. Here are the two new ones:
var apiHourConstraint = new CountByIntervalAwaitableConstraint(5500, TimeSpan.FromHours(1));
var apiMinuteConstraint = new CountByIntervalAwaitableConstraint(100, TimeSpan.FromMinutes(1));
var apiSecondConstraint = new CountByIntervalAwaitableConstraint(2, TimeSpan.FromSeconds(1));
// TimeLimiter for calls automatically to the API
var apiTimeLimiter = TimeLimiter.Compose(apiHourConstraint, apiMinuteConstraint, apiSecondConstraint);
var userHourConstraint = new CountByIntervalAwaitableConstraint(500, TimeSpan.FromHours(1));
var userMinuteConstraint = new CountByIntervalAwaitableConstraint(20, TimeSpan.FromMinutes(1));
var userSecondConstraint = new CountByIntervalAwaitableConstraint(1, TimeSpan.FromSeconds(1));
// TimeLimiter for calls made manually by a user to the API
var userTimeLimiter = TimeLimiter.Compose(userHourConstraint, userMinuteConstraint, userSecondConstraint);
You can play around with the numbers to suit your need.
Now to use it:
I saw you're using a central Method to execute your Requests, this makes it easier. I'll just add an optional boolean parameter that determines if it's an automatically executed request or one made from a user. (You could replace this parameter with an Enum if you want more than just automatic and manual requests)
public static async Task DoRequest(Request request, bool manual = false)
{
TimeLimiter limiter;
if (manual)
limiter = TimeLimiterManager.UserLimiter;
else
limiter = TimeLimiterManager.ApiLimiter;
await limiter;
_api.DoAsync(request);
}
static class TimeLimiterManager
{
public static TimeLimiter ApiLimiter { get; }
public static TimeLimiter UserLimiter { get; }
static TimeLimiterManager()
{
var apiHourConstraint = new CountByIntervalAwaitableConstraint(5500, TimeSpan.FromHours(1));
var apiMinuteConstraint = new CountByIntervalAwaitableConstraint(100, TimeSpan.FromMinutes(1));
var apiSecondConstraint = new CountByIntervalAwaitableConstraint(2, TimeSpan.FromSeconds(1));
// TimeLimiter to control access to the API for automatically executed requests
ApiLimiter = TimeLimiter.Compose(apiHourConstraint, apiMinuteConstraint, apiSecondConstraint);
var userHourConstraint = new CountByIntervalAwaitableConstraint(500, TimeSpan.FromHours(1));
var userMinuteConstraint = new CountByIntervalAwaitableConstraint(20, TimeSpan.FromMinutes(1));
var userSecondConstraint = new CountByIntervalAwaitableConstraint(1, TimeSpan.FromSeconds(1));
// TimeLimiter to control access to the API for manually executed requests
UserLimiter = TimeLimiter.Compose(userHourConstraint, userMinuteConstraint, userSecondConstraint);
}
}
This isn't perfect, as when the user doesn't execute 20 API calls every minute but your automated system needs to execute more than 100 every minute it will have to wait.
And regarding day/ night differences: You can use 2 backing fields for the Api/UserLimiter and return the appropriate ones in the { get {...} } of the property

Related

Orleans slow with minimalistic use case

I'm evaluating Orleans for a new project we are starting soon.
Eventually we want to run a bunch of persistent actors, but I'm currently struggling to just get base line in memory version of orleans to be performant.
Given the following grain
using Common.UserWallet;
using Common.UserWallet.Messages;
using Microsoft.Extensions.Logging;
namespace Grains;
public class UserWalletGrain : Orleans.Grain, IUserWalletGrain
{
private readonly ILogger _logger;
public UserWalletGrain(ILogger<UserWalletGrain> logger)
{
_logger = logger;
}
public async Task<CreateOrderResponse> CreateOrder(CreateOrderCommand command)
{
return new CreateOrderResponse(Guid.NewGuid());
}
public Task Ping()
{
return Task.CompletedTask;
}
}
The following silo config:
static async Task<IHost> StartSiloAsync()
{
ServicePointManager.UseNagleAlgorithm = false;
var builder = new HostBuilder()
.UseOrleans(c =>
{
c.UseLocalhostClustering()
.Configure<ClusterOptions>(options =>
{
options.ClusterId = "dev";
options.ServiceId = "OrleansBasics";
})
.ConfigureApplicationParts(
parts => parts.AddApplicationPart(typeof(HelloGrain).Assembly).WithReferences())
.AddMemoryGrainStorage("OrleansMemoryProvider");
});
var host = builder.Build();
await host.StartAsync();
return host;
}
And the following client code:
static async Task<IClusterClient> ConnectClientAsync()
{
var client = new ClientBuilder()
.UseLocalhostClustering()
.Configure<ClusterOptions>(options =>
{
options.ClusterId = "dev";
options.ServiceId = "OrleansBasics";
})
//.ConfigureLogging(logging => logging.AddConsole())
.Build();
await client.Connect();
Console.WriteLine("Client successfully connected to silo host \n");
return client;
}
static async Task DoClientWorkAsync(IClusterClient client)
{
List<IUserWalletGrain> grains = new List<IUserWalletGrain>();
foreach (var _ in Enumerable.Range(1, 100))
{
var walletGrain = client.GetGrain<IUserWalletGrain>(Guid.NewGuid());
await walletGrain.Ping(); //make sure grain is loaded
grains.Add(walletGrain);
}
var sw = Stopwatch.StartNew();
await Parallel.ForEachAsync(Enumerable.Range(1, 100000), async (o, token) =>
{
var command = new Common.UserWallet.Messages.CreateOrderCommand(Guid.NewGuid(), 4, 5, new List<Guid> { Guid.NewGuid(), Guid.NewGuid() });
var response = await grains[o % 100].CreateOrder(command);
Console.WriteLine($"{o%10}:{o}");
});
Console.WriteLine($"\nElapsed:{sw.ElapsedMilliseconds}\n\n");
}
I'm able to send 100,000 msg in 30 seconds. Which amount to about 3333 msgs per second. This is way less than I would expect when looking at (https://github.com/yevhen/Orleans.PingPong)
It also does not seem to matter if I start of with 10 grains, 100 grains, or 1000 grains.
When I then add persistence with table storage configured
.AddAzureTableGrainStorage(
name: "OrleansMemoryProvider",
configureOptions: options =>
{
options.UseJson = true;
options.ConfigureTableServiceClient(
"secret);
})
And a single
await WriteStateAsync(); in CreateOrder things get drastically worse at about 280 msgs / s
When I go a bit further and implement some basic domain logic. Calling other actors etc we essentially grind to a snails pace at 1.2 msgs / s
What gives?
EDIT:
My cpu is at about 50%.
Building high performance applications can be tricky and nuanced. The general solution in Orleans is that you have many grains and many callers, so you can achieve a high degree of concurrency and thus throughput. In your case, you have many grains (100), but you have few callers (I believe it's one per core by default with Parallel.ForEachAsync), and each caller is writing to the console after every call, which will slow things down substantially.
If I remove the Console.WriteLine and run your code on my machine using Orleans 7.0-rc2, the 100K calls to 100 grains finish in about 850ms. If I change the CreateOrderRequest & CreateOrderResponse types from classes to structs, the duration decreases to 750ms.
If I run a more optimized ping test (the one from the Orleans repository), I see approximately 550K requests per second on my machine with one client and one silo process sharing the same CPU. The numbers are approximately half this for Orleans 3.x. If I co-host the client within the silo process (i.e, pull IClusterClient from the silo's IServiceProvider) then I see over 5M requests per second.
Once you start doing non-trivial amounts of work in each of your grains, you're going to start running up against other limits. I tested calling a single grain from within the same process recently and found that one grain can handle 500K RPS if it is doing trivial work (ping-pong). If the grain has to write to storage on every request and each storage write takes 1ms then it will not be able to handle more than 1000 RPS, since each call waits for the previous call to finish by default. If you want to opt out of that behavior, you can do so by enabling reentrancy on your grain as described in the documentation here: https://learn.microsoft.com/en-us/dotnet/orleans/grains/reentrancy. The Chirper example has more details on how to implement reentrancy with storage updates: https://github.com/dotnet/orleans/tree/main/samples/Chirper.
When grain methods become more complex and grains need to perform significant amounts of I/O to serve each request (for example, storage updates and subsequent grain calls), the throughput of each individual grain will decrease since each request involves more work. Hopefully, the above numbers give you an approximate guide.

Implementation of HttpClient request limiter and buffer

In our project, we have a few services that make requests to a 3rd party API, using a key.
This API has a shared rate limit between all endpoints (meaning request to one endpoint will require 2 seconds cooldown before we can use a different endpoint).
We've handled this using timed background jobs, only making requests to only one of the endpoints at any time.
After some architectural redesign, we've come to a spot where we don't rely as much on the timed background jobs, and now all HttpRequests cannot be moderated since multiple service instances are making requests to the API.
So, in our current example:
We have a few HttpClients set up to all needed API endpoints, i.e.:
services.AddHttpClient<Endpoint1Service>(client =>
{
client.BaseAddress = new Uri(configOptions.Services.Endpoint1.Url);
});
services.AddHttpClient<Endpoint2Service>(client =>
{
client.BaseAddress = new Uri(configOptions.Services.Endpoint2.Url);
});
Endpoint1Service and Endpoint2Service were before accessed by background job services:
public async Task DoJob()
{
var items = await _repository.GetItems();
foreach (var item in items)
{
var processedResult = await _endpoint1Service.DoRequest(item);
await Task.Delay(2000);
//...
}
// save all results
}
But now these "endpoint" services are accessed concurrently, and a new instance is create every time, therefore no way to moderate the request rates.
One possible solution was to create some sort of singleton request buffer is injected into all services that uses this API, and moderates these requests to go out at a given rate. Problems I see with this is it seems dangerous to store requests in a in-memory buffer, in case something goes wrong.
Is this a direction I should be looking towards, or is there anything else I can try?
Hope this helps:
I created the following for similar scenarios. Its objective is concurrency throttled multi threading. However it also gives you a convenient wrapper over your request processing pipeline. Additionally it provides a max number of concurrent requests limit per client (if you want to use that).
Create one instance per end point service and set its number of threads to 1 if you want a throttle of 1. Set it to 4 if you want it at 4 concurrent requests to the given end point.
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Threading/APIProcessor/AsyncThreadedWorkItemProcessor.cs
or
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Threading/APIProcessor/ThreadedWorkItemProcessor.cs
The two implementations are interchangeable. If using in a web server context the former is probably better as it offloads to the background thread pool instead if using foreground threads.
Example Usage
In your case probably set: _maxWorkerThreads to a value of 1 if you want to rate limit it at 1 concurrent request. Set it to 4 if you want to rate limit it at 4 concurrent requests.
//Example Usage for WebAPI controller
class Example
{
private static ThreadedWorkItemProcessor<DummyRequest, DummyResponse, int, WorkItemPriority> ThreadedProcessorExample = new ThreadedWorkItemProcessor<DummyRequest, DummyResponse, int, WorkItemPriority>(
_maxWorkItemLimitPerClient: 100 // Maximum number of concurrent requests in the processing queue per client. Set to int.MaxValue to disable concurrent request caps
, _maxWorkerThreads: 16 // Maximum number of threads to scale upto
, _threadStartupPerWorkItems: 4 // Consider starting a new processing thread ever X requests
, _threadStartupMinQueueSize: 4 // Do NOT start a new processing thread if work item queue is below this size
, _idleWorkerThreadExitSeconds: 10 // Idle threads will exit after X seconds
, _abandonedResponseExpirySeconds: 60 // Expire processed work items after X seconds (Maybe the client terminated or the web request thread died)
, _processRequestMethod: ProcessRequestMethod // Your Do Work method for processing the request
, _logErrorMethod: Handler_LogError
, _logMessageMethod: Handler_LogMessage
);
public async Task<DummyResponse> GetResponse([FromBody] DummyRequest _request)
{
int clientID = 1; //Replace with the client ID from your authentication mechanism if using per client request caps. Otherwise just hardcode to maybe 0 or whatever
WorkItemPriority _priority;
_priority = WorkItemPriority.Medium; //Assign the priority based on whatever prioritization rules.
int RequestID = ThreadedProcessorExample.ScheduleWorkItem(_priority, _request, clientID);
if (RequestID < 0)
{
//Client has exceeded maximum number of concurrent requests or Application Pool is shutting down
//return a suitable error message here
return new DummyResponse() { ErrorMessage = #"Maximum number of concurrent requests exceeded or service is restarting. Please retry request later." };
}
//If you need the result (Like in a webapi controller) then do this
//Otherwise if it is say a backend processing sink where there is no client waiting for a response then we are done here. just return.
KeyValuePair<bool, ThreadWorkItem<DummyRequest, DummyResponse, int>> workItemResult;
workItemResult = await ThreadedProcessorExample.TryGetProcessedWorkItemAsync(RequestID,
_timeoutMS: 1000, //Timeout of 1 second
_taskWaitType: ThreadProcessorAsyncTaskWaitType.Delay_Specific,
_delayMS: 10);
if (!workItemResult.Key)
{
//Processing timeout or Application Pool is shutting down
//return a suitable error message here
return new DummyResponse() { ErrorMessage = #"Internal system timeout or service is restarting. Please retry request later." };
}
return workItemResult.Value.Response;
}
public static DummyResponse ProcessRequestMethod(DummyRequest request)
{
// Process the request and return the response
return new DummyResponse() { orderID = request.orderID };
}
public static void Handler_LogError(Exception ex)
{
//Log unhandled exception here
}
public static void Handler_LogMessage(string Message)
{
//Log message here
}
}

Parallelize C# Graph API SDK methods

I'm connecting to and fetching transitive groups data from MS Graph API via. following logic:
var queryOptions = new List<QueryOption>()
{
new QueryOption("$count", "true")
};
var lstTemp = graphClient.Groups[$"{groupID}"].TransitiveMembers
.Request(queryOptions)
.Header("ConsistencyLevel", "eventual")
.Select("id,mail,onPremisesSecurityIdentifier").Top(999)
.GetAsync().GetAwaiter().GetResult();
var lstGroups = lstTemp.CurrentPage.Where(x => x.ODataType.Contains("group")).ToList();
while (lstTemp.NextPageRequest != null)
{
lstTemp = lstTemp.NextPageRequest.GetAsync().GetAwaiter().GetResult();
lstGroups.AddRange(lstTemp.CurrentPage.Where(x => x.ODataType.Contains("group")).ToList());
}
Although the following logic works fine, for larger data set where the result count could be around 10K records or more, I've noticed the time required to fetch all of the results is around 10-12 seconds.
I'm looking for a solution by which we can parallelize (or multi-threading/tasking) API calls are executed in such a way that the overall time to get completed results is further reduced.
In C# we have Parallel.For etc. can I use it in this scenario to replace my regular While loop mentioned above?
Any suggestions?
Not really using the Parallel.For api, but you can execute a bunch of asynchronous tasks concurrently by throwing them into a List<Task<T>> and awaiting the whole list with Task.WhenAll. Your code may look something like this:
var queryOptions = new List<QueryOption>()
{
new QueryOption("$count", "true")
};
// Creating the first request
var firstRequest = graphClient.Groups[$"{groupID}"].TransitiveMembers
.Request(queryOptions)
.Header("ConsistencyLevel", "eventual")
.Select("id,mail,onPremisesSecurityIdentifier").Top(999)
.GetAsync();
// Creating a list of all requests (starting with the first one)
var requests = new List<Task<IGroupTransitiveMembersCollectionWithReferencesPage>>() { firstRequest };
// Awaiting the first response
var firstResponse = await firstRequest;
// Getting the total count from the request
var count = (int) firstResponse.AdditionalData["#odata.count"];
// Setting offset to the amount of data you already pulled
var offset = 999;
while (offset < count)
{
// Creating the next request
var nextRequest = graphClient.Groups[$"{groupID}"].TransitiveMembers
.Request() // Notice no $count=true (may potentially hurt performance and we don't need it anymore anyways)
.Header("ConsistencyLevel", "eventual")
.Select("id,mail,onPremisesSecurityIdentifier")
.Skip(offset).Top(999) // Skipping the data you already pulled
.GetAsync();
// Adding it to the list
requests.Add(nextRequest);
// Increasing the offset
offset += 999;
}
// Waiting for all the requests to finish
var allResponses = await Task.WhenAll(requests);
// This flattens the list while filtering as you did
allResponses
.Select(x => x.CurrentPage)
.SelectMany(x => x.Where(x => x.ODataType.Contains("group")));
Couldn't check if this code works without a Graph tenant, so you might need to modify a bit, but I hope you can see the general idea.
Also I allowed myself to refactor the code to use proper async/await since it's good and standard practice to do that, but it should work with .GetAwaiter().GetResult() if you can't use await in your context for some reason (please consider, though).

.Net5 HttpClient concurrency - performance

Created a HttpClient using IHttpClientFactory and send 1000 GET call in parallel to WebApi and observed the delay of about 3-5mins for each request.. once this is completed after this again send 1000 GET requests in parallel, this time there was no delay.
Now I increased the parallel request to 2000, for the first batch, each request delay was about 9-11min. And for the second 2000 parallel requests, for each request delay was ~5min(which in case of 1000 requests there was no delay.)
var client = _clientFactory.CreateClient();
client.BaseAddress = new Uri("http://localhost:5000");
client.Timeout = TimeSpan.FromMinutes(20);
List<Task> _task = new List<Task>();
for (int i = 1; i <= 4000; i++)
{
_task.Add(ExecuteRequest(client, i));
if (i % 2000 == 0)
{
await Task.WhenAll(_task);
_task.Clear();
}
}
private async Task ExecuteRequest(HttpClient client, int requestId)
{
var result = await client.GetAsync($"Performance/{requestId}");
var response = await result.Content.ReadAsStringAsync();
var data = JsonConvert.DeserializeObject<Response>(response);
}
Trying to understand,
how many parallel request does HttpClient supports without delay.
How to improve performance of HttpClient for 2000 or more parallel requests..
how many parallel request does HttpClient supports without delay.
On modern .NET Core platforms, you're limited only by available memory. There's no built-in throttling that's on by default.
How to improve performance of HttpClient for 2000 or more parallel requests.
It sounds like you're being throttled by your server. If you want to test a more scalable server, try running this in your server's startup:
var desiredThreads = 2000;
ThreadPool.GetMaxThreads(out _, out var maxIoThreads);
ThreadPool.SetMaxThreads(desiredThreads, maxIoThreads);
ThreadPool.GetMinThreads(out _, out var minIoThreads);
ThreadPool.SetMinThreads(desiredThreads, minIoThreads);
What you're doing is causing worst-case perf for a "cold" (just newed up or empty connection pool) HttpClient.
When you make a new request, it looks for an open connection in the connection pool. When it doesn't find one, it tries to open up a new connection. By throwing a sudden burst at a cold client, most calls to SendAsync will end up trying to open a new connection.
This is a problem because a request that needs a new connection will require multiple round-trips to the server, whereas a request on an existing connection will only require a single round-trip. It gets even worse if you use HTTPS. You're heavily dependent on your network latency in this case.
If you are just benchmarking, then you'll want to benchmark steady-state performance, not warmup performance. Benchmark.NET should more or less do this for you.
When you have requests that complete reasonably quick, it can be a lot faster to instead limit your initial concurrency to a smaller percentage of your total requests, and slowly ramp up your connection pool size from there. This allows subsequent requests to re-use connections. What you might try is something like below, which will only allow (rough behavior, not a guarantee) 10 new connections to be opened at once:
var sem = new SemaphoreSlim(10);
var client = new HttpClient();
async Task<HttpResponseMessage> MakeRequestAsync(HttpRequestMessage req)
{
Task t = sem.WaitAsync();
bool openNew = t.IsCompleted;
await t;
try
{
return await client.SendAsync(req);
}
finally
{
sem.Release(openNew ? 2 : 1);
}
}

HttpClient.SendAsync takes a lot longer to receive response from the api than expected

I have a simple program that uses System.Net.Http.HttpClient which works like this:
// Get existing or create new httpclient if there isn't one instace of it already
mClient = GetHttpClient();
var startTime = DateTime.Now;
var res = await mClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
var executeMethodTime = DateTime.Now.Subtract(startTime).TotalMilliseconds;
mLog.Info("restPostExecuteMethodTime : {0}", executeMethodTime);
It makes a simple request to the API and waits for the response. The content of the request is very small just some simple objects nothing too big.
The API side looks like this:
public async Task<IHttpActionResult> DoWork([FromBody]TemplateInput templateInput)
{
IHttpActionResult res;
var startTime = DateTime.Now;
// do logic
var auditTime = DateTime.Now.Subtract(startTime).TotalMilliseconds;
mLog.Info($"auditTime {auditTime}");
return res;
}
Here is the logging generated by that code :
2020-07-15 09:12:19.7472|auditTime 204.9607|
2020-07-15 09:12:24.8172|executeMethodTime : 5321.9861|
What surprised me is that the API took over 5 seconds to return the result when in fact all the logic in the API took only about 200 milliseconds to finish executing. But it took whole entire 5 seconds to return the result back to the requester.
I checked the network and there were no problems.
It made no sense to me. Is there anything in HttpClient that could potentially affect the response speed? To the point that it would cost request 5 seconds?
I first thought it was caused by proxy setting inside httpClient so I tried to set proxy to false when creating httpClient it turned out to make no difference to speed.
If anyone has any idea - please let me know.

Categories