Created a HttpClient using IHttpClientFactory and send 1000 GET call in parallel to WebApi and observed the delay of about 3-5mins for each request.. once this is completed after this again send 1000 GET requests in parallel, this time there was no delay.
Now I increased the parallel request to 2000, for the first batch, each request delay was about 9-11min. And for the second 2000 parallel requests, for each request delay was ~5min(which in case of 1000 requests there was no delay.)
var client = _clientFactory.CreateClient();
client.BaseAddress = new Uri("http://localhost:5000");
client.Timeout = TimeSpan.FromMinutes(20);
List<Task> _task = new List<Task>();
for (int i = 1; i <= 4000; i++)
{
_task.Add(ExecuteRequest(client, i));
if (i % 2000 == 0)
{
await Task.WhenAll(_task);
_task.Clear();
}
}
private async Task ExecuteRequest(HttpClient client, int requestId)
{
var result = await client.GetAsync($"Performance/{requestId}");
var response = await result.Content.ReadAsStringAsync();
var data = JsonConvert.DeserializeObject<Response>(response);
}
Trying to understand,
how many parallel request does HttpClient supports without delay.
How to improve performance of HttpClient for 2000 or more parallel requests..
how many parallel request does HttpClient supports without delay.
On modern .NET Core platforms, you're limited only by available memory. There's no built-in throttling that's on by default.
How to improve performance of HttpClient for 2000 or more parallel requests.
It sounds like you're being throttled by your server. If you want to test a more scalable server, try running this in your server's startup:
var desiredThreads = 2000;
ThreadPool.GetMaxThreads(out _, out var maxIoThreads);
ThreadPool.SetMaxThreads(desiredThreads, maxIoThreads);
ThreadPool.GetMinThreads(out _, out var minIoThreads);
ThreadPool.SetMinThreads(desiredThreads, minIoThreads);
What you're doing is causing worst-case perf for a "cold" (just newed up or empty connection pool) HttpClient.
When you make a new request, it looks for an open connection in the connection pool. When it doesn't find one, it tries to open up a new connection. By throwing a sudden burst at a cold client, most calls to SendAsync will end up trying to open a new connection.
This is a problem because a request that needs a new connection will require multiple round-trips to the server, whereas a request on an existing connection will only require a single round-trip. It gets even worse if you use HTTPS. You're heavily dependent on your network latency in this case.
If you are just benchmarking, then you'll want to benchmark steady-state performance, not warmup performance. Benchmark.NET should more or less do this for you.
When you have requests that complete reasonably quick, it can be a lot faster to instead limit your initial concurrency to a smaller percentage of your total requests, and slowly ramp up your connection pool size from there. This allows subsequent requests to re-use connections. What you might try is something like below, which will only allow (rough behavior, not a guarantee) 10 new connections to be opened at once:
var sem = new SemaphoreSlim(10);
var client = new HttpClient();
async Task<HttpResponseMessage> MakeRequestAsync(HttpRequestMessage req)
{
Task t = sem.WaitAsync();
bool openNew = t.IsCompleted;
await t;
try
{
return await client.SendAsync(req);
}
finally
{
sem.Release(openNew ? 2 : 1);
}
}
Related
I'm evaluating Orleans for a new project we are starting soon.
Eventually we want to run a bunch of persistent actors, but I'm currently struggling to just get base line in memory version of orleans to be performant.
Given the following grain
using Common.UserWallet;
using Common.UserWallet.Messages;
using Microsoft.Extensions.Logging;
namespace Grains;
public class UserWalletGrain : Orleans.Grain, IUserWalletGrain
{
private readonly ILogger _logger;
public UserWalletGrain(ILogger<UserWalletGrain> logger)
{
_logger = logger;
}
public async Task<CreateOrderResponse> CreateOrder(CreateOrderCommand command)
{
return new CreateOrderResponse(Guid.NewGuid());
}
public Task Ping()
{
return Task.CompletedTask;
}
}
The following silo config:
static async Task<IHost> StartSiloAsync()
{
ServicePointManager.UseNagleAlgorithm = false;
var builder = new HostBuilder()
.UseOrleans(c =>
{
c.UseLocalhostClustering()
.Configure<ClusterOptions>(options =>
{
options.ClusterId = "dev";
options.ServiceId = "OrleansBasics";
})
.ConfigureApplicationParts(
parts => parts.AddApplicationPart(typeof(HelloGrain).Assembly).WithReferences())
.AddMemoryGrainStorage("OrleansMemoryProvider");
});
var host = builder.Build();
await host.StartAsync();
return host;
}
And the following client code:
static async Task<IClusterClient> ConnectClientAsync()
{
var client = new ClientBuilder()
.UseLocalhostClustering()
.Configure<ClusterOptions>(options =>
{
options.ClusterId = "dev";
options.ServiceId = "OrleansBasics";
})
//.ConfigureLogging(logging => logging.AddConsole())
.Build();
await client.Connect();
Console.WriteLine("Client successfully connected to silo host \n");
return client;
}
static async Task DoClientWorkAsync(IClusterClient client)
{
List<IUserWalletGrain> grains = new List<IUserWalletGrain>();
foreach (var _ in Enumerable.Range(1, 100))
{
var walletGrain = client.GetGrain<IUserWalletGrain>(Guid.NewGuid());
await walletGrain.Ping(); //make sure grain is loaded
grains.Add(walletGrain);
}
var sw = Stopwatch.StartNew();
await Parallel.ForEachAsync(Enumerable.Range(1, 100000), async (o, token) =>
{
var command = new Common.UserWallet.Messages.CreateOrderCommand(Guid.NewGuid(), 4, 5, new List<Guid> { Guid.NewGuid(), Guid.NewGuid() });
var response = await grains[o % 100].CreateOrder(command);
Console.WriteLine($"{o%10}:{o}");
});
Console.WriteLine($"\nElapsed:{sw.ElapsedMilliseconds}\n\n");
}
I'm able to send 100,000 msg in 30 seconds. Which amount to about 3333 msgs per second. This is way less than I would expect when looking at (https://github.com/yevhen/Orleans.PingPong)
It also does not seem to matter if I start of with 10 grains, 100 grains, or 1000 grains.
When I then add persistence with table storage configured
.AddAzureTableGrainStorage(
name: "OrleansMemoryProvider",
configureOptions: options =>
{
options.UseJson = true;
options.ConfigureTableServiceClient(
"secret);
})
And a single
await WriteStateAsync(); in CreateOrder things get drastically worse at about 280 msgs / s
When I go a bit further and implement some basic domain logic. Calling other actors etc we essentially grind to a snails pace at 1.2 msgs / s
What gives?
EDIT:
My cpu is at about 50%.
Building high performance applications can be tricky and nuanced. The general solution in Orleans is that you have many grains and many callers, so you can achieve a high degree of concurrency and thus throughput. In your case, you have many grains (100), but you have few callers (I believe it's one per core by default with Parallel.ForEachAsync), and each caller is writing to the console after every call, which will slow things down substantially.
If I remove the Console.WriteLine and run your code on my machine using Orleans 7.0-rc2, the 100K calls to 100 grains finish in about 850ms. If I change the CreateOrderRequest & CreateOrderResponse types from classes to structs, the duration decreases to 750ms.
If I run a more optimized ping test (the one from the Orleans repository), I see approximately 550K requests per second on my machine with one client and one silo process sharing the same CPU. The numbers are approximately half this for Orleans 3.x. If I co-host the client within the silo process (i.e, pull IClusterClient from the silo's IServiceProvider) then I see over 5M requests per second.
Once you start doing non-trivial amounts of work in each of your grains, you're going to start running up against other limits. I tested calling a single grain from within the same process recently and found that one grain can handle 500K RPS if it is doing trivial work (ping-pong). If the grain has to write to storage on every request and each storage write takes 1ms then it will not be able to handle more than 1000 RPS, since each call waits for the previous call to finish by default. If you want to opt out of that behavior, you can do so by enabling reentrancy on your grain as described in the documentation here: https://learn.microsoft.com/en-us/dotnet/orleans/grains/reentrancy. The Chirper example has more details on how to implement reentrancy with storage updates: https://github.com/dotnet/orleans/tree/main/samples/Chirper.
When grain methods become more complex and grains need to perform significant amounts of I/O to serve each request (for example, storage updates and subsequent grain calls), the throughput of each individual grain will decrease since each request involves more work. Hopefully, the above numbers give you an approximate guide.
In our project, we have a few services that make requests to a 3rd party API, using a key.
This API has a shared rate limit between all endpoints (meaning request to one endpoint will require 2 seconds cooldown before we can use a different endpoint).
We've handled this using timed background jobs, only making requests to only one of the endpoints at any time.
After some architectural redesign, we've come to a spot where we don't rely as much on the timed background jobs, and now all HttpRequests cannot be moderated since multiple service instances are making requests to the API.
So, in our current example:
We have a few HttpClients set up to all needed API endpoints, i.e.:
services.AddHttpClient<Endpoint1Service>(client =>
{
client.BaseAddress = new Uri(configOptions.Services.Endpoint1.Url);
});
services.AddHttpClient<Endpoint2Service>(client =>
{
client.BaseAddress = new Uri(configOptions.Services.Endpoint2.Url);
});
Endpoint1Service and Endpoint2Service were before accessed by background job services:
public async Task DoJob()
{
var items = await _repository.GetItems();
foreach (var item in items)
{
var processedResult = await _endpoint1Service.DoRequest(item);
await Task.Delay(2000);
//...
}
// save all results
}
But now these "endpoint" services are accessed concurrently, and a new instance is create every time, therefore no way to moderate the request rates.
One possible solution was to create some sort of singleton request buffer is injected into all services that uses this API, and moderates these requests to go out at a given rate. Problems I see with this is it seems dangerous to store requests in a in-memory buffer, in case something goes wrong.
Is this a direction I should be looking towards, or is there anything else I can try?
Hope this helps:
I created the following for similar scenarios. Its objective is concurrency throttled multi threading. However it also gives you a convenient wrapper over your request processing pipeline. Additionally it provides a max number of concurrent requests limit per client (if you want to use that).
Create one instance per end point service and set its number of threads to 1 if you want a throttle of 1. Set it to 4 if you want it at 4 concurrent requests to the given end point.
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Threading/APIProcessor/AsyncThreadedWorkItemProcessor.cs
or
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Threading/APIProcessor/ThreadedWorkItemProcessor.cs
The two implementations are interchangeable. If using in a web server context the former is probably better as it offloads to the background thread pool instead if using foreground threads.
Example Usage
In your case probably set: _maxWorkerThreads to a value of 1 if you want to rate limit it at 1 concurrent request. Set it to 4 if you want to rate limit it at 4 concurrent requests.
//Example Usage for WebAPI controller
class Example
{
private static ThreadedWorkItemProcessor<DummyRequest, DummyResponse, int, WorkItemPriority> ThreadedProcessorExample = new ThreadedWorkItemProcessor<DummyRequest, DummyResponse, int, WorkItemPriority>(
_maxWorkItemLimitPerClient: 100 // Maximum number of concurrent requests in the processing queue per client. Set to int.MaxValue to disable concurrent request caps
, _maxWorkerThreads: 16 // Maximum number of threads to scale upto
, _threadStartupPerWorkItems: 4 // Consider starting a new processing thread ever X requests
, _threadStartupMinQueueSize: 4 // Do NOT start a new processing thread if work item queue is below this size
, _idleWorkerThreadExitSeconds: 10 // Idle threads will exit after X seconds
, _abandonedResponseExpirySeconds: 60 // Expire processed work items after X seconds (Maybe the client terminated or the web request thread died)
, _processRequestMethod: ProcessRequestMethod // Your Do Work method for processing the request
, _logErrorMethod: Handler_LogError
, _logMessageMethod: Handler_LogMessage
);
public async Task<DummyResponse> GetResponse([FromBody] DummyRequest _request)
{
int clientID = 1; //Replace with the client ID from your authentication mechanism if using per client request caps. Otherwise just hardcode to maybe 0 or whatever
WorkItemPriority _priority;
_priority = WorkItemPriority.Medium; //Assign the priority based on whatever prioritization rules.
int RequestID = ThreadedProcessorExample.ScheduleWorkItem(_priority, _request, clientID);
if (RequestID < 0)
{
//Client has exceeded maximum number of concurrent requests or Application Pool is shutting down
//return a suitable error message here
return new DummyResponse() { ErrorMessage = #"Maximum number of concurrent requests exceeded or service is restarting. Please retry request later." };
}
//If you need the result (Like in a webapi controller) then do this
//Otherwise if it is say a backend processing sink where there is no client waiting for a response then we are done here. just return.
KeyValuePair<bool, ThreadWorkItem<DummyRequest, DummyResponse, int>> workItemResult;
workItemResult = await ThreadedProcessorExample.TryGetProcessedWorkItemAsync(RequestID,
_timeoutMS: 1000, //Timeout of 1 second
_taskWaitType: ThreadProcessorAsyncTaskWaitType.Delay_Specific,
_delayMS: 10);
if (!workItemResult.Key)
{
//Processing timeout or Application Pool is shutting down
//return a suitable error message here
return new DummyResponse() { ErrorMessage = #"Internal system timeout or service is restarting. Please retry request later." };
}
return workItemResult.Value.Response;
}
public static DummyResponse ProcessRequestMethod(DummyRequest request)
{
// Process the request and return the response
return new DummyResponse() { orderID = request.orderID };
}
public static void Handler_LogError(Exception ex)
{
//Log unhandled exception here
}
public static void Handler_LogMessage(string Message)
{
//Log message here
}
}
I'm designing a .NET client application for an external API. It's going to have two main responsibilities:
Synchronization - making a batch of requests to API and saving responses to my database periodically.
Client - a pass-through for requests to API from users of my client.
Service's documentation specifies following rules on maximum number of requests that can be issued in given period of time:
During a day:
Maximum of 6000 requests per hour (~1.67 per second)
Maximum of 120 requests per minute (2 per second)
Maximum of 3 requests per second
At night:
Maximum of 8000 requests per hour (~2.23 per second)
Maximum of 150 requests per minute (2.5 per second)
Maximum of 3 requests per second
Exceeding these limits won't result in immediate lockdown - no exception will be thrown. But provider can get annoyed, contact us and then ban us from using his service. So I need to have some request delaying mechanism in place to prevent that. Here's how I see it:
public async Task MyMethod(Request request)
{
await _rateLimter.WaitForNextRequest(); // awaitable Task with calculated Delay
await _api.DoAsync(request);
_rateLimiter.AppendRequestCounters();
}
Safest and simpliest option would be to respect the lowest rate limit only, that is of max 3 requests per 2 seconds. But because of "Synchronization" responsibility, there is a need to use as much of these limits as possible.
So next option would be to to add a delay based on current request count. I've tried to do something on my own and I also have used RateLimiter by David Desmaisons, and it would've been fine, but here's a problem:
Assuming there will be 3 requests per second sent by my client to the API at day, we're going to see:
A 20 second delay every 120th request
A ~15 minute delay every 6000th request
This would've been acceptable if my application was only about "Synchronization", but "Client" requests can't wait that long.
I've searched the Web, and I've read about token/leaky bucket and sliding window algorithms, but I couldn't translate them to my case and .NET, since they mainly cover the rejecting of requests that exceed a limit. I've found this repo and that repo, but they are both only service-side solutions.
QoS-like spliting of rates, so that "Synchronization" would have the slower, and "Client" the faster rate, is not an option.
Assuming that current request rates will be measured, how to calculate the delay for next request so that it could be adaptive to current situation, respect all maximum rates and wouldn't be longer than 5 seconds? Something like gradually slowing down when approaching a limit.
This is achievable by using the Library you linked on GitHub. We need to use a composed TimeLimiter made out of 3 CountByIntervalAwaitableConstraint like so:
var hourConstraint = new CountByIntervalAwaitableConstraint(6000, TimeSpan.FromHours(1));
var minuteConstraint = new CountByIntervalAwaitableConstraint(120, TimeSpan.FromMinutes(1))
var secondConstraint = new CountByIntervalAwaitableConstraint(3, TimeSpan.FromSeconds(1));
var timeLimiter = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
We can test to see if this works by doing this:
for (int i = 0; i < 1000; i++)
{
await timeLimiter;
Console.WriteLine($"Iteration {i} at {DateTime.Now:T}");
}
This will run 3 times every second until we reach 120 iterations (iteration 119) and then wait until the minute is over and the continue running 3 times every second. We can also (again using the Library) easily use the TimeLimiter with a HTTP Client by using the AsDelegatingHandler() extension method provided like so:
var handler = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
var client = new HttpClient(handler);
We can also use CancellationTokens, but as far as I can tell not at the same time as also using it as the handler for the HttpClient. Here is how you can use it with a HttpClientanyways:
var timeLimiter = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
var client = new HttpClient();
for (int i = 0; i < 100; i++)
{
await composed.Enqueue(async () =>
{
var client = new HttpClient();
var response = await client.GetAsync("https://hacker-news.firebaseio.com/v0/item/8863.json?print=pretty");
if (response.IsSuccessStatusCode)
Console.WriteLine(await response.Content.ReadAsStringAsync());
else
Console.WriteLine($"Error code {response.StatusCode} reason: {response.ReasonPhrase}");
}, new CancellationTokenSource(TimeSpan.FromSeconds(10)).Token);
}
Edit to address OPs question more:
If you want to make sure a User can send a request without having to wait for the limit to be over with, we would need to dedicate a certain amount of request every second/ minute/ hour to our user. So we need a new TimeLimiter for this and also adjust our API TimeLimiter. Here are the two new ones:
var apiHourConstraint = new CountByIntervalAwaitableConstraint(5500, TimeSpan.FromHours(1));
var apiMinuteConstraint = new CountByIntervalAwaitableConstraint(100, TimeSpan.FromMinutes(1));
var apiSecondConstraint = new CountByIntervalAwaitableConstraint(2, TimeSpan.FromSeconds(1));
// TimeLimiter for calls automatically to the API
var apiTimeLimiter = TimeLimiter.Compose(apiHourConstraint, apiMinuteConstraint, apiSecondConstraint);
var userHourConstraint = new CountByIntervalAwaitableConstraint(500, TimeSpan.FromHours(1));
var userMinuteConstraint = new CountByIntervalAwaitableConstraint(20, TimeSpan.FromMinutes(1));
var userSecondConstraint = new CountByIntervalAwaitableConstraint(1, TimeSpan.FromSeconds(1));
// TimeLimiter for calls made manually by a user to the API
var userTimeLimiter = TimeLimiter.Compose(userHourConstraint, userMinuteConstraint, userSecondConstraint);
You can play around with the numbers to suit your need.
Now to use it:
I saw you're using a central Method to execute your Requests, this makes it easier. I'll just add an optional boolean parameter that determines if it's an automatically executed request or one made from a user. (You could replace this parameter with an Enum if you want more than just automatic and manual requests)
public static async Task DoRequest(Request request, bool manual = false)
{
TimeLimiter limiter;
if (manual)
limiter = TimeLimiterManager.UserLimiter;
else
limiter = TimeLimiterManager.ApiLimiter;
await limiter;
_api.DoAsync(request);
}
static class TimeLimiterManager
{
public static TimeLimiter ApiLimiter { get; }
public static TimeLimiter UserLimiter { get; }
static TimeLimiterManager()
{
var apiHourConstraint = new CountByIntervalAwaitableConstraint(5500, TimeSpan.FromHours(1));
var apiMinuteConstraint = new CountByIntervalAwaitableConstraint(100, TimeSpan.FromMinutes(1));
var apiSecondConstraint = new CountByIntervalAwaitableConstraint(2, TimeSpan.FromSeconds(1));
// TimeLimiter to control access to the API for automatically executed requests
ApiLimiter = TimeLimiter.Compose(apiHourConstraint, apiMinuteConstraint, apiSecondConstraint);
var userHourConstraint = new CountByIntervalAwaitableConstraint(500, TimeSpan.FromHours(1));
var userMinuteConstraint = new CountByIntervalAwaitableConstraint(20, TimeSpan.FromMinutes(1));
var userSecondConstraint = new CountByIntervalAwaitableConstraint(1, TimeSpan.FromSeconds(1));
// TimeLimiter to control access to the API for manually executed requests
UserLimiter = TimeLimiter.Compose(userHourConstraint, userMinuteConstraint, userSecondConstraint);
}
}
This isn't perfect, as when the user doesn't execute 20 API calls every minute but your automated system needs to execute more than 100 every minute it will have to wait.
And regarding day/ night differences: You can use 2 backing fields for the Api/UserLimiter and return the appropriate ones in the { get {...} } of the property
I'm building a SOCKS proxy checker using .NET 4.5 and everything works fine except when one of SOCKS proxies is really slow and it takes over 100 seconds to respond. I'd like to timeout those proxies at few stages (ConnectAsync, ReadToEndAsync) especially at ReadToEndAsync because if proxy is slow it hangs.
I've tried everything I was able to find about this, using Cancellation tokens, Task.Wait, NetworkStream.ReadTimeout ( doesn't work.. strange )..
and if I use Task.Wait then I can't use await keyword which makes it synchronous and not async and that beats the whole idea of my tool..
var socksClient = new Socks5ProxyClient(IP,Port);
var googleAddress = await Dns.GetHostAddressesAsync("google.com");
var speedStopwatch = Stopwatch.StartNew();
using(var socksTcpClient = await socksClient.CreateConnection(googleAddress[0].ToString(),80))
{
if(socksTcpClient.Connected)
{
using(var socksTcpStream = socksTcpClient.GetStream())
{
socksTcpStream.ReadTimeout = 5000;
socksTcpStream.WriteTimeout = 5000; //these don't work..
using (var writer = new StreamWriter(socksTcpStream))
{
await writer.WriteAsync("GET / HTTP/1.1\r\nHost: google.com\r\n\r\n");
await writer.FlushAsync();
using (var reader = new StreamReader(socksTcpStream))
{
var result = await reader.ReadToEndAsync(); // up to 250 seconds hang on thread that is checking current proxy..
reader.Close();
writer.Close();
socksTcpStream.Close();
}
}
}
}
}
Shamefully, async socket IO does not support timeouts. You need to build that yourself. Here is the best approach I know:
Make your entire function not care about timeouts. Disable all of them. Then, start a delay task and when it completes dispose of the socket. This kills all IO that is in flight and effects immediate cancellation.
So you could do:
Task.Delay(TimeSpan.FromSeconds(100)).ContinueWith(_ => socksTcpClient.Dispose());
This leads to an ugly ObjectDisposedException. This is unavoidable.
Probably, you need to cancel the delay in case of success. Otherwise you keep a ton of delay tasks for 100 seconds and they might amount to millions depending on load.
I am using the .NET 4.5 HttpClient class to make a POST request to a server a number of times. The first 3 calls run quickly, but the fourth time a call to await client.PostAsync(...) is made, it hangs for several seconds before returning the expected response.
using (HttpClient client = new HttpClient())
{
// Prepare query
StringBuilder queryBuilder = new StringBuilder();
queryBuilder.Append("?arg=value");
// Send query
using (var result = await client.PostAsync(BaseUrl + queryBuilder.ToString(),
new StreamContent(streamData)))
{
Stream stream = await result.Content.ReadAsStreamAsync();
return new MyResult(stream);
}
}
The server code is shown below:
HttpListener listener;
void Run()
{
listener.Start();
ThreadPool.QueueUserWorkItem((o) =>
{
while (listener.IsListening)
{
ThreadPool.QueueUserWorkItem((c) =>
{
var context = c as HttpListenerContext;
try
{
// Handle request
}
finally
{
// Always close the stream
context.Response.OutputStream.Close();
}
}, listener.GetContext());
}
});
}
Inserting a debug statement at // Handle request shows that the server code doesn't seem to receive the request as soon as it is sent.
I have already investigated whether it could be a problem with the client not closing the response, meaning that the number of connections the ServicePoint provider allows could be reached. However, I have tried increasing ServicePointManager.MaxServicePoints but this has no effect at all.
I also found this similar question:
.NET HttpClient hangs after several requests (unless Fiddler is active)
I don't believe this is the problem with my code - even changing my code to exactly what is given there didn't fix the problem.
The problem was that there were too many Task instances scheduled to run.
Changing some of the Task.Factory.StartNew calls in my program for tasks which ran for a long time to use the TaskCreationOptions.LongRunning option fixed this. It appears that the task scheduler was waiting for other tasks to finish before it scheduled the request to the server.