Implementation of HttpClient request limiter and buffer

Implementation of HttpClient request limiter and buffer - c#

In our project, we have a few services that make requests to a 3rd party API, using a key.
This API has a shared rate limit between all endpoints (meaning request to one endpoint will require 2 seconds cooldown before we can use a different endpoint).
We've handled this using timed background jobs, only making requests to only one of the endpoints at any time.
After some architectural redesign, we've come to a spot where we don't rely as much on the timed background jobs, and now all HttpRequests cannot be moderated since multiple service instances are making requests to the API.
So, in our current example:
We have a few HttpClients set up to all needed API endpoints, i.e.:
services.AddHttpClient<Endpoint1Service>(client =>
{
client.BaseAddress = new Uri(configOptions.Services.Endpoint1.Url);
});
services.AddHttpClient<Endpoint2Service>(client =>
{
client.BaseAddress = new Uri(configOptions.Services.Endpoint2.Url);
});
Endpoint1Service and Endpoint2Service were before accessed by background job services:
public async Task DoJob()
{
var items = await _repository.GetItems();
foreach (var item in items)
{
var processedResult = await _endpoint1Service.DoRequest(item);
await Task.Delay(2000);
//...
}
// save all results
}
But now these "endpoint" services are accessed concurrently, and a new instance is create every time, therefore no way to moderate the request rates.
One possible solution was to create some sort of singleton request buffer is injected into all services that uses this API, and moderates these requests to go out at a given rate. Problems I see with this is it seems dangerous to store requests in a in-memory buffer, in case something goes wrong.
Is this a direction I should be looking towards, or is there anything else I can try?

Hope this helps:
I created the following for similar scenarios. Its objective is concurrency throttled multi threading. However it also gives you a convenient wrapper over your request processing pipeline. Additionally it provides a max number of concurrent requests limit per client (if you want to use that).
Create one instance per end point service and set its number of threads to 1 if you want a throttle of 1. Set it to 4 if you want it at 4 concurrent requests to the given end point.
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Threading/APIProcessor/AsyncThreadedWorkItemProcessor.cs
or
https://github.com/tcwicks/ChillX/blob/master/src/ChillX.Threading/APIProcessor/ThreadedWorkItemProcessor.cs
The two implementations are interchangeable. If using in a web server context the former is probably better as it offloads to the background thread pool instead if using foreground threads.
Example Usage
In your case probably set: _maxWorkerThreads to a value of 1 if you want to rate limit it at 1 concurrent request. Set it to 4 if you want to rate limit it at 4 concurrent requests.
//Example Usage for WebAPI controller
class Example
{
private static ThreadedWorkItemProcessor<DummyRequest, DummyResponse, int, WorkItemPriority> ThreadedProcessorExample = new ThreadedWorkItemProcessor<DummyRequest, DummyResponse, int, WorkItemPriority>(
_maxWorkItemLimitPerClient: 100 // Maximum number of concurrent requests in the processing queue per client. Set to int.MaxValue to disable concurrent request caps
, _maxWorkerThreads: 16 // Maximum number of threads to scale upto
, _threadStartupPerWorkItems: 4 // Consider starting a new processing thread ever X requests
, _threadStartupMinQueueSize: 4 // Do NOT start a new processing thread if work item queue is below this size
, _idleWorkerThreadExitSeconds: 10 // Idle threads will exit after X seconds
, _abandonedResponseExpirySeconds: 60 // Expire processed work items after X seconds (Maybe the client terminated or the web request thread died)
, _processRequestMethod: ProcessRequestMethod // Your Do Work method for processing the request
, _logErrorMethod: Handler_LogError
, _logMessageMethod: Handler_LogMessage
);
public async Task<DummyResponse> GetResponse([FromBody] DummyRequest _request)
{
int clientID = 1; //Replace with the client ID from your authentication mechanism if using per client request caps. Otherwise just hardcode to maybe 0 or whatever
WorkItemPriority _priority;
_priority = WorkItemPriority.Medium; //Assign the priority based on whatever prioritization rules.
int RequestID = ThreadedProcessorExample.ScheduleWorkItem(_priority, _request, clientID);
if (RequestID < 0)
{
//Client has exceeded maximum number of concurrent requests or Application Pool is shutting down
//return a suitable error message here
return new DummyResponse() { ErrorMessage = #"Maximum number of concurrent requests exceeded or service is restarting. Please retry request later." };
}
//If you need the result (Like in a webapi controller) then do this
//Otherwise if it is say a backend processing sink where there is no client waiting for a response then we are done here. just return.
KeyValuePair<bool, ThreadWorkItem<DummyRequest, DummyResponse, int>> workItemResult;
workItemResult = await ThreadedProcessorExample.TryGetProcessedWorkItemAsync(RequestID,
_timeoutMS: 1000, //Timeout of 1 second
_taskWaitType: ThreadProcessorAsyncTaskWaitType.Delay_Specific,
_delayMS: 10);
if (!workItemResult.Key)
{
//Processing timeout or Application Pool is shutting down
//return a suitable error message here
return new DummyResponse() { ErrorMessage = #"Internal system timeout or service is restarting. Please retry request later." };
}
return workItemResult.Value.Response;
}
public static DummyResponse ProcessRequestMethod(DummyRequest request)
{
// Process the request and return the response
return new DummyResponse() { orderID = request.orderID };
}
public static void Handler_LogError(Exception ex)
{
//Log unhandled exception here
}
public static void Handler_LogMessage(string Message)
{
//Log message here
}
}

Related

Task.WhenAll timing out after x number of Tasks

I have the following code that gets called from a Controller.
public async Task Execute()
{
var collections= await _repo.GetCollections(); // This gets 500+ items
List<Object1> coolCollections= new List<Object1>();
List<Object2> uncoolCollections= new List<Object2>();
foreach (var collection in collections)
{
if(collection == "Something")
{
var specialObject = TurnObjectIntoSpecialObject(collection);
uncoolCollections.Add(specialObject);
}
else
{
var anotherObject = TurnObjectIntoAnotherObject(object);
coolCollections.Add(anotherObject);
}
}
var list1Async = coolCollections.Select(async obj => await restService.PostObject1(obj)); //each call takes 200 -> 2000ms
var list2Async = uncoolCollections.Select(async obj => await restService.PostObject2(obj));//each call takes 300 -> 3000ms
var asyncTasks = list1Async.Concat<Task>(list2Async);
await Task.WhenAll(asyncTasks); //As 500+ 'tasks'
}
Unfortunately, I'm getting a 504 error after around 300 or so requests. I can't change the API the RestService calls so I'm stuck trying to make the above code more performant.
Changing Task.WhenAll to a foreach loop does work, and does resolve the time out but it's very slow.
My question is how can I make sure the above code does not timeout after x number of requests?

Making more concurrent calls to a remote site or database doesn't improve throughput, quite the opposite. Conflicts between the concurrent operations mean that beyond a certain point everything will start taking more time, until the service crashes. Right now you have a possible 300-way blocking problem.
The way all services handle this is by restricting the number of concurrent connections. In fact, many services will throttle clients so they don't crash if someone ... sends 500 concurrent requests. Some may even tell you they're throttling you with a 429 response.
Another way is to use batch requests, so instead of making 300 or 500 calls you can send a batch of 500 operations, allowing the service to handle them in an efficient way.
You can use Parallel.ForEachAsync to execute multiple calls with a specified degree of parallelism :
ParallelOptions parallelOptions = new()
{
MaxDegreeOfParallelism = 30
};
await Parallel.ForEachAsync(object1, parallelOptions, async obj =>
{
await restService.PostObject1(obj);
});
await Parallel.ForEachAsync(object2, parallelOptions, async obj =>
{
await restService.PostObject2(obj);
});
You can adjust the DOP to find what works best without slowing down the remote service.
If the service can handle it, you could start both pipelines concurrently and await both of them to complete:
ParallelOptions parallelOptions = new()
{
MaxDegreeOfParallelism = 30
};
var task1=Parallel.ForEachAsync(object1, parallelOptions, async obj =>
{
await restService.PostObject1(obj);
});
var task2=Parallel.ForEachAsync(object2, parallelOptions, async obj =>
{
await restService.PostObject2(obj);
});
await Task.WhenAll(task1,task2);
It doesn't make sense to retry those operations before you limit the DOP. Retrying 500 failed requests will only lead to another failure. Retrying with a random or staggered delay is essentially the same as limiting the DOP from the start, except it takes far longer to complete.

Since you are unable to change the rest service the 504 gateway timeout will remain.
A better solution would be to use a retry mechanism, if you receive a 504 error code then you'll retry after 'x' seconds.
Why retry after 'x' seconds?
The reasons for the 504 could be many, it could be that the server does not handle more request because it is at it's maximum workload at the moment.
A good and battle-tested library for retry mechanism is Polly.
You could also write your own function or action depending on the return type.
Depending on the happyflow use-case other methods could be used, but in this situation I went with the thought of you wanting to upload all the data even if an exception occurs.
Something to think about, if this service is provided by a third party vendor then look into the documentation. It will most likely have a section on max concurrent connections to the service.

This answer assumes that you are using .NET 6 or later. You could project the objects to an enumerable of object elements, and then parallelize the processing of the projected objects with the Parallel.ForEachAsync method. This method allows to configure the MaxDegreeOfParallelism, so that not all projected objects are processed at once. As a result the remote server will not be bombarded with more requests than it can handle, nor the network bandwidth will be saturated. The optimal MaxDegreeOfParallelism can be found by experimentation. Start with a small number, like 5, and then gradually increase it until you find the sweet spot that offers the best performance.
public async Task Execute()
{
var objects = await _repo.GetObjects();
IEnumerable<object> projected = objects.Select(obj =>
{
if (obj == "Something")
{
return (object)TurnObjectIntoSpecialObject(obj);
}
else
{
return (object)TurnObjectIntoAnotherObject(obj);
}
});
ParallelOptions options = new()
{
MaxDegreeOfParallelism = 5
};
await Parallel.ForEachAsync(projected, options, async (item, ct) =>
{
switch (item)
{
case Object1 obj1: await restService.PostObject1(obj1); break;
case Object2 obj2: await restService.PostObject2(obj2); break;
default: throw new NotImplementedException();
}
});
}
The above code processes the objects in the same order that appear in the objects sequence. The PostObject1/PostObject2 operations are parallelized, but the TurnObjectIntoSpecialObject/TurnObjectIntoAnotherObject operations are not. If you want to parallelize these too, then you can feed the Parallel.ForEachAsync with the objects, and do the projection inside the parallel loop.
In case of errors, only the first exception will be propagated. If you want to
propagate all the errors, you can find solutions here.

.Net5 HttpClient concurrency - performance

Created a HttpClient using IHttpClientFactory and send 1000 GET call in parallel to WebApi and observed the delay of about 3-5mins for each request.. once this is completed after this again send 1000 GET requests in parallel, this time there was no delay.
Now I increased the parallel request to 2000, for the first batch, each request delay was about 9-11min. And for the second 2000 parallel requests, for each request delay was ~5min(which in case of 1000 requests there was no delay.)
var client = _clientFactory.CreateClient();
client.BaseAddress = new Uri("http://localhost:5000");
client.Timeout = TimeSpan.FromMinutes(20);
List<Task> _task = new List<Task>();
for (int i = 1; i <= 4000; i++)
{
_task.Add(ExecuteRequest(client, i));
if (i % 2000 == 0)
{
await Task.WhenAll(_task);
_task.Clear();
}
}
private async Task ExecuteRequest(HttpClient client, int requestId)
{
var result = await client.GetAsync($"Performance/{requestId}");
var response = await result.Content.ReadAsStringAsync();
var data = JsonConvert.DeserializeObject<Response>(response);
}
Trying to understand,
how many parallel request does HttpClient supports without delay.
How to improve performance of HttpClient for 2000 or more parallel requests..

how many parallel request does HttpClient supports without delay.
On modern .NET Core platforms, you're limited only by available memory. There's no built-in throttling that's on by default.
How to improve performance of HttpClient for 2000 or more parallel requests.
It sounds like you're being throttled by your server. If you want to test a more scalable server, try running this in your server's startup:
var desiredThreads = 2000;
ThreadPool.GetMaxThreads(out _, out var maxIoThreads);
ThreadPool.SetMaxThreads(desiredThreads, maxIoThreads);
ThreadPool.GetMinThreads(out _, out var minIoThreads);
ThreadPool.SetMinThreads(desiredThreads, minIoThreads);

What you're doing is causing worst-case perf for a "cold" (just newed up or empty connection pool) HttpClient.
When you make a new request, it looks for an open connection in the connection pool. When it doesn't find one, it tries to open up a new connection. By throwing a sudden burst at a cold client, most calls to SendAsync will end up trying to open a new connection.
This is a problem because a request that needs a new connection will require multiple round-trips to the server, whereas a request on an existing connection will only require a single round-trip. It gets even worse if you use HTTPS. You're heavily dependent on your network latency in this case.
If you are just benchmarking, then you'll want to benchmark steady-state performance, not warmup performance. Benchmark.NET should more or less do this for you.
When you have requests that complete reasonably quick, it can be a lot faster to instead limit your initial concurrency to a smaller percentage of your total requests, and slowly ramp up your connection pool size from there. This allows subsequent requests to re-use connections. What you might try is something like below, which will only allow (rough behavior, not a guarantee) 10 new connections to be opened at once:
var sem = new SemaphoreSlim(10);
var client = new HttpClient();
async Task<HttpResponseMessage> MakeRequestAsync(HttpRequestMessage req)
{
Task t = sem.WaitAsync();
bool openNew = t.IsCompleted;
await t;
try
{
return await client.SendAsync(req);
}
finally
{
sem.Release(openNew ? 2 : 1);
}
}

Client-side request rate-limiting

I'm designing a .NET client application for an external API. It's going to have two main responsibilities:
Synchronization - making a batch of requests to API and saving responses to my database periodically.
Client - a pass-through for requests to API from users of my client.
Service's documentation specifies following rules on maximum number of requests that can be issued in given period of time:
During a day:
Maximum of 6000 requests per hour (~1.67 per second)
Maximum of 120 requests per minute (2 per second)
Maximum of 3 requests per second
At night:
Maximum of 8000 requests per hour (~2.23 per second)
Maximum of 150 requests per minute (2.5 per second)
Maximum of 3 requests per second
Exceeding these limits won't result in immediate lockdown - no exception will be thrown. But provider can get annoyed, contact us and then ban us from using his service. So I need to have some request delaying mechanism in place to prevent that. Here's how I see it:
public async Task MyMethod(Request request)
{
await _rateLimter.WaitForNextRequest(); // awaitable Task with calculated Delay
await _api.DoAsync(request);
_rateLimiter.AppendRequestCounters();
}
Safest and simpliest option would be to respect the lowest rate limit only, that is of max 3 requests per 2 seconds. But because of "Synchronization" responsibility, there is a need to use as much of these limits as possible.
So next option would be to to add a delay based on current request count. I've tried to do something on my own and I also have used RateLimiter by David Desmaisons, and it would've been fine, but here's a problem:
Assuming there will be 3 requests per second sent by my client to the API at day, we're going to see:
A 20 second delay every 120th request
A ~15 minute delay every 6000th request
This would've been acceptable if my application was only about "Synchronization", but "Client" requests can't wait that long.
I've searched the Web, and I've read about token/leaky bucket and sliding window algorithms, but I couldn't translate them to my case and .NET, since they mainly cover the rejecting of requests that exceed a limit. I've found this repo and that repo, but they are both only service-side solutions.
QoS-like spliting of rates, so that "Synchronization" would have the slower, and "Client" the faster rate, is not an option.
Assuming that current request rates will be measured, how to calculate the delay for next request so that it could be adaptive to current situation, respect all maximum rates and wouldn't be longer than 5 seconds? Something like gradually slowing down when approaching a limit.

This is achievable by using the Library you linked on GitHub. We need to use a composed TimeLimiter made out of 3 CountByIntervalAwaitableConstraint like so:
var hourConstraint = new CountByIntervalAwaitableConstraint(6000, TimeSpan.FromHours(1));
var minuteConstraint = new CountByIntervalAwaitableConstraint(120, TimeSpan.FromMinutes(1))
var secondConstraint = new CountByIntervalAwaitableConstraint(3, TimeSpan.FromSeconds(1));
var timeLimiter = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
We can test to see if this works by doing this:
for (int i = 0; i < 1000; i++)
{
await timeLimiter;
Console.WriteLine($"Iteration {i} at {DateTime.Now:T}");
}
This will run 3 times every second until we reach 120 iterations (iteration 119) and then wait until the minute is over and the continue running 3 times every second. We can also (again using the Library) easily use the TimeLimiter with a HTTP Client by using the AsDelegatingHandler() extension method provided like so:
var handler = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
var client = new HttpClient(handler);
We can also use CancellationTokens, but as far as I can tell not at the same time as also using it as the handler for the HttpClient. Here is how you can use it with a HttpClientanyways:
var timeLimiter = TimeLimiter.Compose(hourConstraint, minuteConstraint, secondConstraint);
var client = new HttpClient();
for (int i = 0; i < 100; i++)
{
await composed.Enqueue(async () =>
{
var client = new HttpClient();
var response = await client.GetAsync("https://hacker-news.firebaseio.com/v0/item/8863.json?print=pretty");
if (response.IsSuccessStatusCode)
Console.WriteLine(await response.Content.ReadAsStringAsync());
else
Console.WriteLine($"Error code {response.StatusCode} reason: {response.ReasonPhrase}");
}, new CancellationTokenSource(TimeSpan.FromSeconds(10)).Token);
}
Edit to address OPs question more:
If you want to make sure a User can send a request without having to wait for the limit to be over with, we would need to dedicate a certain amount of request every second/ minute/ hour to our user. So we need a new TimeLimiter for this and also adjust our API TimeLimiter. Here are the two new ones:
var apiHourConstraint = new CountByIntervalAwaitableConstraint(5500, TimeSpan.FromHours(1));
var apiMinuteConstraint = new CountByIntervalAwaitableConstraint(100, TimeSpan.FromMinutes(1));
var apiSecondConstraint = new CountByIntervalAwaitableConstraint(2, TimeSpan.FromSeconds(1));
// TimeLimiter for calls automatically to the API
var apiTimeLimiter = TimeLimiter.Compose(apiHourConstraint, apiMinuteConstraint, apiSecondConstraint);
var userHourConstraint = new CountByIntervalAwaitableConstraint(500, TimeSpan.FromHours(1));
var userMinuteConstraint = new CountByIntervalAwaitableConstraint(20, TimeSpan.FromMinutes(1));
var userSecondConstraint = new CountByIntervalAwaitableConstraint(1, TimeSpan.FromSeconds(1));
// TimeLimiter for calls made manually by a user to the API
var userTimeLimiter = TimeLimiter.Compose(userHourConstraint, userMinuteConstraint, userSecondConstraint);
You can play around with the numbers to suit your need.
Now to use it:
I saw you're using a central Method to execute your Requests, this makes it easier. I'll just add an optional boolean parameter that determines if it's an automatically executed request or one made from a user. (You could replace this parameter with an Enum if you want more than just automatic and manual requests)
public static async Task DoRequest(Request request, bool manual = false)
{
TimeLimiter limiter;
if (manual)
limiter = TimeLimiterManager.UserLimiter;
else
limiter = TimeLimiterManager.ApiLimiter;
await limiter;
_api.DoAsync(request);
}
static class TimeLimiterManager
{
public static TimeLimiter ApiLimiter { get; }
public static TimeLimiter UserLimiter { get; }
static TimeLimiterManager()
{
var apiHourConstraint = new CountByIntervalAwaitableConstraint(5500, TimeSpan.FromHours(1));
var apiMinuteConstraint = new CountByIntervalAwaitableConstraint(100, TimeSpan.FromMinutes(1));
var apiSecondConstraint = new CountByIntervalAwaitableConstraint(2, TimeSpan.FromSeconds(1));
// TimeLimiter to control access to the API for automatically executed requests
ApiLimiter = TimeLimiter.Compose(apiHourConstraint, apiMinuteConstraint, apiSecondConstraint);
var userHourConstraint = new CountByIntervalAwaitableConstraint(500, TimeSpan.FromHours(1));
var userMinuteConstraint = new CountByIntervalAwaitableConstraint(20, TimeSpan.FromMinutes(1));
var userSecondConstraint = new CountByIntervalAwaitableConstraint(1, TimeSpan.FromSeconds(1));
// TimeLimiter to control access to the API for manually executed requests
UserLimiter = TimeLimiter.Compose(userHourConstraint, userMinuteConstraint, userSecondConstraint);
}
}
This isn't perfect, as when the user doesn't execute 20 API calls every minute but your automated system needs to execute more than 100 every minute it will have to wait.
And regarding day/ night differences: You can use 2 backing fields for the Api/UserLimiter and return the appropriate ones in the { get {...} } of the property

Advice on processing giant text file and processing URL's

I'm currently trying to loop through a text file that is about 1.5gb's in size and then use the URL's that are grabbed from it to pull down the html from the site.
For speed I'm trying to process all the HTTP request on a new thread but since C# is not my strongest language but a requirement for what I'm doing I'm a bit confused on good thread practice.
This is how I'm processing the list
private static void Main()
{
const Int32 BufferSize = 128;
using (var fileStream = File.OpenRead("dump.txt"))
using (var streamReader = new StreamReader(fileStream, Encoding.UTF8, true, BufferSize))
{
String line;
var progress = 0;
while ((line = streamReader.ReadLine()) != null)
{
var stuff = line.Split('|');
getHTML(stuff[3]);
progress += 1;
Console.WriteLine(progress);
}
}
}
And I'm pulling down the HTML as so
private static void getHTML(String url)
{
new Thread(() =>
{
var client = new DecompressGzipResponse();
var html = client.DownloadString(url);
}).Start();
}
Though the speeds are fast doing this initially, after about 20 thousand they slow down and eventually after 32 thousand the application will hang and crash. I was under the impression C# threads terminated when the function completed?
Can anyone give any examples/ suggestions on how to do this better?

One very reliable way to do this is by using the producer-consumer pattern. You create a thread-safe queue of URLs (for example, BlockingCollection<Uri>). Your main thread is the producer, which adds items to the queue. You then have multiple consumer threads, each of which reads Urls from the queue and does the HTTP requests. See BlockingCollection.
Setting it up isn't terribly difficult:
BlockingCollection<Uri> UrlQueue = new BlockingCollection<Uri>();
// Main thread starts the consumer threads
Task t1 = Task.Factory.StartNew(() => ProcessUrls, TaskCreationOptions.LongRunning);
Task t2 = Task.Factory.StartNew(() => ProcessUrls, TaskCreationOptions.LongRunning);
// create more tasks if you think necessary.
// Now read your file
foreach (var line in File.ReadLines(inputFileName))
{
var theUri = ExtractUriFromLine(line);
UrlQueue.Add(theUri);
}
// when done adding lines to the queue, mark the queue as complete
UrlQueue.CompleteAdding();
// now wait for the tasks to complete.
t1.Wait();
t2.Wait();
// You could also use Task.WaitAll if you have an array of tasks
The individual threads process the urls with this method:
void ProcessUrls()
{
foreach (var uri in UrlQueue.GetConsumingEnumerable())
{
// code here to do a web request on that url
}
}
That's a simple and reliable way to do things, but it's not especially quick. You can do much better by using a second queue of WebCient objects that make asynchronous requests For example, say you want to have 15 asynchronous requests. You start the same way with a BlockingCollection, but you only have one persistent consumer thread.
const int MaxRequests = 15;
BlockingCollection<WebClient> Clients = new BlockingCollection<WebClient>();
// start a single consumer thread
var ProcessingThread = Task.Factory.StartNew(() => ProcessUrls, TaskCreationOptions.LongRunning);
// Create the WebClient objects and add them to the queue
for (var i = 0; i < MaxRequests; ++i)
{
var client = new WebClient();
// Add an event handler for the DownloadDataCompleted event
client.DownloadDataCompleted += DownloadDataCompletedHandler;
// And add this client to the queue
Clients.Add(client);
}
// add the code from above that reads the file and populates the queue
Your processing function is somewhat different:
void ProcessUrls()
{
foreach (var uri in UrlQueue.GetConsumingEnumerable())
{
// Wait for an available client
var client = Clients.Take();
// and make an asynchronous request
client.DownloadDataAsync(uri, client);
}
// When the queue is empty, you need to wait for all of the
// clients to complete their requests.
// You know they're all done when you dequeue all of them.
for (int i = 0; i < MaxRequests; ++i)
{
var client = Clients.Take();
client.Dispose();
}
}
Your DownloadDataCompleted event handler does something with the data that was downloaded, and then adds the WebClient instance back to the queue of clients.
void DownloadDataCompleteHandler(Object sender, DownloadDataCompletedEventArgs e)
{
// The data downloaded is in e.Result
// be sure to check the e.Error and e.Cancelled values to determine if an error occurred
// do something with the data
// And then add the client back to the queue
WebClient client = (WebClient)e.UserState;
Clients.Add(client);
}
This should keep you going with 15 concurrent requests, which is about all you can do without getting a bit more complicated. Your system can likely handle many more concurrent requests, but the way that WebClient starts asynchronous requests requires some synchronous work up front, and that overhead makes 15 about the maximum number you can handle.
You might be able to have multiple threads initiating the asynchronous requests. In that case, you could potentially have as many threads as you have processor cores. So on a quad core machine, you could have the main thread and three consumer threads. With three consumer threads this technique could give you 45 concurrent requests. I'm not certain that it scales that well, but it might be worth a try.
There are ways to have hundreds of concurrent requests, but they're quite a bit more complicated to implement.

You need thread management.
My advice is to use Tasks instead of creating your own Threads.
By using the Task Parallel Library, you let the runtime deal with the thread management. By default, it will allocate your tasks on threads from the ThreadPool, and will allow a level of concurrency which is contingent on the number of CPU cores you have. It will also reuse existing Threads when they become available instead of wasting time creating new ones.
If you want to get more advanced, you can create your own task scheduler to manage the scheduling aspect yourself.
See also What is difference between Task and Thread?

Placing a global HTTP requests per second limit

due to server limitations, I cannot make more than one requests per 3 second, I am using Thread.Sleep() to limit the number of requests I can make. Is there a better way without having to pause the thread? Thanks.
static void main(string[] args)
{
// getids
List<string> requestIds = GetMyRequestIds();
foreach(string requestId in requestIds)
{
Thread.Sleep(3000);
// one request for each Id
result = FetchStatus(requestId);
}
}
public Dictionary<string, object> FetchStatus(string requestId)
{
// build http request and query the server
// ... requestId... http... etc... read to end
return results;
}

If your limitation is just one request per 3 seconds you could set up a timer which fires a callback every 3 seconds. As this is executed in a separate thread, it is possible that two long running requests execute simultaneously.
System.Threading.Timer

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.