I have a problem with C# async and await when multiple requests hit my controller at the same time.
I need to requests execute one by one not all at the same time.
When the first request save data in DB then from queue get the next request
I think it's enough to put some synchronziation there. Let's assume following code:
public async Task GetData()
{
var result = await db.GetDataFromDatabase();
return result;
}
You can declare private field of type SemaphoreSlim for example:
private readonly SemaphoreSlim _getDataSem = new SemaphoreSlim(1, 1);
and then modify body of the method:
public async Task GetData()
{
_getDataSem.Wait();
try
{
var result = await db.GetDataFromDatabase();
return result;
}
finally { _getDataSem.Release(); }
}
There is wide range of ways to synchronize such code, like Semaphore, locks, etc.
Related
I'm coding my own HttpClient that should Handle HTTP - 429 (TooManyRequests) responses. I'm executing a single method in the client in parallel. As soon as I get a 429 StatusCode as a response, I would like to pause the execution of all Tasks, that are currently calling the method.
Currently, I'm using very old code from an old MS DevBlog: PauseToken/Source
private readonly HttpClient _client;
private readonly PauseTokenSource PauseSource;
private readonly PauseToken PauseToken;
public MyHttpClient(HttpClient client)
{
_client = client;
PauseSource = new();
PauseToken = PauseSource.Token;
}
public async Task<HttpResponseMessage> PostAsJsonAsync<TValue>(string? requestUri?, TValue value, CancellationToken cancellationToken = default)
{
try
{
await PauseToken.WaitWhilePausedAsync(); // I'd really like to pass the cancellationToken as well
HttpResponseMessage result = await _client.PostAsJsonAsync(requestUri, value, cancellationToken).ConfigureAwait(false);
if (result.StatusCodes == HttpStatusCode.TooManyRequests)
{
PauseSource.IsPaused = true;
TimeSpan delay = (result.Headers.RetryAfter?.Date - DateTimeOffset.UtcNow) ?? TimeSpan.Zero;
await Task.Delay(delay, cancellationToken);
PauseSource.IsPaused = false;
return await PostAsJsonAsync(requestUri, value, cancellationToken);
}
return result;
}
finally
{
PauseSource.IsPaused = false;
}
}
MyHttpClient.PostAsJsonAsync is called like this:
private readonly MyHttpClient _client; // This gets injected by the constructor DI
private string ApiUrl; // This as well
public async Task SendToAPIAsync<T>(IEnumerable<T> items, CancellationToken cancellationToken = default)
{
IEnumerable<Task<T>> tasks = items.Select(item =>
_client.PostAsJsonAsync(ApiUrl, item, cancellationToken));
await Task.WhenAll(tasks).ConfigureAwait(false);
}
The items collection will contain 15'000 - 25'000 items. The API is unfortunately built so I have to make 1 request for each item.
I really dislike using old code like this, since I honestly don't even know what it does under the hood (the entire source code can be looked at in the linked article above). Also, I'd like to pass my cancellationToken to the WaitWhilePausedAsync() method since execution should be able to be cancelled at any time.
Is there really no easy way to "pause an async method"?
I've tried to store the DateTimeOffset I get from the result->RetryAfter in a local field, then just simply Task.Delay() the delta to DateTimeOffset.UtcNow, but that didn't seem to work and I also don't think it's very performant.
I like the idea of having a PauseToken but I think there might be better ways to do this nowadays.
I really dislike using old code like this
Just because code is old does not necessarily mean it is bad.
Also, I'd like to pass my cancellationToken to the WaitWhilePausedAsync() method since execution should be able to be cancelled at any time
As far as I can tell, the WaitWhilePausedAsync just returns a task, If you want to abort as soon as the cancellation token is cancelled you could use this answer for an WaitOrCancel extension, used like:
try{
await PauseToken.WaitWhilePausedAsync().WaitOrCancel(cancellationToken );
}
catch(OperationCancelledException()){
// handle cancel
}
Is there really no easy way to "pause an async method"?
To 'pause and async method' should mean we need to await something, since we probably want to avoid blocking. That something need to be a Task, so such a method would probably involve creating a TaskCompletionSource that can be awaited, that completes when unpaused. That seem to be more or less what your PauseToken does.
Note that any type of 'pausing' or 'cancellation' need to be done cooperatively, so any pause feature need to be built, and probably need to be built by you if you are implementing your own client.
But there are might be alternative solutions. Maybe use a SemaphoreSlim for rate-limiting? Maybe just delay the request a bit if you get a ToManyRequests error? Maybe use a central queue of requests that can be throttled?
I ultimately created a library that contains a HttpClientHandler which handles these results for me. For anyone interested, here's the repo: github.com/baltermia/too-many-requests-handler (the NuGet package is linked in the readme).
A comment above led me to the solution below. I used the github.com/StephenCleary/AsyncEx library, that both has PauseTokenSource and the AsyncLock types which provided the functionality I was searching for.
private readonly AsyncLock _asyncLock = new();
private readonly HttpClient _client;
private readonly PauseTokenSource _pauseSource = new();
public PauseToken PauseToken { get; }
public MyHttpClient(HttpClient client)
{
_client = client;
PauseToken = _pauseSource.Token;
}
public async Task<HttpResponseMessage> PostAsJsonAsync<TValue>(string? requestUri?, TValue value, CancellationToken cancellationToken = default)
{
{
// check if requests are paused and wait
await PauseToken.WaitWhilePausedAsync(cancellationToken).ConfigureAwait(false);
HttpResponseMessage result = await _client.PostAsJsonAsync(requestUri, value, cancellationToken).ConfigureAwait(false);
// if result is anything but 429, return (even if it may is an error)
if (result.StatusCode != HttpStatusCode.TooManyRequests)
return result;
// create a locker which will unlock at the end of the stack
using IDisposable locker = await _asyncLock.LockAsync(cancellationToken).ConfigureAwait(false);
// calculate delay
DateTimeOffset? time = result.Headers.RetryAfter?.Date;
TimeSpan delay = time - DateTimeOffset.UtcNow ?? TimeSpan.Zero;
// if delay is 0 or below, return new requests
if (delay <= TimeSpan.Zero)
{
// very important to unlock
locker.Dispose();
// recursively recall itself
return await PostAsJsonAsync(requestUri, value, cancellationToken).ConfigureAwait(false);
}
try
{
// otherwise pause requests
_pauseSource.IsPaused = true;
// then wait the calculated delay
await Task.Delay(delay, cancellationToken).ConfigureAwait(false);
}
finally
{
_pauseSource.IsPaused = false;
}
// make sure to unlock again (otherwise the method would lock itself because of recursion)
locker.Dispose();
// recursively recall itself
return await PostAsJsonAsync(requestUri, value, cancellationToken).ConfigureAwait(false);
}
}
I have few methods that report some data to Data base. We want to invoke all calls to Data service asynchronously. These calls to data service are all over and so we want to make sure that these DS calls are executed one after another in order at any given time. Initially, i was using async await on each of these methods and each of the calls were executed asynchronously but we found out if they are out of sequence then there are room for errors.
So, i thought we should queue all these asynchronous tasks and send them in a separate thread but i want to know what options we have? I came across 'SemaphoreSlim' . Will this be appropriate in my use case?
Or what other options will suit my use case? Please, guide me.
So, what i have in my code currently
public static SemaphoreSlim mutex = new SemaphoreSlim(1);
//first DS call
public async Task SendModuleDataToDSAsync(Module parameters)
{
var tasks1 = new List<Task>();
var tasks2 = new List<Task>();
//await mutex.WaitAsync(); **//is this correct way to use SemaphoreSlim ?**
foreach (var setting in Module.param)
{
Task job1 = SaveModule(setting);
tasks1.Add(job1);
Task job2= SaveModule(GetAdvancedData(setting));
tasks2.Add(job2);
}
await Task.WhenAll(tasks1);
await Task.WhenAll(tasks2);
//mutex.Release(); // **is this correct?**
}
private async Task SaveModule(Module setting)
{
await Task.Run(() =>
{
// Invokes Calls to DS
...
});
}
//somewhere down the main thread, invoking second call to DS
//Second DS Call
private async Task SendInstrumentSettingsToDS(<param1>, <param2>)
{
//await mutex.WaitAsync();// **is this correct?**
await Task.Run(() =>
{
//TrackInstrumentInfoToDS
//mutex.Release();// **is this correct?**
});
if(param2)
{
await Task.Run(() =>
{
//TrackParam2InstrumentInfoToDS
});
}
}
Initially, i was using async await on each of these methods and each of the calls were executed asynchronously but we found out if they are out of sequence then there are room for errors.
So, i thought we should queue all these asynchronous tasks and send them in a separate thread but i want to know what options we have? I came across 'SemaphoreSlim' .
SemaphoreSlim does restrict asynchronous code to running one at a time, and is a valid form of mutual exclusion. However, since "out of sequence" calls can cause errors, then SemaphoreSlim is not an appropriate solution since it does not guarantee FIFO.
In a more general sense, no synchronization primitive guarantees FIFO because that can cause problems due to side effects like lock convoys. On the other hand, it is natural for data structures to be strictly FIFO.
So, you'll need to use your own FIFO queue, rather than having an implicit execution queue. Channels is a nice, performant, async-compatible queue, but since you're on an older version of C#/.NET, BlockingCollection<T> would work:
public sealed class ExecutionQueue
{
private readonly BlockingCollection<Func<Task>> _queue = new BlockingCollection<Func<Task>>();
public ExecutionQueue() => Completion = Task.Run(() => ProcessQueueAsync());
public Task Completion { get; }
public void Complete() => _queue.CompleteAdding();
private async Task ProcessQueueAsync()
{
foreach (var value in _queue.GetConsumingEnumerable())
await value();
}
}
The only tricky part with this setup is how to queue work. From the perspective of the code queueing the work, they want to know when the lambda is executed, not when the lambda is queued. From the perspective of the queue method (which I'm calling Run), the method needs to complete its returned task only after the lambda is executed. So, you can write the queue method something like this:
public Task Run(Func<Task> lambda)
{
var tcs = new TaskCompletionSource<object>();
_queue.Add(async () =>
{
// Execute the lambda and propagate the results to the Task returned from Run
try
{
await lambda();
tcs.TrySetResult(null);
}
catch (OperationCanceledException ex)
{
tcs.TrySetCanceled(ex.CancellationToken);
}
catch (Exception ex)
{
tcs.TrySetException(ex);
}
});
return tcs.Task;
}
This queueing method isn't as perfect as it could be. If a task completes with more than one exception (this is normal for parallel code), only the first one is retained (this is normal for async code). There's also an edge case around OperationCanceledException handling. But this code is good enough for most cases.
Now you can use it like this:
public static ExecutionQueue _queue = new ExecutionQueue();
public async Task SendModuleDataToDSAsync(Module parameters)
{
var tasks1 = new List<Task>();
var tasks2 = new List<Task>();
foreach (var setting in Module.param)
{
Task job1 = _queue.Run(() => SaveModule(setting));
tasks1.Add(job1);
Task job2 = _queue.Run(() => SaveModule(GetAdvancedData(setting)));
tasks2.Add(job2);
}
await Task.WhenAll(tasks1);
await Task.WhenAll(tasks2);
}
Here's a compact solution that has the least amount of moving parts but still guarantees FIFO ordering (unlike some of the suggested SemaphoreSlim solutions). There are two overloads for Enqueue so you can enqueue tasks with and without return values.
using System;
using System.Threading;
using System.Threading.Tasks;
public class TaskQueue
{
private Task _previousTask = Task.CompletedTask;
public Task Enqueue(Func<Task> asyncAction)
{
return Enqueue(async () => {
await asyncAction().ConfigureAwait(false);
return true;
});
}
public async Task<T> Enqueue<T>(Func<Task<T>> asyncFunction)
{
var tcs = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
// get predecessor and wait until it's done. Also atomically swap in our own completion task.
await Interlocked.Exchange(ref _previousTask, tcs.Task).ConfigureAwait(false);
try
{
return await asyncFunction().ConfigureAwait(false);
}
finally
{
tcs.SetResult();
}
}
}
Please keep in mind that your first solution queueing all tasks to lists doesn't ensure that the tasks are executed one after another. They're all running in parallel because they're not awaited until the next tasks is startet.
So yes you've to use a SemapohoreSlim to use async locking and await. A simple implementation might be:
private readonly SemaphoreSlim _syncRoot = new SemaphoreSlim(1);
public async Task SendModuleDataToDSAsync(Module parameters)
{
await this._syncRoot.WaitAsync();
try
{
foreach (var setting in Module.param)
{
await SaveModule(setting);
await SaveModule(GetAdvancedData(setting));
}
}
finally
{
this._syncRoot.Release();
}
}
If you can use Nito.AsyncEx the code can be simplified to:
public async Task SendModuleDataToDSAsync(Module parameters)
{
using var lockHandle = await this._syncRoot.LockAsync();
foreach (var setting in Module.param)
{
await SaveModule(setting);
await SaveModule(GetAdvancedData(setting));
}
}
One option is to queue operations that will create tasks instead of queuing already running tasks as the code in the question does.
PseudoCode without locking:
Queue<Func<Task>> tasksQueue = new Queue<Func<Task>>();
async Task RunAllTasks()
{
while (tasksQueue.Count > 0)
{
var taskCreator = tasksQueue.Dequeu(); // get creator
var task = taskCreator(); // staring one task at a time here
await task; // wait till task completes
}
}
// note that declaring createSaveModuleTask does not
// start SaveModule task - it will only happen after this func is invoked
// inside RunAllTasks
Func<Task> createSaveModuleTask = () => SaveModule(setting);
tasksQueue.Add(createSaveModuleTask);
tasksQueue.Add(() => SaveModule(GetAdvancedData(setting)));
// no DB operations started at this point
// this will start tasks from the queue one by one.
await RunAllTasks();
Using ConcurrentQueue would be likely be right thing in actual code. You also would need to know total number of expected operations to stop when all are started and awaited one after another.
Building on your comment under Alexeis answer, your approch with the SemaphoreSlim is correct.
Assumeing that the methods SendInstrumentSettingsToDS and SendModuleDataToDSAsync are members of the same class. You simplay need a instance variable for a SemaphoreSlim and then at the start of each methode that needs synchornization call await lock.WaitAsync() and call lock.Release() in the finally block.
public async Task SendModuleDataToDSAsync(Module parameters)
{
await lock.WaitAsync();
try
{
...
}
finally
{
lock.Release();
}
}
private async Task SendInstrumentSettingsToDS(<param1>, <param2>)
{
await lock.WaitAsync();
try
{
...
}
finally
{
lock.Release();
}
}
and it is importend that the call to lock.Release() is in the finally-block, so that if an exception is thrown somewhere in the code of the try-block the semaphore is released.
How i can get an iterator for use in Task.WaitAll to wait for all task exist in customList with lowest overhead or lines of code ?
public class Custom
{
public Task task;
public int result;
}
public class Main
{
void doSomething()
{
List<Custom> customList=new List<>();
//customList.Add(custom1,2,3,4,5.....);
Task.WaitAll(????)
}
}
Tasks aren't threads. They represent the execution of a function, so there is no point in storing them and their results as classes. Getting the results of multiple tasks is very easy if you use Task.WhenAll.
There is no reason to wrap the task in a class. You can think of Task itself as the wrapper over a thread invocation and is result.
If you want to retrieve the contents from multiple pages, you can use LINQ a single HttpClient to request the content from many pages at once. await Task.WhenAll will return the results from all tasks :
string[] myUrls=...;
HttpClient client=new HttpClient();
var tasks=myUrls.Select(url=>client.GetStringAsync(url));
string[] pages=await Task.WhenAll(tasks);
If you want to return more data, eg check the contents, you can do so inside the Select lambda:
var tasks=myUrls.Select(async url=>
{
var response=await client.GetStringAsync(url);
return new {
Url=url,
IsSensitive=response.Contains("sensitive"),
Response=response
};
});
var results =await Task.WhenAll(tasks);
Results contains the anonymous objects generated inside Select. You can iterate or query them just like any other array, eg:
foreach(var result in results)
{
Console.WriteLine(result.Url);
}
I have made a class to handle multiple HTTP GET requests. It looks something like this:
public partial class MyHttpClass : IDisposable
{
private HttpClient theClient;
private string ApiBaseUrl = "https://example.com/";
public MyHttpClass()
{
this.theClient = new HttpClient();
this.theClient.BaseAddress = new Uri(ApiBaseUrl);
this.theClient.DefaultRequestHeaders.Accept.Clear();
this.theClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
}
public async Task<JObject> GetAsync(string reqUrl)
{
var returnObj = new JObject();
var response = await this.theClient.GetAsync(reqUrl);
if (response.IsSuccessStatusCode)
{
returnObj = await response.Content.ReadAsAsync<JObject>();
Console.WriteLine("GET successful");
}
else
{
Console.WriteLine("GET failed");
}
return returnObj;
}
public void Dispose()
{
theClient.Dispose();
}
}
I am then queueing multiple requets by using a loop over Task.Run() and then after Task.WaitAll() in the manner of:
public async Task Start()
{
foreach(var item in list)
{
taskList.Add(Task.Run(() => this.GetThing(item)));
}
Task.WaitAll(taskList.ToArray());
}
public async Task GetThing(string url)
{
var response = await this.theClient.GetAsync(url);
// some code to process and save response
}
It definitiely works faster than synchonus operation but it is not as fast as I expected. Based on other advice I think the local threadpool is slowing me down. MSDN suggest I should specify it as a long running task but I can't see a way to do that calling it like this.
Right now I haven't got into limiting threads, I am just doing batches and testing speed to discover the right approach.
Can anyone suggest some areas for me to look at to increase the speed?
So, after you've set your DefaultConnectionLimit to a nice high number, or just the ConnectionLimit of the ServicePoint that manages connections to the host you are hitting:
ServicePointManager
.FindServicePoint(new Uri("https://example.com/"))
.ConnectionLimit = 1000;
the only suspect bit of code is where you start everything...
public async Task Start()
{
foreach(var item in list)
{
taskList.Add(Task.Run(() => this.GetThing(item)));
}
Task.WaitAll(taskList.ToArray());
}
This can be reduced to
var tasks = list.Select(this.GetThing);
to create the tasks (your async methods return hot (running) tasks... no need to double wrap with Task.Run)
Then, rather that blocking while waiting for them to complete, wait asynchronously instead:
await Task.WhenAll(tasks);
You are probably hitting some overhead in creating multiple instance-based HttpClient vs using a static instance. Your implementation will not scale. Using a shared HttpClient is actually recommended.
See my answer why - What is the overhead of creating a new HttpClient per call in a WebAPI client?
I have a processes where I need to make ~100 http api calls to a server and process the results. I've put together this commandexecutor which builds a list of commands and then runs them async. To make about 100 calls and parse the result is taking over 1 minute. 1 request using a browser give me a response in ~100ms. You would think that ~100 calls would be around 10 seconds. I believe that I am doing something wrong and that this should go much faster.
public static class CommandExecutor
{
private static readonly ThreadLocal<List<Command>> CommandsToExecute =
new ThreadLocal<List<Command>>(() => new List<Command>());
private static readonly ThreadLocal<List<Task<List<Candidate>>>> Tasks =
new ThreadLocal<List<Task<List<Candidate>>>>(() => new List<Task<List<Candidate>>>());
public static void ExecuteLater(Command command)
{
CommandsToExecute.Value.Add(command);
}
public static void StartExecuting()
{
foreach (var command in CommandsToExecute.Value)
{
Tasks.Value.Add(Task.Factory.StartNew<List<Candidate>>(command.GetResult));
}
Task.WaitAll(Tasks.Value.ToArray());
}
public static List<Candidate> Result()
{
return Tasks.Value.Where(x => x.Result != null)
.SelectMany(x => x.Result)
.ToList();
}
}
The Command that I am passing into this list creates a new httpclient, calls the getasync on that client with a url, converts the string response to an object then hydrates a field.
protected void Initialize()
{
_httpClient = new HttpClient();
_httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("text/plain"));
}
protected override void Execute()
{
Initialize();
var task = _httpClient.GetAsync(string.Format(Url, Input));
Result = ConvertResponseToObjectAsync(task).Result;
Result.ForEach(x => x.prop = value);
}
private static Task<Model> ConvertResponseToObjectAsync(Task<HttpResponseMessage> task)
{
return task.Result.Content.ReadAsAsync<Model>(
new MediaTypeFormatter[]
{
new Formatter()
});
}
Can you pick up on my bottleneck or have any suggestions on how to speed this up.
EDIT
making these changes made it down to 4 seconds.
protected override void Execute()
{
Initialize();
_httpClient.GetAsync(string.Format(Url, Input))
.ContinueWith(httpResponse => ConvertResponseToObjectAsync(httpResponse)
.ContinueWith(ProcessResult));
}
protected void ProcessResult(Task<Model> model)
{
Result = model.Result;
Result.ForEach(x => x.prop = value);
}
Stop creating new HttpClient instances. Everytime you dispose a HttpClient instance it closes the TCP/IP connection. Create one HttpClient instance and re-use it for every request. HttpClient can make multiple requests on multiple different threads at the same time.
Avoid the use of task.Result in ConvertResponseToObjectAsync and then again in Execute. Instead chain these on to the original GetAsync task with ContinueWith.
As it stands today, Result will block execution of the current thread until the other task finishes. However, your threadpool will quickly get backed up by tasks waiting on other tasks that have nowhere to run. Eventually (after waiting for a second), the threadpool will add an additional thread to run and so this will eventually finish, but it's hardly efficient.
As a general principle, you should avoid ever accessing Task.Result except in a task continuation.
As a bonus, you probably don't want to be using ThreadLocalStorage. ThreadLocalStorage stores an instance of the item stored in it on each thread where it is accessed. In this case, it looks like you want a thread-safe but shared form of storage. I would recommend ConcurrentQueue for this sort of thing.