How to handle multiple request batch processing using Task in ASP.NET? - c#

I have a list of selected contentIds and for each content id I need to call an api, get the response and then save the received response for each content in DB.
At a time a user can select any number of content ranging from 1-1000 and can pass on this to update the content db details after getting the response from api.
In this situation I end up creating multiple requests for each content.
I thought to go ahead with asp.net async Task operation and then wrote this following method.
The code I wrote currently creates one one task for each contentId and atlast I am waiting from all task to get the response.
Task.WaitAll(allTasks);
public static Task<KeyValuePair<int, string>> GetXXXApiResponse(string url, int contentId)
{
var client = new HttpClient();
return client.GetAsync(url).ContinueWith(task =>
{
var response = task.Result;
var strTask = response.Content.ReadAsStringAsync();
strTask.Wait();
var strResponse = strTask.Result;
return new KeyValuePair<int, string>(contentId, strResponse);
});
}
I am now thinking for each task I create it will create one thread and in turn with the limited no of worker thread this approach will end up taking all the threads, which I don't want to happen.
Can any one help/guide me how to handle this situation effectively i.e handling multiple api requests or kind of batch processing using async tasks etc?
FYI: I'm using .NET framework 4.5

A task is just a representation of an asynchronous operation that can be waited on and cancelled. Creating new tasks doesn't necessarily create new threads (it mostly doesn't). If you use async-await correctly you don't even have a thread during most of the asynchronous operation.
But, making many requests concurrently can still be problematic (e.g. burdening the content server too much). So you may still want to limit the amount of concurrent calls using SemaphoreSlim or TPL Dataflow:
private static readonly SemaphoreSlim _semaphore = new SemaphoreSlim(100);
public static async Task<KeyValuePair<int, string>> GetXXXApiResponse(string url, int contentId)
{
await _semaphore.WaitAsync();
try
{
var client = new HttpClient();
var response = await client.GetAsync(url);
var strResponse = await response.Content.ReadAsStringAsync();
return new KeyValuePair<int, string>(contentId, strResponse);
}
finally
{
_semaphore.Release();
}
}
To wait for these tasks to complete you should use Task.WhenAll to asynchronously wait instead of Task.WaitAll which blocks the calling thread:
await Task.WhenAll(apiResponses);

Related

WebAPI HTTP request not completing until queued work kicks off on background task

In my .Net 6 WebPI service, I am queueing work to a background task queue, very closely based on the example here, but I could post parts of my implementation if that would help:
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services?view=aspnetcore-6.0&tabs=visual-studio#queued-background-tasks
I am running into unexpected behavior where control is not returned to the caller, even after the return Ok(..) completes in the controller. Instead the request only completes after the await Task.Delay(1000); line is reached on the queued work item. The request returns to the client as soon as this line is reached, and does not need to wait for the Delay to finish.
I'm guessing this is because of the await either starting a new async context, or someone un-sticking the async context of the original request. My intention is for the request to complete immediately after queuing the work item.
Any insight into what is happening here would be greatly appreciated.
Controller:
public async Task<ActionResult> Test()
{
var resultMessage = await _testQueue.DoIt();
return Ok(new { Message = resultMessage });
}
Queueing Work:
public TestAdapter(ITaskQueue taskQueue)
{
_taskQueue = taskQueue;
}
public async Task<string> DoIt()
{
await _taskQueue.QueueBackgroundWorkItemAsync(async (_cancellationToken) =>
{
await Task.Delay(1000);
var y = 12;
});
return "cool";
}
IoC:
services.AddHostedService<QueueHostedService>();
services.AddSingleton<ITTaskQueue>(ctx =>
{
return new TaskQueue(MaxQueueCount);
});
TaskQueue:
private readonly Channel<BackgroundTaskContext> _queue;
public TaskQueue(int capacity)
{
var options = new BoundedChannelOptions(capacity)
{
FullMode = BoundedChannelFullMode.Wait
};
_queue = Channel.CreateBounded<BackgroundTaskContext>(options);
}
public async ValueTask QueueBackgroundWorkItemAsync(
Func<CancellationToken, ValueTask> workItem)
{
if (workItem == null)
{
throw new ArgumentNullException(nameof(workItem));
}
await _queue.Writer.WriteAsync(new BackgroundTaskContext(workItem, ...));
}
Not sure what you expect here. I'm assuming you want the async method to return the cool in the api response. That's fine but because your also awaiting the async call with in DoIt(), then it pauses until QueueBackgroundWorkItemAsync finishes. You could remove the await and it will run and return as you expect.
I can't say that I'm a big fan of that design as you lose contact with it with exception of the cancellation token. Another thought would be to Send that work off to a console job or function app using message bus or even another http call.
Additional Notes:
Async can be complicated to explain because in reality, it wraps up code and executes on a thread of it's choosing. The await simulates the synchronous behavior.
await Task.Delay(1000); // Runs on it's own thread but still halts code execution for 1 second.
await _taskQueue.QueueBackgroundWorkItemAsync(async (_cancellationToken) // Waits for control to be returned from the code inside.
var resultMessage = await _testQueue.DoIt(); // Always waits for the code inside to complete.
If your wanting something to run without pausing code execution, you can either remove the await or add a Task.Run(() => { }); pattern. Is it a good idea is a whole other question. It also matters whether you need information back from the async method. If you don't await it then you'll get null back as it doesn't wait around for the answer to be computed.
This appears just to have been user error using the debugger. The debugger is switching to the background task thread and hitting breakpoints there before the response fully returns giving the appearance that control was not being returned to the client and being carried into the background task.
Even after adding some synchronous steps in QueueBackgroundWorkItemAsync and putting breakpoints on them, control was not immediately returned to the original http call. Only after I tried adding a long running task like await Task.Delay(1000); did enough time/ticks pass for the http response to return. I had conflated this with just the await somehow freeing up the original http context.

C# Add to a List Asynchronously in API

I have an API which needs to be run in a loop for Mass processing.
Current single API is:
public async Task<ActionResult<CombinedAddressResponse>> GetCombinedAddress(AddressRequestDto request)
We are not allowed to touch/modify the original single API. However can be run in bulk, using foreach statement. What is the best way to run this asychronously without locks?
Current Solution below is just providing a list, would this be it?
public async Task<ActionResult<List<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
var combinedAddressResponses = new List<CombinedAddressResponse>();
foreach(AddressRequestDto request in requests)
{
var newCombinedAddress = (await GetCombinedAddress(request)).Value;
combinedAddressResponses.Add(newCombinedAddress);
}
return combinedAddressResponses;
}
Update:
In debugger, it has to go to combinedAddressResponse.Result.Value
combinedAddressResponse.Value = null
and Also strangely, writing combinedAddressResponse.Result.Value gives error below "Action Result does not contain a definition for for 'Value' and no accessible extension method
I'm writing this code off the top of my head without an IDE or sleep, so please comment if I'm missing something or there's a better way.
But effectively I think you want to run all your requests at once (not sequentially) doing something like this:
public async Task<ActionResult<List<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
var combinedAddressResponses = new List<CombinedAddressResponse>(requests.Count);
var tasks = new List<Task<ActionResult<CombinedAddressResponse>>(requests.Count);
foreach (var request in requests)
{
tasks.Add(Task.Run(async () => await GetCombinedAddress(request));
}
//This waits for all the tasks to complete
await tasks.WhenAll(tasks.ToArray());
combinedAddressResponses.AddRange(tasks.Select(x => x.Result.Value));
return combinedAddressResponses;
}
looking for a way to speed things up and run in parallel thanks
What you need is "asynchronous concurrency". I use the term "concurrency" to mean "doing more than one thing at a time", and "parallel" to mean "doing more than one thing at a time using threads". Since you're on ASP.NET, you don't want to use additional threads; you'd want to use a form of concurrency that works asynchronously (which uses fewer threads). So, Parallel and Task.Run should not be parts of your solution.
The way to do asynchronous concurrency is to build a collection of tasks, and then use await Task.WhenAll. E.g.:
public async Task<ActionResult<IReadOnlyList<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
// Build the collection of tasks by doing an asynchronous operation for each request.
var tasks = requests.Select(async request =>
{
var combinedAddressResponse = await GetCombinedAdress(request);
return combinedAddressResponse.Value;
}).ToList();
// Wait for all the tasks to complete and get the results.
var results = await Task.WhenAll(tasks);
return results;
}

Getting deadlock with foreach and await

I am trying to call a service, but the service has a max length per request, so I am splitting my request into multiple smaller requests.
I am then trying to to use HttpClient together with Await as its async
public async Task<string> CallGenoskanAsync(List<string> requestList)
{
ServicePointManager.ServerCertificateValidationCallback = delegate { return true; };
var credentials = new NetworkCredential(userId, password);
var tasks = new List<Task<string>>();
foreach (string requestString in requestList)
{
using (HttpClientHandler handler = new HttpClientHandler { Credentials = credentials })
{
using (HttpClient client = new HttpClient(handler))
{
client.BaseAddress = new Uri(baseAddress);
using (HttpResponseMessage response = client.GetAsync(requestString).Result)
using (HttpContent content = response.Content)
{
tasks.Add(content.ReadAsStringAsync());
}
}
}
}
Task.WaitAll(tasks.ToArray());
var result = "";
foreach (var task in tasks)
{
result += task.Result;
}
return result;
}
This code deadlocks when await client.GetAsync is called, it never finishes.
If I change that line to
using (HttpResponseMessage response = client.GetAsync(requestString).Result)
Then I dont get any deadlocks, I suppose I am using await improperly together with foreach, but I can not figure out how
Edit: Changed example code
The compiler will give you a warning for the code you posted; in particular, it will point out that CallGenoskanAsync is synchronous, not asynchronous.
The core problem is this line:
Task.WaitAll(tasks.ToArray());
When you're writing asynchronous code, you shouldn't block on it. To do so can cause deadlocks, as I explain in my blog post. The deadlock occurs because await will capture a "context" that it uses to resume execution of the async method. This "context" is SynchronizationContext.Current or TaskScheduler.Current, and many contexts (UI and ASP.NET request contexts in particular) only allow one thread. So, when your code blocks a thread (Task.WaitAll), it's blocking a thread in that context, and this prevents the await from continuing since it's waiting for that context.
To fix, make your code asynchronous all the way. As I explain in my async intro post, the asynchronous equivalent of Task.WaitAll is await Task.WhenAll:
await Task.WhenAll(tasks);
WhenAll also has the nice property that it unwraps the results for you, so you don't have to use the problematic Result property:
var results = await Task.WhenAll(tasks);
return string.Join("", results);
Relevant context is not provided:
If
Your code sample is part of a callstack that returns a Task
There is a blocking wait at the root of the callstack (e.g.,
DoSomething().Result or DoSomething.Wait())
You have a SynchronizationContext that's thread affinitized (IE, this is running pretty much anywhere but a Console application)
Then
Your synchronization context's thread may be blocked on the root Result/Wait(), so it can never process the continuation that the completed call to GetAsync() scheduled to it. Deadlock.
If you meet the above conditions, try applying .ConfigureAwait(false) to your awaited GetAsync. This will let the continuation get scheduled to a Threadpool thread. Be aware that you are potentially getting scheduled to a different thread at that point.

Confusion with async/await web calls in order

I have a Web API service that is used to retrieve and update a specific set of data (MyDataSet objects), and I am running into some confusion in using async/await when performing the following events:
Update MyDataSet with new values
Get MyDataSet (but only after the new values have been updated)
In my client, I have something similar to the following:
Harmony.cs
private async Task<string> GetDataSet(string request)
{
using(var httpClient = new HttpClient())
{
httpClient.baseAddress = theBaseAddress;
HttpResponseMessage response = await httpClient.GetAsync(request);
response.EnsureSuccessStatusCode();
return response.Content.ReadAsStringAsync().Result;
}
}
private async Task PostDataSet<T>(string request, T data)
{
using (var httpClient = new HttpClient())
{
client.BaseAddress = new Uri(theBaseAddress);
HttpResponseMessage response = await client.PostAsJsonAsync<T>(request, data);
response.EnsureSuccessStatusCode();
}
}
internal MyDataSet GetMyDataSetById(int id)
{
string request = String.Format("api/MyDataSet/GetById/{0}", id);
return JsonConvert.DeserializeObject<MyDataSet>(GetDataSet(request).Result);
}
internal void UpdateDataSet(MyDataSet data)
{
PostDataSet("api/MyDataSet/Update", data);
}
HarmonyScheduler.cs
internal void ScheduleDataSet()
{
MyDataSet data = ...
harmony.UpdateDataSet(data);
MyDataSet data2 = harmony.GetMyDataSetById(data.Id);
}
There is a compiler warning in Harmony.cs UpdateDataSet because the call is not awaited and execution will continue before the call is completed. This is affecting the program execution, because data2 is being retrieved before the update is taking place.
If I were to make UpdateDataSet async and add an await to it, then it just moves things up the stack a level, and now HarmonyScheduler gets the warning about not being awaited.
How do I wait for the update to be complete before retrieving data2, so that I will have the updated values in the data2 object?
How do I wait for the update to be complete before retrieving data2,
so that I will have the updated values in the data2 object?
The thing I see many people don't comprehend is the fact that using the TAP with async-await will infect your code like a plague.
What do I mean by that?
Using the TAP will cause async to bubble up all the way to the top of your call stack, that is why they say async method go "all the way". That is the recommendation for using the pattern. Usually, that means that if you want to introduce an asynchronous API, you'll have to provide it along side a separate synchronous API. Most people try to mix and match between the two, but that causes a whole lot of trouble (and many SO questions).
In order to make things work properly, you'll have to turn UpdateDataSet and GetMyDataSetById to be async as well. The outcome should look like this:
private readonly HttpClient httpClient = new HttpClient();
private async Task<string> GetDataSetAsync(string request)
{
httpClient.BaseAddress = theBaseAddress;
HttpResponseMessage response = await httpClient.GetAsync(request);
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
}
private async Task PostDataSetAsync<T>(string request, T data)
{
client.BaseAddress = new Uri(theBaseAddress);
HttpResponseMessage response = await client.PostAsJsonAsync<T>(request, data);
response.EnsureSuccessStatusCode();
}
internal async Task<MyDataSet> GetMyDataSetByIdAsync(int id)
{
string request = String.Format("api/MyDataSet/GetById/{0}", id);
return JsonConvert.DeserializeObject<MyDataSet>(await GetDataSetAsync(request));
}
internal Task UpdateDataSetAsync(MyDataSet data)
{
return PostDataSetAsync("api/MyDataSet/Update", data);
}
Note - HttpClient is meant to be reused instead of a disposable, single call object. I would encapsulate as a class level field and reuse it.
If you want to expose a synchronous API, do so using a HTTP library that exposes a synchronous API, such as WebClient.
Always wait on the tasks right before you need its results.
Its always better to wait on a task if it contains an await in its body.
Warning : Do not use .Wait() or .Result
You would understand the concept better if you go through the control flow as explained Control Flow in Async Programs.
So in your case I would make all the functions accessing the async methods GetDataSet(string request) and PostDataSet<T>(string request, T data) with return type Task as awaitable.
Since you dont seem to expect any result back from the PostDataSet function you could just wait for it to complete.
PostDataSet("api/MyDataSet/Update", data).Wait();

How to create HttpWebRequest without interrupting async/await?

I have a bunch of slow functions that are essentially this:
private async Task<List<string>> DownloadSomething()
{
var request = System.Net.WebRequest.Create("https://valid.url");
...
using (var ss = await request.GetRequestStreamAsync())
{
await ss.WriteAsync(...);
}
using (var rr = await request.GetResponseAsync())
using (var ss = rr.GetResponseStream())
{
//read stream and return data
}
}
This works nicely and asynchronously except for the call to WebRequest.Create - this single line freezes the UI thread for several seconds which sort of ruins the purpose of async/await.
I already have this code written using BackgroundWorkers, which works perfectly and never freezes the UI.
Still, what is the correct, idiomatic way to create a web request with respect to async/await? Or maybe there is another class that should be used?
I've seen this nice answer about asyncifying a WebRequest, but even there the object itself is created synchronously.
Interestingly, I'm not seeing a blocking delay with WebRequest.Create or HttpClient.PostAsync. It might be something to do with DNS resolution or proxy configuration, although I'd expect these operations to be implemented internally as asynchronous, too.
Anyway, as a workaround you can start the request on a pool thread, although this is not something I'd normally do:
private async Task<List<string>> DownloadSomething()
{
var request = await Task.Run(() => {
// WebRequest.Create freezes??
return System.Net.WebRequest.Create("https://valid.url");
});
// ...
using (var ss = await request.GetRequestStreamAsync())
{
await ss.WriteAsync(...);
}
using (var rr = await request.GetResponseAsync())
using (var ss = rr.GetResponseStream())
{
//read stream and return data
}
}
That would keep the UI responsive, but it might be difficult to cancel it if user wants to stop the operation. That's because you need to already have a WebRequest instance to be able to call Abort on it.
Using HttpClient, cancellation would be possible, something like this:
private async Task<List<string>> DownloadSomething(CancellationToken token)
{
var httpClient = new HttpClient();
var response = await Task.Run(async () => {
return await httpClient.PostAsync("https://valid.url", token);
}, token);
// ...
}
With HttpClient, you can also register a httpClient.CancelPendingRequests() callback on the cancellation token, like this.
[UPDATE] Based on the comments: in your original case (before introducing Task.Run) you probably did not need the IProgress<I> pattern. As long as DownloadSomething() was called on the UI thread, every execution step after each await inside DownloadSomething would be resumed on the same UI thread, so you could just update the UI directly in between awaits.
Now, to run the whole DownloadSomething() via Task.Run on a pool thread, you would have to pass an instance of IProgress<I> into it, e.g.:
private async Task<List<string>> DownloadSomething(
string url,
IProgress<int> progress,
CancellationToken token)
{
var request = System.Net.WebRequest.Create(url);
// ...
using (var ss = await request.GetRequestStreamAsync())
{
await ss.WriteAsync(...);
}
using (var rr = await request.GetResponseAsync())
using (var ss = rr.GetResponseStream())
{
// read stream and return data
progress.Report(...); // report progress
}
}
// ...
// Calling DownloadSomething from the UI thread via Task.Run:
var progressIndicator = new Progress<int>(ReportProgress);
var cts = new CancellationTokenSource(30000); // cancel in 30s (optional)
var url = "https://valid.url";
var result = await Task.Run(() =>
DownloadSomething(url, progressIndicator, cts.Token), cts.Token);
// the "result" type is deduced to "List<string>" by the compiler
Note, because DownloadSomething is an async method itself, it is now run as a nested task, which Task.Run transparently unwraps for you. More on this: Task.Run vs Task.Factory.StartNew.
Also check out: Enabling Progress and Cancellation in Async APIs.
I think you need to use HttpClient.GetAsync() which returns a task from an HTTP request.
http://msdn.microsoft.com/en-us/library/hh158912(v=vs.110).aspx
It may depend a bit on what you want to return, but the HttpClient has a whole bunch of async methods for requests.

Categories