I'm not sure if I'm misunderstanding the usage of Task.WhenAny but in the following code only "0" gets printed when it should print "1" and "2" and then "mx.Name" every time a task finishes:
public async void PopulateMxRecords(List<string> nsRecords, int threads)
{
ThreadPool.SetMinThreads(threads, threads);
var resolver = new DnsStubResolver();
var tasks = nsRecords.Select(ns => resolver.ResolveAsync<MxRecord>(ns, RecordType.Mx));
Console.WriteLine("0");
var finished = Task.WhenAny(tasks);
Console.WriteLine("1");
while (mxNsRecords.Count < nsRecords.Count)
{
Console.WriteLine("2");
var task = await finished;
var mxRecords = await task;
foreach(var mx in mxRecords)
Console.WriteLine(mx.Name);
}
}
The DnsStubResolver is part of ARSoft.Tools.Net.Dns. The nsRecords list contains up to 2 million strings.
I'm not sure if I'm misunderstanding the usage of Task.WhenAny
You might be. The pattern you seem to be looking for is interleaving. In the following example, notice the important changes that I have made:
use ToList() to materialize the LINQ query results,
move WhenAny() into the loop,
use Remove(task) as each task completes, and
run the while loop as long as tasks.Count() > 0.
Those are the important changes. The other changes are there to make your listing into a runnable demo of interleaving, the full listing of which is here: https://dotnetfiddle.net/nr1gQ7
public static async Task PopulateMxRecords(List<string> nsRecords)
{
var tasks = nsRecords.Select(ns => ResolveAsync(ns)).ToList();
while (tasks.Count() > 0)
{
var task = await Task.WhenAny(tasks);
tasks.Remove(task);
var mxRecords = await task;
Console.WriteLine(mxRecords);
}
}
Related
i'm trying to make this code works in async way, but i got some doubts.
public async Task<string> GetAsync(string inputA, string inputB)
{
var result = await AnotherGetAsync(inputA, inputB)
.ConfigureAwait(false);
return result.Enabled
? inputB
: string.Empty;
}
I got some string input collection, and i would like to run this method on them.
Then i would do Task.WhenAll, and filter for non empty strings.
But it won't be async as inside the method i'm already awaiting for the result right?
I assumed the real question is:
If a single item is awaited inside method A, will this run sequential if I use Task.WhenAll for a range of items calling method A?
They will be run simultaneous with Task.WhenAll.
Perhaps it is best explained with an example:
void Main()
{
Test().GetAwaiter().GetResult();
}
private async Task<bool> CheckEmpty(string input)
{
await Task.Delay(200);
return String.IsNullOrEmpty(input);
}
private async Task Test()
{
var list = new List<string>
{
"1",
null,
"2",
""
};
var stopwatch = new Stopwatch();
stopwatch.Start();
// This takes aprox 4 * 200ms == 800ms to complete.
foreach (var itm in list)
{
Console.WriteLine(await CheckEmpty(itm));
}
Console.WriteLine(stopwatch.Elapsed);
Console.WriteLine();
stopwatch.Reset();
stopwatch.Start();
// This takes aprox 200ms to complete.
var tasks = list.Select(itm => CheckEmpty(itm));
var results = await Task.WhenAll(tasks); // Runs all at once.
Console.WriteLine(String.Join(Environment.NewLine, results));
Console.WriteLine(stopwatch.Elapsed);
}
My test results:
False
True
False
True
00:00:00.8006182
False
True
False
True
00:00:00.2017568
If we break this down:
AnotherGetAsync(inputA, inputB)
Returns a Task. Using the await keyword implicitly returns another Task to the code calling GetAsync(string inputA, string inputB).
Using the await keyword will free up the current thread, which is what makes the code asynchronous.
once the Task returned by AnotherGetAsync(inputA, inputB) is complete, result will be set to that Task.Result.
The remainder of the method will then resume:
return result.Enabled
? inputB
: string.Empty;
Setting the Task.Result of the Task returned from GetAsync.
If GetAsync is to be run multiple times, you can use Task.WhenAll to wait for each resulting Task to complete:
Task.WhenAll(yourStringCollection.Select(s => GetAsync(s, "another string"));
Task.WhenAll itself returns another Task that will complete when all the GetAsync tasks have completed:
var strings = await Task.WhenAll(yourStringCollection.Select(s => GetAsync(s, "another string"));
var nonEmptyStrings = strings.Where(s => s != string.Empty);
No, it will be async. When you await a task inside an async method, the task will be awaited when the async method is invoked. Here is your example simplified, using explicit variables for the created tasks:
public async Task<string> GetAsync()
{
var innerTask = InnerGetAsync();
var result = await innerTask;
var stringResult = result.ToString();
return stringResult;
}
Later you create two tasks by invoking the GetAsync() method:
var task1 = GetAsync();
var task2 = GetAsync();
At this point the two tasks are running concurrently. Each one has invoked internally the InnerGetAsync() method, which means that the two innerTasks are also up and running, and awaited. Any code that follows the await will not run before the innerTask is completed.
The completion of each outer task (task1 and task2) is dependent on the completion of its innerTask. Any code that will await task1, will also implicitly await innerTask as well.
After creating the two tasks, you combine them with Task.WhenAll, then do something else, then await the combined task, and finally process the results:
Task<string[]> whenAllTask = Task.WhenAll(task1, task2);
DoSomethingElse();
string[] results = await whenAllTask;
ProcessResults(results);
It is important that the DoSomethingElse() method will run before the two tasks are completed. So at this point three things will be happening concurrently, the DoSomethingElse() and the two active tasks. The await inside the GetAsync() method is local to this method, and does not mean that the generated outer task will be automatically awaited upon creation. To process the results of task1 and task2 you must first await them as well.
I have an IEnumerable<Task>, where each Task will call the same endpoint. However, the endpoint can only handle so many calls per second. How can I put, say, a half second delay between each call?
I have tried adding Task.Delay(), but of course awaiting them simply means that the app waits a half second before sending all the calls at once.
Here is a code snippet:
var resultTasks = orders
.Select(async task =>
{
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
}
catch(Exception ex)
{
result.Exception = ex;
}
return result;
} );
var results = Task.WhenAll(resultTasks);
I feel like I should do something like
Task.WhenAll(resultTasks.EmitOverTime(500));
... but how exactly do I do that?
What you describe in your question is in other words rate limiting. You'd like to apply rate limiting policy to your client, because the API you use enforces such a policy on the server to protect itself from abuse.
While you could implement rate limiting yourself, I'd recommend you to go with some well established solution. Rate Limiter from Davis Desmaisons was the one that I picked at random and I instantly liked it. It had solid documentation, superior coverage and was easy to use. It is also available as NuGet package.
Check out the simple snippet below that demonstrates running semi-overlapping tasks in sequence while defering the task start by half a second after the immediately preceding task started. Each task lasts at least 750 ms.
using ComposableAsync;
using RateLimiter;
using System;
using System.Threading.Tasks;
namespace RateLimiterTest
{
class Program
{
static void Main(string[] args)
{
Log("Starting tasks ...");
var constraint = TimeLimiter.GetFromMaxCountByInterval(1, TimeSpan.FromSeconds(0.5));
var tasks = new[]
{
DoWorkAsync("Task1", constraint),
DoWorkAsync("Task2", constraint),
DoWorkAsync("Task3", constraint),
DoWorkAsync("Task4", constraint)
};
Task.WaitAll(tasks);
Log("All tasks finished.");
Console.ReadLine();
}
static void Log(string message)
{
Console.WriteLine(DateTime.Now.ToString("HH:mm:ss.fff ") + message);
}
static async Task DoWorkAsync(string name, IDispatcher constraint)
{
await constraint;
Log(name + " started");
await Task.Delay(750);
Log(name + " finished");
}
}
}
Sample output:
10:03:27.121 Starting tasks ...
10:03:27.154 Task1 started
10:03:27.658 Task2 started
10:03:27.911 Task1 finished
10:03:28.160 Task3 started
10:03:28.410 Task2 finished
10:03:28.680 Task4 started
10:03:28.913 Task3 finished
10:03:29.443 Task4 finished
10:03:29.443 All tasks finished.
If you change the constraint to allow maximum two tasks per second (var constraint = TimeLimiter.GetFromMaxCountByInterval(2, TimeSpan.FromSeconds(1));), which is not the same as one per half a second, then the output could be like:
10:06:03.237 Starting tasks ...
10:06:03.264 Task1 started
10:06:03.268 Task2 started
10:06:04.026 Task2 finished
10:06:04.031 Task1 finished
10:06:04.275 Task3 started
10:06:04.276 Task4 started
10:06:05.032 Task4 finished
10:06:05.032 Task3 finished
10:06:05.033 All tasks finished.
Note that the current version of Rate Limiter targets .NETFramework 4.7.2+ or .NETStandard 2.0+.
This is just a thought, but another approach could be to create a queue and add another thread that runs polling the queue for calls that need to go out to your endpoint.
Have you considered just turning that into a foreach-loop with a Task.Delay call? You seem to want to explicitly call them sequentially, it won't hurt if that is obvious from your code.
var results = new List<YourResultType>;
foreach(var order in orders){
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
results.Add(result.Response);
}
catch(Exception ex)
{
result.Exception = ex;
}
}
Instead of selecting from orders you could loop over them, and inside the loop put the result into a list and then call Task.WhenAll.
Would look something like:
var resultTasks = new List<VendorTaskResult>(orders.Count);
orders.ToList().ForEach( item => {
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
}
catch(Exception ex)
{
result.Exception = ex;
}
resultTasks.Add(result);
Thread.Sleep(x);
});
var results = Task.WhenAll(resultTasks);
If you want to control the number of requests executed simultaneously, you have to use a semaphore.
I have something very similar, and it works fine with me. Please note that I call ToArray() after the Linq query finishes, that triggers the tasks:
using (HttpClient client = new HttpClient()) {
IEnumerable<Task<string>> _downloads = _group
.Select(job => {
await Task.Delay(300);
return client.GetStringAsync(<url with variable job>);
});
Task<string>[] _downloadTasks = _downloads.ToArray();
_pages = await Task.WhenAll(_downloadTasks);
}
Now please note that this will create n nunmber of tasks, all in parallel, and the Task.Delay literally does nothing. If you want to call the pages synchronously (as it sounds by putting a delay between the calls), then this code may be better:
using (HttpClient client = new HttpClient()) {
foreach (string job in _group) {
await Task.Delay(300);
_pages.Add(await client.GetStringAsync(<url with variable job>));
}
}
The download of the pages is still asynchronous (while downloading other tasks are done), but each call to download the page is synchronous, ensuring that you can wait for one to finish in order to call the next one.
The code can be easily changed to call the pages asynchronously in chunks, like every 10 pages, wait 300ms, like in this sample:
IEnumerable<string[]> toParse = myData
.Select((v, i) => new { v.code, group = i / 20 })
.GroupBy(x => x.group)
.Select(g => g.Select(x => x.code).ToArray());
using (HttpClient client = new HttpClient()) {
foreach (string[] _group in toParse) {
string[] _pages = null;
IEnumerable<Task<string>> _downloads = _group
.Select(job => {
return client.GetStringAsync(<url with job>);
});
Task<string>[] _downloadTasks = _downloads.ToArray();
_pages = await Task.WhenAll(_downloadTasks);
await Task.Delay(5000);
}
}
All this does is group your pages in chunks of 20, iterate through the chunks, download all pages of the chunk asynchronously, wait 5 seconds, move on to the next chunk.
I hope that is what you were waiting for :)
The proposed method EmitOverTime is doable, but only by blocking the current thread:
public static IEnumerable<Task<TResult>> EmitOverTime<TResult>(
this IEnumerable<Task<TResult>> tasks, int delay)
{
foreach (var item in tasks)
{
Thread.Sleep(delay); // Delay by blocking
yield return item;
}
}
Usage:
var results = await Task.WhenAll(resultTasks.EmitOverTime(500));
Probably better is to create a variant of Task.WhenAll that accepts a delay argument, and delays asyncronously:
public static async Task<TResult[]> WhenAllWithDelay<TResult>(
IEnumerable<Task<TResult>> tasks, int delay)
{
var tasksList = new List<Task<TResult>>();
foreach (var task in tasks)
{
await Task.Delay(delay).ConfigureAwait(false);
tasksList.Add(task);
}
return await Task.WhenAll(tasksList).ConfigureAwait(false);
}
Usage:
var results = await WhenAllWithDelay(resultTasks, 500);
This design implies that the enumerable of tasks should be enumerated only once. It is easy to forget this during development, and start enumerating it again, spawning a new set of tasks. For this reason I propose to make it an OnlyOnce enumerable, as it is shown in this question.
Update: I should mention why the above methods work, and under what premise. The premise is that the supplied IEnumerable<Task<TResult>> is deferred, in other words non-materialized. At the method's start there are no tasks created yet. The tasks are created one after the other during the enumeration of the enumerable, and the trick is that the enumeration is slow and controlled. The delay inside the loop ensures that the tasks are not created all at once. They are created hot (in other words already started), so at the time the last task has been created some of the first tasks may have already been completed. The materialized list of half-running/half-completed tasks is then passed to Task.WhenAll, that waits for all to complete asynchronously.
I have two services that ultimately both update the same object, so we have a test to ensure that the writes to that object complete (Under the hood we have retry policies on each).
9 times out of 10, one or more of the theories will fail, with the task.ShouldNotBeNull(); always being the assertion to fail. What am i getting wrong with the async code in this sample? Why would the task be null?
[Theory]
[InlineData(1)]
[InlineData(5)]
[InlineData(10)]
[InlineData(20)]
public async Task ConcurrencyIssueTest(int iterations)
{
var orderResult = await _driver.PlaceOrder();
var tasksA = new List<Task<ApiResponse<string>>>();
var tasksB = new List<Task<ApiResponse<string>>>();
await Task.Run(() => Parallel.For(1, iterations,
x =>
{
tasksA.Add(_Api.TaskA(orderResult.OrderId));
tasksB.Add(_Api.TaskB(orderResult.OrderId));
}));
//Check all tasks return successful
foreach (var task in tasksA)
{
task.ShouldNotBeNull();
var result = task.GetAwaiter().GetResult();
result.ShouldNotBeNull();
result.StatusCode.ShouldBe(HttpStatusCode.OK);
}
foreach (var task in tasksB)
{
task.ShouldNotBeNull();
var result = task.GetAwaiter().GetResult();
result.ShouldNotBeNull();
result.StatusCode.ShouldBe(HttpStatusCode.OK);
}
}
}
There's no need for Tasks and Parrallel looping here. I'm presuming that your _api calls are IO bound? You want something more like this:
var tasksA = new List<Task<ApiResponse<string>>>();
var tasksB = new List<Task<ApiResponse<string>>>();
//fire off all the async tasks
foreach(var it in iterations){
tasksA.Add(_Api.TaskA(orderResult.OrderId));
tasksB.Add(_Api.TaskB(orderResult.OrderId));
}
//await the results
await Task.WhenAll(tasksA).ConfigureAwait(false);
foreach (var task in tasksA)
{
//no need to get GetAwaiter(), you've awaited above.
task.Result;
}
//to get the most out of the async only await them just before you need them
await Task.WhenAll(tasksB).ConfigureAwait(false);
foreach (var task2 in tasksB)
{
task2.Result;
}
this will fire all your api calls async then block while the results return. You Parallel for and tasks are just using additional thread pool threads to zero benefit.
If _api is CPU bound you could get benefit from Task.Run but I'm guessing these are web api or something. So the Task.Run is doing nothing but using an additional thread.
As others have suggested, remove the Parallel, and await on all tasks to finish before asserting them.
I would also recommend to remove .Result from each task, and await them instead.
public async Task ConcurrencyIssueTest(int iterations)
{
var orderResult = await _driver.PlaceOrder();
var taskA = _Api.TaskA(orderResult.OrderId);
var taskB = _Api.TaskB(orderResult.OrderId);
await Task.WhenAll(taskA, taskB);
var taskAResult = await taskA;
taskAResult.ShouldNotBeNull();
taskAResult.StatusCode.ShouldBe(HttpStatusCode.OK);
var taskBResult = await taskB;
taskBResult.ShouldNotBeNull();
taskBResult.StatusCode.ShouldBe(HttpStatusCode.OK);
}
Say I had a list of algorithms, each containing an async call somewhere in the body of that algorithm. The order in which I execute the algorithms is the order in which I want to receive the results. That is, I want the AlgorithmResult List to look like {Algorithm1Result, Algorithm2Result, Algorithm3Result} after all the algorithms have executed. Would I be right in saying that if Algorithm1 and 3 finished before 2 that my results would actually be in the order {Algorithm1Result, Algorithm3Result, Algorithm2Result}
var algorithms = new List<Algorithm>(){Algorithm1, Algorithm2, Algorithm3};
var algorithmResults = new List<AlgorithmResults>();
foreach (var algorithm in algorithms)
{
algorithmResults.Add(await algorithm.Execute());
}
NO, Result would be in the same order you added it to the list, since each operation is being waited for separately.
class Program
{
public static async Task<int> GetResult(int timing)
{
return await Task<int>.Run(() =>
{
Thread.Sleep(timing * 1000);
return timing;
});
}
public static async Task<List<int>> GetAll()
{
List<int> tasks = new List<int>();
tasks.Add(await GetResult(3));
tasks.Add(await GetResult(2));
tasks.Add(await GetResult(1));
return tasks;
}
static void Main(string[] args)
{
var res = GetAll().Result;
}
}
res anyway contains list in order it was added, also this is not parallel execution.
Since you don't add the tasks, but the results of the task after awaiting, they will be in the order you want.
Even if you did not await the result before adding the next task you could get the order you want:
in small steps, showing type awareness:
List<Task<AlgorithmResult>> tasks = new List<Task<AlgorithmResult>>();
foreach (Algorithm algorithm in algorithms)
{
Task<AlgorithmResult> task = algorithm.Execute();
// don't wait until task Completes, just remember the Task
// and continue executing the next Algorithm
tasks.Add(task);
}
Now some Tasks may be running, some may already have completed. Let's wait until they are all complete, and fetch the results:
Task.WhenAll(tasks.ToArray());
List<AlgrithmResults> results = tasks.Select(task => task.Result).ToList();
I have a list of objects that I need to run a long running process on and I would like to kick them off asynchronously, then when they are all finished return them as a list to the calling method. I've been trying different methods that I have found, however it appears that the processes are still running synchronously in the order that they are in the list. So I am sure that I am missing something in the process of how to execute a list of tasks.
Here is my code:
public async Task<List<ShipmentOverview>> GetShipmentByStatus(ShipmentFilterModel filter)
{
if (string.IsNullOrEmpty(filter.Status))
{
throw new InvalidShipmentStatusException(filter.Status);
}
var lookups = GetLookups(false, Brownells.ConsolidatedShipping.Constants.ShipmentStatusType);
var lookup = lookups.SingleOrDefault(sd => sd.Name.ToLower() == filter.Status.ToLower());
if (lookup != null)
{
filter.StatusId = lookup.Id;
var shipments = Shipments.GetShipments(filter);
var tasks = shipments.Select(async model => await GetOverview(model)).ToList();
ShipmentOverview[] finishedTask = await Task.WhenAll(tasks);
return finishedTask.ToList();
}
else
{
throw new InvalidShipmentStatusException(filter.Status);
}
}
private async Task<ShipmentOverview> GetOverview(ShipmentModel model)
{
String version;
var user = AuthContext.GetUserSecurityModel(Identity.Token, out version) as UserSecurityModel;
var profile = AuthContext.GetProfileSecurityModel(user.Profiles.First());
var overview = new ShipmentOverview
{
Id = model.Id,
CanView = true,
CanClose = profile.HasFeatureAction("Shipments", "Close", "POST"),
CanClear = profile.HasFeatureAction("Shipments", "Clear", "POST"),
CanEdit = profile.HasFeatureAction("Shipments", "Get", "PUT"),
ShipmentNumber = model.ShipmentNumber.ToString(),
ShipmentName = model.Name,
};
var parcels = Shipments.GetParcelsInShipment(model.Id);
overview.NumberParcels = parcels.Count;
var orders = parcels.Select(s => WareHouseClient.GetOrderNumberFromParcelId(s.ParcelNumber)).ToList();
overview.NumberOrders = orders.Distinct().Count();
//check validations
var vals = Shipments.GetShipmentValidations(model.Id);
if (model.ValidationTypeId == Constants.OrderValidationType)
{
if (vals.Count > 0)
{
overview.NumberOrdersTotal = vals.Count();
overview.NumberParcelsTotal = vals.Sum(s => WareHouseClient.GetParcelsPerOrder(s.ValidateReference));
}
}
return overview;
}
It looks like you're using asynchronous methods while you really want threads.
Asynchronous methods yield control back to the calling method when an async method is called, then wait until the methods has completed on the await. You can see how it works here.
Basically, the only usefulness of async/await methods is not to lock the UI, so that it stays responsive.
If you want to fire multiple processings in parallel, you will want to use threads, like such:
using System.Threading.Tasks;
public void MainMethod() {
// Parallel.ForEach will automagically run the "right" number of threads in parallel
Parallel.ForEach(shipments, shipment => ProcessShipment(shipment));
// do something when all shipments have been processed
}
public void ProcessShipment(Shipment shipment) { ... }
Marking the method as async doesn't auto-magically make it execute in parallel. Since you're not using await at all, it will in fact execute completely synchronously as if it wasn't async. You might have read somewhere that async makes functions execute asynchronously, but this simply isn't true - forget it. The only thing it does is build a state machine to handle task continuations for you when you use await and actually build all the code to manage those tasks and their error handling.
If your code is mostly I/O bound, use the asynchronous APIs with await to make sure the methods actually execute in parallel. If they are CPU bound, a Task.Run (or Parallel.ForEach) will work best.
Also, there's no point in doing .Select(async model => await GetOverview(model). It's almost equivalent to .Select(model => GetOverview(model). In any case, since the method actually doesn't return an asynchronous task, it will be executed while doing the Select, long before you get to the Task.WhenAll.
Given this, even the GetShipmentByStatus's async is pretty much useless - you only use await to await the Task.WhenAll, but since all the tasks are already completed by that point, it will simply complete synchronously.
If your tasks are CPU bound and not I/O bound, then here is the pattern I believe you're looking for:
static void Main(string[] args) {
Task firstStepTask = Task.Run(() => firstStep());
Task secondStepTask = Task.Run(() => secondStep());
//...
Task finalStepTask = Task.Factory.ContinueWhenAll(
new Task[] { step1Task, step2Task }, //more if more than two steps...
(previousTasks) => finalStep());
finalStepTask.Wait();
}