I have an API which needs to be run in a loop for Mass processing.
Current single API is:
public async Task<ActionResult<CombinedAddressResponse>> GetCombinedAddress(AddressRequestDto request)
We are not allowed to touch/modify the original single API. However can be run in bulk, using foreach statement. What is the best way to run this asychronously without locks?
Current Solution below is just providing a list, would this be it?
public async Task<ActionResult<List<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
var combinedAddressResponses = new List<CombinedAddressResponse>();
foreach(AddressRequestDto request in requests)
{
var newCombinedAddress = (await GetCombinedAddress(request)).Value;
combinedAddressResponses.Add(newCombinedAddress);
}
return combinedAddressResponses;
}
Update:
In debugger, it has to go to combinedAddressResponse.Result.Value
combinedAddressResponse.Value = null
and Also strangely, writing combinedAddressResponse.Result.Value gives error below "Action Result does not contain a definition for for 'Value' and no accessible extension method
I'm writing this code off the top of my head without an IDE or sleep, so please comment if I'm missing something or there's a better way.
But effectively I think you want to run all your requests at once (not sequentially) doing something like this:
public async Task<ActionResult<List<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
var combinedAddressResponses = new List<CombinedAddressResponse>(requests.Count);
var tasks = new List<Task<ActionResult<CombinedAddressResponse>>(requests.Count);
foreach (var request in requests)
{
tasks.Add(Task.Run(async () => await GetCombinedAddress(request));
}
//This waits for all the tasks to complete
await tasks.WhenAll(tasks.ToArray());
combinedAddressResponses.AddRange(tasks.Select(x => x.Result.Value));
return combinedAddressResponses;
}
looking for a way to speed things up and run in parallel thanks
What you need is "asynchronous concurrency". I use the term "concurrency" to mean "doing more than one thing at a time", and "parallel" to mean "doing more than one thing at a time using threads". Since you're on ASP.NET, you don't want to use additional threads; you'd want to use a form of concurrency that works asynchronously (which uses fewer threads). So, Parallel and Task.Run should not be parts of your solution.
The way to do asynchronous concurrency is to build a collection of tasks, and then use await Task.WhenAll. E.g.:
public async Task<ActionResult<IReadOnlyList<CombinedAddressResponse>>> GetCombinedAddress(List<AddressRequestDto> requests)
{
// Build the collection of tasks by doing an asynchronous operation for each request.
var tasks = requests.Select(async request =>
{
var combinedAddressResponse = await GetCombinedAdress(request);
return combinedAddressResponse.Value;
}).ToList();
// Wait for all the tasks to complete and get the results.
var results = await Task.WhenAll(tasks);
return results;
}
Related
I have the following code, that I intend to run asynchronously. My goal is that GetPictureForEmployeeAsync() is called in parallel as many times as needed. I'd like to make sure that 'await' on CreatePicture does not prevent me from doing so.
public Task<Picture[]> GetPictures(IDictionary<string, string> tags)
{
var query = documentRepository.GetRepositoryQuery();
var employees = query.Where(doc => doc.Gender == tags["gender"]);
return Task.WhenAll(employees.Select(employee => GetPictureForEmployeeAsync(employee, tags)));
}
private Task<Picture> GetPictureForEmployeeAsync(Employee employee, IDictionary<string, string> tags)
{
var base64PictureTask = blobRepository.GetBase64PictureAsync(employee.ID.ToString());
var documentTask = documentRepository.GetItemAsync(employee.ID.ToString());
return CreatePicture(tags, base64PictureTask, documentTask);
}
private static async Task<Picture> CreatePicture(IDictionary<string, string> tags, Task<string> base64PictureTask, Task<Employee> documentTask)
{
var document = await documentTask;
return new Picture
{
EmployeeID = document.ID,
Data = await base64PictureTask,
ID = document.ID.ToString(),
Tags = tags,
};
}
If I understand it correctly, Task.WhenAll() is not affected by the two awaited tasks inside CreatePicture() because GetPictureForEmployeeAsync() is not awaited. Am I right about this? If not, how should I restructure the code to achieve what I want?
I'd like to make sure that 'await' on CreatePicture does not prevent me from doing so.
It doesn't.
If I understand it correctly, Task.WhenAll() is not affected by the two awaited tasks inside CreatePicture() because GetPictureForEmployeeAsync() is not awaited. Am I right about this?
Yes and no. The WhenAll isn't limited in any way by the awaited tasks in CreatePicture, but that has nothing to do with whether GetPictureForEmployeeAsync is awaited or not. These two lines of code are equivalent in terms of behavior:
return Task.WhenAll(employees.Select(employee => GetPictureForEmployeeAsync(employee, tags)));
return Task.WhenAll(employees.Select(async employee => await GetPictureForEmployeeAsync(employee, tags)));
I recommend reading my async intro to get a good understanding of how async and await work with tasks.
Also, since GetPictures has non-trivial logic (GetRepositoryQuery and evaluating tags["gender"]), I recommend using async and await for GetPictures, as such:
public async Task<Picture[]> GetPictures(IDictionary<string, string> tags)
{
var query = documentRepository.GetRepositoryQuery();
var employees = query.Where(doc => doc.Gender == tags["gender"]);
var tasks = employees.Select(employee => GetPictureForEmployeeAsync(employee, tags)).ToList();
return await Task.WhenAll(tasks);
}
As a final note, you may find your code cleaner if you don't pass around "tasks meant to be awaited" - instead, await them first and pass their result values:
async Task<Picture> GetPictureForEmployeeAsync(Employee employee, IDictionary<string, string> tags)
{
var base64PictureTask = blobRepository.GetBase64PictureAsync(employee.ID.ToString());
var documentTask = documentRepository.GetItemAsync(employee.ID.ToString());
await Task.WhenAll(base64PictureTask, documentTask);
return CreatePicture(tags, await base64PictureTask, await documentTask);
}
static Picture CreatePicture(IDictionary<string, string> tags, string base64Picture, Employee document)
{
return new Picture
{
EmployeeID = document.ID,
Data = base64Picture,
ID = document.ID.ToString(),
Tags = tags,
};
}
The thing to keep in mind about calling an async method is that, as soon as an await statement is reached inside that method, control immediately goes back to the code that invoked the async method -- no matter where the await statement happens to be in the method. With a 'normal' method, control doesn't go back to the code that invokes that method until the end of that method is reached.
So in your case, you can do the following:
private async Task<Picture> GetPictureForEmployeeAsync(Employee employee, IDictionary<string, string> tags)
{
// As soon as we get here, control immediately goes back to the GetPictures
// method -- no need to store the task in a variable and await it within
// CreatePicture as you were doing
var picture = await blobRepository.GetBase64PictureAsync(employee.ID.ToString());
var document = await documentRepository.GetItemAsync(employee.ID.ToString());
return CreatePicture(tags, picture, document);
}
Because the first line of code in GetPictureForEmployeeAsync has an await, control will immediately go right back to this line...
return Task.WhenAll(employees.Select(employee => GetPictureForEmployeeAsync(employee, tags)));
...as soon as it is invoked. This will have the effect of all of the employee items getting processed in parallel (well, sort of -- the number of threads that will be allotted to your application will be limited).
As an additional word of advice, if this application is hitting a database or web service to get the pictures or documents, this code will likely cause you issues with running out of available connections. If this is the case, consider using System.Threading.Tasks.Parallel and setting the maximum degree of parallelism, or use SemaphoreSlim to control the number of connections used simultaneously.
Consider this situation:
class Product { }
interface IWorker
{
Task<Product> CreateProductAsync();
}
I am now given an IEnumerable<IWorker> workers and am supposed to create an IEnumerable<Product> from it that I have to pass to some other function that I cannot alter:
void CheckProducts(IEnumerable<Product> products);
This methods needs to have access to the entire IEnumerable<Product>. It is not possible to subdivide it and call CheckProducts on multiple subsets.
One obvious solution is this:
CheckProducts(workers.Select(worker => worker.CreateProductAsync().Result));
But this is blocking, of course, and hence it would only be my last resort.
Syntactically, I need precisely this, just without blocking.
I cannot use await inside of the function I'm passing to Select() as I would have to mark it as async and that would require it to return a Task itself and I would have gained nothing. In the end I need an IEnumerable<Product> and not an IEnumerable<Task<Product>>.
It is important to know that the order of the workers creating their products does matter, their work must not overlap. Otherwise, I would do this:
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var tasks = workers.Select(worker => worker.CreateProductAsync());
return await Task.WhenAll(tasks);
}
But unfortunately, Task.WhenAll() executes some tasks in parallel while I need them executed sequentially.
Here is one possibility to implement it if I had an IReadOnlyList<IWorker> instead of an IEnumerable<IWorker>:
async Task<IEnumerable<Product>> CreateProductsAsync(IReadOnlyList<IWorker> workers)
{
var resultList = new Product[workers.Count];
for (int i = 0; i < resultList.Length; ++i)
resultList[i] = await workers[i].CreateProductAsync();
return resultList;
}
But I must deal with an IEnumerable and, even worse, it is usually quite huge, sometimes it is even unlimited, yielding workers forever. If I knew that its size was decent, I would just call ToArray() on it and use the method above.
The ultimate solution would be this:
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
foreach (var worker in workers)
yield return await worker.CreateProductAsync();
}
But yield and await are incompatible as described in this answer. Looking at that answer, would that hypothetical IAsyncEnumerator help me here? Does something similar meanwhile exist in C#?
A summary of the issues I'm facing:
I have a potentially endless IEnumerable<IWorker>
I want to asynchronously call CreateProductAsync() on each of them in the same order as they are coming in
In the end I need an IEnumerable<Product>
A summary of what I already tried, but doesn't work:
I cannot use Task.WhenAll() because it executes tasks in parallel.
I cannot use ToArray() and process that array manually in a loop because my sequence is sometimes endless.
I cannot use yield return because it's incompatible with await.
Does anybody have a solution or workaround for me?
Otherwise I will have to use that blocking code...
IEnumerator<T> is a synchronous interface, so blocking is unavoidable if CheckProducts enumerates the next product before the next worker has finished creating the product.
Nevertheless, you can achieve parallelism by creating products on another thread, adding them to a BlockingCollection<T>, and yielding them on the main thread:
static IEnumerable<Product> CreateProducts(IEnumerable<IWorker> workers)
{
var products = new BlockingCollection<Product>(3);
Task.Run(async () => // On the thread pool...
{
foreach (IWorker worker in workers)
{
Product product = await worker.CreateProductAsync(); // Create products serially.
products.Add(product); // Enqueue the product, blocking if the queue is full.
}
products.CompleteAdding(); // Notify GetConsumingEnumerable that we're done.
});
return products.GetConsumingEnumerable();
}
To avoid unbounded memory consumption, you can optionally specify the capacity of the queue as a constructor argument to BlockingCollection<T>. I used 3 in the code above.
The Situation:
Here you're saying you need to do this synchronously, because IEnumerable doesn't support async and the requirements are you need an IEnumerable<Product>.
I am now given an IEnumerable workers and am supposed to
create an IEnumerable from it that I have to pass to some
other function that I cannot alter:
Here you say the entire product set needs to be processed at the same time, presumably making a single call to void CheckProducts(IEnumerable<Product> products).
This methods needs to check the entire Product set as a whole. It is
not possible to subdivide the result.
And here you say the enumerable can yield an indefinite number of items
But I must deal with an IEnumerable and, even worse, it is usually
quite huge, sometimes it is even unlimited, yielding workers forever.
If I knew that its size was decent, I would just call ToArray() on it
and use the method above.
So lets put these together. You need to do asynchronous processing of an indefinite number of items within a synchronous environment and then evaluate the entire set as a whole... synchronously.
The Underlying Problems:
1: To evaluate a set as a whole, it must be completely enumerated. To completely enumerate a set, it must be finite. Therefore it is impossible to evaluate an infinite set as a whole.
2: Switching back and forth between sync and async forces the async code to run synchronously. that might be ok from a requirements perspective, but from a technical perspective it can cause deadlocks (maybe unavoidable, I don't know. Look that up. I'm not the expert).
Possible Solutions to Problem 1:
1: Force the source to be an ICollection<T> instead of IEnumerable<T>. This enforces finiteness.
2: Alter the CheckProducts algorithm to process iteratively, potentially yielding intermediary results while still maintaining an ongoing aggregation internally.
Possible Solutions to Problem 2:
1: Make the CheckProducts method asynchronous.
2: Make the CreateProduct... method synchronous.
Bottom Line
You can't do what you're asking how you're asking, and it sounds like someone else is dictating your requirements. They need to change some of the requirements, because what they're asking for is (and I really hate using this word) impossible. Is it possible you have misinterpreted some of the requirements?
Two ideas for you OP
Multiple call solution
If you are allowed to call CheckProducts more than once, you could simply do this:
foreach (var worker in workers)
{
var product = await worker.CreateProductAsync();
CheckProducts(new [] { product } );
}
If it adds value, I'm pretty sure you could work out a way to do it in batches of, say, 100 at a time, too.
Thread pool solution
If you are not allowed to call CheckProducts more than once, and not allowed to modify CheckProducts, there is no way to force it to yield control and allow other continuations to run. So no matter what you do, you cannot force asynchronousness into the IEnumerable that you pass to it, not just because of the compiler checking, but because it would probably deadlock.
So here is a thread pool solution. The idea is to create one separate thread to process the products in series; the processor is async, so a call to CreateProductAsync() will still yield control to anything else that has been posted to the synchronization context, as needed. However it can't magically force CheckProduct to give up control, so there is still some possibility that it will block occasionally if it is able to check products faster than they are created. In my example I'm using Monitor.Wait() so the O/S won't schedule the thread until there is something waiting for it. You'll still be using up a thread resource while it blocks, but at least you won't be wasting CPU time in a busy-wait loop.
public static IEnumerable<Product> CreateProducts(IEnumerable<Worker> workers)
{
var queue = new ConcurrentQueue<Product>();
var task = Task.Run(() => ConvertProducts(workers.GetEnumerator(), queue));
while (true)
{
while (queue.Count > 0)
{
Product product;
var ok = queue.TryDequeue(out product);
if (ok) yield return product;
}
if (task.IsCompleted && queue.Count == 0) yield break;
Monitor.Wait(queue, 1000);
}
}
private static async Task ConvertProducts(IEnumerator<Worker> input, ConcurrentQueue<Product> output)
{
while (input.MoveNext())
{
var current = input.Current;
var product = await current.CreateProductAsync();
output.Enqueue(product);
Monitor.Pulse(output);
}
}
From your requirements I can put together the following:
1) Workers processed in order
2) Open to receive new Workers at any time
So using the fact that a dataflow TransformBlock has a built in queue and processes items in order. Now we can accept Workers from the producer at any time.
Next we make the result of the TransformBlockobservale so that the consumer can consume Products on demand.
Made some quick changes and started the consumer portion. This simply takes the observable produced by the Transformer and maps it to an enumerable that yields each product. For background here is the ToEnumerable().
The ToEnumerator operator returns an enumerator from an observable sequence. The enumerator will yield each item in the sequence as it is produced
Source
using System;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace ClassLibrary1
{
public class WorkerProducer
{
public async Task ProduceWorker()
{
//await ProductTransformer_Transformer.SendAsync(new Worker())
}
}
public class ProductTransformer
{
public IObservable<Product> Products { get; private set; }
public TransformBlock<Worker, Product> Transformer { get; private set; }
private Task<Product> CreateProductAsync(Worker worker) => Task.FromResult(new Product());
public ProductTransformer()
{
Transformer = new TransformBlock<Worker, Product>(wrk => CreateProductAsync(wrk));
Products = Transformer.AsObservable();
}
}
public class ProductConsumer
{
private ThirdParty ThirdParty { get; set; } = new ThirdParty();
private ProductTransformer Transformer { get; set; }
public ProductConsumer()
{
ThirdParty.CheckProducts(Transformer.Products.ToEnumerable());
}
public class Worker { }
public class Product { }
public class ThirdParty
{
public void CheckProducts(IEnumerable<Product> products)
{
}
}
}
Unless I misunterstood something, I don't see why you don't simply do it like this:
var productList = new List<Product>(workers.Count())
foreach(var worker in workers)
{
productList.Add(await worker.CreateProductAsync());
}
CheckProducts(productList);
What about if you simply keep clearing a List of size 1?
var productList = new List<Product>(1);
var checkTask = Task.CompletedTask;
foreach(var worker in workers)
{
await checkTask;
productList.Clear();
productList.Add(await worker.CreateProductAsync());
checkTask = Task.Run(CheckProducts(productList));
}
await checkTask;
You can use Task.WhenAll, but instead of returning result of Task.WhenAll, return collection of tasks transformed to the collection of results.
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var tasks = workers.Select(worker => worker.CreateProductAsync()).ToList();
await Task.WhenAll(tasks);
return tasks.Select(task => task.Result);
}
Order of tasks will be persisted.
And seems like should be ok to go with just return await Task.WhenAll()
From docs of Task.WhenAll Method (IEnumerable>)
The Task.Result property of the returned task will be set to
an array containing all of the results of the supplied tasks in the
same order as they were provided...
If workers need to be executed one by one in the order they were created and based on requirement that another function need whole set of workers results
async Task<IEnumerable<Product>> CreateProductsAsync(IEnumerable<IWorker> workers)
{
var products = new List<product>();
foreach (var worker in workers)
{
product = await worker.CreateProductAsync();
products.Add(product);
}
return products;
}
You can do this now with async, IEnumerable and LINQ but every method in the chain after the async would be a Task<T>, and you need to use something like await Task.WhenAll at the end. You can use async lambdas in the LINQ methods, which return Task<T>. You don't need to wait synchronously in these.
The Select will start your tasks sequentially i.e. they won't even exist as tasks until the select enumerates each one, and won't keep going after you stop enumerating. You could also run your own foreach over the enumerable of tasks if you want to await them all individually.
You can break out of this like any other foreach without it starting all of them, so this will also work on an infinite enumerable.
public async Task Main()
{
// This async method call could also be an async lambda
foreach (var task in GetTasks())
{
var result = await task;
Console.WriteLine($"Result is {result}");
if (result > 5) break;
}
}
private IEnumerable<Task<int>> GetTasks()
{
return GetNumbers().Select(WaitAndDoubleAsync);
}
private async Task<int> WaitAndDoubleAsync(int i)
{
Console.WriteLine($"Waiting {i} seconds asynchronously");
await Task.Delay(TimeSpan.FromSeconds(i));
return i * 2;
}
/// Keeps yielding numbers
private IEnumerable<int> GetNumbers()
{
var i = 0;
while (true) yield return i++;
}
Outputs, the following, then stops:
Waiting 0 seconds asynchronously
Result is 0
Waiting 1 seconds asynchronously
Result is 2
Waiting 2 seconds asynchronously
Result is 4
Waiting 3 seconds asynchronously
Result is 6
The important thing is that you can't mix yield and await in the same method, but you can yield Tasks returned from a method that uses await absolutely fine, so you can use them together just by splitting them into separate methods. Select is already a method that uses yield, so you may not need to write your own method for this.
In your post you were looking for a Task<IEnumerable<Product>>, but what you can actually use is a IEnumerable<Task<Product>>.
You can go even further with this e.g. if you had something like a REST API where one resource can have links to other resources, like if you just wanted to get a list of users of a group, but stop when you found the user you were interested in:
public async Task<IEnumerable<Task<User>>> GetUserTasksAsync(int groupId)
{
var group = await GetGroupAsync(groupId);
return group.UserIds.Select(GetUserAsync);
}
foreach (var task in await GetUserTasksAsync(1))
{
var user = await task;
...
}
There is no solution to your problem. You can't transform a deferred IEnumerable<Task<Product>> to a deferred IEnumerable<Product>, such that the consuming thread will not get blocked while enumerating the IEnumerable<Product>. The IEnumerable<T> is a synchronous interface. It returns an enumerator with a synchronous MoveNext method. The MoveNext returns bool, which is not an awaitable type. An asynchronous interface IAsyncEnumerable<T> exists, whose enumerator has an asynchronous MoveNextAsync method, with a return type of ValueTask<bool>. But you have explicitly said that you can't change the consuming method, so you are stuck with the IEnumerable<T> interface. No solution then.
try
workers.ForEach(async wrkr =>
{
var prdlist = await wrkr.CreateProductAsync();
//Remaing tasks....
});
I want to run all tasks with WhenAll (not one by one).
But after that I need to update list (LastReport property) base on result.
I think I have solution but I would like to check if there is better way.
Idea is to:
Run all tasks
Remember relation between configuration and task
Update configuration
My solution is:
var lastReportAllTasks = new List<Task<Dictionary<string, string>>>();
var configurationTaskRelation = new Dictionary<int, Task<Dictionary<string, string>>>();
foreach (var configuration in MachineConfigurations)
{
var task = machineService.GetReports(configuration);
lastReportAllTasks.Add(task);
configurationTaskRelation.Add(configuration.Id, task);
}
await Task.WhenAll(lastReportAllTasks);
foreach (var configuration in MachineConfigurations)
{
var lastReportTask = configurationTaskRelation[configuration.Id];
configuration.LastReport = await lastReportTask;
}
The Select function can be asynchronous itself. You can await the report and return both the configuration and result in the same result object (anonymous type or tuple, whatever you prefer) :
var tasks=MachineConfigurations.Select(async conf=>{
var report= await machineService.GetReports(conf);
return new {conf,report});
var results=await Task.WhenAll(tasks);
foreach(var pair in results)
{
pair.conf.LastReport=pair.report;
}
EDIT - Loops and error handling
As Servy suggested, Task.WhenAll can be ommited and awaiting can be moved inside the loop :
foreach(var task in tasks)
{
var pair=await task;
pair.conf.LastReport=pair.report;
}
The tasks will still execute concurrently. In case of exception though, some configuration objects will be modified and some not.
In general, this would be an ugly situation, requiring extra exception handling code to clean up the modified objects. Exception handling is a lot easier when modifications are done on-the-side and finalized/applied when the happy path completes. That's one reason why updating the Configuration objects inside the Select() requires careful consideration.
In this particular case though it may be better to "skip" the failed reports, possibly move them to an error queue and reprocess them at a later time. It may be better to have partial results than no results at all, as long as this behaviour is expected:
foreach(var task in tasks)
{
try
{
var pair=await task;
pair.conf.LastReport=pair.report;
}
catch(Exception exc)
{
//Make sure the error is logged
Log.Error(exc);
ErrorQueue.Enqueue(new ProcessingError(conf,ex);
}
}
//Handle errors after the loop
EDIT 2 - Dataflow
For completeness, I do have several thousand ticket reports to generate each day, and each GDS call (the service through which every travel agency sells tickets) takes considerable time. I can't run all requests at the same time - I start getting server serialization errors if I try more than 10 concurrent requests. I can't retry everything either.
In this case I used TPL DataFlow combined with some Railway oriented programming tricks. An ActionBlock with a DOP of 8 processes the ticket requests. The results are wrapped in a Success class and sent to the next block. Failed requests and exceptions are wrapped in a Failure class and sent to another block. Both classes inherit from IFlowEnvelope which has a Successful flag. Yes, that's F# Discriminated Union envy.
This is combined with some retry logic for timeouts etc.
In pseudocode the pipeline looks like this :
var reportingBlock=new TransformBlock<Ticket,IFlowEnvelope<TicketReport>(reportFunc,dopOptions);
var happyBlock = new ActionBlock<IFlowEnvelope<TicketReport>>(storeToDb);
var errorBlock = new ActionBlock<IFlowEnvelope<TicketReport>>(logError);
reportingBlock.LinkTo(happyBlock,linkOptions,msg=>msg.Success);
reportingBlock.LinkTo(errorBlock,linkOptions,msg=>!msg.Success);
foreach(var ticket in tickets)
{
reportingBlock.Post(ticket);
}
reportFunc catches any exceptions and wraps them as Failure<T> objects:
async Task<IFlowEnvelope<Ticket,TicketReport>> reportFunc(Ticket ticket)
{
try
{
//Do the heavy processing
return new Success<TicketReport>(report);
}
catch(Exception exc)
{
//Construct an error message, msg
return new Failure<TicketReport>(report,msg);
}
}
The real pipeline includes steps that parse daily reports and individual tickets. Each call to the GDS takes 1-6 seconds so the complexity of the pipeline is justified.
I think you don't need Lists or Dictionaries. Why not simple loop which updates LastReport with results
foreach (var configuration in MachineConfigurations)
{
configuration.LastReport = await machineService.GetReports(configuration);
}
For executing all reports "in parallel"
Func<Configuration, Task> loadReport =
async config => config.LastReport = await machineService.GetReports(config);
await Task.WhenAll(MachineConfigurations.Select(loadReport));
And very poor try to be more functional.
Func<Configuration, Task<Configuration>> getConfigWithReportAsync =
async config =>
{
var report = await machineService.GetReports(config);
return new Configuration
{
Id = config.Id,
LastReport = report
};
}
var configsWithUpdatedReports =
await Task.WhenAll(MachineConfigurations.Select(getConfigWithReportAsync));
using System.Linq;
var taskResultsWithConfiguration = MachineConfigurations.Select(conf =>
new { Conf = conf, Task = machineService.GetReports(conf) }).ToList();
await Task.WhenAll(taskResultsWithConfiguration.Select(pair => pair.Task));
foreach (var pair in taskResultsWithConfiguration)
pair.Conf.LastReport = pair.Task.Result;
I have a situation where I am interested in the first successful response from an array of services that each support the method
Task<Try<SearchResponse>> PerformSearch(SearchRequest request);
The Try class is a container for a Good/Bad result (like error Monad)
The call to the list of services currently is this
var searchResponses = await Task.WhenAll(
_searchServices.Select(s => s.PerformSearch(request)));
return searchResponses.FirstOrBad(sr=>sr.IsGood);
Where FirstOrBad is an extension method that finds the first good result or returns a composite Bad Try with a concatenation of all the errors.
As far as I understand the problem with this is that due to the WhenAll the time to find the first good result is limited by the slowest response.
I want to continue execution as soon as I receive the first positive result but not the first (2nd ... etc) result if it is not successful, but also continue execution if all results return unsuccessfully, reporting the lack of success.
I would have thought this is a common problem but have found little when searching for examples. It maybe known by some other term than scatter gather.
Something like this should work for you
public static async Task<Try<T>> FirstOrBad<T>(this IEnumerable<Task<Try<T>>> tasks, Func<Try<T>, bool> predicate)
{
var taskList = tasks.ToList();
var completed = new List<Task<Try<T>>>();
Task<Try<T>> completedTask;
do
{
completedTask = await Task.WhenAny(taskList);
completed.Add(completedTask);
taskList.Remove(completedTask);
} while (!predicate(await completedTask) && taskList.Any());
return !predicate(await completedTask) ? new Try<T>(completed.ToString(",")) : await completedTask;
}
Adapter from this answer TPL wait for task to complete with a specific return value
I'm working on a C# console application that will be responsible for running an array of tasks. The basic structure for that is as follows:
var tasks = workItems.Select(x => Task.Factory.StartNew(() =>
{
DoSomeWork(x);
})).ToArray();
Task.WaitAll(tasks);
The problem is that DoSomeWork() is an async method which awaits the result of another task, which also needs to await a call to the Facebook API.
http://facebooksdk.net/docs/reference/SDK/Facebook.FacebookClient.html#GetTaskAsync(string)
public async void DoSomeWork(WorkItem item)
{
var results = await GetWorkData();
}
public async Task<List<WorkData>> GetWorkData()
{
var fbClient = new FacebookClient();
var task = fbClient.GetTaskAsync("something");
var fbResults = await task;
};
I thought I would be able to support this notion of nested tasks with the call to Task.WaitAll() but the execution of the parent tasks finishes almost immediately. Putting a Console.ReadLine() at the end of the application to prevent it from early execution shows that the result will indeed come back later from Facebook.
Am I missing something obvious or is there a better way to block for my Task collection that will allow for this kind of scenario?
DoSomeWork needs to not return void. The by not returning a Task the caller has no way of knowing when if finishes.
Additionally, it's already asynchronous, so there is no reason to use StartNew here. Just call DoSomeWork directly, since it should be returning a Task.