I am trying to run 3 of tasks on different threads (there will be a few more added.) The tasks that are called then call other tasks that are async / await.
The program execution continues after my command to wait. Execution needs to wait until all tasks are complete. My code is below (the null return is just there to test, I still need to create the return code.
public List<string> CopyFilesAsync(List<ModelIterationModel> model)
{
var copyFileTaskParameters = GetCopyFileTaskParameters(model);
Task<List<CopyFitDataResult>> fitDataResulLits = null;
Task<List<CopyNMStoreResult>> nmStoreResultsList = null;
Task<List<CopyDecompAnalyzerResult>> decompAnalyzerStoreResultsList = null;
Task parent = Task.Factory.StartNew(() =>
{
var cancellationToken = new CancellationToken();
TaskFactory factory = new TaskFactory(TaskCreationOptions.AttachedToParent, TaskContinuationOptions.ExecuteSynchronously);
factory.StartNew(() => fitDataResulLits = CopyFitDataFiles(copyFileTaskParameters, cancellationToken));
factory.StartNew(() => decompAnalyzerStoreResultsList = CopyDecompAnalyzerFiles(copyFileTaskParameters, cancellationToken));
factory.StartNew(() => nmStoreResultsList = CopyNMStoreResultsFiles(copyFileTaskParameters, cancellationToken));
});
parent.Wait();
return null;
}
The calling code is synchronous. Execution continues in this method before the tasks above complete.
public void CreateConfigFile(CreateConfigFileParameter parameter)
{
try
{
//data for this will need to come from UI, return values will include local file paths. All copy operations will be ran on it's own thread
//return value will include local file paths
var userFileListModel = _copyFilesToLocalDirectoryService.CopyFilesAsync(temp);
//will return object graph of data read from speadsheets and excel files
_readLocalFilesToDataModelService.ReadAllFiles();
//will take object graph and do date period logic and event type compression and any other business
//logic to extract an object model to create the config file
_processDataModelsToCreateConfigService.Process();
//will take extracted object model and use config file template to create the config file workbook
_writeConfigFileService.WriteConfigFile();
}
catch(Exception ex)
{
}
}
This code is in a class library in a WPF application. I don't know if that is important, but this is the first time I have had to interact with WPF (15 years of web development only.)
What do I need to do to stop execution until all tasks have completed? I played around with a few other approaches, such as attaching as children but nothing I do seems to work.
Edit - I keep trying approaches straight out of MSDN samples with no luck whatsoever. Just tried this
var cancellationToken = new CancellationToken();
var tasks = new List<Task>();
tasks.Add(Task.Run(() =>
{
fitDataResulLits = CopyFitDataFiles(copyFileTaskParameters, cancellationToken);
}));
Task t = Task.WhenAll(tasks.ToArray());
t.Wait();
Exactly like the MSDN sample, and I tried WaitAll but it runs right past it.
Could this have something to do with the Visual Studio debugger?
There are many questions to your code:
If you do not wait for files to be copied, how next lines of code should run?
Why do you need to create a TaskFactory to start a background work, which is already a Task?
Why do you create a CancellationToken? You need to create a CancellationTokenSource, and use it's Token for all your code you may need to cancel.
Also, this code:
tasks.Add(Task.Run(() =>
{
fitDataResulLits = CopyFitDataFiles(copyFileTaskParameters, cancellationToken);
}));
doesn't fire the CopyFitDataFiles, it simply assigns a task reference. You need to do this:
tasks.Add(CopyFitDataFiles(copyFileTaskParameters, cancellationToken));
Your code should be rewritten in this way:
public async Task<List<string>> CopyFilesAsync(List<ModelIterationModel> model)
{
var copyFileTaskParameters = GetCopyFileTaskParameters(model);
// do not await tasks here, just get the reference for them
var fitDataResulLits = CopyFitDataFiles(copyFileTaskParameters, cancellationToken);
// ...
// wait for all running tasks
await Task.WhenAll(copyFileTaskParameters, ...);
// now all of them are finished
}
// note sugnature change
public async Task CreateConfigFile
{
// if you really need to wait for this task after some moment, save the reference for task
var userFileListModel = _copyFilesToLocalDirectoryService.CopyFilesAsync(temp);
...
// now await it
await userFileListModel;
...
}
There is a great article about async/await: Async/Await - Best Practices in Asynchronous Programming by #StephenCleary
Related
I have an IEnumerable<Task>, where each Task will call the same endpoint. However, the endpoint can only handle so many calls per second. How can I put, say, a half second delay between each call?
I have tried adding Task.Delay(), but of course awaiting them simply means that the app waits a half second before sending all the calls at once.
Here is a code snippet:
var resultTasks = orders
.Select(async task =>
{
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
}
catch(Exception ex)
{
result.Exception = ex;
}
return result;
} );
var results = Task.WhenAll(resultTasks);
I feel like I should do something like
Task.WhenAll(resultTasks.EmitOverTime(500));
... but how exactly do I do that?
What you describe in your question is in other words rate limiting. You'd like to apply rate limiting policy to your client, because the API you use enforces such a policy on the server to protect itself from abuse.
While you could implement rate limiting yourself, I'd recommend you to go with some well established solution. Rate Limiter from Davis Desmaisons was the one that I picked at random and I instantly liked it. It had solid documentation, superior coverage and was easy to use. It is also available as NuGet package.
Check out the simple snippet below that demonstrates running semi-overlapping tasks in sequence while defering the task start by half a second after the immediately preceding task started. Each task lasts at least 750 ms.
using ComposableAsync;
using RateLimiter;
using System;
using System.Threading.Tasks;
namespace RateLimiterTest
{
class Program
{
static void Main(string[] args)
{
Log("Starting tasks ...");
var constraint = TimeLimiter.GetFromMaxCountByInterval(1, TimeSpan.FromSeconds(0.5));
var tasks = new[]
{
DoWorkAsync("Task1", constraint),
DoWorkAsync("Task2", constraint),
DoWorkAsync("Task3", constraint),
DoWorkAsync("Task4", constraint)
};
Task.WaitAll(tasks);
Log("All tasks finished.");
Console.ReadLine();
}
static void Log(string message)
{
Console.WriteLine(DateTime.Now.ToString("HH:mm:ss.fff ") + message);
}
static async Task DoWorkAsync(string name, IDispatcher constraint)
{
await constraint;
Log(name + " started");
await Task.Delay(750);
Log(name + " finished");
}
}
}
Sample output:
10:03:27.121 Starting tasks ...
10:03:27.154 Task1 started
10:03:27.658 Task2 started
10:03:27.911 Task1 finished
10:03:28.160 Task3 started
10:03:28.410 Task2 finished
10:03:28.680 Task4 started
10:03:28.913 Task3 finished
10:03:29.443 Task4 finished
10:03:29.443 All tasks finished.
If you change the constraint to allow maximum two tasks per second (var constraint = TimeLimiter.GetFromMaxCountByInterval(2, TimeSpan.FromSeconds(1));), which is not the same as one per half a second, then the output could be like:
10:06:03.237 Starting tasks ...
10:06:03.264 Task1 started
10:06:03.268 Task2 started
10:06:04.026 Task2 finished
10:06:04.031 Task1 finished
10:06:04.275 Task3 started
10:06:04.276 Task4 started
10:06:05.032 Task4 finished
10:06:05.032 Task3 finished
10:06:05.033 All tasks finished.
Note that the current version of Rate Limiter targets .NETFramework 4.7.2+ or .NETStandard 2.0+.
This is just a thought, but another approach could be to create a queue and add another thread that runs polling the queue for calls that need to go out to your endpoint.
Have you considered just turning that into a foreach-loop with a Task.Delay call? You seem to want to explicitly call them sequentially, it won't hurt if that is obvious from your code.
var results = new List<YourResultType>;
foreach(var order in orders){
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
results.Add(result.Response);
}
catch(Exception ex)
{
result.Exception = ex;
}
}
Instead of selecting from orders you could loop over them, and inside the loop put the result into a list and then call Task.WhenAll.
Would look something like:
var resultTasks = new List<VendorTaskResult>(orders.Count);
orders.ToList().ForEach( item => {
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
}
catch(Exception ex)
{
result.Exception = ex;
}
resultTasks.Add(result);
Thread.Sleep(x);
});
var results = Task.WhenAll(resultTasks);
If you want to control the number of requests executed simultaneously, you have to use a semaphore.
I have something very similar, and it works fine with me. Please note that I call ToArray() after the Linq query finishes, that triggers the tasks:
using (HttpClient client = new HttpClient()) {
IEnumerable<Task<string>> _downloads = _group
.Select(job => {
await Task.Delay(300);
return client.GetStringAsync(<url with variable job>);
});
Task<string>[] _downloadTasks = _downloads.ToArray();
_pages = await Task.WhenAll(_downloadTasks);
}
Now please note that this will create n nunmber of tasks, all in parallel, and the Task.Delay literally does nothing. If you want to call the pages synchronously (as it sounds by putting a delay between the calls), then this code may be better:
using (HttpClient client = new HttpClient()) {
foreach (string job in _group) {
await Task.Delay(300);
_pages.Add(await client.GetStringAsync(<url with variable job>));
}
}
The download of the pages is still asynchronous (while downloading other tasks are done), but each call to download the page is synchronous, ensuring that you can wait for one to finish in order to call the next one.
The code can be easily changed to call the pages asynchronously in chunks, like every 10 pages, wait 300ms, like in this sample:
IEnumerable<string[]> toParse = myData
.Select((v, i) => new { v.code, group = i / 20 })
.GroupBy(x => x.group)
.Select(g => g.Select(x => x.code).ToArray());
using (HttpClient client = new HttpClient()) {
foreach (string[] _group in toParse) {
string[] _pages = null;
IEnumerable<Task<string>> _downloads = _group
.Select(job => {
return client.GetStringAsync(<url with job>);
});
Task<string>[] _downloadTasks = _downloads.ToArray();
_pages = await Task.WhenAll(_downloadTasks);
await Task.Delay(5000);
}
}
All this does is group your pages in chunks of 20, iterate through the chunks, download all pages of the chunk asynchronously, wait 5 seconds, move on to the next chunk.
I hope that is what you were waiting for :)
The proposed method EmitOverTime is doable, but only by blocking the current thread:
public static IEnumerable<Task<TResult>> EmitOverTime<TResult>(
this IEnumerable<Task<TResult>> tasks, int delay)
{
foreach (var item in tasks)
{
Thread.Sleep(delay); // Delay by blocking
yield return item;
}
}
Usage:
var results = await Task.WhenAll(resultTasks.EmitOverTime(500));
Probably better is to create a variant of Task.WhenAll that accepts a delay argument, and delays asyncronously:
public static async Task<TResult[]> WhenAllWithDelay<TResult>(
IEnumerable<Task<TResult>> tasks, int delay)
{
var tasksList = new List<Task<TResult>>();
foreach (var task in tasks)
{
await Task.Delay(delay).ConfigureAwait(false);
tasksList.Add(task);
}
return await Task.WhenAll(tasksList).ConfigureAwait(false);
}
Usage:
var results = await WhenAllWithDelay(resultTasks, 500);
This design implies that the enumerable of tasks should be enumerated only once. It is easy to forget this during development, and start enumerating it again, spawning a new set of tasks. For this reason I propose to make it an OnlyOnce enumerable, as it is shown in this question.
Update: I should mention why the above methods work, and under what premise. The premise is that the supplied IEnumerable<Task<TResult>> is deferred, in other words non-materialized. At the method's start there are no tasks created yet. The tasks are created one after the other during the enumeration of the enumerable, and the trick is that the enumeration is slow and controlled. The delay inside the loop ensures that the tasks are not created all at once. They are created hot (in other words already started), so at the time the last task has been created some of the first tasks may have already been completed. The materialized list of half-running/half-completed tasks is then passed to Task.WhenAll, that waits for all to complete asynchronously.
[Updated 18-Apr-2018 with LinqPad example - see end]
My application receives a list of jobs:
var jobs = await myDB.GetWorkItems();
(NB: we use .ConfigureAwait(false) everwhere, I'm just not showing it in these pseudo code snippets.)
For each job, we create a long running Task. However, we don't want to wait for this long running Task to complete.
jobs.ForEach(job =>
{
var module = Factory.GetModule(job.Type);
var task = Task.Run(() => module.ExecuteAsync(job.Data));
this.NonAwaitedTasks.Add(task, module);
};
The task and its related module instance are both added to a ConcurrentDictionary so that they don't go out of scope.
Elsewhere, I have another method that is called occasionally which contains the following:
foreach (var entry in this.NonAwaitedTasks.Where(e => e.Key.IsCompleted))
{
var module = entry.Value as IDisposable;
module?.Dispose();
this.NonAwaitedTasks.Remove(entry.Key);
}
(NB the NonAwaitedTasks is additionally locked using a SemaphoreSlim...)
So, the idea is that this method will find all those Tasks which have completed and then dispose of their related module, and remove them from this Dictionary.
However....
Whilst debugging in Visual Studio 2017, I pull a single job from the DB and whilst I'm taking my time debugging within the single Module that has been instantiated, the Dispose is called on that module. Looking in the Callstack, I can see the Dispose has been called in the method above, and that is because the task has IsCompleted == true. But evidently, it can't be completed because I'm still debugging it.
Is the .IsCompleted property the wrong property to check?
Is this just an artifact of debugging in Visual Studio?
Am I going about this the wrong way?
Additional Information
In the comments below, I was asked to provide some additional information regarding the flow because what I described didn't seem possible (and indeed, my hope was that it couldn't be). Below is a cut-down version of my code (I've removed checks for the cancellation token and defensive coding, but nothing that affects the flow).
Application Entry Point
This is a Windows Service. In the OnStart() is the following line:
this.RunApplicationTask =
Task.Run(() => myApp.DoWorkAsync().ConfigureAwait(false), myService.CancelSource.Token);
"RunApplicationTask" is just a property to keep the executing task in scope during the lifetime of the Service.
DoWorkAsync()
public async Task DoWorkAsync()
{
do
{
await this.ExecuteSingleIterationAsync().ConfigureAwait(false);
await Task.Delay(TimeSpan.FromSeconds(5)).ConfigureAwait(false);
}
while (myApp.ServiceCancellationToken.IsCancellationRequested == false);
await Task.WhenAll(this.NonAwaitedTasks.Keys).ConfigureAwait(false);
await this.ClearCompletedTasksAsync().ConfigureAwait(false);
this.WorkItemsTaskCompletionSource.SetResult(true);
return;
}
So whilst I'm debugging, this is iterating the DO-LOOP, it does not get to the Task.WhenAll(....).
Note too that after the Cancellation request is called and all Tasks have completed, I call ClearCompletedTasksAsync(). More on that later....
ExecuteSingleIterationAsync
private async Task ExecuteSingleIterationAsync()
{
var getJobsResponse = await DB.GetJobsAsync().ConfigureAwait(false);
await this.ProcessWorkLoadAsync(getJobsResponse.Jobs).ConfigureAwait(false);
await this.ClearCompletedTasksAsync().ConfigureAwait(false);
}
ProcessWorkLoadAsync
private async Task ProcessWorkLoadAsync(IList<Job> jobs)
{
if (jobs.NoItems())
{
return ;
}
jobs.ForEach(job =>
{
// The processor instance is disposed of when removed from the NonAwaitedTasks collection.
IJobProcessor processor = ProcessorFactory.GetProcessor(workItem, myApp.ServiceCancellationToken);
try
{
var task = Task.Run(() => processor.ExecuteAsync(job).ConfigureAwait(false), myApp.ServiceCancellationToken);
this.NonAwaitedTasks.Add(task, processor);
}
catch (Exception e)
{
...
}
});
return;
}
Each processor implements the following interface method:
Task ExecuteAsync(Job job);
It's whilst I'm in the ExecuteAsync that .Dispose() gets called on the processor instance I'm using.
ProcessorFactory.GetProcessor()
public static IJobProcessor GetProcessor(Job job, CancellationToken token)
{
.....
switch (someParamCalculatedAbove)
{
case X:
{
return new XProcessor(...);
}
case Y:
{
return new YProcessor(...);
}
default:
{
return null;
}
}
}
So here we're getting a new instance.
ClearCompletedTasksAsync()
private async Task ClearCompletedTasksAsync()
{
await myStatic.NonAwaitedTasksPadlock.WaitAsync().ConfigureAwait(false);
try
{
foreach (var taskEntry in this.NonAwaitedTasks.Where(entry => entry.Key.IsCompleted).ToArray())
{
var processorInstance = taskEntry.Value as IDisposable;
processorInstance?.Dispose();
this.NonAwaitedTasks.Remove(taskEntry.Key);
}
}
finally
{
myStatic.NonAwaitedTasksPadlock.Release();
}
}
This is called every iteration of the Do-Loop. It's purpose is to ensure that the list of non-awaited tasks is kept small.
And that's it... Dispose only seems to get called when debugging.
LinqPad example
async Task Main()
{
SetProcessorRunning();
await Task.Delay(TimeSpan.FromSeconds(1)).ConfigureAwait(false);
do
{
foreach (var entry in NonAwaitedTasks.Where(e => e.Key.IsCompleted).ToArray())
{
"Task is completed, so will dispose of the Task's processor...".Dump();
var p = entry.Value as IDisposable;
p?.Dispose();
NonAwaitedTasks.Remove(entry.Key);
}
}
while (NonAwaitedTasks.Count > 0);
}
// Define other methods and classes here
public void SetProcessorRunning()
{
var p = new Processor();
var task = Task.Run(() => p.DoWorkAsync().ConfigureAwait(false));
NonAwaitedTasks.Add(task, p);
}
public interface IProcessor
{
Task DoWorkAsync();
}
public static Dictionary<Task, IProcessor> NonAwaitedTasks = new Dictionary<Task, IProcessor>();
public class Processor : IProcessor, IDisposable
{
bool isDisposed = false;
public void Dispose()
{
this.isDisposed = true;
"I have been disposed of".Dump();
}
public async Task DoWorkAsync()
{
await Task.Delay(TimeSpan.FromSeconds(5)).ConfigureAwait(false);
if (this.isDisposed)
{
$"I have been disposed of (isDispose = {this.isDisposed}) but I've not finished work yet...".Dump();
}
await Task.Delay(TimeSpan.FromSeconds(5)).ConfigureAwait(false);
}
}
Output:
Task is completed, so will dispose of the Task's processor...
I have been disposed of
I have been disposed of (isDispose = True) but I've not finished work
yet...
Your problem is in this line:
var task = Task.Run(() => p.DoWorkAsync().ConfigureAwait(false));
Hover over the var and take a look at what type that is.
Task.Run understands async delegates by having special "task unwrapping" rules for Func<Task<Task>> and friends. But it won't have any special unwrapping for Func<ConfiguredTaskAwaitable>.
You can think of it this way; with the code above:
p.DoWorkAsync() returns a Task.
Task.ConfigureAwait(false) returns a ConfiguredTaskAwaitable.
So the Task.Run is being asked to run this function that creates a ConfiguredTaskAwaitable on a thread pool thread.
Thus, the return type of Task.Run is Task<ConfiguredTaskAwaitable> - a task that completes as soon as the ConfiguredTaskAwaitable is created. When it is created - not when it completes.
In this case, the ConfigureAwait(false) isn't doing anything anyway, because there's no await to configure. So you can remove it:
var task = Task.Run(() => p.DoWorkAsync());
Also, as Servy mentioned, if you don't need to run DoWorkAsync on a thread pool thread, you can also skip the Task.Run:
var task = p.DoWorkAsync();
I have a list of objects that I need to run a long running process on and I would like to kick them off asynchronously, then when they are all finished return them as a list to the calling method. I've been trying different methods that I have found, however it appears that the processes are still running synchronously in the order that they are in the list. So I am sure that I am missing something in the process of how to execute a list of tasks.
Here is my code:
public async Task<List<ShipmentOverview>> GetShipmentByStatus(ShipmentFilterModel filter)
{
if (string.IsNullOrEmpty(filter.Status))
{
throw new InvalidShipmentStatusException(filter.Status);
}
var lookups = GetLookups(false, Brownells.ConsolidatedShipping.Constants.ShipmentStatusType);
var lookup = lookups.SingleOrDefault(sd => sd.Name.ToLower() == filter.Status.ToLower());
if (lookup != null)
{
filter.StatusId = lookup.Id;
var shipments = Shipments.GetShipments(filter);
var tasks = shipments.Select(async model => await GetOverview(model)).ToList();
ShipmentOverview[] finishedTask = await Task.WhenAll(tasks);
return finishedTask.ToList();
}
else
{
throw new InvalidShipmentStatusException(filter.Status);
}
}
private async Task<ShipmentOverview> GetOverview(ShipmentModel model)
{
String version;
var user = AuthContext.GetUserSecurityModel(Identity.Token, out version) as UserSecurityModel;
var profile = AuthContext.GetProfileSecurityModel(user.Profiles.First());
var overview = new ShipmentOverview
{
Id = model.Id,
CanView = true,
CanClose = profile.HasFeatureAction("Shipments", "Close", "POST"),
CanClear = profile.HasFeatureAction("Shipments", "Clear", "POST"),
CanEdit = profile.HasFeatureAction("Shipments", "Get", "PUT"),
ShipmentNumber = model.ShipmentNumber.ToString(),
ShipmentName = model.Name,
};
var parcels = Shipments.GetParcelsInShipment(model.Id);
overview.NumberParcels = parcels.Count;
var orders = parcels.Select(s => WareHouseClient.GetOrderNumberFromParcelId(s.ParcelNumber)).ToList();
overview.NumberOrders = orders.Distinct().Count();
//check validations
var vals = Shipments.GetShipmentValidations(model.Id);
if (model.ValidationTypeId == Constants.OrderValidationType)
{
if (vals.Count > 0)
{
overview.NumberOrdersTotal = vals.Count();
overview.NumberParcelsTotal = vals.Sum(s => WareHouseClient.GetParcelsPerOrder(s.ValidateReference));
}
}
return overview;
}
It looks like you're using asynchronous methods while you really want threads.
Asynchronous methods yield control back to the calling method when an async method is called, then wait until the methods has completed on the await. You can see how it works here.
Basically, the only usefulness of async/await methods is not to lock the UI, so that it stays responsive.
If you want to fire multiple processings in parallel, you will want to use threads, like such:
using System.Threading.Tasks;
public void MainMethod() {
// Parallel.ForEach will automagically run the "right" number of threads in parallel
Parallel.ForEach(shipments, shipment => ProcessShipment(shipment));
// do something when all shipments have been processed
}
public void ProcessShipment(Shipment shipment) { ... }
Marking the method as async doesn't auto-magically make it execute in parallel. Since you're not using await at all, it will in fact execute completely synchronously as if it wasn't async. You might have read somewhere that async makes functions execute asynchronously, but this simply isn't true - forget it. The only thing it does is build a state machine to handle task continuations for you when you use await and actually build all the code to manage those tasks and their error handling.
If your code is mostly I/O bound, use the asynchronous APIs with await to make sure the methods actually execute in parallel. If they are CPU bound, a Task.Run (or Parallel.ForEach) will work best.
Also, there's no point in doing .Select(async model => await GetOverview(model). It's almost equivalent to .Select(model => GetOverview(model). In any case, since the method actually doesn't return an asynchronous task, it will be executed while doing the Select, long before you get to the Task.WhenAll.
Given this, even the GetShipmentByStatus's async is pretty much useless - you only use await to await the Task.WhenAll, but since all the tasks are already completed by that point, it will simply complete synchronously.
If your tasks are CPU bound and not I/O bound, then here is the pattern I believe you're looking for:
static void Main(string[] args) {
Task firstStepTask = Task.Run(() => firstStep());
Task secondStepTask = Task.Run(() => secondStep());
//...
Task finalStepTask = Task.Factory.ContinueWhenAll(
new Task[] { step1Task, step2Task }, //more if more than two steps...
(previousTasks) => finalStep());
finalStepTask.Wait();
}
I have an abstract class called VehicleInfoFetcher which returns information asynchronously from a WebClient via this method:
public override async Task<DTOrealtimeinfo> getVehicleInfo(string stopID);
I'd like to combine the results of two separate instances of this class, running each in parallel before combining the results. This is done within a third class, CombinedVehicleInfoFetcher (also itself a subclass of VehicleInfoFetcher)
Here's my code - but I'm not quite convinced that it's running the tasks in parallel; am I doing it right? Could it be optimized?
public class CombinedVehicleInfoFetcher : VehicleInfoFetcher
{
public HashSet<VehicleInfoFetcher> VehicleInfoFetchers { get; set; }
public override async Task<DTOrealtimeinfo> getVehicleInfo(string stopID)
{
// Create a list of parallel tasks to run
var resultTasks = new List<Task<DTOrealtimeinfo>>();
foreach (VehicleInfoFetcher fetcher in VehicleInfoFetchers)
resultTasks.Add(fetcher.getVehicleInfo(stopID, stopID2, timePointLocal));
// run each task
foreach (var task in resultTasks)
await task;
// Wait for all the results to come in
await Task.WhenAll(resultTasks.ToArray());
// combine the results
var allRealtimeResults = new List<DTOrealtimeinfo>( resultTasks.Select(t => t.Result) );
return combineTaskResults(allRealtimeResults);
}
DTOrealtimeinfo combineTaskResults(List<DTOrealtimeinfo> realtimeResults)
{
// ...
return rtInfoOutput;
}
}
Edit
Some very helpful answers, here is a re-written example to aid discussion with usr below:
public override async Task<object> combineResults()
{
// Create a list of parallel tasks to run
var resultTasks= new List<object>();
foreach (AnotherClass cls in this.OtherClasses)
resultTasks.Add(cls.getResults() );
// Point A - have the cls.getResults() methods been called yet?
// Wait for all the results to come in
await Task.WhenAll(resultTasks.ToArray());
// combine the results
return new List<object>( resultTasks.Select(t => t.Result) );
}
}
Almost all tasks start out already started. Probably, whatever fetcher.getVehicleInfo returns is already started. So you can remove:
// run each task
foreach (var task in resultTasks)
await task;
Task.WhenAll is faster and has better error behavior (you want all exceptions to be propagated, not just the first you happen to stumble upon).
Also, await does not start a task. It waits for completion. You have to arrange for the tasks to be started separately, but as I said, almost all tasks are already started when you get them. This is best-practice as well.
To help our discussion in the comments:
Task Test1() { return new Task(() => {}); }
Task Test2() { return Task.Factory.StartNew(() => {}); }
Task Test3() { return new FileStream("").ReadAsync(...); }
Task Test4() { return new TaskCompletionSource<object>().Task; }
Does not "run" when returned from the method. Must be started. Bad practice.
Runs when returned. Does not matter what you do with it, it is already running. Not necessary to add it to a list or store it somewhere.
Already runs like (2).
The notion of running does not make sense here. This task will never complete although it cannot be explicitly started.
I have the following code to build an advanced data structure which is pulled from SQL Server, then when the retrevial of that data is complete I update the UI. The code used is
private void BuildSelectedTreeViewSectionAsync(TreeNode selectedNode)
{
// Initialise.
SqlServer instance = null;
SqlServer.Database database = null;
// Build and expand the TreeNode.
Task task = null;
task = Task.Factory.StartNew(() => {
string[] tmpStrArr = selectedNode.Text.Split(' ');
string strDatabaseName = tmpStrArr[0];
instance = SqlServer.Instance(this.conn);
database = instance.GetDatabaseFromName(strDatabaseName);
}).ContinueWith(cont => {
instance.BuildTreeViewForSelectedDatabase(this.customTreeViewSql,
selectedNode, database);
selectedNode.Expand();
task.Dispose();
}, CancellationToken.None, TaskContinuationOptions.OnlyOnRanToCompletion,
this.MainUiScheduler);
}
This works as it should on my main development machine; that is, it completes the build of the database object, then in the continuation update the UI and disposes the task (Task object).
However, I have been doing some testing on another machine and I get an InvalidOperationException, this is due to the task.Dispose() on task which still in the Running state, but the continuation cont should never fire unless the task has ran to completion.
Here's what the code looks like in the debugger when the exception is thrown:
I am aware that it almost always unneccessary to call Dispose on tasks. This question is more about why the continuation is firing at all here?**
The reason for this is simple, you are calling Dispose on the continuation itself and not on the first task
Your code consists of:
Task task = null;
var task = <task 1>.ContinueWith(t => {
/* task 2 */
task.Dispose();
});
In the above code, task is equal to the continuation (ContinueWith doesn't pass back the original Task, it passes the continuation) and that's what's getting captured in the closure you're passing to ContinueWith.
You can test this by comparing the references of the Task parameter passed into the ContinueWith method with task:
Task task = null;
var task = <task 1>.ContinueWith(t => {
/* task 2 */
if (object.ReferenceEquals(t, task))
throw new InvalidOperationException("Trying to dispose of myself!");
task.Dispose();
});
In order to dispose of the first, you need to break it up into two Task variables and capture the first Task, like so:
var task1 = <task 1>;
var task2 = task1.ContinueWith(t => {
// Dispose of task1 when done.
using (task1)
{
// Do task 2.
}
});
However, because the previous Task is passed to you as a parameter in the ContinueWith method, you don't need to capture task in the closure at all, you can simply call Dispose on the Task passed as a parameter to you:
var task = <task 1>.ContinueWith(t => {
// t = task 1
// task = task 2
// Dispose of task 1 when done.
using (t)
{
// Do task 2.
}
});
I'm pretty sure you are trying to do above is equivelent to:
task = Task.Factory.StartNew(() => ...);
task.ContinueWith(cont => { ... task.Dispose(); });
However, what gets assigned to task variable with your code will be the ContinueWith work item, not the origninal StartNew work item.
More importantly, you probably don't even need to worry about task.Dispose() in this scenario.
The only time there is any real value in doing task.Dispose() is when there is a task.Wait() involved somewhere, which allocates an OS wait handle resource under the covers.
More info:
http://social.msdn.microsoft.com/Forums/en/parallelextensions/thread/7b3a42e5-4ebf-405a-8ee6-bcd2f0214f85