Organizing task results using print statements - c#

I am sending urls the are in a list using the task, BeginGetResponse, EndGetResponse and fromasync, and continuewith methods. Using Console.WriteLine is there a way to organize/schedule each urls results when they come back? When I try to handle this the print statements are out of sync.
Example:
1.url:google.com
-ResponseStatus: up
-Sent time
-Received Time
2. url yahoo.com
etc

If you want to print the results in a particular order, rather than in the order that they happen to complete, then don't add the print statements in ContinueWith calls to each task. Instead call WhenAll on a collection of all of the tasks and then add a continuation to that which prints all of the values.
public static void AddPrintStatements(IEnumerable<Task<string>> tasks)
{
Task.WhenAll(tasks)
.ContinueWith(t =>
{
foreach (var line in t.Result)
PrintResults(line);
});
}
Since you're using 4.0 and don't have a WhenAll, you can use this instead:
public static Task<IEnumerable<T>> WhenAll<T>(IEnumerable<Task<T>> tasks)
{
return Task.Factory.ContinueWhenAll(tasks.ToArray(),
results => results.Select(t => t.Result));
}
If you want the results to be printed as they come in, but also maintaining their order, then you could do that too by going through each task and adding a continuation that is both of the previous continuation and the given task:
public static void AddPrintStatements2(IEnumerable<Task<string>> tasks)
{
Task continuation = Task.FromResult(true);
foreach (var task in tasks)
{
continuation = continuation.ContinueWith(t =>
task.ContinueWith(t2 => PrintResults(t2.Result)))
.Unwrap();
}
}
Since you're using 4.0 you also won't have FromResult, so you can use this:
public static Task<T> FromResult<T>(T result)
{
var tcs = new TaskCompletionSource<T>();
tcs.SetResult(result);
return tcs.Task;
}

Related

Catching exceptions in Task.WhenAll

A class has async method MonitorAsync(), which starts a long-running parallel operation. I have a collection of these monitors; these are all kicked off as follows:
internal async Task RunAsync()
{
var tasks = monitors.Select((p) => p.Value.MonitorAsync());
await Task.WhenAll(tasks);
}
If a monitor falls over, I need to know (basically I will run it up again). I've looked into ContinueWith and so on but when running a bunch of async tasks in parallel, how can I ensure I definitely know when one ends?
For context, RunAsync is basically the core of my application.
If a monitor falls over, I need to know (basically I will run it up again).
The easiest way to do this is to define this logic in a separate method:
internal async Task RunAsync()
{
var tasks = monitors.Select(p => MonitorAndRestart(p));
await Task.WhenAll(tasks);
async Task MonitorAndRestart(P p)
{
while (true)
{
try { await p.Value.MonitorAsync(); }
catch { ... }
p.Restart();
}
}
}
If you want to know when one ends (and that does not affect the others), ContinueWith() could be the way.
Alternatively, how about WaitAny in a loop?
while(anyTaskUnfinished){
await Task.WaitAny(tasks);
}
//Stuff you do after WhenAll() comes here
I am uncertain if you have to remove already finished Tasks. Or if it waits for any newly finishing.
You can try this:
If you do not want to call the Task.Wait method to wait for a task's completion, you can also retrieve the AggregateException exception from the task's Exception property
internal async Task RunAsync()
{
var tasks = monitors.Select((p) => p.Value.MonitorAsync());
try
{
await Task.WhenAll(tasks);
}
catch (Exception)
{
foreach (var task in tasks.Where(x => x.IsFaulted))
foreach (var exception in task.Exception.InnerExceptions)
{
// Do Something
}
}
}
Reference: Exception handling (Task Parallel Library)

Queuing asynchronous task in C#

I have few methods that report some data to Data base. We want to invoke all calls to Data service asynchronously. These calls to data service are all over and so we want to make sure that these DS calls are executed one after another in order at any given time. Initially, i was using async await on each of these methods and each of the calls were executed asynchronously but we found out if they are out of sequence then there are room for errors.
So, i thought we should queue all these asynchronous tasks and send them in a separate thread but i want to know what options we have? I came across 'SemaphoreSlim' . Will this be appropriate in my use case?
Or what other options will suit my use case? Please, guide me.
So, what i have in my code currently
public static SemaphoreSlim mutex = new SemaphoreSlim(1);
//first DS call
public async Task SendModuleDataToDSAsync(Module parameters)
{
var tasks1 = new List<Task>();
var tasks2 = new List<Task>();
//await mutex.WaitAsync(); **//is this correct way to use SemaphoreSlim ?**
foreach (var setting in Module.param)
{
Task job1 = SaveModule(setting);
tasks1.Add(job1);
Task job2= SaveModule(GetAdvancedData(setting));
tasks2.Add(job2);
}
await Task.WhenAll(tasks1);
await Task.WhenAll(tasks2);
//mutex.Release(); // **is this correct?**
}
private async Task SaveModule(Module setting)
{
await Task.Run(() =>
{
// Invokes Calls to DS
...
});
}
//somewhere down the main thread, invoking second call to DS
//Second DS Call
private async Task SendInstrumentSettingsToDS(<param1>, <param2>)
{
//await mutex.WaitAsync();// **is this correct?**
await Task.Run(() =>
{
//TrackInstrumentInfoToDS
//mutex.Release();// **is this correct?**
});
if(param2)
{
await Task.Run(() =>
{
//TrackParam2InstrumentInfoToDS
});
}
}
Initially, i was using async await on each of these methods and each of the calls were executed asynchronously but we found out if they are out of sequence then there are room for errors.
So, i thought we should queue all these asynchronous tasks and send them in a separate thread but i want to know what options we have? I came across 'SemaphoreSlim' .
SemaphoreSlim does restrict asynchronous code to running one at a time, and is a valid form of mutual exclusion. However, since "out of sequence" calls can cause errors, then SemaphoreSlim is not an appropriate solution since it does not guarantee FIFO.
In a more general sense, no synchronization primitive guarantees FIFO because that can cause problems due to side effects like lock convoys. On the other hand, it is natural for data structures to be strictly FIFO.
So, you'll need to use your own FIFO queue, rather than having an implicit execution queue. Channels is a nice, performant, async-compatible queue, but since you're on an older version of C#/.NET, BlockingCollection<T> would work:
public sealed class ExecutionQueue
{
private readonly BlockingCollection<Func<Task>> _queue = new BlockingCollection<Func<Task>>();
public ExecutionQueue() => Completion = Task.Run(() => ProcessQueueAsync());
public Task Completion { get; }
public void Complete() => _queue.CompleteAdding();
private async Task ProcessQueueAsync()
{
foreach (var value in _queue.GetConsumingEnumerable())
await value();
}
}
The only tricky part with this setup is how to queue work. From the perspective of the code queueing the work, they want to know when the lambda is executed, not when the lambda is queued. From the perspective of the queue method (which I'm calling Run), the method needs to complete its returned task only after the lambda is executed. So, you can write the queue method something like this:
public Task Run(Func<Task> lambda)
{
var tcs = new TaskCompletionSource<object>();
_queue.Add(async () =>
{
// Execute the lambda and propagate the results to the Task returned from Run
try
{
await lambda();
tcs.TrySetResult(null);
}
catch (OperationCanceledException ex)
{
tcs.TrySetCanceled(ex.CancellationToken);
}
catch (Exception ex)
{
tcs.TrySetException(ex);
}
});
return tcs.Task;
}
This queueing method isn't as perfect as it could be. If a task completes with more than one exception (this is normal for parallel code), only the first one is retained (this is normal for async code). There's also an edge case around OperationCanceledException handling. But this code is good enough for most cases.
Now you can use it like this:
public static ExecutionQueue _queue = new ExecutionQueue();
public async Task SendModuleDataToDSAsync(Module parameters)
{
var tasks1 = new List<Task>();
var tasks2 = new List<Task>();
foreach (var setting in Module.param)
{
Task job1 = _queue.Run(() => SaveModule(setting));
tasks1.Add(job1);
Task job2 = _queue.Run(() => SaveModule(GetAdvancedData(setting)));
tasks2.Add(job2);
}
await Task.WhenAll(tasks1);
await Task.WhenAll(tasks2);
}
Here's a compact solution that has the least amount of moving parts but still guarantees FIFO ordering (unlike some of the suggested SemaphoreSlim solutions). There are two overloads for Enqueue so you can enqueue tasks with and without return values.
using System;
using System.Threading;
using System.Threading.Tasks;
public class TaskQueue
{
private Task _previousTask = Task.CompletedTask;
public Task Enqueue(Func<Task> asyncAction)
{
return Enqueue(async () => {
await asyncAction().ConfigureAwait(false);
return true;
});
}
public async Task<T> Enqueue<T>(Func<Task<T>> asyncFunction)
{
var tcs = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
// get predecessor and wait until it's done. Also atomically swap in our own completion task.
await Interlocked.Exchange(ref _previousTask, tcs.Task).ConfigureAwait(false);
try
{
return await asyncFunction().ConfigureAwait(false);
}
finally
{
tcs.SetResult();
}
}
}
Please keep in mind that your first solution queueing all tasks to lists doesn't ensure that the tasks are executed one after another. They're all running in parallel because they're not awaited until the next tasks is startet.
So yes you've to use a SemapohoreSlim to use async locking and await. A simple implementation might be:
private readonly SemaphoreSlim _syncRoot = new SemaphoreSlim(1);
public async Task SendModuleDataToDSAsync(Module parameters)
{
await this._syncRoot.WaitAsync();
try
{
foreach (var setting in Module.param)
{
await SaveModule(setting);
await SaveModule(GetAdvancedData(setting));
}
}
finally
{
this._syncRoot.Release();
}
}
If you can use Nito.AsyncEx the code can be simplified to:
public async Task SendModuleDataToDSAsync(Module parameters)
{
using var lockHandle = await this._syncRoot.LockAsync();
foreach (var setting in Module.param)
{
await SaveModule(setting);
await SaveModule(GetAdvancedData(setting));
}
}
One option is to queue operations that will create tasks instead of queuing already running tasks as the code in the question does.
PseudoCode without locking:
Queue<Func<Task>> tasksQueue = new Queue<Func<Task>>();
async Task RunAllTasks()
{
while (tasksQueue.Count > 0)
{
var taskCreator = tasksQueue.Dequeu(); // get creator
var task = taskCreator(); // staring one task at a time here
await task; // wait till task completes
}
}
// note that declaring createSaveModuleTask does not
// start SaveModule task - it will only happen after this func is invoked
// inside RunAllTasks
Func<Task> createSaveModuleTask = () => SaveModule(setting);
tasksQueue.Add(createSaveModuleTask);
tasksQueue.Add(() => SaveModule(GetAdvancedData(setting)));
// no DB operations started at this point
// this will start tasks from the queue one by one.
await RunAllTasks();
Using ConcurrentQueue would be likely be right thing in actual code. You also would need to know total number of expected operations to stop when all are started and awaited one after another.
Building on your comment under Alexeis answer, your approch with the SemaphoreSlim is correct.
Assumeing that the methods SendInstrumentSettingsToDS and SendModuleDataToDSAsync are members of the same class. You simplay need a instance variable for a SemaphoreSlim and then at the start of each methode that needs synchornization call await lock.WaitAsync() and call lock.Release() in the finally block.
public async Task SendModuleDataToDSAsync(Module parameters)
{
await lock.WaitAsync();
try
{
...
}
finally
{
lock.Release();
}
}
private async Task SendInstrumentSettingsToDS(<param1>, <param2>)
{
await lock.WaitAsync();
try
{
...
}
finally
{
lock.Release();
}
}
and it is importend that the call to lock.Release() is in the finally-block, so that if an exception is thrown somewhere in the code of the try-block the semaphore is released.

Dependencies between async tasks

I call my async method several times, ones for each unit
var tasks = otherData.Select(async unit =>
await OneUnitProcessor.ProcessOneUnitAsync(
var1,
authenticationResponse,
unit,
reservation: new Reservation()
)
);
await Task.WhenAll(tasks);
This is my async method:
public static async Task ProcessOneUnitAsync(string var1, ServiceResponse authenticationResponse, ChannelManagerJsonHolder unit,IReservation reservation, CookieCollection cookieCollection = null)
{
List<Task<HttpResponseMessage>> tasksDeleteReservations = reservation.DeleteReservations(allLinksToDelete, authenticationResponse, unit);
var occupiedPricesItems = OccupationPriceGetter.GetOccupiedPeriodsWithPrices(fiksniTecaj, authenticationResponse, unit, cookieCollection);
tasksOccupied = allOccupationsToInsert.Select(async price => await Api.CalendarPrices.SendRequestAsync(authenticationResponse, unit, price, unavailable: true));
await Task.WhenAll(tasksOccupied.Union(tasksDeleteReservations));
}
Basically there are tasks which delete reservations(tasksDeleteReservations) and tasks that insert reservations (tasksOccupied). This solution is not good for me because I want achieve solution that all "old" reservations are deleted before inserting new one (all tasksDeleteReservations tasks are finished before tasksOccupied)
One solution would be to have two awaits inside my async method, but I don't think it is good solution two have multiple awaits (control will return to the caller and I think program will exit before all other tasks (related to inserting reservations) are finished.
Other solution would be to block on deleting before contiuning to inserting, but this is probably not async code anymore.
How to achieve asynchronicity and order of execution in situation like this?
EDIT1: Here is the code that calls tasks which process all units;
private async Task ProcessAllUnits(string var1, IEnumerable<ChannelManagerJsonHolder> otherData, ServiceResponse authenticationResponse)
{
try
{
var tasks = otherData.Select(async unit => await OneUnitProcessor.ProcessOneUnitAsync(fiksniTecaj, authenticationResponse, unit, reservation: new Reservation()));
await Task.WhenAll(tasks);
}
catch (AggregateException ex)
{
foreach (var innerException in ex.InnerExceptions)
{
new FileLogger().Log("Unit level exception:" + innerException.Message, ChannelManager.Core.Utilities.Logging.LogLevel.Error);
}
}
}
ProcessAllUnits is called by some other method. Here is the code of that method:
private async Task LoginAndProcessAllUnits(string var1, UserModel oneGroupedCMJsonHolder,IAuthentication authentication)
{
var task = ProcessAllUnits(fiksniTecaj, otherData, authenticationResponse);
await task;
}
This is top level method:
public void ProcessAllUsersAsync(List<ChannelManagerJsonHolder> CMJsonHolder, string var1)
{
var tasks = new List<Task>();
foreach (var group in groupedCMJsonHolder)
{
var task = LoginAndProcessAllUnits(fiksniTecaj, group, new Authentication());
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
}
There's nothing wrong with having multiple awaits in one method. In fact, that's why you're using await in the first place, really - to manage the state machine and continuations for you.
Yes, control will return to the caller. But the point is, if you're using await, you need to use await all the way - the behaviour you're describing will only occur when your await chain is broken by some method that doesn't await on some asynchronous method. That's where your problem is, not the async methods that await multiple times.
All you need is two awaits on those task collections. Control does return to the caller but the task is not completed. If the task were completed by using await what good would the feature be?! It would amount to what return does.
The task that you start with async unit => ... will also complete only when then OneUnitProcessor.ProcessOneUnitAsync... task is done.
And await Task.WhenAll(tasks); completed when all these tasks are done.
await is for serializing execution of tasks.

Is it OK to do some async/await inside some .NET Parallel.ForEach() code?

Given the following code, is it OK to do async/await inside a Parallel.ForEach ?
eg.
Parallel.ForEach(names, name =>
{
// Do some stuff...
var foo = await GetStuffFrom3rdPartyAsync(name);
// Do some more stuff, with the foo.
});
or is there some gotcha's that I need to be made aware of?
EDIT: No idea if this compiles, btw. Just Pseduo-code .. thinking out loud.
No, It doesn't make sense to combine async with Paralell.Foreach.
Consider the following example:
private void DoSomething()
{
var names = Enumerable.Range(0,10).Select(x=> "Somename" + x);
Parallel.ForEach(names, async(name) =>
{
await Task.Delay(1000);
Console.WriteLine("Name {0} completed",name);
});
Console.WriteLine("Parallel ForEach completed");
}
What output you will expect?
Name Somename3 completed
Name Somename8 completed
Name Somename4 completed
...
Parallel ForEach completed
That's not what will happen. It will output :
Parallel ForEach completed
Name Somename3 completed
Name Somename8 completed
Name Somename4 completed
...
Why? Because when ForEach hits first await the method actually returns, Parallel.ForEach doesn't know it is asynchronous and it ran to completion!. Code after await runs as continuation on another thread not "Paralell processing thread"
Stephen toub addressed this here
From the name, I'm assuming that GetStuffFrom3rdPartyAsync is I/O-bound. The Parallel class is specifically for CPU-bound code.
In the asynchronous world, you can start multiple tasks and then (asynchronously) wait for them all to complete using Task.WhenAll. Since you're starting with a sequence, it's probably easiest to project each element to an asynchronous operation, and then await all of those operations:
await Task.WhenAll(names.Select(async name =>
{
// Do some stuff...
var foo = await GetStuffFrom3rdPartyAsync(name);
// Do some more stuff, with the foo.
}));
A close alternative might be this:
static void ForEach<T>(IEnumerable<T> data, Func<T, Task> func)
{
var tasks = data.Select(item =>
Task.Run(() => func(item)));
Task.WaitAll(tasks.ToArray());
}
// ...
ForEach(names, name => GetStuffFrom3rdPartyAsync(name));
Ideally, you shouldn't be using a blocking call like Task.WaitAll, if you can make the whole chain of methods calls async, "all the way down" on the current call stack:
var tasks = data.Select(item =>
Task.Run(() => func(item)));
await Task.WhenAll(tasks.ToArray());
Furthermore, if you don't do any CPU-bound work inside GetStuffFrom3rdPartyAsync, Task.Run may be redundant:
var tasks = data.Select(item => func(item));
As pointed out by #Sriram Sakthivel there are some problems with using Parallel.ForEach with asynchronous lambdas. Steven Toub's ForEachASync can do the equivalent. He talks about it here, but here is the code:
public static class Extensions
{
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate {
using (partition) while (partition.MoveNext()) await body(partition.Current);
}));
}
}
It uses the Partitioner class to create a load balancing partitioner(doco), and allows you to specify how many threads you want to run with the dop parameter. to see the difference between it and Parallel.ForEach. Try the following code.
class Program
{
public static async Task GetStuffParallelForEach()
{
var data = Enumerable.Range(1, 10);
Parallel.ForEach(data, async i =>
{
await Task.Delay(1000 * i);
Console.WriteLine(i);
});
}
public static async Task GetStuffForEachAsync()
{
var data = Enumerable.Range(1, 10);
await data.ForEachAsync(5, async i =>
{
await Task.Delay(1000 * i);
Console.WriteLine(i);
});
}
static void Main(string[] args)
{
//GetStuffParallelForEach().Wait(); // Finished printed before work is complete
GetStuffForEachAsync().Wait(); // Finished printed after all work is done
Console.WriteLine("Finished");
Console.ReadLine();
}
if you run GetStuffForEachAsync the program waits for all work to finish. If you run GetStuffParallelForEach, the line Finished will be printed before the work is finished.

async / await - am I correctly running these methods in parallel?

I have an abstract class called VehicleInfoFetcher which returns information asynchronously from a WebClient via this method:
public override async Task<DTOrealtimeinfo> getVehicleInfo(string stopID);
I'd like to combine the results of two separate instances of this class, running each in parallel before combining the results. This is done within a third class, CombinedVehicleInfoFetcher (also itself a subclass of VehicleInfoFetcher)
Here's my code - but I'm not quite convinced that it's running the tasks in parallel; am I doing it right? Could it be optimized?
public class CombinedVehicleInfoFetcher : VehicleInfoFetcher
{
public HashSet<VehicleInfoFetcher> VehicleInfoFetchers { get; set; }
public override async Task<DTOrealtimeinfo> getVehicleInfo(string stopID)
{
// Create a list of parallel tasks to run
var resultTasks = new List<Task<DTOrealtimeinfo>>();
foreach (VehicleInfoFetcher fetcher in VehicleInfoFetchers)
resultTasks.Add(fetcher.getVehicleInfo(stopID, stopID2, timePointLocal));
// run each task
foreach (var task in resultTasks)
await task;
// Wait for all the results to come in
await Task.WhenAll(resultTasks.ToArray());
// combine the results
var allRealtimeResults = new List<DTOrealtimeinfo>( resultTasks.Select(t => t.Result) );
return combineTaskResults(allRealtimeResults);
}
DTOrealtimeinfo combineTaskResults(List<DTOrealtimeinfo> realtimeResults)
{
// ...
return rtInfoOutput;
}
}
Edit
Some very helpful answers, here is a re-written example to aid discussion with usr below:
public override async Task<object> combineResults()
{
// Create a list of parallel tasks to run
var resultTasks= new List<object>();
foreach (AnotherClass cls in this.OtherClasses)
resultTasks.Add(cls.getResults() );
// Point A - have the cls.getResults() methods been called yet?
// Wait for all the results to come in
await Task.WhenAll(resultTasks.ToArray());
// combine the results
return new List<object>( resultTasks.Select(t => t.Result) );
}
}
Almost all tasks start out already started. Probably, whatever fetcher.getVehicleInfo returns is already started. So you can remove:
// run each task
foreach (var task in resultTasks)
await task;
Task.WhenAll is faster and has better error behavior (you want all exceptions to be propagated, not just the first you happen to stumble upon).
Also, await does not start a task. It waits for completion. You have to arrange for the tasks to be started separately, but as I said, almost all tasks are already started when you get them. This is best-practice as well.
To help our discussion in the comments:
Task Test1() { return new Task(() => {}); }
Task Test2() { return Task.Factory.StartNew(() => {}); }
Task Test3() { return new FileStream("").ReadAsync(...); }
Task Test4() { return new TaskCompletionSource<object>().Task; }
Does not "run" when returned from the method. Must be started. Bad practice.
Runs when returned. Does not matter what you do with it, it is already running. Not necessary to add it to a list or store it somewhere.
Already runs like (2).
The notion of running does not make sense here. This task will never complete although it cannot be explicitly started.

Categories