The description of the Task.WhenAny method says, that it will return the first task finished, even if it's faulted. Is there a way to change this behavior, so it would return first successful task?
Something like this should do it (may need some tweaks - haven't tested):
private static async Task<Task> WaitForAnyNonFaultedTaskAsync(IEnumerable<Task> tasks)
{
IList<Task> customTasks = tasks.ToList();
Task completedTask;
do
{
completedTask = await Task.WhenAny(customTasks);
customTasks.Remove(completedTask);
} while (completedTask.IsFaulted && customTasks.Count > 0);
return completedTask.IsFaulted?null:completedTask;
}
First off, from my review there is no direct way of doing this without waiting for all the tasks to complete then find the first one that ran successfully.
To start with I am not sure of the edge cases that will cause issues that I havent tested, and given the source code around tasks and contiunuation requires more than an hour of review I would like to start to think around the follow source code. Please review my thoughts at the bottom.
public static class TaskExtensions
{
public static async Task<Task> WhenFirst(params Task[] tasks)
{
if (tasks == null)
{
throw new ArgumentNullException(nameof(tasks), "Must be supplied");
}
else if (tasks.Length == 0)
{
throw new ArgumentException("Must supply at least one task", nameof(tasks));
}
int finishedTaskIndex = -1;
for (int i = 0, j = tasks.Length; i < j; i++)
{
var task = tasks[i];
if (task == null)
throw new ArgumentException($"Task at index {i} is null.", nameof(tasks));
if (finishedTaskIndex == -1 && task.IsCompleted && task.Status == TaskStatus.RanToCompletion)
{
finishedTaskIndex = i;
}
}
if (finishedTaskIndex == -1)
{
var promise = new TaskAwaitPromise(tasks.ToList());
for (int i = 0, j = tasks.Length; i < j; i++)
{
if (finishedTaskIndex == -1)
{
var taskId = i;
#pragma warning disable CS4014 // Because this call is not awaited, execution of the current method continues before the call is completed
//we dont want to await these tasks as we want to signal the first awaited task completed.
tasks[i].ContinueWith((t) =>
{
if (t.Status == TaskStatus.RanToCompletion)
{
if (finishedTaskIndex == -1)
{
finishedTaskIndex = taskId;
promise.InvokeCompleted(taskId);
}
}
else
promise.InvokeFailed();
});
#pragma warning restore CS4014 // Because this call is not awaited, execution of the current method continues before the call is completed
}
}
return await promise.WaitCompleted();
}
return Task.FromResult(finishedTaskIndex > -1 ? tasks[finishedTaskIndex] : null);
}
class TaskAwaitPromise
{
IList<Task> _tasks;
int _taskId = -1;
int _taskCount = 0;
int _failedCount = 0;
public TaskAwaitPromise(IList<Task> tasks)
{
_tasks = tasks;
_taskCount = tasks.Count;
GC.KeepAlive(_tasks);
}
public void InvokeFailed()
{
_failedCount++;
}
public void InvokeCompleted(int taskId)
{
if (_taskId < 0)
{
_taskId = taskId;
}
}
public async Task<Task> WaitCompleted()
{
await Task.Delay(0);
while (_taskId < 0 && _taskCount != _failedCount)
{
}
return _taskId > 0 ? _tasks[_taskId] : null;
}
}
}
The code is lengthy I understand and may have lots of issues, however the concept is you need to execute all the tasks in parallel and find the first resulting task that completed successfully.
If we consider that we need to make a continuation block of all the tasks and be able to return out of the continuation block back to the original caller. My main concern (other than the fact I cant remove the continuation) is the while() loop in the code. Probably best to add some sort of CancellationToken and/or Timeout to ensure we dont deadlock while waiting for a completed task. In this case if zero tasks complete we never finish this block.
Edit
I did change the code slightly to signal the promise for a failure so we can handle a failed task. Still not happy with the code but its a start.
Related
I'm doing a small application and I need help, because I do not know where the problem is.
I have not been with C # for a long time and I am learning little by little, because all this is leisure form me, no more.
I have the following Tuple that is working correctly:
private Tuple<int, int, int, int> CheckStatus()
{
int out = 0;
int stage = 0;
int retired = 0;
int stop = 0;
for (int i = 0; i < Dgv.Rows.Count; i++)
{
if (Dgv.Rows[i].Cells["Start"].Value != null)
{
out = out + 1;
}
if (Dgv.Rows[i].Cells["Start"].Value != null && Dgv.Rows[i].Cells["Finnish"].Value == null)
{
stage = stage + 1;
}
if (Dgv.Rows[i].Cells["Start"].Value != null && Dgv.Rows[i].Cells["Finnish"].Value != null)
{
stop = stop + 1;
}
}
retired = GetRetirements();
stage = stage - retired;
return new Tuple<int, int, int,int>(out, stage, retired, stop);
}
I want to pass it to asynchronous to execute an await method because now the GetRetirements method is asynchronous tasks, and change the code to this, but i have problems:
private async Task<Tuple<int, int, int, int>> CheckStatus()
{
int out = 0;
int stage = 0;
int retired = 0;
int stop = 0;
for (int i = 0; i < Dgv.Rows.Count; i++)
{
if (Dgv.Rows[i].Cells["Start"].Value != null)
{
out = out + 1;
}
if (Dgv.Rows[i].Cells["Start"].Value != null && Dgv.Rows[i].Cells["Finnish"].Value == null)
{
stage = stage + 1;
}
if (Dgv.Rows[i].Cells["Start"].Value != null && Dgv.Rows[i].Cells["Finnish"].Value != null)
{
stop = stop + 1;
}
}
retired = await GetRetirements();
stage = stage - retired;
return new Tuple<int, int, int,int>(out, stage, retired, stop);
}
But tells me that can not find any item (item1, item2, item3, item4). I do not know where is the problem.
private void GetCheckStatus()
{
LblOut.Text = CheckStatus().Item1.ToString();
LblStage.Text = CheckStatus().Item2.ToString();
LblRetired.Text = CheckStatus().Item3.ToString();
LblStop.Text = CheckStatus().Item4.ToString();
}
I am doing something wrong? It's the first time I work with Tuple and I do not know the truth that it could be wrong.
Thanks you very much.
Best regards,
CheckStatus is now an async function. To get the result you need to await and you likely only want to invoke the function once. Note how async has also been added to GetCheckStatus and will flow all the way up to an async void event handler, e.g. a button click.
private async Task GetCheckStatus()
{
var status = await CheckStatus()
LblOut.Text = status.Item1.ToString();
LblStage.Text = status.Item2.ToString();
LblRetired.Text = status.Item3.ToString();
LblStop.Text = status.Item4.ToString();
}
You changed CheckStatus() to return a Task<>. You should probably await that and use the result as before.
You could also handle it in different ways, depending on your UI framework. But it comes down to "this method is now aysnc, handle it that way."
You've made the inner call async but the outer call is not waiting for it. Try something like:
private async Task GetCheckStatus()
{
var result = await CheckStatus();
LblOut.Text = result .Item1.ToString();
LblStage.Text = result .Item2.ToString();
LblRetired.Text = result .Item3.ToString();
LblStop.Text = result .Item4.ToString();
}
The cause is, that you forgot to await for the results of CheckStatus() before accessing the result.
It is quite conventional to end the name of async functions with async. This is to warn users not to forget that they are using async-await, and that they should await for the return value before accessing the result.
This has also the advantage that you can offer both the normal version and the async version
async Task<int> GetRetirementsAsync(){...}
async Task<Tuple<int, int, int, int>> CheckStatusAsync()
{
...
int retired = await GetRetirementsAsync();
return new Tuple...
}
async Task GetCheckStatusAsync()
{
var tuple = await CheckStatusAsync();
// process output:
LblOut.Text = tuple.Item1.ToString();
LblStage.Text = tuple.Item2.ToString();
LblRetired.Text = tuple.Item3.ToString();
LblStop.Text = tuple.Item4.ToString();
}
Possible performance improvement
The reason that you want to use the GetRetirementsAsync instead of the non-async GetRetirements, is because you expect that somewhere deep inside the process has to wait idly for the results from another process, like querying database, or reading a file, or fetching data from the internet.
Instead of waiting idly, you can use async await to do other things, until you really need the results from the database.
You do this, by starting the task, without awaiting. The thread won't wait idly for the database, but continues processing your statements until you need the result and await the task.
private async Task<Tuple<int, int, int, int>> CheckStatus()
{
// Get the retirements, do not await yet.
Task<int> taskGetRetirements = GetRetirementsAsync();
// instead of waiting idly, your thread is free to do the following:
int out = 0;
int stage = 0;
int retired = 0;
int stop = 0;
for (int i = 0; i < Dgv.Rows.Count; i++)
{
...
}
// now you need the retirements; await for the task to finish
int retired = await taskGetRetirements;
stage = stage - retired;
return new Tuple<int, int, int,int>(out, stage, retired, stop);
}
How do I close down and wait for a semaphore to be fully released?
private SemaphoreSlim _processSemaphore = new SemaphoreSlim(10);
public async Task<Modification> Process(IList<Command> commands)
{
Assert.IsFalse(_shuttingDown, "Server is in shutdown phase");
await _processSemaphore.WaitAsync();
try
{
// threads that have reached this far must be allowed to complete
return _database.Process(commands);
}
finally
{
_processSemaphore.Release();
}
}
public async Task StopAsync()
{
_shuttingDown = true;
// how wait for threads to complete without cancelling?
await ?
}
private SemaphoreSlim _processSemaphore = new SemaphoreSlim(10);
private int _concurrency;
private TaskCompletionSource<int> _source;
private ManualResetEvent _awaitor;
public void Start()
{
//solution 1
_concurrency = 0;
_source = new TaskCompletionSource<int>();
_shuttingDown = false;
//solution 2
_awaitor = new ManualResetEvent(false);
//your code
}
public async Task<Modification> Process(IList<Command> commands)
{
Interlocked.Increment(ref _concurrency);
Assert.IsFalse(_shuttingDown, "Server is in shutdown phase");
await _processSemaphore.WaitAsync();
try
{
// threads that have reached this far must be allowed to complete
return _database.Process(commands);
}
finally
{
_processSemaphore.Release();
//check and release
int concurrency = Interlocked.Decrement(ref _concurrency);
if (_shuttingDown && concurrency == 0)
{
//solution 1
_source.TrySetResult(0);
//solution 2
_awaitor.Set();
}
}
}
public async Task StopAsync()
{
_shuttingDown = true;
// how wait for threads to complete without cancelling?
if (Interlocked.CompareExchange(ref _concurrency, 0, 0) != 0)
{
await _source.Task;//solution 1
_awaitor.WaitOne();//solution 2
}
}
Might not be exactly what you need, but I had a similar case and I solved it with the CountdownEvent class
private CountdownEvent _countdownEvent = new CountdownEvent(1);
process_method() {
//if the count is zero means that we already called finalize
if (_countdownEvent.IsSet)
return;
try
{
//this can throw and exception if we try to add when the countdown has already reached 0.
//this exception happens when one process_method B has passed the _counddownEvent.IsSet check and context switched to
//to another process_method A that was the last one (after finalize waits for 0) and sets the countdown to 0. Which
//triggers finalization and should not allow new process_method, so process_methodB not continuing is good (finalization is
//in progress).
_countdownEvent.AddCount(1);
} catch
{
return;
}
try
{
//your process
}
finally
{
_countdownEvent.Signal();
}
}
And then when you are ready to wait for the count to be zero:
finalize() {
_countdownEvent.Signal();
_countdownEvent.Wait(_finalizationSafetyTimeout, cancellationToken)
}
Lets say that i have a couple of tasks:
void Sample(IEnumerable<int> someInts)
{
var taskList = someInts.Select(x => DownloadSomeString(x));
}
async Task<string> DownloadSomeString(int x) {...}
I want to to get the result of first successful task. So, the basic solution is to write something like:
var taskList = someInts.Select(x => DownloadSomeString(x));
string content = string.Empty;
Task<string> firstOne = null;
while (string.IsNullOrWhiteSpace(content)){
try
{
firstOne = await Task.WhenAny(taskList);
if (firstOne.Status != TaskStatus.RanToCompletion)
{
taskList = taskList.Where(x => x != firstOne);
continue;
}
content = await firstOne;
}
catch(...){taskList = taskList.Where(x => x != firstOne);}
}
But this solution seems to run N+(N-1)+..+K tasks. Where N is someInts.Count and K is position of first successful task in tasks, so as it's rerunning all task except one that is captured by WhenAny.
So, is there any way to get first task that finished successfully with running maximum of N tasks? (if successful task will be the last one)
All you need to do is create a TaskCompletionSource, add a continuation to each of your tasks, and set it when the first one finished successfully:
public static Task<T> FirstSuccessfulTask<T>(IEnumerable<Task<T>> tasks)
{
var taskList = tasks.ToList();
var tcs = new TaskCompletionSource<T>();
int remainingTasks = taskList.Count;
foreach (var task in taskList)
{
task.ContinueWith(t =>
{
if (task.Status == TaskStatus.RanToCompletion)
tcs.TrySetResult(t.Result);
else
if (Interlocked.Decrement(ref remainingTasks) == 0)
tcs.SetException(new AggregateException(tasks.SelectMany(t1 => t1.Exception.InnerExceptions)));
});
}
return tcs.Task;
}
And a version for tasks without a result:
public static Task FirstSuccessfulTask(IEnumerable<Task> tasks)
{
var taskList = tasks.ToList();
var tcs = new TaskCompletionSource<bool>();
int remainingTasks = taskList.Count;
foreach (var task in taskList)
{
task.ContinueWith(t =>
{
if (task.Status == TaskStatus.RanToCompletion)
tcs.TrySetResult(true);
else
if (Interlocked.Decrement(ref remainingTasks) == 0)
tcs.SetException(new AggregateException(
tasks.SelectMany(t1 => t1.Exception.InnerExceptions)));
});
}
return tcs.Task;
}
The problem with "the first successful task" is what to do if all tasks fail? It's a really bad idea to have a task that never completes.
I assume you'd want to propagate the last task's exception if they all fail. With that in mind, I would say something like this would be appropriate:
async Task<Task<T>> FirstSuccessfulTask(IEnumerable<Task<T>> tasks)
{
Task<T>[] ordered = tasks.OrderByCompletion();
for (int i = 0; i != ordered.Length; ++i)
{
var task = ordered[i];
try
{
await task.ConfigureAwait(false);
return task;
}
catch
{
if (i == ordered.Length - 1)
return task;
continue;
}
}
return null; // Never reached
}
This solution builds on the OrderByCompletion extension method that is part of my AsyncEx library; alternative implementations also exist by Jon Skeet and Stephen Toub.
As a straight forward solution is to wait for any task, check if it is in RanToCompletion state and if not, wait again for any task except the already finished one.
async Task<TResult> WaitForFirstCompleted<TResult>( IEnumerable<Task<TResult>> tasks )
{
var taskList = new List<Task<TResult>>( tasks );
while ( taskList.Count > 0 )
{
Task<TResult> firstCompleted = await Task.WhenAny( taskList ).ConfigureAwait(false);
if ( firstCompleted.Status == TaskStatus.RanToCompletion )
{
return firstCompleted.Result;
}
taskList.Remove( firstCompleted );
}
throw new InvalidOperationException( "No task completed successful" );
}
I have found many methods of using the TaskFactory but I could not find anything about starting more tasks and watching when one ends and starting another one.
I always want to have 10 tasks working.
I want something like this
int nTotalTasks=10;
int nCurrentTask=0;
Task<bool>[] tasks=new Task<bool>[nThreadsNum];
for (int i=0; i<1000; i++)
{
string param1="test";
string param2="test";
if (nCurrentTask<10) // if there are less than 10 tasks then start another one
tasks[nCurrentThread++] = Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
});
// How can I stop the for loop until a new task is finished and start a new one?
}
Check out the Task.WaitAny method:
Waits for any of the provided Task objects to complete execution.
Example from the documentation:
var t1 = Task.Factory.StartNew(() => DoOperation1());
var t2 = Task.Factory.StartNew(() => DoOperation2());
Task.WaitAny(t1, t2)
I would use a combination of Microsoft's Reactive Framework (NuGet "Rx-Main") and TPL for this. It becomes very simple.
Here's the code:
int nTotalTasks=10;
string param1="test";
string param2="test";
IDisposable subscription =
Observable
.Range(0, 1000)
.Select(i => Observable.FromAsync(() => Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
})))
.Merge(nTotalTasks)
.ToArray()
.Subscribe((bool[] results) =>
{
/* Do something with the results. */
});
The key part here is the .Merge(nTotalTasks) which limits the number of concurrent tasks.
If you need to stop the processing part way thru just call subscription.Dispose() and everything gets cleaned up for you.
If you want to process each result as they are produced you can change the code from the .Merge(...) like this:
.Merge(nTotalTasks)
.Subscribe((bool result) =>
{
/* Do something with each result. */
});
This should be all you need, not complete, but all you need to do is wait on the first to complete and then run the second.
Task.WaitAny(task to wait on);
Task.Factory.StartNew()
Have you seen the BlockingCollection class? It allows you to have multiple threads running in parallel and you can wait from results from one task to execute another. See more information here.
The answer depends on whether the tasks to be scheduled are CPU or I/O bound.
For CPU-intensive work I would use Parallel.For() API setting the number of thread/tasks through MaxDegreeOfParallelism property of ParallelOptions
For I/O bound work the number of concurrently executing tasks can be significantly larger than the number of available CPUs, so the strategy is to rely on async methods as much as possible, which reduces the total number of threads waiting for completion.
How can I stop the for loop until a new task is finished and start a
new one?
The loop can be throttled by using await:
static void Main(string[] args)
{
var task = DoWorkAsync();
task.Wait();
// handle results
// task.Result;
Console.WriteLine("Done.");
}
async static Task<bool> DoWorkAsync()
{
const int NUMBER_OF_SLOTS = 10;
string param1="test";
string param2="test";
var results = new bool[NUMBER_OF_SLOTS];
AsyncWorkScheduler ws = new AsyncWorkScheduler(NUMBER_OF_SLOTS);
for (int i = 0; i < 1000; ++i)
{
await ws.ScheduleAsync((slotNumber) => DoWorkAsync(i, slotNumber, param1, param2, results));
}
ws.Complete();
await ws.Completion;
}
async static Task DoWorkAsync(int index, int slotNumber, string param1, string param2, bool[] results)
{
results[slotNumber] = results[slotNumber} && await Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
}));
}
A helper class AsyncWorkScheduler uses TPL.DataFlow components as well as Task.WhenAll():
class AsyncWorkScheduler
{
public AsyncWorkScheduler(int numberOfSlots)
{
m_slots = new Task[numberOfSlots];
m_availableSlots = new BufferBlock<int>();
m_errors = new List<Exception>();
m_tcs = new TaskCompletionSource<bool>();
m_completionPending = 0;
// Initial state: all slots are available
for(int i = 0; i < m_slots.Length; ++i)
{
m_slots[i] = Task.FromResult(false);
m_availableSlots.Post(i);
}
}
public async Task ScheduleAsync(Func<int, Task> action)
{
if (Volatile.Read(ref m_completionPending) != 0)
{
throw new InvalidOperationException("Unable to schedule new items.");
}
// Acquire a slot
int slotNumber = await m_availableSlots.ReceiveAsync().ConfigureAwait(false);
// Schedule a new task for a given slot
var task = action(slotNumber);
// Store a continuation on the task to handle completion events
m_slots[slotNumber] = task.ContinueWith(t => HandleCompletedTask(t, slotNumber), TaskContinuationOptions.ExecuteSynchronously);
}
public async void Complete()
{
if (Interlocked.CompareExchange(ref m_completionPending, 1, 0) != 0)
{
return;
}
// Signal the queue's completion
m_availableSlots.Complete();
await Task.WhenAll(m_slots).ConfigureAwait(false);
// Set completion
if (m_errors.Count != 0)
{
m_tcs.TrySetException(m_errors);
}
else
{
m_tcs.TrySetResult(true);
}
}
public Task Completion
{
get
{
return m_tcs.Task;
}
}
void SetFailed(Exception error)
{
lock(m_errors)
{
m_errors.Add(error);
}
}
void HandleCompletedTask(Task task, int slotNumber)
{
if (task.IsFaulted || task.IsCanceled)
{
SetFailed(task.Exception);
return;
}
if (Volatile.Read(ref m_completionPending) == 1)
{
return;
}
// Release a slot
m_availableSlots.Post(slotNumber);
}
int m_completionPending;
List<Exception> m_errors;
BufferBlock<int> m_availableSlots;
TaskCompletionSource<bool> m_tcs;
Task[] m_slots;
}
So what i am trying to do here is:
Make the engine loop and work on an object if the queue is not empty.
If the queue is empty i call the manualresetevent to make the thread sleep.
When a item is added and the loop is not active i set the manualresetevent.
To make it faster i pick up atmost 5 items from the list and perform operation on them asynchronously and wait for all of them to finish.
Problem:
The clear methods on the two lists are called as soon as a new call to the AddToUpdateQueueMethod is called.
In my head as i am waiting for Task.WhenAll(tasks), so thread should wait for its completion before moving ahead, hence the clear on the lists should only be called on after Task.WhenAll(tasks) returns.
What am i missing here, or what will be a better way to achieve this.
public async Task ThumbnailUpdaterEngine()
{
int count;
List<Task<bool>> tasks = new List<Task<bool>>();
List<Content> candidateContents = new List<Content>();
while (true)
{
for (int i = 0; i < 5; i++)
{
Content nextContent = GetNextFromInternalQueue();
if (nextContent == null)
break;
else
candidateContents.Add(nextContent);
}
foreach (var candidateContent in candidateContents)
{
foreach (var provider in interactionProviders)
{
if (provider.IsServiceSupported(candidateContent.ServiceType))
{
Task<bool> task = provider.UpdateThumbnail(candidateContent);
tasks.Add(task);
break;
}
}
}
var results = await Task.WhenAll(tasks);
tasks.Clear();
foreach (var candidateContent in candidateContents)
{
if (candidateContent.ThumbnailLink != null && !candidateContent.ThumbnailLink.Equals(candidateContent.FileIconLink, StringComparison.CurrentCultureIgnoreCase))
{
Task<bool> task = DownloadAndUpdateThumbnailCache(candidateContent);
tasks.Add(task);
}
}
await Task.WhenAll(tasks);
//Clean up for next time the loop comes in.
tasks.Clear();
candidateContents.Clear();
lock (syncObject)
{
count = internalQueue.Count;
if (count == 0)
{
isQueueControllerRunning = false;
monitorEvent.Reset();
}
}
await Task.Run(() => monitorEvent.WaitOne());
}
}
private Content GetNextFromInternalQueue()
{
lock (syncObject)
{
Content nextContent = null;
if (internalQueue.Count > 0)
{
nextContent = internalQueue[0];
internalQueue.Remove(nextContent);
}
return nextContent;
}
}
public void AddToUpdateQueue(Content content)
{
lock (syncObject)
{
internalQueue.Add(content);
if (!isQueueControllerRunning)
{
isQueueControllerRunning = true;
monitorEvent.Set();
}
}
}
You should simply use TPL Dataflow. It's an actor framework on top of the TPL with an async support. Use an ActionBlock with an async action and MaxDegreeOfParallelism of 5:
var block = new ActionBlock<Content>(
async content =>
{
var tasks = interactionProviders.
Where(provider => provider.IsServiceSupported(content.ServiceType)).
Select(provider => provider.UpdateThumbnail(content));
await Task.WhenAll(tasks);
if (content.ThumbnailLink != null && !content.ThumbnailLink.Equals(
content.FileIconLink,
StringComparison.CurrentCultureIgnoreCase))
{
await DownloadAndUpdateThumbnailCache(content);
}
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 5});
foreach (var content in GetContent())
{
block.Post(content);
}
block.Complete();
await block.Completion