parallel.foreach - loopState.Stop() versus Cancellation

parallel.foreach - loopState.Stop() versus Cancellation - c#

What is the difference between the Cancellation operations versus the loopState operation (Break/Stop)?
private static CancellationTokenSource cts;
public static loopingMethod()
{
cts = new CancellationTokenSource();
try
{
ParallelOptions pOptions = new ParallelOptions();
pOptions.MaxDegreeOfParallelism = 4;
pOptions.CancellationToken = cts.Token;
Parallel.ForEach(dictObj, pOptions, (KVP, loopState) =>
{
pOptions.CancellationToken.ThrowIfCancellationRequested();
parallelDoWork(KVP.Key, KVP.Value, loopState);
}); //End of Parallel.ForEach loop
}
catch (OperationCanceledException e)
{
//Catestrophic Failure
return -99;
}
}
public static void parallelDoWork(string Id, string Value, ParallelLoopState loopState)
{
try{
throw new exception("kill loop");
}
catch(exception ex)
{
if(ex.message == "kill loop")
{
cts.Cancel();
//Or do I use loopState here?
}
}
}
Why would I want to use the ParallelOptions Cancellation operation versus the loopState.Break(); or loopState.Stop(); or vice versa?

See this article
"Setting a cancellation token allows you to abort Invoke (remember that when a delegate throws an exception, the exception is swallowed and only re-thrown by Invoke after all other delegates have been executed)."
Scenario 1. Imagine you have a user about to send messages to all ex-[girl|boy]friends. They click send and then they come to their senses and want to cancel it. By using the cancellation token they are able to stop further messages from going out. So if you have a long running process that is allowed to be cancelled, use the cancellation token.
Scenario 2 On the other hand, if you don't want a process to be interrupted, then use normal loop state exceptions so that the exceptions will be swallowed until all threads finish.
Scenario 3 If you have a process that is I/O intensive, then you probably want to be using async/await and not parallel.foreach. Check out Microsoft's task-based asynchronous pattern.

ParallelLoopState.Break/Stop have well defined semantics specific to the execution of the loop. I.e. by using these you can be very specific about how you want the loop to terminate. A CancellationToken on the other hand is the generic stop mechanism in TPL, so it does nothing special for parallel loops. The advantage of using the token is that it can be shared among other TPL features, so you could have a task and a loop that are controlled by the same token.

Related

How to force an ActionBlock to complete fast

According to the documentation:
A dataflow block is considered completed when it is not currently processing a message and when it has guaranteed that it will not process any more messages.
This behavior is not ideal in my case. I want to be able to cancel the job at any time, but the processing of each individual action takes a long time. So when I cancel the token, the effect is not immediate. I must wait for the currently processed item to complete. I have no way to cancel the actions directly, because the API I use is not cancelable. Can I do anything to make the block ignore the currently running action, and complete instantly?
Here is an example that demonstrates my problem. The token is canceled after 500 msec, and the duration of each action is 1000 msec:
static async Task Main()
{
var cts = new CancellationTokenSource(500);
var block = new ActionBlock<int>(async x =>
{
await Task.Delay(1000);
}, new ExecutionDataflowBlockOptions() { CancellationToken = cts.Token });
block.Post(1); // I must wait for this one to complete
block.Post(2); // This one is ignored
block.Complete();
var stopwatch = Stopwatch.StartNew();
try
{
await block.Completion;
}
catch (OperationCanceledException)
{
Console.WriteLine($"Canceled after {stopwatch.ElapsedMilliseconds} msec");
}
}
Output:
Canceled after 1035 msec
The desired output would be a cancellation after ~500 msec.

Based on this excerpt from your comment...:
What I want to happen in case of a cancellation request is to ignore the currently running workitem. I don't care about it any more, so why I have to wait for it?
...and assuming you are truly OK with leaving the Task running, you can simply wrap the job you wish to call inside another Task which will constantly poll for cancellation or completion, and cancel that Task instead. Take a look at the following "proof-of-concept" code that wraps a "long-running" task inside another Task "tasked" with constantly polling the wrapped task for completion, and a CancellationToken for cancellation (completely "spur-of-the-moment" status, you will want to re-adapt it a bit of course):
public class LongRunningTaskSource
{
public Task LongRunning(int milliseconds)
{
return Task.Run(() =>
{
Console.WriteLine("Starting long running task");
Thread.Sleep(3000);
Console.WriteLine("Finished long running task");
});
}
public Task LongRunningTaskWrapper(int milliseconds, CancellationToken token)
{
Task task = LongRunning(milliseconds);
Task wrapperTask = Task.Run(() =>
{
while (true)
{
//Check for completion (you could, of course, do different things
//depending on whether it is faulted or completed).
if (!(task.Status == TaskStatus.Running))
break;
//Check for cancellation.
if (token.IsCancellationRequested)
{
Console.WriteLine("Aborting Task.");
token.ThrowIfCancellationRequested();
}
}
}, token);
return wrapperTask;
}
}
Using the following code:
static void Main()
{
LongRunningTaskSource longRunning = new LongRunningTaskSource();
CancellationTokenSource cts = new CancellationTokenSource(1500);
Task task = longRunning.LongRunningTaskWrapper(3000, cts.Token);
//Sleep long enough to let things roll on their own.
Thread.Sleep(5000);
Console.WriteLine("Ended Main");
}
...produces the following output:
Starting long running task
Aborting Task.
Exception thrown: 'System.OperationCanceledException' in mscorlib.dll
Finished long running task
Ended Main
The wrapped Task obviously completes in its own good time. If you don't have a problem with that, which is often not the case, hopefully, this should fit your needs.
As a supplementary example, running the following code (letting the wrapped Task finish before time-out):
static void Main()
{
LongRunningTaskSource longRunning = new LongRunningTaskSource();
CancellationTokenSource cts = new CancellationTokenSource(3000);
Task task = longRunning.LongRunningTaskWrapper(1500, cts.Token);
//Sleep long enough to let things roll on their own.
Thread.Sleep(5000);
Console.WriteLine("Ended Main");
}
...produces the following output:
Starting long running task
Finished long running task
Ended Main
So the task started and finished before timeout and nothing had to be cancelled. Of course nothing is blocked while waiting. As you probably already know, of course, if you know what is being used behind the scenes in the long-running code, it would be good to clean up if necessary.
Hopefully, you can adapt this example to pass something like this to your ActionBlock.
Disclaimer & Notes
I am not familiar with the TPL Dataflow library, so this is just a workaround, of course. Also, if all you have is, for example, a synchronous method call that you do not have any influence on at all, then you will obviously need two tasks. One wrapper task to wrap the synchronous call and another one to wrap the wrapper task to include continuous status polling and cancellation checks.

Handle exception with CancellationToken.Register properly

I am trying to cancel a time consuming task after a certain milliseconds and for my case, I think CancellationToken.Register method best fits compared to other approaches with constant polling or WaitHandle. CancellationToken.Register method will help me define a delegate where I plan to take the task to cancelled state and stop further execution of the task; this delegate will be invoked when the task is cancelled (certain milliseconds later as per my goal). Here is the test code I have which I intend to extend for multiple tasks and with nested tasks later:
List<Task> tasks = new List<Task>();
CancellationTokenSource tokenSource = new CancellationTokenSource();
CancellationToken cancellationToken = tokenSource.Token;
Task t1 = Task.Factory.StartNew(() =>
{
// check cancellation token before task begins
if (cancellationToken.IsCancellationRequested) cancellationToken.ThrowIfCancellationRequested();
// register a callback to handle cancellation token anytime it occurs
cancellationToken.Register(() =>
{
Console.WriteLine("Task t1 cancelled");
cancellationToken.ThrowIfCancellationRequested();
});
// simulating a massive task; note, it is not a repeating task
Thread.Sleep(12000);
}, cancellationToken);
tasks.Add(t1);
try
{
// cancel token after 3000 ms + wait for all tasks
tokenSource.CancelAfter(3000);
Task.WaitAll(tasks.ToArray());
// OR wait for all tasks for 3000 ms and then cancel token immediately
//Task.WaitAll(tasks.ToArray(), 3000);
//tokenSource.Cancel();
}
catch (AggregateException e)
{
Console.WriteLine("\nAggregateException thrown with the following inner exceptions:");
// Display information about each exception.
foreach (var v in e.InnerExceptions)
{
if (v is TaskCanceledException)
Console.WriteLine(" TaskCanceledException: Task {0}", ((TaskCanceledException)v).Task.Id);
else
Console.WriteLine(" Exception: {0}", v.GetType().Name);
}
Console.WriteLine();
}
finally
{
tokenSource.Dispose();
}
I am, however, facing an issue with exception handling during execution of the callback method inside cancellationToken.Register. The call to cancellationToken.ThrowIfCancellationRequested() gives me exception "OperationCanceledException was unhandled by user code" followed by "AggregateException was unhandled". I have read about VS settings with unchecking User-unhandled exception for the first OperationCanceledException exception but my application terminates after second AggregateException exception; the try..catch block for Task.WaitAll does not seem to handle this.
I tried to enclose cancellationToken.ThrowIfCancellationRequested() in a try..catch block but the problem with this approach was that the task continued with the remaining steps which I do not desire. I do not see this behaviour with polling approach.
// poll continuously to check for cancellation instead of Register
// but I do not want my massive task inside this repeating block
while (true)
{
if (cancellationToken.IsCancellationRequested)
{
Console.WriteLine("Task t1 Canceled.");
cancellationToken.ThrowIfCancellationRequested();
}
}
What am I doing wrong with the CancellationToken.Register approach?

The error you see is exactly because you didn't wrap the ThrowIfCancellationRequested inside a try-catch block.
In my opinion it really depends on what you're doing in place of that Sleep().
Ending the task in a cooperative way like
while(!cancellationToken.IsCancellationRequested)
{
// Do stuff
// Also check before doing something that may block or take a while
if(!cancellationToken.IsCancellationRequested)
{
Stream.Read(buffer, 0, n);
}
}
should be the best way to stop it.
If you really need to stop it no matter what, I would do wrap it in another task
Task.Run(() =>
{
// Simulate a long running task
Thread.Sleep(12*1000);
}, cancellationToken);
(tested and working)
This way you should not see any Exception Unhandled thrown.
Also, you might want to take a look at this: How do I abort/cancel TPL Tasks?

How should I use DataflowBlockOptions.CancellationToken?

How could I use DataflowBlockOptions.CancellationToken?
If I create instance of BufferBlock like this:
var queue = new BufferBlock<int>(new DataflowBlockOptions { BoundedCapacity = 5, CancellationToken = _cts.Token });
then having consumer/producer methods that use queue, how can I use its CancellationToken to handle cancellation?
E.g. in producer method, how can I check the cancellation token - I haven't found any property to access the token..
EDIT:
Sample of produce/consume methods:
private static async Task Produce(BufferBlock<int> queue, IEnumerable<int> values)
{
foreach (var value in values)
{
await queue.SendAsync(value);
}
queue.Complete();
}
private static async Task<IEnumerable<int>> Consume(BufferBlock<int> queue)
{
var ret = new List<int>();
while (await queue.OutputAvailableAsync())
{
ret.Add(await queue.ReceiveAsync());
}
return ret;
}
Code to call it:
var queue = new BufferBlock<int>(new DataflowBlockOptions { BoundedCapacity = 5, CancellationToken = _cts.Token });
// Start the producer and consumer.
var values = Enumerable.Range(0, 10);
Produce(queue, values);
var consumer = Consume(queue);
// Wait for everything to complete.
await Task.WhenAll(consumer, queue.Completion);
EDIT2:
If I call _cts.Cancel(), the Produce method does not cancel and finishes without interruption.

If you want to cancel produce process you should pass token in it, like this:
private static async Task Produce(
BufferBlock<int> queue,
IEnumerable<int> values,
CancellationToken token
)
{
foreach (var value in values)
{
await queue.SendAsync(value, token);
Console.WriteLine(value);
}
queue.Complete();
}
private static async Task<IEnumerable<int>> Consume(BufferBlock<int> queue)
{
var ret = new List<int>();
while (await queue.OutputAvailableAsync())
{
ret.Add(await queue.ReceiveAsync());
}
return ret;
}
static void Main(string[] args)
{
var cts = new CancellationTokenSource();
var queue = new BufferBlock<int>(new DataflowBlockOptions { BoundedCapacity = 5, CancellationToken = cts.Token });
// Start the producer and consumer.
var values = Enumerable.Range(0, 100);
Produce(queue, values, cts.Token);
var consumer = Consume(queue);
cts.Cancel();
try
{
Task.WaitAll(consumer, queue.Completion);
}
catch (Exception e)
{
Console.WriteLine(e.ToString());
}
foreach (var i in consumer.Result)
{
Console.WriteLine(i);
}
Console.ReadKey();

Normally you use the CancellationToken option in order to control the cancellation of a dataflow block, using an external CancellationTokenSource. Canceling the block (assuming that its a TransformBlock) has the following immediate effects:
The block stops accepting incoming messages. Invoking its Post returns false, meaning that the offered message is rejected.
The messages that are currently stored in the block's internal input buffer are immediately discarded. These messages are lost. They will not be processed or propagated.
If the block is not currently processing any messages, the following effects will also follow immediately. Otherwise they will follow when the processing of all currently processed messages is completed:
All the processed messages that are currently stored in this block's output buffer are discarded. The last processed messages (the messages that were in the middle of processing when the cancellation occurred) will not be propagated to linked blocks downstream.
Any pending asynchronous SendAsync operations targeting the block, that were in-flight when the cancellation occurred, will complete with a result of false (meaning "non accepted").
The Task that represents the Completion of the block transitions to the Canceled state. In other words this task's IsCanceled property becomes true.
You can achieve all but the last effect directly, without using the CancellationToken option, by invoking the block's Fault method. This method is accessible through the IDataflowBlock interface that all blocks implement. You can use it like this:
((IDataflowBlock)block).Fault(new OperationCanceledException());
The difference is that the Completion task will now become Faulted instead of Canceled. This difference may or may not be important, depending on the situation. If you just await the Completion, which is how this property is normally used, in both cases a OperationCanceledException will be thrown. So if you don't need to do anything fancy with the Completion property, and you also want to avoid configuring the CancellationToken for some reason, you could consider this trick as an option.
Update: Behavior when the cancellation occurs after the Complete method has been invoked, in other words when the block is already in its completion phase, but has not completed yet:
If the block is a processing block, like a TransformBlock, all of the above will happen just the same. The block will transition soon to the Canceled state.
If the block is a non-processing block, like a BufferBlock<T>, the (3) from the list above will not happen. The output buffer of a BufferBlock<T> is not emptied, when the cancellation happen after the invocation of the Complete method. See this GitHib issue for a demonstration of this behavior. Please take into consideration that the Complete method may be invoked not only manually, but also automatically, if the block has been linked as the target of a source block, with the PropagateCompletion configuration enabled. You may want to check out this question, to understand fully the implications of this behavior. Long story short, canceling all the blocks of a dataflow pipeline that contains a BufferBlock<T>, does not guarantee that the pipeline will terminate.
Side note: When both the Complete and Fault methods are invoked, whatever was invoked first prevails regarding the final status of the block. If the Complete was invoked first, the block will complete with status RanToCompletion. If the Fault was invoked first, the block will complete with status Faulted. Faulting a Completed block has still an effect though: it empties its internal input buffer.

Immediately cancelling blocking operation with timeout

I have a blocking operation that reads from a queue, but it can take a timeout. I can easily convert this to an "async" operation:
public async Task<IMessage> ReceiveAsync(CancellationToken cancellationToken)
{
return await Task.Run(() =>
{
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
// Try receiving for one second
IMessage message = consumer.Receive(TimeSpan.FromSeconds(1));
if (message != null)
{
return message;
}
}
}, cancellationToken).ConfigureAwait(false);
}
Aborting a thread is generally considered bad practice since you can leak resources, so the timeout seems like the only way to cleanly stop a thread. So I have three questions:
What is a generally accepted timeout value for "immediate" cancellation?
For libraries that provide built-in async methods, does immediate cancellation truly exist or do they also use timeouts and loops to simulate it? Maybe the question here is how would you make use of software interrupts and if these also have to do some sort of polling to check if there are interrupts, even if it's at the kernel/CPU level.
Is there some alternate way I should be approaching this?
Edit: So I may have found part of my answer with Thread.Interrupt() and then handling ThreadInterruptedException. Is this basically a kernel-level software interrupt and as close to "immediate" as we can get? Would the following be a better way of handling this?
public async Task<IMessage> ReceiveAsync(CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
var completionSource = new TaskCompletionSource<IMessage>();
var receiverThread = new Thread(() =>
{
try
{
completionSource.SetResult(consumer.Receive());
}
catch (ThreadInterruptedException)
{
completionSource.SetCanceled();
}
catch (Exception ex)
{
completionSource.SetException(ex);
}
});
cancellationToken.Register(receiverThread.Interrupt);
receiverThread.Name = "Queue Receive";
receiverThread.Start();
return await completionSource.Task.ConfigureAwait(false);
}

It depends on your specific needs. A second could be immediate for some and slow for others.
Libraries (good ones) which provide async API do so from the bottom up. They usually don't wrap blocking (synchronous) operations with a thread to make them seem asynchronous. They use TaskCompletionSource to create truly async methods.
I'm not sure what you mean by queue (the built-in Queue in .Net doesn't have a Receive method) but you should probably be using a truly async data structure like TPL Dataflow's BufferBlock.
About your specific code sample.
You are holding up a thread throughout the entire operation (that's async over sync) which is costly. You could instead try to consume quickly and then wait asynchronously for the timeout to end, or for the CancellationToken to be cancelled.
There's also no point in using another thread with Task.Run. You can simply have the async lambda be the content of ReceiveAsync:
public async Task<IMessage> ReceiveAsync(CancellationToken cancellationToken)
{
while (true)
{
cancellationToken.ThrowIfCancellationRequested();
// Try receiving for one second
IMessage message;
if (!consumer.TryReceive(out message))
{
await Task.Delay(TimeSpan.FromSeconds(1), cancellationToken);
}
if (message != null)
{
return message;
}
}
}
If your queue implements IDisposable a different (harsher) option would be to call Dispose on it when the CancellationToken is cancelled. Here's how.

Aborting a long running task in TPL

Our application uses the TPL to serialize (potentially) long running units of work. The creation of work (tasks) is user-driven and may be cancelled at any time. In order to have a responsive user interface, if the current piece of work is no longer required we would like to abandon what we were doing, and immediately start a different task.
Tasks are queued up something like this:
private Task workQueue;
private void DoWorkAsync
(Action<WorkCompletedEventArgs> callback, CancellationToken token)
{
if (workQueue == null)
{
workQueue = Task.Factory.StartWork
(() => DoWork(callback, token), token);
}
else
{
workQueue.ContinueWork(t => DoWork(callback, token), token);
}
}
The DoWork method contains a long running call, so it is not as simple as constantly checking the status of token.IsCancellationRequested and bailing if/when a cancel is detected. The long running work will block the Task continuations until it finishes, even if the task is cancelled.
I have come up with two sample methods to work around this issue, but am not convinced that either are proper. I created simple console applications to demonstrate how they work.
The important point to note is that the continuation fires before the original task completes.
Attempt #1: An inner task
static void Main(string[] args)
{
CancellationTokenSource cts = new CancellationTokenSource();
var token = cts.Token;
token.Register(() => Console.WriteLine("Token cancelled"));
// Initial work
var t = Task.Factory.StartNew(() =>
{
Console.WriteLine("Doing work");
// Wrap the long running work in a task, and then wait for it to complete
// or the token to be cancelled.
var innerT = Task.Factory.StartNew(() => Thread.Sleep(3000), token);
innerT.Wait(token);
token.ThrowIfCancellationRequested();
Console.WriteLine("Completed.");
}
, token);
// Second chunk of work which, in the real world, would be identical to the
// first chunk of work.
t.ContinueWith((lastTask) =>
{
Console.WriteLine("Continuation started");
});
// Give the user 3s to cancel the first batch of work
Console.ReadKey();
if (t.Status == TaskStatus.Running)
{
Console.WriteLine("Cancel requested");
cts.Cancel();
Console.ReadKey();
}
}
This works, but the "innerT" Task feels extremely kludgey to me. It also has the drawback of forcing me to refactor all parts of my code that queue up work in this manner, by necessitating the wrapping up of all long running calls in a new Task.
Attempt #2: TaskCompletionSource tinkering
static void Main(string[] args)
{ var tcs = new TaskCompletionSource<object>();
//Wire up the token's cancellation to trigger the TaskCompletionSource's cancellation
CancellationTokenSource cts = new CancellationTokenSource();
var token = cts.Token;
token.Register(() =>
{ Console.WriteLine("Token cancelled");
tcs.SetCanceled();
});
var innerT = Task.Factory.StartNew(() =>
{
Console.WriteLine("Doing work");
Thread.Sleep(3000);
Console.WriteLine("Completed.");
// When the work has complete, set the TaskCompletionSource so that the
// continuation will fire.
tcs.SetResult(null);
});
// Second chunk of work which, in the real world, would be identical to the
// first chunk of work.
// Note that we continue when the TaskCompletionSource's task finishes,
// not the above innerT task.
tcs.Task.ContinueWith((lastTask) =>
{
Console.WriteLine("Continuation started");
});
// Give the user 3s to cancel the first batch of work
Console.ReadKey();
if (innerT.Status == TaskStatus.Running)
{
Console.WriteLine("Cancel requested");
cts.Cancel();
Console.ReadKey();
}
}
Again this works, but now I have two problems:
a) It feels like I'm abusing TaskCompletionSource by never using it's result, and just setting null when I've finished my work.
b) In order to properly wire up continuations I need to keep a handle on the previous unit of work's unique TaskCompletionSource, and not the task that was created for it. This is technically possible, but again feels clunky and strange.
Where to go from here?
To reiterate, my question is: are either of these methods the "correct" way to tackle this problem, or is there a more correct/elegant solution that will allow me to prematurely abort a long running task and immediately starting a continuation? My preference is for a low-impact solution, but I'd be willing to undertake some huge refactoring if it's the right thing to do.
Alternately, is the TPL even the correct tool for the job, or am I missing a better task queuing mechanism. My target framework is .NET 4.0.

The real issue here is that the long-running call in DoWork is not cancellation-aware. If I understand correctly, what you're doing here is not really cancelling the long-running work, but merely allowing the continuation to execute and, when the work completes on the cancelled task, ignoring the result. For example, if you used the inner task pattern to call CrunchNumbers(), which takes several minutes, cancelling the outer task will allow continuation to occur, but CrunchNumbers() will continue to execute in the background until completion.
I don't think there's any real way around this other than making your long-running calls support cancellation. Often this isn't possible (they may be blocking API calls, with no API support for cancellation.) When this is the case, it's really a flaw in the API; you may check to see if there are alternate API calls that could be used to perform the operation in a way that can be cancelled. One hack approach to this is to capture a reference to the underlying Thread being used by the Task when the Task is started and then call Thread.Interrupt. This will wake up the thread from various sleep states and allow it to terminate, but in a potentially nasty way. Worst case, you can even call Thread.Abort, but that's even more problematic and not recommended.
Here is a stab at a delegate-based wrapper. It's untested, but I think it will do the trick; feel free to edit the answer if you make it work and have fixes/improvements.
public sealed class AbandonableTask
{
private readonly CancellationToken _token;
private readonly Action _beginWork;
private readonly Action _blockingWork;
private readonly Action<Task> _afterComplete;
private AbandonableTask(CancellationToken token,
Action beginWork,
Action blockingWork,
Action<Task> afterComplete)
{
if (blockingWork == null) throw new ArgumentNullException("blockingWork");
_token = token;
_beginWork = beginWork;
_blockingWork = blockingWork;
_afterComplete = afterComplete;
}
private void RunTask()
{
if (_beginWork != null)
_beginWork();
var innerTask = new Task(_blockingWork,
_token,
TaskCreationOptions.LongRunning);
innerTask.Start();
innerTask.Wait(_token);
if (innerTask.IsCompleted && _afterComplete != null)
{
_afterComplete(innerTask);
}
}
public static Task Start(CancellationToken token,
Action blockingWork,
Action beginWork = null,
Action<Task> afterComplete = null)
{
if (blockingWork == null) throw new ArgumentNullException("blockingWork");
var worker = new AbandonableTask(token, beginWork, blockingWork, afterComplete);
var outerTask = new Task(worker.RunTask, token);
outerTask.Start();
return outerTask;
}
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.