I have two operations - long running OperationA and much quicker OperationB. I was running them in parallel using TAP and returning results as they both finish :
var taskA = Task.Factory.StartNew(()=>OperationA());
var taskB = Task.Factory.StartNew(()=>OperationB());
var tasks = new Task[] { taskA, taskB };
Task.WaitAll(tasks);
// processing taskA.Result, taskB.Result
No magic here. Now what I want to do is repeat OperationB when it's finished indefinitely in case OperationA is still running. So whole procedure finish point will occur when OperationA is finished and last pass of OperationB is finished. I'm looking for some sort of effective pattern for doing that will not involve polling for OperationA's Status in while loop if that's possible. Looking toward improving WaitAllOneByOne pattern proposed in this Pluralsight course or something similar.
Try this
// Get cancellation support.
CancellationTokenSource source = new CancellationTokenSource();
CancellationToken token = source.Token;
// Start off A and set continuation to cancel B when finished.
bool taskAFinished = false;
var taskA = Task.Factory.StartNew(() => OperationA());
Task contA = taskA.ContinueWith(ant => source.Cancel());
// Set off B and in the method perform your loop. Cancellation with be thrown when
// A has completed.
var taskB = Task.Factory.StartNew(() => OperationB(token), token);
Task contB = taskB.ContinueWith(ant =>
{
switch (task.Status)
{
// Handle any exceptions to prevent UnobservedTaskException.
case TaskStatus.RanToCompletion:
// Do stuff.
break;
case TaskStatus.Canceled:
// You know TaskA is finished.
break;
case TaskStatus.Faulted:
// Something bad.
break;
}
});
then in the OperationB method you can perform your loop and include a cancellation upon TaskA's compleation...
private void OperationB(CancellationToken token)
{
foreach (var v in object)
{
...
token.ThrowIfCancellationRequested(); // This must be handeled. AggregateException.
}
}
Note, instead of complicating with a cancellation, you can just set a bool from with in the continuation of TaskA and check for this in TaskB' loop - this will avoid any faffing about with cancellations.
I hope this helps
Took your approach as basis and adapted a bit :
var source = new CancellationTokenSource();
var token = source.Token;
var taskA = Task.Factory.StartNew(
() => OperationA()
);
var taskAFinished = taskA.ContinueWith(antecedent =>
{
source.Cancel();
return antecedent.Result;
});
var taskB = Task.Factory.StartNew(
() => OperationB(token), token
);
var taskBFinished = taskB.ContinueWith(antecedent =>
{
switch (antecedent.Status)
{
case TaskStatus.RanToCompletion:
case TaskStatus.Canceled:
try
{
return ant.Result;
}
catch (AggregateException ae)
{
// Operation was canceled before start if OperationA is short
return null;
}
case TaskStatus.Faulted:
return null;
}
return null;
});
Made two continuations that returns results for respective operations so I could make wait for them both to be finished (tried to do that only with second one and it didn't work).
var tasks = new Task[] { taskAFinished, taskBFinished };
Task.WaitAll(tasks);
First one is just passing antecedent's task Result further, second takes aggregate results available at this point in OperationB (both RanToCompletion and Canceled statuses are considered correct end of process). OperationB now looks like this :
public static List<Result> OperationB(CancellationToken token)
{
var resultsList = new List<Result>();
while (true)
{
foreach (var op in operations)
{
resultsList.Add(op.GetResult();
}
if (token.IsCancellationRequested)
{
return resultsList;
}
}
}
Changes logic a bit - all loops inside OperationB now are considered as single task, but this is easier than keep them atomic and write some sort of coordination primitive that will gather results from each run. In case I don't really care which loop produced which results this seems to be a decent solution. May improve to more flexible implementation later if needed (what I was actually looking for is to chain multiple operations recursively - OperationB itself may have smaller repeating OperationC's inside with same behavior, OperationC - multiple OperationD's that are running when C is active etc.).
edit
Added exception handling in taskBFinished for case when OperationA is quick and cancellation is issued before OperationB is even started.
Related
I have an IEnumerable<Task>, where each Task will call the same endpoint. However, the endpoint can only handle so many calls per second. How can I put, say, a half second delay between each call?
I have tried adding Task.Delay(), but of course awaiting them simply means that the app waits a half second before sending all the calls at once.
Here is a code snippet:
var resultTasks = orders
.Select(async task =>
{
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
}
catch(Exception ex)
{
result.Exception = ex;
}
return result;
} );
var results = Task.WhenAll(resultTasks);
I feel like I should do something like
Task.WhenAll(resultTasks.EmitOverTime(500));
... but how exactly do I do that?
What you describe in your question is in other words rate limiting. You'd like to apply rate limiting policy to your client, because the API you use enforces such a policy on the server to protect itself from abuse.
While you could implement rate limiting yourself, I'd recommend you to go with some well established solution. Rate Limiter from Davis Desmaisons was the one that I picked at random and I instantly liked it. It had solid documentation, superior coverage and was easy to use. It is also available as NuGet package.
Check out the simple snippet below that demonstrates running semi-overlapping tasks in sequence while defering the task start by half a second after the immediately preceding task started. Each task lasts at least 750 ms.
using ComposableAsync;
using RateLimiter;
using System;
using System.Threading.Tasks;
namespace RateLimiterTest
{
class Program
{
static void Main(string[] args)
{
Log("Starting tasks ...");
var constraint = TimeLimiter.GetFromMaxCountByInterval(1, TimeSpan.FromSeconds(0.5));
var tasks = new[]
{
DoWorkAsync("Task1", constraint),
DoWorkAsync("Task2", constraint),
DoWorkAsync("Task3", constraint),
DoWorkAsync("Task4", constraint)
};
Task.WaitAll(tasks);
Log("All tasks finished.");
Console.ReadLine();
}
static void Log(string message)
{
Console.WriteLine(DateTime.Now.ToString("HH:mm:ss.fff ") + message);
}
static async Task DoWorkAsync(string name, IDispatcher constraint)
{
await constraint;
Log(name + " started");
await Task.Delay(750);
Log(name + " finished");
}
}
}
Sample output:
10:03:27.121 Starting tasks ...
10:03:27.154 Task1 started
10:03:27.658 Task2 started
10:03:27.911 Task1 finished
10:03:28.160 Task3 started
10:03:28.410 Task2 finished
10:03:28.680 Task4 started
10:03:28.913 Task3 finished
10:03:29.443 Task4 finished
10:03:29.443 All tasks finished.
If you change the constraint to allow maximum two tasks per second (var constraint = TimeLimiter.GetFromMaxCountByInterval(2, TimeSpan.FromSeconds(1));), which is not the same as one per half a second, then the output could be like:
10:06:03.237 Starting tasks ...
10:06:03.264 Task1 started
10:06:03.268 Task2 started
10:06:04.026 Task2 finished
10:06:04.031 Task1 finished
10:06:04.275 Task3 started
10:06:04.276 Task4 started
10:06:05.032 Task4 finished
10:06:05.032 Task3 finished
10:06:05.033 All tasks finished.
Note that the current version of Rate Limiter targets .NETFramework 4.7.2+ or .NETStandard 2.0+.
This is just a thought, but another approach could be to create a queue and add another thread that runs polling the queue for calls that need to go out to your endpoint.
Have you considered just turning that into a foreach-loop with a Task.Delay call? You seem to want to explicitly call them sequentially, it won't hurt if that is obvious from your code.
var results = new List<YourResultType>;
foreach(var order in orders){
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
results.Add(result.Response);
}
catch(Exception ex)
{
result.Exception = ex;
}
}
Instead of selecting from orders you could loop over them, and inside the loop put the result into a list and then call Task.WhenAll.
Would look something like:
var resultTasks = new List<VendorTaskResult>(orders.Count);
orders.ToList().ForEach( item => {
var result = new VendorTaskResult();
try
{
result.Response = await result.CallVendorAsync();
}
catch(Exception ex)
{
result.Exception = ex;
}
resultTasks.Add(result);
Thread.Sleep(x);
});
var results = Task.WhenAll(resultTasks);
If you want to control the number of requests executed simultaneously, you have to use a semaphore.
I have something very similar, and it works fine with me. Please note that I call ToArray() after the Linq query finishes, that triggers the tasks:
using (HttpClient client = new HttpClient()) {
IEnumerable<Task<string>> _downloads = _group
.Select(job => {
await Task.Delay(300);
return client.GetStringAsync(<url with variable job>);
});
Task<string>[] _downloadTasks = _downloads.ToArray();
_pages = await Task.WhenAll(_downloadTasks);
}
Now please note that this will create n nunmber of tasks, all in parallel, and the Task.Delay literally does nothing. If you want to call the pages synchronously (as it sounds by putting a delay between the calls), then this code may be better:
using (HttpClient client = new HttpClient()) {
foreach (string job in _group) {
await Task.Delay(300);
_pages.Add(await client.GetStringAsync(<url with variable job>));
}
}
The download of the pages is still asynchronous (while downloading other tasks are done), but each call to download the page is synchronous, ensuring that you can wait for one to finish in order to call the next one.
The code can be easily changed to call the pages asynchronously in chunks, like every 10 pages, wait 300ms, like in this sample:
IEnumerable<string[]> toParse = myData
.Select((v, i) => new { v.code, group = i / 20 })
.GroupBy(x => x.group)
.Select(g => g.Select(x => x.code).ToArray());
using (HttpClient client = new HttpClient()) {
foreach (string[] _group in toParse) {
string[] _pages = null;
IEnumerable<Task<string>> _downloads = _group
.Select(job => {
return client.GetStringAsync(<url with job>);
});
Task<string>[] _downloadTasks = _downloads.ToArray();
_pages = await Task.WhenAll(_downloadTasks);
await Task.Delay(5000);
}
}
All this does is group your pages in chunks of 20, iterate through the chunks, download all pages of the chunk asynchronously, wait 5 seconds, move on to the next chunk.
I hope that is what you were waiting for :)
The proposed method EmitOverTime is doable, but only by blocking the current thread:
public static IEnumerable<Task<TResult>> EmitOverTime<TResult>(
this IEnumerable<Task<TResult>> tasks, int delay)
{
foreach (var item in tasks)
{
Thread.Sleep(delay); // Delay by blocking
yield return item;
}
}
Usage:
var results = await Task.WhenAll(resultTasks.EmitOverTime(500));
Probably better is to create a variant of Task.WhenAll that accepts a delay argument, and delays asyncronously:
public static async Task<TResult[]> WhenAllWithDelay<TResult>(
IEnumerable<Task<TResult>> tasks, int delay)
{
var tasksList = new List<Task<TResult>>();
foreach (var task in tasks)
{
await Task.Delay(delay).ConfigureAwait(false);
tasksList.Add(task);
}
return await Task.WhenAll(tasksList).ConfigureAwait(false);
}
Usage:
var results = await WhenAllWithDelay(resultTasks, 500);
This design implies that the enumerable of tasks should be enumerated only once. It is easy to forget this during development, and start enumerating it again, spawning a new set of tasks. For this reason I propose to make it an OnlyOnce enumerable, as it is shown in this question.
Update: I should mention why the above methods work, and under what premise. The premise is that the supplied IEnumerable<Task<TResult>> is deferred, in other words non-materialized. At the method's start there are no tasks created yet. The tasks are created one after the other during the enumeration of the enumerable, and the trick is that the enumeration is slow and controlled. The delay inside the loop ensures that the tasks are not created all at once. They are created hot (in other words already started), so at the time the last task has been created some of the first tasks may have already been completed. The materialized list of half-running/half-completed tasks is then passed to Task.WhenAll, that waits for all to complete asynchronously.
In the docs for TPL I found this line:
Invoke multiple continuations from the same antecedent
But this isn't explained any further. I naively assumed you could chain ContinueWiths in a pattern matching like manner until you hit the right TaskContinuationOptions.
TaskThatReturnsString()
.ContinueWith((s) => Console.Out.WriteLine(s.Result), TaskContinuationOptions.OnlyOnRanToCompletion)
.ContinueWith((f) => Console.Out.WriteLine(f.Exception.Message), TaskContinuationOptions.OnlyOnFaulted)
.ContinueWith((f) => Console.Out.WriteLine("Cancelled"), TaskContinuationOptions.OnlyOnCanceled)
.Wait();
But this doesn't work like I hoped for at least two reasons.
The continuations are properly chained so the 2nd ContinueWith gets the result form the 1st, that is implemented as new Task, basically the ContinueWith task itself. I realize that the String could be returned onwards, but won't that be a new task with other info lost?
Since the first option is not met, the Task is just cancelled. Meaning that the second set will never be met and the exceptions are lost.
So what do they mean in the docs when they say multiple continuations from the same antecedent?
Is there a proper patter for this or do we just have to wrap the calls in try catch blocks?
EDIT
So I guess this was what I was hoping I could do, note this is a simplified example.
public void ProccessAllTheThings()
{
var theThings = util.GetAllTheThings();
var tasks = new List<Task>();
foreach (var thing in theThings)
{
var task = util.Process(thing)
.ContinueWith((t) => Console.Out.WriteLine($"Finished processing {thing.ThingId} with result {t.Result}"), TaskContinuationOptions.OnlyOnRanToCompletion)
.ContinueWith((t) => Console.Out.WriteLine($"Error on processing {thing.ThingId} with error {t.Exception.Message}"), TaskContinuationOptions.OnlyOnFaulted);
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
}
Since this wasn't possible I was thinking I would have to wrap each task call in a try catch inside the loop so I wouldn't stop the process but not wait on it there. I wasn't sure what the correct way.
Sometimes a solution is just staring you in the face, this would work wouldn't it?
public void ProccessAllTheThings()
{
var theThings = util.GetAllTheThings();
var tasks = new List<Task>();
foreach (var thing in theThings)
{
var task = util.Process(thing)
.ContinueWith((t) =>
{
if (t.Status == TaskStatus.RanToCompletion)
{
Console.Out.WriteLine($"Finished processing {thing.ThingId} with result {t.Result}");
}
else
{
Console.Out.WriteLine($"Error on processing {thing.ThingId} - {t.Exception.Message}");
}
});
tasks.Add(task);
}
Task.WaitAll(tasks.ToArray());
}
What you did is to create a sequential chain of multiple tasks.
What you need to do is attach all your continuation tasks to the first one:
var firstTask = TaskThatReturnsString();
var t1 = firstTask.ContinueWith (…);
var t2 = firstTask.ContinueWith (…);
var t3 = firstTask.ContinueWith (…);
Then you need to wait for all the continuation tasks:
Task.WaitAll (t1, t2, t3);
If I have code that abstracts a staged sequence of asynchronous operations by returning a Task representing each stage, how can I ensure that continuations execute in stage order (i.e. the order in which the Tasks are completed)?
Note that this is a different requirement from simply 'not wasting time waiting for slower tasks'. The order needs to be guaranteed without race conditions in the scheduling. This looser requirement could addressed by parts of the answers to the following questions:
Sort Tasks into order of completition
Is there default way to get first task that finished successfully?
I think the logical solution would be to attach the continuations using a custom TaskScheduler (such as one based on a SynchronizationContext). However, I can't find any assurance that the scheduling of continuations is performed synchronously upon task completion.
In code this could be something like
class StagedOperationSource
{
public TaskCompletionSource Connect = new TaskCompletionSource();
public TaskCompletionSource Accept = new TaskCompletionSource();
public TaskCompletionSource Complete = new TaskCompletionSource();
}
class StagedOperation
{
public Task Connect, Accept, Complete;
public StagedOperation(StagedOperationSource source)
{
Connect = source.Connect.Task;
Accept = source.Accept.Task;
Complete = source.Complete.Task;
}
}
...
private StagedOperation InitiateStagedOperation(int opId)
{
var source = new StagedOperationSource();
Task.Run(GetRunnerFromOpId(opId, source));
return new StagedOperation(source);
}
...
public RunOperations()
{
for (int i=0; i<3; i++)
{
var op = InitiateStagedOperation(i);
op.Connect.ContinueWith(t => Console.WriteLine("{0}: Connected", i));
op.Accept.ContinueWith(t => Console.WriteLine("{0}: Accepted", i));
op.Complete.ContinueWith(t => Console.WriteLine("{0}: Completed", i));
}
}
which should produce output similar to
0: Connected
1: Connected
0: Accepted
2: Connected
0: Completed
1: Accepted
2: Accepted
2: Completed
1: Completed
Obviously the example is missing details like forwarding exceptions to (or cancelling) later stages if an earlier stage fails, but its just an example.
Just await each stage before going onto the next...
public static async Task ProcessStagedOperation(StagedOperation operation, int i)
{
await operation.Connect;
Console.WriteLine("{0}: Connected", i);
await operation.Accept;
Console.WriteLine("{0}: Accepted", i);
await operation.Complete;
Console.WriteLine("{0}: Completed", i);
}
You can then call that method in your for loop.
If you use TAP (Task Asynchronous Programming), i.e. async and await, you can make the flow of processing a lot more apparent. In this case I would create a new method to encapsulate the order of operations:
public async Task ProcessStagedOperation(StagedOperation op, int i)
{
await op.Connect;
Console.WriteLine("{0}: Connected", i);
await op.Accept;
Console.WriteLine("{0}: Accepted", i)
await op.Complete;
Console.WriteLine("{0}: Completed", i)
}
Now your processing loop gets simplified a bit:
public async Task RunOperations()
{
List<Task> pendingOperations = new List<Task>();
for (int i=0; i<3; i++)
{
var op = InitiateStagedOperation(i);
pendingOperations.Add(ProcessStagedOperation(op, i));
}
await Task.WhenAll(pendingOperations); // finish
}
You now have a reference to a task object you can explicitly wait or simply await from another context. (or you can simply ignore it). The way I modified the RunOperations() method allows you to create a large queue of pending tasks but not block while you wait for them all to finish.
I am writing a set of async tasks that go away an download and parse data, however I am running in to a bit of a blank with the next step where I am updating a database.
The issue is that for the sake of performance I am using a TableLock to load rather large datasets, so what I am wanting to do is have my import service wait for the first Task to return, start the import. Should another Task complete while the first import is running the process joins a queue and waits for the import service is complete for task 1.
eg.
Async
- Task1
- Task2
- Task3
Sync
- ImportService
RunAsync Tasks
Task3 returns first > ImportService.Import(Task3)
Task1 return, ImportService is still running. Wait()
ImportService.Complete() event
Task2 returns. Wait()
ImportService.Import(Task1)
ImportService.Complete() event
ImportService.Import(Task2)
ImportService.Complete() event
Hope this makes sense!
You can't really use await here, but you can wait on multiple tasks to complete:
var tasks = new List<Task)();
// start the tasks however
tasks.Add(Task.Run(Task1Function);
tasks.Add(Task.Run(Task2Function);
tasks.Add(Task.Run(Task2Function);
while (tasks.Count > 0)
{
var i = Task.WaitAny(tasks.ToArray()); // yes this is ugly but an array is required
var task = tasks[i];
tasks.RemoveAt(i);
ImportService.Import(task); // do you need to pass the task or the task.Result
}
Seems to me however that there should be a better option. You could let the tasks and the import run and add a lock on the ImportService part for instance:
// This is the task code doing whatever
....
// Task finishes and calls ImportService.Import
lock(typeof(ImportService)) // actually the lock should probably be inside the Import method
{
ImportService.Import(....);
}
There are several things bothering me with your requirements (including using a static ImportService, static classes are rarely a good idea), but without further details I can't provide better advice.
While this is likely not the most graceful solution, I would try launching the work tasks and have them place their output in a ConcurrentQueue. You could check the queue for work on a timer until all tasks are completed.
var rand = new Random();
var importedData = new List<string>();
var results = new ConcurrentQueue<string>();
var tasks = new List<Task<string>>
{
new Task<string>(() =>
{
Thread.Sleep(rand.Next(1000, 5000));
Debug.WriteLine("Task 1 Completed");
return "ABC";
}),
new Task<string>(() =>
{
Thread.Sleep(rand.Next(1000, 5000));
Debug.WriteLine("Task 2 Completed");
return "FOO";
}),
new Task<string>(() =>
{
Thread.Sleep(rand.Next(1000, 5000));
Debug.WriteLine("Task 3 Completed");
return "BAR";
})
};
tasks.ForEach(t =>
{
t.ContinueWith(r => results.Enqueue(r.Result));
t.Start();
});
var allTasksCompleted = new AutoResetEvent(false);
new Timer(state =>
{
var timer = (Timer) state;
string item;
if (!results.TryDequeue(out item))
return;
importedData.Add(item);
Debug.WriteLine("Imported " + item);
if (importedData.Count == tasks.Count)
{
timer.Dispose();
Debug.WriteLine("Completed.");
allTasksCompleted.Set();
}
}).Change(1000, 100);
allTasksCompleted.WaitOne();
Let's consider the method:
Task Foo(IEnumerable items, CancellationToken token)
{
return Task.Run(() =>
{
foreach (var i in items)
token.ThrowIfCancellationRequested();
}, token);
}
Then I have a consumer:
var cts = new CancellationTokenSource();
var task = Foo(Items, cts.token);
task.Wait();
And the example of Items:
IEnumerable Items
{
get
{
yield return 0;
Task.Delay(Timeout.InfiniteTimeSpan).Wait();
yield return 1;
}
}
What about task.Wait?
I cannot put my cancel token into collection of items.
How to kill the not responding task or get around this?
I found one solution that allows to put cancellation token into Items originating from thid parties:
public static IEnumerable<T> ToCancellable<T>(this IEnumerable<T> #this, CancellationToken token)
{
var enumerator = #this.GetEnumerator();
for (; ; )
{
var task = Task.Run(() => enumerator.MoveNext(), token);
task.Wait(token);
if (!task.Result)
yield break;
yield return enumerator.Current;
}
}
Now I need to use:
Items.ToCancellable(cts.token)
And that will not hang after cancel request.
You can't really cancel a non-cancellable operation. Stephen Toub goes into details in "How do I cancel non-cancelable async operations?" on the Parallel FX Team's blog but the essence is that you need to understand what you actually want to do?
Stop the asynchronous/long-running operation itself? Not doable in a cooperative way, if you don't have a way to signal the operation
Stop waiting for the operation to finish, ignoring any results? That's doable, but can lead to unreliability for obvious reasons. You can start a Task with the long operation passing a cancellation token, or use a TaskCompletionSource as Stephen Toub describes.
You need to decide which behavior you want to find the proper solution
Why can't you pass the CancellationToken to Items()?
IEnumerable Items(CancellationToken ct)
{
yield return 0;
Task.Delay(Timeout.InfiniteTimeSpan, ct).Wait();
yield return 1;
}
You would have to pass the same token to Items() as you pass to Foo(), of course.
Try using a TaskCompletionSource and returning that. You can then set the TaskCompletionSource to the result (or the error) of the inner task if it runs to completion (or faults). But you can set it to canceled immediately if the CancellationToken gets triggered.
Task<int> Foo(IEnumerable<int> items, CancellationToken token)
{
var tcs = new TaskCompletionSource<int>();
token.Register(() => tcs.TrySetCanceled());
var innerTask = Task.Factory.StartNew(() =>
{
foreach (var i in items)
token.ThrowIfCancellationRequested();
return 7;
}, token);
innerTask.ContinueWith(task => tcs.TrySetResult(task.Result), TaskContinuationOptions.OnlyOnRanToCompletion);
innerTask.ContinueWith(task => tcs.TrySetException(task.Exception), TaskContinuationOptions.OnlyOnFaulted);
return tcs.Task;
}
This won't actually kill the inner task, but it'll give you a task that you can continue from immediately on cancellation. To kill the inner task since it's hanging out in an infinite timeout, I believe the only thing you can do is to grab a reference to Thread.CurrentThread where you start the task, and then call taskThread.Abort() from within Foo, which of course is bad practice. But in this case your question really comes down to "how can I make a long running function terminate without having access to the code", which is only doable via Thread.Abort.
Can you have Items be IEnumerable<Task<int>> instead of IEnumerable<int>? Then you could do
return Task.Run(() =>
{
foreach (var task in tasks)
{
task.Wait(token);
token.ThrowIfCancellationRequested();
var i = task.Result;
}
}, token);
Although something like this may be more straightforward to do using Reactive Framework and doing items.ToObservable. That would look like this:
static Task<int> Foo(IEnumerable<int> items, CancellationToken token)
{
var sum = 0;
var tcs = new TaskCompletionSource<int>();
var obs = items.ToObservable(ThreadPoolScheduler.Instance);
token.Register(() => tcs.TrySetCanceled());
obs.Subscribe(i => sum += i, tcs.SetException, () => tcs.TrySetResult(sum), token);
return tcs.Task;
}
How about creating a wrapper around the enumerable that is itself cancellable between items?
IEnumerable<T> CancellableEnum<T>(IEnumerable<T> items, CancellationToken ct) {
foreach (var item in items) {
ct.ThrowIfCancellationRequested();
yield return item;
}
}
...though that seems to be kind of what Foo() already does. If you have some place where this enumerable blocks literally infinitely (and it's not just very slow), then what you would do is add a timeout and/or a cancellation token to the task.Wait() on the consumer side.
My previous solution was based on an optimistic assumption that the enumerable is likely to not hang and is quite fast. Thus we could sometimes sucrifice one thread of the system's thread pool? As Dax Fohl pointed out, the task will be still active even if its parent task has been killed by cancel exception. And in this regard, that could chock up the underlying ThreadPool, which is used by default task scheduler, if several collections have been frozen indefinitely.
Consequently I have refactored ToCancellable method:
public static IEnumerable<T> ToCancellable<T>(this IEnumerable<T> #this, CancellationToken token)
{
var enumerator = #this.GetEnumerator();
var state = new State();
for (; ; )
{
token.ThrowIfCancellationRequested();
var thread = new Thread(s => { ((State)s).Result = enumerator.MoveNext(); }) { IsBackground = true, Priority = ThreadPriority.Lowest };
thread.Start(state);
try
{
while (!thread.Join(10))
token.ThrowIfCancellationRequested();
}
catch (OperationCanceledException)
{
thread.Abort();
throw;
}
if (!state.Result)
yield break;
yield return enumerator.Current;
}
}
And a helping class to manage the result:
class State
{
public bool Result { get; set; }
}
It is safe to abort a detached thread.
The pain, that I see here is a thread creation which is heavy. That could be solved by using custom thread pool along with producer-consumer pattern that will be able to handle abort exceptions in order to remove broken thread from the pool.
Another problem is at Join line. What is the best pause here? Maybe that should be in user charge and shiped as a method argument.