cancel async task if running - c#

I have the following method called on several occasions (e.g onkeyup of textbox) which asynchronously filters items in listbox.
private async void filterCats(string category,bool deselect)
{
List<Category> tempList = new List<Category>();
//Wait for categories
var tokenSource = new CancellationTokenSource();
var token = tokenSource.Token;
//HERE,CANCEL TASK IF ALREADY RUNNING
tempList= await _filterCats(category,token);
//Show results
CAT_lb_Cats.DataSource = tempList;
CAT_lb_Cats.DisplayMember = "strCategory";
CAT_lb_Cats.ValueMember = "idCategory";
}
and the following task
private async Task<List<Category>> _filterCats(string category,CancellationToken token)
{
List<Category> result = await Task.Run(() =>
{
return getCatsByStr(category);
},token);
return result;
}
and I would like to test whether the task is already runing and if so cancel it and start it with the new value. I know how to cancel task, but how can I check whether it is already running?

This is the code that I use to do this :
if (_tokenSource != null)
{
_tokenSource.Cancel();
}
_tokenSource = new CancellationTokenSource();
try
{
await loadPrestatieAsync(_bedrijfid, _projectid, _medewerkerid, _prestatieid, _startDate, _endDate, _tokenSource.Token);
}
catch (OperationCanceledException ex)
{
}
and for the procedure call it is like this (simplified of course) :
private async Task loadPrestatieAsync(int bedrijfId, int projectid, int medewerkerid, int prestatieid,
DateTime? startDate, DateTime? endDate, CancellationToken token)
{
await Task.Delay(100, token).ConfigureAwait(true);
try{
//do stuff
token.ThrowIfCancellationRequested();
}
catch (OperationCanceledException ex)
{
throw;
}
catch (Exception Ex)
{
throw;
}
}
I am doing a delay of 100 ms because the same action is triggered rather quickly and repeatedly, a small postpone of 100 ms makes it look like the GUI is more responsive actually.

It appears you are looking for a way to get an "autocomplete list" from text entered in a text box, where an ongoing async search is canceled when the text has changed since the search was started.
As was mentioned in the comments, Rx (Reactive Extensions), provides very nice patterns for this, allowing you to easily connect your UI elements to cancellable asynchronous tasks, building in retry logic, etc.
The less than 90 line program below, shows a "full UI" sample (unfortunately excluding any cats ;-). It includes some reporting on the search status.
I have created this using a number of static methods in the RxAutoComplete class, to show how to this is achieved in small documented steps, and how they can be combined, to achieve a more complex task.
namespace TryOuts
{
using System;
using System.Linq;
using System.Threading.Tasks;
using System.Windows.Forms;
using System.Reactive.Linq;
using System.Threading;
// Simulated async search service, that can fail.
public class FakeWordSearchService
{
private static Random _rnd = new Random();
private static string[] _allWords = new[] {
"gideon", "gabby", "joan", "jessica", "bob", "bill", "sam", "johann"
};
public async Task<string[]> Search(string searchTerm, CancellationToken cancelToken)
{
await Task.Delay(_rnd.Next(600), cancelToken); // simulate async call.
if ((_rnd.Next() % 5) == 0) // every 5 times, we will cause a search failure
throw new Exception(string.Format("Search for '{0}' failed on purpose", searchTerm));
return _allWords.Where(w => w.StartsWith(searchTerm)).ToArray();
}
}
public static class RxAutoComplete
{
// Returns an observable that pushes the 'txt' TextBox text when it has changed.
static IObservable<string> TextChanged(TextBox txt)
{
return from evt in Observable.FromEventPattern<EventHandler, EventArgs>(
h => txt.TextChanged += h,
h => txt.TextChanged -= h)
select ((TextBox)evt.Sender).Text.Trim();
}
// Throttles the source.
static IObservable<string> ThrottleInput(IObservable<string> source, int minTextLength, TimeSpan throttle)
{
return source
.Where(t => t.Length >= minTextLength) // Wait until we have at least 'minTextLength' characters
.Throttle(throttle) // We don't start when the user is still typing
.DistinctUntilChanged(); // We only fire, if after throttling the text is different from before.
}
// Provides search results and performs asynchronous,
// cancellable search with automatic retries on errors
static IObservable<string[]> PerformSearch(IObservable<string> source, FakeWordSearchService searchService)
{
return from term in source // term from throttled input
from result in Observable.FromAsync(async token => await searchService.Search(term, token))
.Retry(3) // Perform up to 3 tries on failure
.TakeUntil(source) // Cancel pending request if new search request was made.
select result;
}
// Putting it all together.
public static void RunUI()
{
// Our simple search GUI.
var inputTextBox = new TextBox() { Width = 300 };
var searchResultLB = new ListBox { Top = inputTextBox.Height + 10, Width = inputTextBox.Width };
var searchStatus = new Label { Top = searchResultLB.Height + 30, Width = inputTextBox.Width };
var mainForm = new Form { Controls = { inputTextBox, searchResultLB, searchStatus }, Width = inputTextBox.Width + 20 };
// Our UI update handlers.
var syncContext = SynchronizationContext.Current;
Action<Action> onUITread = (x) => syncContext.Post(_ => x(), null);
Action<string> onSearchStarted = t => onUITread(() => searchStatus.Text = (string.Format("searching for '{0}'.", t)));
Action<string[]> onSearchResult = w => {
searchResultLB.Items.Clear();
searchResultLB.Items.AddRange(w);
searchStatus.Text += string.Format(" {0} maches found.", w.Length > 0 ? w.Length.ToString() : "No");
};
// Connecting input to search
var input = ThrottleInput(TextChanged(inputTextBox), 1, TimeSpan.FromSeconds(0.5)).Do(onSearchStarted);
var result = PerformSearch(input, new FakeWordSearchService());
// Running it
using (result.ObserveOn(syncContext).Subscribe(onSearchResult, ex => Console.WriteLine(ex)))
Application.Run(mainForm);
}
}
}

Related

Handle exceptions with TPL Dataflow blocks

I have a simple tpl data flow which basically does some tasks.
I noticed when there is an exception in any of the datablocks, it wasn't getting caught in the initial parent block caller.
I have added some manual code to check for exception but doesn't seem the right approach.
if (readBlock.Completion.Exception != null
|| saveBlockJoinedProcess.Completion.Exception != null
|| processBlock1.Completion.Exception != null
|| processBlock2.Completion.Exception != null)
{
throw readBlock.Completion.Exception;
}
I had a look online to see what's a suggested approach but didn't see anything obvious.
So I created some sample code below and was hoping to get some guidance on a better solution:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace TPLDataflow
{
class Program
{
static void Main(string[] args)
{
try
{
//ProcessB();
ProcessA();
}
catch (Exception e)
{
Console.WriteLine("Exception in Process!");
throw new Exception($"exception:{e}");
}
Console.WriteLine("Processing complete!");
Console.ReadLine();
}
private static void ProcessB()
{
Task.WhenAll(Task.Run(() => DoSomething(1, "ProcessB"))).Wait();
}
private static void ProcessA()
{
var random = new Random();
var readBlock = new TransformBlock<int, int>(x =>
{
try { return DoSomething(x, "readBlock"); }
catch (Exception e) { throw e; }
}); //1
var braodcastBlock = new BroadcastBlock<int>(i => i); // ⬅ Here
var processBlock1 = new TransformBlock<int, int>(x =>
DoSomethingAsync(5, "processBlock1")); //2
var processBlock2 = new TransformBlock<int, int>(x =>
DoSomethingAsync(2, "processBlock2")); //3
//var saveBlock =
// new ActionBlock<int>(
// x => Save(x)); //4
var saveBlockJoinedProcess =
new ActionBlock<Tuple<int, int>>(
x => SaveJoined(x.Item1, x.Item2)); //4
var saveBlockJoin = new JoinBlock<int, int>();
readBlock.LinkTo(braodcastBlock, new DataflowLinkOptions
{ PropagateCompletion = true });
braodcastBlock.LinkTo(processBlock1,
new DataflowLinkOptions { PropagateCompletion = true }); //5
braodcastBlock.LinkTo(processBlock2,
new DataflowLinkOptions { PropagateCompletion = true }); //6
processBlock1.LinkTo(
saveBlockJoin.Target1); //7
processBlock2.LinkTo(
saveBlockJoin.Target2); //8
saveBlockJoin.LinkTo(saveBlockJoinedProcess,
new DataflowLinkOptions { PropagateCompletion = true });
readBlock.Post(1); //10
//readBlock.Post(2); //10
Task.WhenAll(processBlock1.Completion,processBlock2.Completion)
.ContinueWith(_ => saveBlockJoin.Complete());
readBlock.Complete(); //12
saveBlockJoinedProcess.Completion.Wait(); //13
if (readBlock.Completion.Exception != null
|| saveBlockJoinedProcess.Completion.Exception != null
|| processBlock1.Completion.Exception != null
|| processBlock2.Completion.Exception != null)
{
throw readBlock.Completion.Exception;
}
}
private static int DoSomething(int i, string method)
{
Console.WriteLine($"Do Something, callng method : { method}");
throw new Exception("Fake Exception!");
return i;
}
private static async Task<int> DoSomethingAsync(int i, string method)
{
Console.WriteLine($"Do SomethingAsync");
throw new Exception("Fake Exception!");
await Task.Delay(new TimeSpan(0, 0, i));
Console.WriteLine($"Do Something : {i}, callng method : { method}");
return i;
}
private static void Save(int x)
{
Console.WriteLine("Save!");
}
private static void SaveJoined(int x, int y)
{
Thread.Sleep(new TimeSpan(0, 0, 10));
Console.WriteLine("Save Joined!");
}
}
}
I had a look online to see what's a suggested approach but didn't see anything obvious.
If you have a pipeline (more or less), then the common approach is to use PropagateCompletion to shut down the pipe. If you have more complex topologies, then you would need to complete blocks by hand.
In your case, you have an attempted propagation here:
Task.WhenAll(
processBlock1.Completion,
processBlock2.Completion)
.ContinueWith(_ => saveBlockJoin.Complete());
But this code will not propagate exceptions. When both processBlock1.Completion and processBlock2.Completion complete, saveBlockJoin is completed successfully.
A better solution would be to use await instead of ContinueWith:
async Task PropagateToSaveBlockJoin()
{
try
{
await Task.WhenAll(processBlock1.Completion, processBlock2.Completion);
saveBlockJoin.Complete();
}
catch (Exception ex)
{
((IDataflowBlock)saveBlockJoin).Fault(ex);
}
}
_ = PropagateToSaveBlockJoin();
Using await encourages you to handle exceptions, which you can do by passing them to Fault to propagate the exception.
Propagating errors backward in the pipeline is not supported in the TPL Dataflow out of the box, which is especially annoying when the blocks have a bounded capacity. In this case an error in a block downstream may cause the blocks in front of it to block indefinitely. The only solution I know is to use the cancellation feature, and cancel all blocks in case anyone fails. Here is how it can be done. First create a CancellationTokenSource:
var cts = new CancellationTokenSource();
Then create the blocks one by one, embedding the same CancellationToken in the options of all of them:
var options = new ExecutionDataflowBlockOptions()
{ BoundedCapacity = 10, CancellationToken = cts.Token };
var block1 = new TransformBlock<double, double>(Math.Sqrt, options);
var block2 = new ActionBlock<double>(Console.WriteLine, options);
Then link the blocks together, including the PropagateCompletion setting:
block1.LinkTo(block2, new DataflowLinkOptions { PropagateCompletion = true });
Finally use an extension method to trigger the cancellation of the CancellationTokenSource in case of an exception:
block1.OnFaultedCancel(cts);
block2.OnFaultedCancel(cts);
The OnFaultedCancel extension method is shown below:
public static class DataflowExtensions
{
public static void OnFaultedCancel(this IDataflowBlock dataflowBlock,
CancellationTokenSource cts)
{
dataflowBlock.Completion.ContinueWith(_ => cts.Cancel(), default,
TaskContinuationOptions.OnlyOnFaulted |
TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
}
}
at the first look, if have only some minor points (not looking at your architecture). it seems to me that you have mixed some newer and some older constructs. and there are some code parts which are unnecessary.
for example:
private static void ProcessB()
{
Task.WhenAll(Task.Run(() => DoSomething(1, "ProcessB"))).Wait();
}
using the Wait()-method, if any exceptions happen, they will be wrapped in a System.AggregateException. in my opinion, this is better:
private static async Task ProcessBAsync()
{
await Task.Run(() => DoSomething(1, "ProcessB"));
}
using async-await, if an exception occurs, the await statement rethrows the first exception which is wrapped in the System.AggregateException. This allows you to try-catch for concrete exception types and handle only cases you really can handle.
another thing is this part of your code:
private static void ProcessA()
{
var random = new Random();
var readBlock = new TransformBlock<int, int>(
x =>
{
try { return DoSomething(x, "readBlock"); }
catch (Exception e)
{
throw e;
}
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 1 }); //1
Why catch an exception only to rethrow it? in this case, the try-catch is redundant.
And this here:
private static void SaveJoined(int x, int y)
{
Thread.Sleep(new TimeSpan(0, 0, 10));
Console.WriteLine("Save Joined!");
}
It is much better to use await Task.Delay(....). Using Task.Delay(...), your application will not freeze.

Async Producer / Consumer with throttled duration and batched consumption

I am trying to build a service that provides a queue for many asynchronous clients to make requests and await a response. I need to be able to throttle the queue processing by X requests per Y duration. For example: 50 web requests per second. It is for a 3rd party REST Service where I can only issue X requests per second.
Found many SO questions, it is lead me down the path of using TPL Dataflow, I've used a TranformBlock to provide my custom throttling and then X number of ActionBlocks to complete the tasks in parallel. The implementation of the Action seems a bit clunky, so wondering if there is a better way for me to pass Tasks into the pipeline that notify the callers once completed.
I'm wondering if there is there a better or more optimal/simpler way to do what I want? Is there any glaring issues with my implementation? I know it is missing cancellation and exception handing and I'll be doing this next, but your comments are most welcomed.
I've Extended Stephen Cleary's example for my Dataflow pipeline and used
svick's concept of a time throttled TransformBlock. I am wondering if what I've built could be easily achieved with a pure SemaphoreSlim design, its the time based throttling with max operations that I think will complicate things.
Here is the latest implementation. FIFO queue async queue where I can pass in custom actions.
public class ThrottledProducerConsumer<T>
{
private class TimerState<T1>
{
public SemaphoreSlim Sem;
public T1 Value;
}
private BufferBlock<T> _queue;
private IPropagatorBlock<T, T> _throttleBlock;
private List<Task> _consumers;
private static IPropagatorBlock<T1, T1> CreateThrottleBlock<T1>(TimeSpan Interval, Int32 MaxPerInterval)
{
SemaphoreSlim _sem = new SemaphoreSlim(MaxPerInterval);
return new TransformBlock<T1, T1>(async (x) =>
{
var sw = new Stopwatch();
sw.Start();
//Console.WriteLine($"Current count: {_sem.CurrentCount}");
await _sem.WaitAsync();
sw.Stop();
var now = DateTime.UtcNow;
var releaseTime = now.Add(Interval) - now;
//-- Using timer as opposed to Task.Delay as I do not want to await or wait for it to complete
var tm = new Timer((s) => {
var state = (TimerState<T1>)s;
//Console.WriteLine($"RELEASE: {state.Value} was released {DateTime.UtcNow:mm:ss:ff} Reset Sem");
state.Sem.Release();
}, new TimerState<T1> { Sem = _sem, Value = x }, (int)Interval.TotalMilliseconds,
-1);
/*
Task.Delay(delay).ContinueWith((t)=>
{
Console.WriteLine($"RELEASE(FAKE): {x} was released {DateTime.UtcNow:mm:ss:ff} Reset Sem");
//_sem.Release();
});
*/
//Console.WriteLine($"{x} was tramsformed in {sw.ElapsedMilliseconds}ms. Will release {now.Add(Interval):mm:ss:ff}");
return x;
},
//new ExecutionDataflowBlockOptions { BoundedCapacity = 1 });
//
new ExecutionDataflowBlockOptions { BoundedCapacity = 5, MaxDegreeOfParallelism = 10 });
}
public ThrottledProducerConsumer(TimeSpan Interval, int MaxPerInterval, Int32 QueueBoundedMax = 5, Action<T> ConsumerAction = null, Int32 MaxConsumers = 1)
{
var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1, };
var linkOptions = new DataflowLinkOptions { PropagateCompletion = true, };
//-- Create the Queue
_queue = new BufferBlock<T>(new DataflowBlockOptions { BoundedCapacity = QueueBoundedMax, });
//-- Create and link the throttle block
_throttleBlock = CreateThrottleBlock<T>(Interval, MaxPerInterval);
_queue.LinkTo(_throttleBlock, linkOptions);
//-- Create and link the consumer(s) to the throttle block
var consumerAction = (ConsumerAction != null) ? ConsumerAction : new Action<T>(ConsumeItem);
_consumers = new List<Task>();
for (int i = 0; i < MaxConsumers; i++)
{
var consumer = new ActionBlock<T>(consumerAction, consumerOptions);
_throttleBlock.LinkTo(consumer, linkOptions);
_consumers.Add(consumer.Completion);
}
//-- TODO: Add some cancellation tokens to shut this thing down
}
/// <summary>
/// Default Consumer Action, just prints to console
/// </summary>
/// <param name="ItemToConsume"></param>
private void ConsumeItem(T ItemToConsume)
{
Console.WriteLine($"Consumed {ItemToConsume} at {DateTime.UtcNow}");
}
public async Task EnqueueAsync(T ItemToEnqueue)
{
await this._queue.SendAsync(ItemToEnqueue);
}
public async Task EnqueueItemsAsync(IEnumerable<T> ItemsToEnqueue)
{
foreach (var item in ItemsToEnqueue)
{
await this._queue.SendAsync(item);
}
}
public async Task CompleteAsync()
{
this._queue.Complete();
await Task.WhenAll(_consumers);
Console.WriteLine($"All consumers completed {DateTime.UtcNow}");
}
}
The test method
public class WorkItem<T>
{
public TaskCompletionSource<T> tcs;
//public T respone;
public string url;
public WorkItem(string Url)
{
tcs = new TaskCompletionSource<T>();
url = Url;
}
public override string ToString()
{
return $"{url}";
}
}
public static void TestQueue()
{
Console.WriteLine("Created the queue");
var defaultAction = new Action<WorkItem<String>>(async i => {
var taskItem = ((WorkItem<String>)i);
Console.WriteLine($"Consuming: {taskItem.url} {DateTime.UtcNow:mm:ss:ff}");
//-- Assume calling another async method e.g. await httpClient.DownloadStringTaskAsync(url);
await Task.Delay(5000);
taskItem.tcs.SetResult($"{taskItem.url}");
//Console.WriteLine($"Consumed: {taskItem.url} {DateTime.UtcNow}");
});
var queue = new ThrottledProducerConsumer<WorkItem<String>>(TimeSpan.FromMilliseconds(2000), 5, 2, defaultAction);
var results = new List<Task>();
foreach (var no in Enumerable.Range(0, 20))
{
var workItem = new WorkItem<String>($"http://someurl{no}.com");
results.Add(queue.EnqueueAsync(workItem));
results.Add(workItem.tcs.Task);
results.Add(workItem.tcs.Task.ContinueWith(response =>
{
Console.WriteLine($"Received: {response.Result} {DateTime.UtcNow:mm:ss:ff}");
}));
}
Task.WhenAll(results).Wait();
Console.WriteLine("All Work Items Have Been Processed");
}
Since asking, I have created a ThrottledConsumerProducer class based on TPL Dataflow. It was tested over a number of days which included concurrent producers which were queued and completed in order, approx 281k without any problems, however there my be bugs I've not discovered.
I am using a BufferBlock as an asynchronous queue, this is linked to:
A TransformBlock which provides the throttling and blocking I need. It is used in conjunction with a SempahoreSlim to control the max requests. As each item is passed through the block, it increments the semaphore and schedules a task to run X duration later to release the semaphore by one. This way I have a sliding window of X requests per duration; exactly what I wanted. Because of TPL I am also leveraging parallelism to the connected:
ActionBlock(s) which are responsible for performing the task I need.
The classes are generic, so it might be useful to others if they need something similar. I have not written cancellation or error handling, but thought I should just mark this as answered to move it along. I would be quite happy to see some alternatives and feedback, rather than mark mine as an accepted answer. Thanks for reading.
NOTE: I removed the Timer from the original implementation as it was doing weird stuff causing the semaphore to release more than the maximum, I am assuming it is dynamic context error, it occurred when I started running concurrent requests. I worked around it using Task.Delay to schedule a release of a semaphore lock.
Throttled Producer Consumer
public class ThrottledProducerConsumer<T>
{
private BufferBlock<T> _queue;
private IPropagatorBlock<T, T> _throttleBlock;
private List<Task> _consumers;
private static IPropagatorBlock<T1, T1> CreateThrottleBlock<T1>(TimeSpan Interval,
Int32 MaxPerInterval, Int32 BlockBoundedMax = 2, Int32 BlockMaxDegreeOfParallelism = 2)
{
SemaphoreSlim _sem = new SemaphoreSlim(MaxPerInterval, MaxPerInterval);
return new TransformBlock<T1, T1>(async (x) =>
{
//Log($"Transform blk: {x} {DateTime.UtcNow:mm:ss:ff} Semaphore Count: {_sem.CurrentCount}");
var sw = new Stopwatch();
sw.Start();
//Console.WriteLine($"Current count: {_sem.CurrentCount}");
await _sem.WaitAsync();
sw.Stop();
var delayTask = Task.Delay(Interval).ContinueWith((t) =>
{
//Log($"Pre-RELEASE: {x} {DateTime.UtcNow:mm:ss:ff} Semaphore Count {_sem.CurrentCount}");
_sem.Release();
//Log($"PostRELEASE: {x} {DateTime.UtcNow:mm:ss:ff} Semaphoere Count {_sem.CurrentCount}");
});
//},TaskScheduler.FromCurrentSynchronizationContext());
//Log($"Transformed: {x} in queue {sw.ElapsedMilliseconds}ms. {DateTime.Now:mm:ss:ff} will release {DateTime.Now.Add(Interval):mm:ss:ff} Semaphoere Count {_sem.CurrentCount}");
return x;
},
//-- Might be better to keep Bounded Capacity in sync with the semaphore
new ExecutionDataflowBlockOptions { BoundedCapacity = BlockBoundedMax,
MaxDegreeOfParallelism = BlockMaxDegreeOfParallelism });
}
public ThrottledProducerConsumer(TimeSpan Interval, int MaxPerInterval,
Int32 QueueBoundedMax = 5, Action<T> ConsumerAction = null, Int32 MaxConsumers = 1,
Int32 MaxThrottleBuffer = 20, Int32 MaxDegreeOfParallelism = 10)
{
//-- Probably best to link MaxPerInterval and MaxThrottleBuffer
// and MaxConsumers with MaxDegreeOfParallelism
var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1, };
var linkOptions = new DataflowLinkOptions { PropagateCompletion = true, };
//-- Create the Queue
_queue = new BufferBlock<T>(new DataflowBlockOptions { BoundedCapacity = QueueBoundedMax, });
//-- Create and link the throttle block
_throttleBlock = CreateThrottleBlock<T>(Interval, MaxPerInterval);
_queue.LinkTo(_throttleBlock, linkOptions);
//-- Create and link the consumer(s) to the throttle block
var consumerAction = (ConsumerAction != null) ? ConsumerAction : new Action<T>(ConsumeItem);
_consumers = new List<Task>();
for (int i = 0; i < MaxConsumers; i++)
{
var consumer = new ActionBlock<T>(consumerAction, consumerOptions);
_throttleBlock.LinkTo(consumer, linkOptions);
_consumers.Add(consumer.Completion);
}
//-- TODO: Add some cancellation tokens to shut this thing down
}
/// <summary>
/// Default Consumer Action, just prints to console
/// </summary>
/// <param name="ItemToConsume"></param>
private void ConsumeItem(T ItemToConsume)
{
Log($"Consumed {ItemToConsume} at {DateTime.UtcNow}");
}
public async Task EnqueueAsync(T ItemToEnqueue)
{
await this._queue.SendAsync(ItemToEnqueue);
}
public async Task EnqueueItemsAsync(IEnumerable<T> ItemsToEnqueue)
{
foreach (var item in ItemsToEnqueue)
{
await this._queue.SendAsync(item);
}
}
public async Task CompleteAsync()
{
this._queue.Complete();
await Task.WhenAll(_consumers);
Console.WriteLine($"All consumers completed {DateTime.UtcNow}");
}
private static void Log(String messageToLog)
{
System.Diagnostics.Trace.WriteLine(messageToLog);
Console.WriteLine(messageToLog);
}
}
- Example Usage -
A Generic WorkItem
public class WorkItem<Toutput,Tinput>
{
private TaskCompletionSource<Toutput> _tcs;
public Task<Toutput> Task { get { return _tcs.Task; } }
public Tinput InputData { get; private set; }
public Toutput OutputData { get; private set; }
public WorkItem(Tinput inputData)
{
_tcs = new TaskCompletionSource<Toutput>();
InputData = inputData;
}
public void Complete(Toutput result)
{
_tcs.SetResult(result);
}
public void Failed(Exception ex)
{
_tcs.SetException(ex);
}
public override string ToString()
{
return InputData.ToString();
}
}
Creating the action block executed in the pipeline
private Action<WorkItem<Location,PointToLocation>> CreateProcessingAction()
{
return new Action<WorkItem<Location,PointToLocation>>(async i => {
var sw = new Stopwatch();
sw.Start();
var taskItem = ((WorkItem<Location,PointToLocation>)i);
var inputData = taskItem.InputData;
//Log($"Consuming: {inputData.Latitude},{inputData.Longitude} {DateTime.UtcNow:mm:ss:ff}");
//-- Assume calling another async method e.g. await httpClient.DownloadStringTaskAsync(url);
await Task.Delay(500);
sw.Stop();
Location outData = new Location()
{
Latitude = inputData.Latitude,
Longitude = inputData.Longitude,
StreetAddress = $"Consumed: {inputData.Latitude},{inputData.Longitude} Duration(ms): {sw.ElapsedMilliseconds}"
};
taskItem.Complete(outData);
//Console.WriteLine($"Consumed: {taskItem.url} {DateTime.UtcNow}");
});
}
Test Method
You'll need to provide your own implementation for PointToLocation and Location. Just an example of how you'd use it with your own classes.
int startRange = 0;
int nextRange = 1000;
ThrottledProducerConsumer<WorkItem<Location,PointToLocation>> tpc;
private void cmdTestPipeline_Click(object sender, EventArgs e)
{
Log($"Pipeline test started {DateTime.Now:HH:mm:ss:ff}");
if(tpc == null)
{
tpc = new ThrottledProducerConsumer<WorkItem<Location, PointToLocation>>(
//1010, 2, 20000,
TimeSpan.FromMilliseconds(1010), 45, 100000,
CreateProcessingAction(),
2,45,10);
}
var workItems = new List<WorkItem<Models.Location, PointToLocation>>();
foreach (var i in Enumerable.Range(startRange, nextRange))
{
var ptToLoc = new PointToLocation() { Latitude = i + 101, Longitude = i + 100 };
var wrkItem = new WorkItem<Location, PointToLocation>(ptToLoc);
workItems.Add(wrkItem);
wrkItem.Task.ContinueWith(t =>
{
var loc = t.Result;
string line = $"[Simulated:{DateTime.Now:HH:mm:ss:ff}] - {loc.StreetAddress}";
//txtResponse.Text = String.Concat(txtResponse.Text, line, System.Environment.NewLine);
//var lines = txtResponse.Text.Split(new string[] { System.Environment.NewLine},
// StringSplitOptions.RemoveEmptyEntries).LongCount();
//lblLines.Text = lines.ToString();
//Log(line);
});
//}, TaskScheduler.FromCurrentSynchronizationContext());
}
startRange += nextRange;
tpc.EnqueueItemsAsync(workItems);
Log($"Pipeline test completed {DateTime.Now:HH:mm:ss:ff}");
}

C# Parallel Foreach + Async

I'm processing a list of items (200k - 300k), each item processing time is between 2 to 8 seconds. To gain time, I can process this list in parallel. As I'm in an async context, I use something like this :
public async Task<List<Keyword>> DoWord(List<string> keyword)
{
ConcurrentBag<Keyword> keywordResults = new ConcurrentBag<Keyword>();
if (keyword.Count > 0)
{
try
{
var tasks = keyword.Select(async kw =>
{
return await Work(kw).ConfigureAwait(false);
});
keywordResults = new ConcurrentBag<Keyword>(await Task.WhenAll(tasks).ConfigureAwait(false));
}
catch (AggregateException ae)
{
foreach (Exception innerEx in ae.InnerExceptions)
{
log.ErrorFormat("Core threads exception: {0}", innerEx);
}
}
}
return keywordResults.ToList();
}
The keyword list contains always 8 elements (comming from above) thus I process my list 8 by 8 but, in this case, I guess that if 7 keywords are processed in 3 secs and the 8th is processed in 10 secs, the total time for the 8 keywords will be 10 (correct me if i'm wrong).
How Can I approach from the Parallel.Foreach then? I mean : launch 8 keywords if 1 of them is done, launch 1 more. In this case I'll have 8 working processes permanently. Any idea ?
Another more easier way to do this is to use the AsyncEnumerator NuGet Package:
using System.Collections.Async;
public async Task<List<Keyword>> DoWord(List<string> keywords)
{
var keywordResults = new ConcurrentBag<Keyword>();
await keywords.ParallelForEachAsync(async keyword =>
{
try
{
var result = await Work(keyword);
keywordResults.Add(result);
}
catch (AggregateException ae)
{
foreach (Exception innerEx in ae.InnerExceptions)
{
log.ErrorFormat("Core threads exception: {0}", innerEx);
}
}
}, maxDegreeOfParallelism: 8);
return keywordResults.ToList();
}
Here's some sample code showing how you could approach this using TPL Dataflow.
Note that in order to compile this, you will need to add TPL Dataflow to your project via NuGet.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace Demo
{
class Keyword // Dummy test class.
{
public string Name;
}
class Program
{
static void Main()
{
// Dummy test data.
var keywords = Enumerable.Range(1, 100).Select(n => n.ToString()).ToList();
var result = DoWork(keywords).Result;
Console.WriteLine("---------------------------------");
foreach (var item in result)
Console.WriteLine(item.Name);
}
public static async Task<List<Keyword>> DoWork(List<string> keywords)
{
var input = new TransformBlock<string, Keyword>
(
async s => await Work(s),
// This is where you specify the max number of threads to use.
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 8 }
);
var result = new List<Keyword>();
var output = new ActionBlock<Keyword>
(
item => result.Add(item), // Output only 1 item at a time, because 'result.Add()' is not threadsafe.
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 1 }
);
input.LinkTo(output, new DataflowLinkOptions { PropagateCompletion = true });
foreach (string s in keywords)
await input.SendAsync(s);
input.Complete();
await output.Completion;
return result;
}
public static async Task<Keyword> Work(string s) // Stubbed test method.
{
Console.WriteLine("Processing " + s);
int delay;
lock (rng) { delay = rng.Next(10, 1000); }
await Task.Delay(delay); // Simulate load.
Console.WriteLine("Completed " + s);
return await Task.Run( () => new Keyword { Name = s });
}
static Random rng = new Random();
}
}

TaskFactory, Starting a new Task when one ends

I have found many methods of using the TaskFactory but I could not find anything about starting more tasks and watching when one ends and starting another one.
I always want to have 10 tasks working.
I want something like this
int nTotalTasks=10;
int nCurrentTask=0;
Task<bool>[] tasks=new Task<bool>[nThreadsNum];
for (int i=0; i<1000; i++)
{
string param1="test";
string param2="test";
if (nCurrentTask<10) // if there are less than 10 tasks then start another one
tasks[nCurrentThread++] = Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
});
// How can I stop the for loop until a new task is finished and start a new one?
}
Check out the Task.WaitAny method:
Waits for any of the provided Task objects to complete execution.
Example from the documentation:
var t1 = Task.Factory.StartNew(() => DoOperation1());
var t2 = Task.Factory.StartNew(() => DoOperation2());
Task.WaitAny(t1, t2)
I would use a combination of Microsoft's Reactive Framework (NuGet "Rx-Main") and TPL for this. It becomes very simple.
Here's the code:
int nTotalTasks=10;
string param1="test";
string param2="test";
IDisposable subscription =
Observable
.Range(0, 1000)
.Select(i => Observable.FromAsync(() => Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
})))
.Merge(nTotalTasks)
.ToArray()
.Subscribe((bool[] results) =>
{
/* Do something with the results. */
});
The key part here is the .Merge(nTotalTasks) which limits the number of concurrent tasks.
If you need to stop the processing part way thru just call subscription.Dispose() and everything gets cleaned up for you.
If you want to process each result as they are produced you can change the code from the .Merge(...) like this:
.Merge(nTotalTasks)
.Subscribe((bool result) =>
{
/* Do something with each result. */
});
This should be all you need, not complete, but all you need to do is wait on the first to complete and then run the second.
Task.WaitAny(task to wait on);
Task.Factory.StartNew()
Have you seen the BlockingCollection class? It allows you to have multiple threads running in parallel and you can wait from results from one task to execute another. See more information here.
The answer depends on whether the tasks to be scheduled are CPU or I/O bound.
For CPU-intensive work I would use Parallel.For() API setting the number of thread/tasks through MaxDegreeOfParallelism property of ParallelOptions
For I/O bound work the number of concurrently executing tasks can be significantly larger than the number of available CPUs, so the strategy is to rely on async methods as much as possible, which reduces the total number of threads waiting for completion.
How can I stop the for loop until a new task is finished and start a
new one?
The loop can be throttled by using await:
static void Main(string[] args)
{
var task = DoWorkAsync();
task.Wait();
// handle results
// task.Result;
Console.WriteLine("Done.");
}
async static Task<bool> DoWorkAsync()
{
const int NUMBER_OF_SLOTS = 10;
string param1="test";
string param2="test";
var results = new bool[NUMBER_OF_SLOTS];
AsyncWorkScheduler ws = new AsyncWorkScheduler(NUMBER_OF_SLOTS);
for (int i = 0; i < 1000; ++i)
{
await ws.ScheduleAsync((slotNumber) => DoWorkAsync(i, slotNumber, param1, param2, results));
}
ws.Complete();
await ws.Completion;
}
async static Task DoWorkAsync(int index, int slotNumber, string param1, string param2, bool[] results)
{
results[slotNumber] = results[slotNumber} && await Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
}));
}
A helper class AsyncWorkScheduler uses TPL.DataFlow components as well as Task.WhenAll():
class AsyncWorkScheduler
{
public AsyncWorkScheduler(int numberOfSlots)
{
m_slots = new Task[numberOfSlots];
m_availableSlots = new BufferBlock<int>();
m_errors = new List<Exception>();
m_tcs = new TaskCompletionSource<bool>();
m_completionPending = 0;
// Initial state: all slots are available
for(int i = 0; i < m_slots.Length; ++i)
{
m_slots[i] = Task.FromResult(false);
m_availableSlots.Post(i);
}
}
public async Task ScheduleAsync(Func<int, Task> action)
{
if (Volatile.Read(ref m_completionPending) != 0)
{
throw new InvalidOperationException("Unable to schedule new items.");
}
// Acquire a slot
int slotNumber = await m_availableSlots.ReceiveAsync().ConfigureAwait(false);
// Schedule a new task for a given slot
var task = action(slotNumber);
// Store a continuation on the task to handle completion events
m_slots[slotNumber] = task.ContinueWith(t => HandleCompletedTask(t, slotNumber), TaskContinuationOptions.ExecuteSynchronously);
}
public async void Complete()
{
if (Interlocked.CompareExchange(ref m_completionPending, 1, 0) != 0)
{
return;
}
// Signal the queue's completion
m_availableSlots.Complete();
await Task.WhenAll(m_slots).ConfigureAwait(false);
// Set completion
if (m_errors.Count != 0)
{
m_tcs.TrySetException(m_errors);
}
else
{
m_tcs.TrySetResult(true);
}
}
public Task Completion
{
get
{
return m_tcs.Task;
}
}
void SetFailed(Exception error)
{
lock(m_errors)
{
m_errors.Add(error);
}
}
void HandleCompletedTask(Task task, int slotNumber)
{
if (task.IsFaulted || task.IsCanceled)
{
SetFailed(task.Exception);
return;
}
if (Volatile.Read(ref m_completionPending) == 1)
{
return;
}
// Release a slot
m_availableSlots.Post(slotNumber);
}
int m_completionPending;
List<Exception> m_errors;
BufferBlock<int> m_availableSlots;
TaskCompletionSource<bool> m_tcs;
Task[] m_slots;
}

C# infinitive task loop using Task<> class + cancellation

I`m trying to make a small class for the multithreading usage in my WinForm projects.
Tried Threads(problems with UI), Backgroundworker(smth went wrong with UI too, just leave it now:)), now trying to do it with Task class. But now, can`t understand, how to make an infinitive loop and a cancelling method (in class) for all running tasks.
Examples i found is to be used in 1 method.
So, here is a structure & code of currently working part (Worker.css and methonds used in WinForm code).
Worker.css
class Worker
{
public static int threadCount { get; set; }
public void doWork(ParameterizedThreadStart method)
{
Task[] tasks = Enumerable.Range(0, 4).Select(i => Task.Factory.StartNew(() => method(i))).ToArray();
}
}
usage on
Form1.cs
private void Start_btn_Click(object sender, EventArgs e)
{
Worker.threadCount = 1; //actually it doesn`t using now, number of tasks is declared in class temporaly
Worker worker = new Worker();
worker.doWork(Job);
string logString_1 = string.Format("Starting {0} threads...", Worker.threadCount);
log(logString_1);
}
public static int j = 0;
private void Job(object sender)
{
Worker worker = new Worker();
Random r = new Random();
log("Thread "+Thread.CurrentThread.ManagedThreadId +" is working...");
for (int i = 0; i < 5; i++)
{
j++;
log("J==" + j);
if (j == 50)
{
//worker.Stop();
log("STOP");
}
}
Thread.Sleep(r.Next(500, 1000));
}
So, it run an example 4 threads, they executed, i got J==20 in my log, it`s ok.
My question is, how to implement infinitive loop for the tasks, created by Worker.doWork() method.
And also to make a .Stop() method for the Worker class (which should just stop all tasks when called). As i understand it`s related questions, so i put it in 1.
I tryed some solutions, but all of them based on the CancellationToken usage, but i have to create this element only inside of the Worker.doWork() method, so i can`t use the same token to create a Worker.Stop() method.
Someone can help? threads amount range i have to use in this software is about 5-200 threads.
using J computation is just an example of the the easy condition used to stop a software work(stop of tasks/threads).
In real, stop conditions is mostly like Queue<> is finished, or List<> elements is empty(finished).
Finally, get it works.
class Worker
{
public static int threadCount { get; set; }
Task[] tasks;
//ex data
public static string exception;
static CancellationTokenSource wtoken = new CancellationTokenSource();
CancellationToken cancellationToken = wtoken.Token;
public void doWork(ParameterizedThreadStart method)
{
try
{
tasks = Enumerable.Range(0, 4).Select(i => Task.Factory.StartNew(() =>
{
while (!cancellationToken.IsCancellationRequested)
{
method(i);
}
}, cancellationToken)).ToArray();
}
catch (Exception ex) { exception = ex.Message; }
}
public void HardStop()
{
try
{
using (wtoken)
{
wtoken.Cancel();
}
wtoken = null;
tasks = null;
}
catch (Exception ex) { exception = ex.Message; }
}
}
But if i`m using this method to quit cancellationToken.ThrowIfCancellationRequested();
Get a error:
when Job() method reach J == 50, and worker.HardStop() function called, program window crashes and i get and exception "OparetionCanceledException was unhandled by user code"
on this string
cancellationToken.ThrowIfCancellationRequested();
so, whats wrong? i`m already put it in try{} catch(){}
as i understood, just some boolean properties should be changed in Task (Task.IsCancelled == false, Task.IsFaulted == true) on wtoken.Cancel();
I'd avoid all of the mucking around with tasks and use Microsoft's Reactive Framework (NuGet "Rx-Main") for this.
Here's how:
var r = new Random();
var query =
Observable
.Range(0, 4, Scheduler.Default)
.Select(i =>
Observable
.Generate(0, x => true, x => x, x => x,
x => TimeSpan.FromMilliseconds(r.Next(500, 1000)),
Scheduler.Default)
.Select(x => i))
.Merge();
var subscription =
query
.Subscribe(i => method(i));
And when you want to cancel the calls to method just do this:
subscription.Dispose();
I've tested this and it works like a treat.
If I wrap this up in your worker class then it looks like this:
class Worker
{
private Random _r = new Random();
private IDisposable _subscription = null;
public void doWork()
{
_subscription =
Observable
.Range(0, 4, Scheduler.Default)
.Select(n =>
Observable
.Generate(
0, x => true, x => x, x => x,
x => TimeSpan.FromMilliseconds(_r.Next(500, 1000)),
Scheduler.Default)
.Select(x => n))
.Merge()
.Subscribe(i => method(i));
}
public void HardStop()
{
_subscription.Dispose();
}
}

Categories