Nested transactions in StatefulService to save async state of aborted transaction - c#

I have a ReliableQueue<MyTask> which is enqueued into in a different scope, and I'm dequeuing tasks in a transaction, then want to run some long running calculations on each task.
The problem here is that in case my queue transaction is aborted, I don't want to lose the instance of the long calculation. It will keep running in the background, independent, and I just want to check if its completed or not once I retry to process the tasks.
Code segment:
public void protected override async Task RunAsync(CancellationToken cancellationToken)
{
var queue = await StateManager.GetOrAddAsync<IReliableQueue<MyTask>>(...);
while(!cancellationToken.IsCancellationRequested)
{
using (var transaction = ...)
{
var myTaskConditional = await queue.TryDequeueAsync(transaction);
if (!myTaskConditional.HasValue)
{
break;
}
await DoLongProcessing(myTaskConditional)
await transaction.CommitAsync();
}
}
}
private async void DoLongProcessing(MyTask myTask) {
var dict = await StateManager.GetOrAddAsync<IReliableDictionary<Guid,Guid>>(...);
Conditional<Guid> guidConditional;
using (var transaction = ...)
{
guidConditional = await dict.TryGetValueAsync(myTask.TaskGuid);
if (guidConditional.HasValue) {
await transaction.CommitAsync();
// continue handling knowing we already started, continue to wait for
await WaitForClaulcationFinish(guidConditional.Value);
}
else {
// start handling knowing we never handled this task, create new guid and store it in dict
var runGuid = await StartRunningCalculation(runGuid);
await dict.AddAsync(myTask.TaskGuid, runGuid);
await transaction.CommitAsync();
await WaitForClaulcationFinish(runGuid);
}
}
}
My concern: I'm using nested transactions and that's not recommended.
Is there actually a risk of deadlock here if I'm using the transactions solely for the ReliableQueue or ReliableDictionary separately?
Is there a better intended design for what I'm trying to achieve?

You should not be doing anything long running within a transaction. Take a look at the priority queue service I published. Take the item out of the queue an place it into a collections while doing work, then when done, either put it back into the queue or complete the work.

Related

.NET5 Background Service Concurrency

I have a background service that will be started when the application performing startup. The background service will start to create multiple tasks based on how many workers are set. As I do various trials and monitor the open connection on DB. The open connection is always the same value as the worker I set. Let say I set 32 workers, then the connection will be always 32 open connections shown as I use query to check it. FYI I am using Postgres as the DB server. In order to check the open connection, I use the query below to check the connection when the application is running.
select * from pg_stat_activity where application_name = 'myapplication';
Below is the background service code.
public class MessagingService : BackgroundService {
private int worker = 32;
protected override async Task ExecuteAsync(CancellationToken cancellationToken) {
var tasks = new List<Task>();
for (int i=0; i<worker; i++) {
tasks.Add(DoJob(cancellationToken));
}
while (!cancellationToken.IsCancellationRequested) {
try {
var completed = await Task.WhenAny(tasks);
tasks.Remove(completed);
} catch (Exception) {
await Task.Delay(1000, cancellationToken);
}
if (!cancellationToken.IsCancellationRequested) {
tasks.Add(DoJob(cancellationToken));
}
}
}
private async Task DoJob(CancellationToken cancellationToken) {
using (var scope = _services.CreateScope()) {
var service = scope.ServiceProvider
.GetRequiredService<MessageService>();
try {
//do select and update query on db if null return false otherwise send mail
if (!await service.Run(cancellationToken)) {
await Task.Delay(1000, cancellationToken);
}
} catch (Exception) {
await Task.Delay(1000, cancellationToken);
}
}
}
}
The workflow is not right as it will keep creating the task and leave the connection open and idle. Also, the CPU and memory usage are high when running those tasks. How can I achieve like when there is no record found on DB only keep 1 worker running at the moment? If a record or more is found it will keep increasing until the preset max worker then decreasing the worker when the record is less than the max worker. If this question is too vague or opinion-based then please let me know and I will try my best to make it as specific as possible.
Update Purpose
The purpose of this service is to perform email delivery. There is another API that will be used to create a scheduled job. Once the job is added to the DB, this service will do the email delivery at the scheduled time. Eg, 5k schedule jobs are added to the DB and the scheduled time to perform the job is '2021-12-31 08:00:00' and the time when creating the scheduled job is 2021-12-31 00:00:00'. The service will keep on looping from 00:00:00 until 08:00:00 with 32 workers running at the same time then just start to do the email delivery. How can I improve it to more efficiency like normally when there is no job scheduled only 1 worker is running. When it checked there is 5k scheduled job it will fully utilise all the worker. After 5k job is completed, it will back to 1 workers.
My suggestion is to spare yourself from the burden of manually creating and maintaining worker tasks, by using an ActionBlock<T> from the TPL Dataflow library. This component is a combination of an input queue and an Action<T> delegate. You specify the delegate in its constructor, and you feed it with messages with its Post method. The component invokes the delegate for each message it receives, with the specified degree of parallelism. When there are no more messages to send, you notify it by invoking its Complete method, and then await its Completion so that you know that all work that was delegated to it has completed.
Below is a rough demonstration if how you could use this component:
protected override async Task ExecuteAsync(CancellationToken cancellationToken)
{
var processor = new ActionBlock<Job>(async job =>
{
await ProcessJob(job);
await MarkJobAsCompleted(job);
}, new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 32
});
try
{
while (true)
{
Task delayTask = Task.Delay(TimeSpan.FromSeconds(60), cancellationToken);
Job[] jobs = await FetchReadyToProcessJobs();
foreach (var job in jobs)
{
await MarkJobAsPending(job);
processor.Post(job);
}
await delayTask; // Will throw when the token is canceled
}
}
finally
{
processor.Complete();
await processor.Completion;
}
}
The FetchReadyToProcessJobs method is supposed to connect to the database, and fetch all the jobs whose time has come to be processed. In the above example this method is invoked every 60 seconds. The Task.Delay is created before invoking the method, and awaited after the returned jobs have been posted to the ActionBlock<T>. This way the interval between invocations will be stable and consistent.

Is there a better, maybe more reliable pattern than "fire and forget" to handle variable number of asynchronous tasks concurrently?

I have following code:
while (!cancellationToken.IsCancellationRequested)
{
var connection = await listener.AcceptAsync(cancellationToken);
HandleConnectionAsync(connection, cancellationToken)
.FireAndForget(HandleException);
}
The FireAndForget is an extension method:
public static async void FireAndForget(this ValueTask task, Action<Exception> exceptionHandler)
{
try
{
await task.ConfigureAwait(false);
}
catch (Exception e)
{
exceptionHandler.Invoke(e);
}
}
The while loop is the server lifecycle. When new connection is accepted then it starts some "background task" so it can handle this new connection and then while loop goes back to accepting new connections without awaiting anything - pausing the lifecycle.
I cannot await HandleConnectionAsync (pause the lifecycle) here, because I want to immediately accept another connection (if there is one) and be able to handle multiple connections concurrently. HandleConnectionAsync is I/O bound and handles one connection at time until closed (task completes after some time).
The connections have to be handled separately - I don't want to have a situation when some error while handling one connection have any influence on other connections.
The "fire and forget" solution I have here works, but the general rule is to always await asynchronous methods and never use async void.
It seems like I've broken the rules, so is there a better, maybe more reliable way to handle variable (number of tasks varies in time) number of asynchronous I/O bound tasks concurrently in a situation described here?
More information:
Each call to AcceptAsync allocates system resources even before returning the connection and I want to avoid that whenever possible (the connection may not be returned for hours (code may "await" for hours) - until some external client decides to connect to my server). It is better to assume that this is the method I don't want to be called concurrently/in parallel - just one AcceptAsync at time is enough
Please take into account that I can have millions of clients per day connecting and disconnecting to my server and server (while loop) can work for many many days
I don't know how many connections I will need to handle at a specific time
I do know the maximum number of connections my program will be able to handle concurrently
If I hit the maximum number of connections limit then AcceptAsync won't return new connection until some other active connection closes, so I don't need to worry about that, but any solution based on this limit have to take into account that the active connections may be closed and I still need to handle new connections - number of connections varies over time. "fire and forget" have no issues with that
The code for HandleConnectionAsync is not relevant - it just handles one connection at time until closed (task completes after some time) and is I/O bound (HandleConnectionAsync handles one connection at time, but of course we can start multiple HandleConnectionAsync tasks to handle multiple connections concurrently - which is what I did with "fire and forget")
I'm assuming that changing to something like SignalR isn't an acceptable solution. That would be my first recommendation.
Custom server sockets is a scenario where some kind of "fire and forget" is acceptable. I'm considering adding a "task manager" kind of type to AsyncEx to make this kind of solution easier, but haven't done it yet.
The bottom line is that you need to manage your list of connections yourself. The "connection" object can include a Task that represents the handling loop; that's fine. It's also useful (especially for debugging or management purposes) to have other properties on there as well, such as the remote IP.
So I would approach it something like this:
private readonly object _mutex = new object();
private readonly List<State> _connections = new List<State>();
private void Add(State state)
{
lock (_mutex)
_connections.Add(state);
}
private void Remove(State state)
{
lock (_mutex)
_connections.Remove(state);
}
public async Task RunAsync(CancellationToken cancellationToken)
{
while (true)
{
var connection = await listener.AcceptAsync(cancellationToken);
Add(new State(this, connection));
}
}
private sealed class State
{
private readonly Parent _parent;
public State(Parent parent, Connection connection, CancellationToken cancellationToken)
{
_parent = parent;
Task = ExecuteAsync(connection, cancellationToken);
}
private static async Task ExecuteAsync(Connection connection, CancellationToken cancellationToken)
{
try { await HandleConnectionAsync(connection, cancellationToken); }
finally { _parent.Remove(this); }
}
public Task Task { get; }
// other properties as desired, e.g., RemoteAddress
}
You now have a collection of connections. You can either ignore the tasks in the State objects (as the code above is doing), which is just like fire-and-forget. Or you can await them all at some point. E.g.:
public async Task RunAsync(CancellationToken cancellationToken)
{
try
{
while (true)
{
var connection = await listener.AcceptAsync(cancellationToken);
Add(new State(this, connection));
}
}
catch (OperationCanceledException)
{
// Wait for all connections to cancel.
// I'm not really sure why you would *want* to do this, though.
List<State> connections;
lock (_mutex) { connections = _connections.ToList(); }
await Task.WhenAll(connections.Select(x => x.Task));
}
}
Then it's easy to extend the State object so you can do things that are sometimes useful for a server app to do, e.g.:
List all remote addresses this server has connections to.
Wait until a specific connection is done.
...
Notes:
Use one pattern for cancellation. Passing the token will result in an OperationCanceledException, which is the normal cancellation pattern. The code also was formerly doing a while (!IsCancellationRequested), resulting in a successful completion on cancellation, which is not the normal cancellation pattern. So I removed that so the code is no longer using two cancellation patterns.
When working with raw sockets, in the general case, you need to be constantly reading (even when you're writing) and periodically writing (even if you have no data to send). So your HandleConnectionAsync should be starting an asynchronous reader and writer and then using Task.WhenAll.
I removed the call to HandleException because (probably) whatever it does should be handled by State.ExecuteAsync. It's not hard to add it back in if necessary.
If there is a limit to the maximum number of allowed concurrent tasks, you should use SemaphoreSlim:
int allowedConcurrent = //..
var semaphore = new SemaphoreSlim(allowedConcurrent);
var tasks = new List<Task>();
while (!cancellationToken.IsCancellationRequested)
{
Func<Task> func = async () =>
{
var connection = await listener.AcceptAsync(cancellationToken);
await HandleConnectionAsync(connection, cancellationToken);
semaphore.Release();
};
await semaphore.WaitAsync(); // Will return immediately if the number of concurrent tasks does not exceed allowed
tasks.Add(func());
}
await Task.WhenAll(tasks);
This will accumulate the tasks into a list, then Task.WhenAll can wait for them all to complete.
First things first:
Don't do async void...
Then you can implement a producer/consumer pattern for this, the below pseudocode is just to guide, you need to make sure your Consumer is a Singleton in your app
public class Data
{
public Uri Url { get; set; }
}
public class Producer
{
private Consumer _consumer = new Consumer();
public void DoStuff()
{
var data = new Data();
_consumer.Enqueue(data);
}
}
public class Consumer
{
private readonly List<Data> _toDo = new List<Data>();
private bool _stop = false;
public Consumer()
{
Task.Factory.StartNew(Loop);
}
private async Task Loop()
{
while (!_stop)
{
Data toDo = null;
lock (_toDo)
{
if (_toDo.Any())
{
toDo = _toDo.First();
_toDo.RemoveAt(0);
}
}
if (toDo != null)
{
await DoSomething(toDo);
}
Thread.Sleep(TimeSpan.FromSeconds(1));
}
}
private async Task DoSomething(Data toDo)
{
// YOUR ASYNC STUFF HERE
}
public void Enqueue(Data data)
{
lock (_toDo)
{
_toDo.Add(data);
}
}
}
So your calling method produces what you need to do the background task and the consumer performs that, that's another fire and forget.
You should consider too what happens if something goes wrong at an application level, should you store the Data in the Consumer.Enqueue() so if the app starts again can do the missing job...
Hope this helps

Queuing asynchronous task in C#

I have few methods that report some data to Data base. We want to invoke all calls to Data service asynchronously. These calls to data service are all over and so we want to make sure that these DS calls are executed one after another in order at any given time. Initially, i was using async await on each of these methods and each of the calls were executed asynchronously but we found out if they are out of sequence then there are room for errors.
So, i thought we should queue all these asynchronous tasks and send them in a separate thread but i want to know what options we have? I came across 'SemaphoreSlim' . Will this be appropriate in my use case?
Or what other options will suit my use case? Please, guide me.
So, what i have in my code currently
public static SemaphoreSlim mutex = new SemaphoreSlim(1);
//first DS call
public async Task SendModuleDataToDSAsync(Module parameters)
{
var tasks1 = new List<Task>();
var tasks2 = new List<Task>();
//await mutex.WaitAsync(); **//is this correct way to use SemaphoreSlim ?**
foreach (var setting in Module.param)
{
Task job1 = SaveModule(setting);
tasks1.Add(job1);
Task job2= SaveModule(GetAdvancedData(setting));
tasks2.Add(job2);
}
await Task.WhenAll(tasks1);
await Task.WhenAll(tasks2);
//mutex.Release(); // **is this correct?**
}
private async Task SaveModule(Module setting)
{
await Task.Run(() =>
{
// Invokes Calls to DS
...
});
}
//somewhere down the main thread, invoking second call to DS
//Second DS Call
private async Task SendInstrumentSettingsToDS(<param1>, <param2>)
{
//await mutex.WaitAsync();// **is this correct?**
await Task.Run(() =>
{
//TrackInstrumentInfoToDS
//mutex.Release();// **is this correct?**
});
if(param2)
{
await Task.Run(() =>
{
//TrackParam2InstrumentInfoToDS
});
}
}
Initially, i was using async await on each of these methods and each of the calls were executed asynchronously but we found out if they are out of sequence then there are room for errors.
So, i thought we should queue all these asynchronous tasks and send them in a separate thread but i want to know what options we have? I came across 'SemaphoreSlim' .
SemaphoreSlim does restrict asynchronous code to running one at a time, and is a valid form of mutual exclusion. However, since "out of sequence" calls can cause errors, then SemaphoreSlim is not an appropriate solution since it does not guarantee FIFO.
In a more general sense, no synchronization primitive guarantees FIFO because that can cause problems due to side effects like lock convoys. On the other hand, it is natural for data structures to be strictly FIFO.
So, you'll need to use your own FIFO queue, rather than having an implicit execution queue. Channels is a nice, performant, async-compatible queue, but since you're on an older version of C#/.NET, BlockingCollection<T> would work:
public sealed class ExecutionQueue
{
private readonly BlockingCollection<Func<Task>> _queue = new BlockingCollection<Func<Task>>();
public ExecutionQueue() => Completion = Task.Run(() => ProcessQueueAsync());
public Task Completion { get; }
public void Complete() => _queue.CompleteAdding();
private async Task ProcessQueueAsync()
{
foreach (var value in _queue.GetConsumingEnumerable())
await value();
}
}
The only tricky part with this setup is how to queue work. From the perspective of the code queueing the work, they want to know when the lambda is executed, not when the lambda is queued. From the perspective of the queue method (which I'm calling Run), the method needs to complete its returned task only after the lambda is executed. So, you can write the queue method something like this:
public Task Run(Func<Task> lambda)
{
var tcs = new TaskCompletionSource<object>();
_queue.Add(async () =>
{
// Execute the lambda and propagate the results to the Task returned from Run
try
{
await lambda();
tcs.TrySetResult(null);
}
catch (OperationCanceledException ex)
{
tcs.TrySetCanceled(ex.CancellationToken);
}
catch (Exception ex)
{
tcs.TrySetException(ex);
}
});
return tcs.Task;
}
This queueing method isn't as perfect as it could be. If a task completes with more than one exception (this is normal for parallel code), only the first one is retained (this is normal for async code). There's also an edge case around OperationCanceledException handling. But this code is good enough for most cases.
Now you can use it like this:
public static ExecutionQueue _queue = new ExecutionQueue();
public async Task SendModuleDataToDSAsync(Module parameters)
{
var tasks1 = new List<Task>();
var tasks2 = new List<Task>();
foreach (var setting in Module.param)
{
Task job1 = _queue.Run(() => SaveModule(setting));
tasks1.Add(job1);
Task job2 = _queue.Run(() => SaveModule(GetAdvancedData(setting)));
tasks2.Add(job2);
}
await Task.WhenAll(tasks1);
await Task.WhenAll(tasks2);
}
Here's a compact solution that has the least amount of moving parts but still guarantees FIFO ordering (unlike some of the suggested SemaphoreSlim solutions). There are two overloads for Enqueue so you can enqueue tasks with and without return values.
using System;
using System.Threading;
using System.Threading.Tasks;
public class TaskQueue
{
private Task _previousTask = Task.CompletedTask;
public Task Enqueue(Func<Task> asyncAction)
{
return Enqueue(async () => {
await asyncAction().ConfigureAwait(false);
return true;
});
}
public async Task<T> Enqueue<T>(Func<Task<T>> asyncFunction)
{
var tcs = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
// get predecessor and wait until it's done. Also atomically swap in our own completion task.
await Interlocked.Exchange(ref _previousTask, tcs.Task).ConfigureAwait(false);
try
{
return await asyncFunction().ConfigureAwait(false);
}
finally
{
tcs.SetResult();
}
}
}
Please keep in mind that your first solution queueing all tasks to lists doesn't ensure that the tasks are executed one after another. They're all running in parallel because they're not awaited until the next tasks is startet.
So yes you've to use a SemapohoreSlim to use async locking and await. A simple implementation might be:
private readonly SemaphoreSlim _syncRoot = new SemaphoreSlim(1);
public async Task SendModuleDataToDSAsync(Module parameters)
{
await this._syncRoot.WaitAsync();
try
{
foreach (var setting in Module.param)
{
await SaveModule(setting);
await SaveModule(GetAdvancedData(setting));
}
}
finally
{
this._syncRoot.Release();
}
}
If you can use Nito.AsyncEx the code can be simplified to:
public async Task SendModuleDataToDSAsync(Module parameters)
{
using var lockHandle = await this._syncRoot.LockAsync();
foreach (var setting in Module.param)
{
await SaveModule(setting);
await SaveModule(GetAdvancedData(setting));
}
}
One option is to queue operations that will create tasks instead of queuing already running tasks as the code in the question does.
PseudoCode without locking:
Queue<Func<Task>> tasksQueue = new Queue<Func<Task>>();
async Task RunAllTasks()
{
while (tasksQueue.Count > 0)
{
var taskCreator = tasksQueue.Dequeu(); // get creator
var task = taskCreator(); // staring one task at a time here
await task; // wait till task completes
}
}
// note that declaring createSaveModuleTask does not
// start SaveModule task - it will only happen after this func is invoked
// inside RunAllTasks
Func<Task> createSaveModuleTask = () => SaveModule(setting);
tasksQueue.Add(createSaveModuleTask);
tasksQueue.Add(() => SaveModule(GetAdvancedData(setting)));
// no DB operations started at this point
// this will start tasks from the queue one by one.
await RunAllTasks();
Using ConcurrentQueue would be likely be right thing in actual code. You also would need to know total number of expected operations to stop when all are started and awaited one after another.
Building on your comment under Alexeis answer, your approch with the SemaphoreSlim is correct.
Assumeing that the methods SendInstrumentSettingsToDS and SendModuleDataToDSAsync are members of the same class. You simplay need a instance variable for a SemaphoreSlim and then at the start of each methode that needs synchornization call await lock.WaitAsync() and call lock.Release() in the finally block.
public async Task SendModuleDataToDSAsync(Module parameters)
{
await lock.WaitAsync();
try
{
...
}
finally
{
lock.Release();
}
}
private async Task SendInstrumentSettingsToDS(<param1>, <param2>)
{
await lock.WaitAsync();
try
{
...
}
finally
{
lock.Release();
}
}
and it is importend that the call to lock.Release() is in the finally-block, so that if an exception is thrown somewhere in the code of the try-block the semaphore is released.

Contentious paralleled work between two collections of objects

I am trying to simulate work between two collections asynchronously and in parallel, I have a ConcurrentQueue of customers and a collection of workers. I need the workers to take a Customer from the queue perform work on the customer and once done take another customer right away.
I decided I'd use an event-based paradigm where the collection of workers would perform an action on a customer; who holds an event handler that fires off when the customer is done; which would hopefully fire off the DoWork Method once again, that way I can parallelize the workers to take customers from the queue. But I can't figure out how to pass a customer into DoWork in OnCustomerFinished()! The worker shouldn't depend on a queue of customers obviously
public class Worker
{
public async Task DoWork(ConcurrentQueue<Customer> cust)
{
await Task.Run(() =>
{
if (cust.TryDequeue(out Customer temp))
{
Task.Delay(5000);
temp.IsDone = true;
}
});
}
public void OnCustomerFinished()
{
// This is where I'm stuck
DoWork(~HOW TO PASS THE QUEUE OF CUSTOMER HERE?~);
}
}
// Edit - This is the Customer Class
public class Customer
{
private bool _isDone = false;
public EventHandler<EventArgs> CustomerFinished;
public bool IsDone
{
private get { return _isDone; }
set
{
_isDone = value;
if (_isDone)
{
OnCustomerFinished();
}
}
}
protected virtual void OnCustomerFinished()
{
if (CustomerFinished != null)
{
CustomerFinished(this, EventArgs.Empty);
}
}
}
.NET already has pub/sub and worker mechanisms in the form of DataFlow blocks and lately, Channels.
Dataflow
Dataflow blocks from the System.Threading.Tasks.Dataflow namespace are the "old" way (2012 and later) of building workers and pipelines of workers. Each block has an input and/or output buffer. Each message posted to the block is processed by one or more tasks in the background. For blocks with outputs, the output of each iteration is stored in the output buffer.
Blocks can be combined into pipelines similar to a CMD or Powershell pipeline, with each block running on its own task(s).
In the simplest case an ActionBlock can be used as a worker:
void ProcessCustomer(Customer customer)
{
....
}
var block =new ActionBlock<Customer>(cust=>ProcessCustomer(cust));
That's it. There's no need to manually dequeue or poll.
The producer method can start sending customer instances to the block. Each of them will be processed in the background, in the order they were posted :
foreach(var customer in bigCustomerList)
{
block.Post(customer);
}
When done, eg when the application terminates, the producer only needs to call Complete() on the block and wait for any remaining entries to complete.
block.Complete();
await block.Completion;
Blocks can work with asynchronous methods too.
Channels
Channels are a new mechanism, built into .NET Core 3 and available as a NuGet in previous .NET Framework and .NET Core version. The producer writes to a channel using a ChannelWriter and the consumer reads from the channel using a ChannelReader. This may seem a bit strange until you realize it allows some powerful patterns.
The producer could be something like this, eg a producer that "produces" all customers in a list with a 0.5 sec delay :
ChannelReader<Customer> Producer(IEnumerable<Customer> customers,CancellationToken token=default)
{
//Create a channel that can buffer an infinite number of entries
var channel=Channel.CreateUnbounded();
var writer=channel.Writer;
//Start a background task to produce the data
_ = Task.Run(async ()=>{
foreach(var customer in customers)
{
//Exit gracefully in case of cancellation
if (token.IsCancellationRequested)
{
return;
}
await writer.WriteAsync(customer,token);
await Task.Delay(500);
}
},token)
//Ensure we complete the writer no matter what
.ContinueWith(t=>writer.Complete(t.Exception);
return channel.Reader;
}
That's a bit more involved but notice that the only thing the function needs to return is the ChannelReader. The cancellation token is useful for terminating the producer early, eg after a timeout or if the application closes.
When the writer completes, all the channel's readers will also complete.
The consumer only needs that ChannelReader to work :
async Task Consumer(ChannelReader<Customer> reader,CancellationToken token=default)
{
while(await reader.WaitToReadAsync(token))
{
while(reader.TryRead(out var customer))
{
//Process the customer
}
}
}
Should the writer complete, WaitToReadAsync will return false and the loop will exit.
In .NET Core 3 the ChannelReader supports IAsyncEnumerable through the ReadAllAsync method, making the code even simpler :
async Task Consumer(ChannelReader<Customer> reader,CancellationToken token=default)
{
await foreach(var customer in reader.ReadAllAsync(token))
{
//Process the customer
}
}
The reader created by the producer can be passed directly to the consumer :
var customers=new []{......}
var reader=Producer(customers);
await Consumer(reader);
Intermediate steps can read from a previous channel reader and publish data to the next, eg an order generator :
ChannelReader<Order> ConsumerOrders(ChannelReader<Customer> reader,CancellationToken token=default)
{
var channel=Channel.CreateUnbounded();
var writer=channel.Writer;
//Start a background task to produce the data
_ = Task.Run(async ()=>{
await foreach(var customer in reader.ReadAllAsync(token))
{
//Somehow create an order for the customer
var order=new Order(...);
await writer.WriteAsync(order,token);
}
},token)
//Ensure we complete the writer no matter what
.ContinueWith(t=>writer.Complete(t.Exception);
return channel.Reader;
}
Again, all we need to do is pass the readers from one method to the next
var customers=new []{......}
var customerReader=Producer(customers);
var orderReader=CustomerOrders(customerReader);
await ConsumeOrders(orderReader);

Ensuring that an async task has fully cancelled before restarting that task

OK, here goes: I have a part of an application where I am querying rows from a database. I perform the query when the user enters text into a search box (or alters another filter setting).
The data that is returned from the database is going into an ObservableCollection which is bound to a DataGrid. Because I'm conscious of keeping the UI responsive, I'm using Async-Await to (attempt) to fill this ObservableCollection in the background.
So, in my mind, every time the user types something (or changes the filter settings) I want to cancel the ongoing task wait for it to confirm it's cancelled and then "restart" (or rather create a new task) with the new settings.
But I'm getting all sorts of weird results (especially when I slow down the task to simulate slow database access) such as the collection not getting cleared and being populated twice and when disposing the CancellationTokenSource (which I read is a good idea) sometimes when I get to the point of calling Cancel() it's been disposed in the meantime and I get an exception.
I suspect that the issue stems from a fundamental gap in my understanding of the pattern I'm meant to use here so any style/pattern pointers are as welcome as an actual technical solution.
The code basically goes like this:
ObservableCollection<Thing> _thingCollection;
Task _thingUpdaterTask;
CancellationTokenSource _thingUpdaterCancellationSource;
// initialisation etc. here
async void PopulateThings(ThingFilterSettings settings)
{
// try to cancel any ongoing task
if(_thingUpdaterTask?.IsCompleted ?? false){
_thingUpdaterCancellationSource.Cancel();
await _thingUpdaterTask;
}
// I'm hoping that any ongoing task is now done with,
// but in reality that isn't happening. I'm guessing
// that's because Tasks are getting dereferenced and
// orphaned in concurrent calls to this method?
_thingCollection.Clear();
_thingUpdaterCancellationSource = new CancellationTokenSource();
var cancellationToken = _thingUpdaterCancellationSource.Token;
var progressHandler = new Progress<Thing>(x => _thingCollection.add(x));
var progress = (IProgress<Thing>)progressHandler;
try{
_thingUpdaterTask = Task.Factory.StartNew(
() => GetThings(settings, progress, cancellationToken));
await _thingUpdaterTask;
}catch(AggregateException e){
//handle stuff etc.
}finally{
// should I be disposing the Token Source here?
}
}
void GetThings(ThingFilterSettings settings,
IProgress<Thing> progress,
CancellationToken ctok){
foreach(var thingy in SomeGetThingsMethod(settings)){
if(ctok.IsCancellationRequested){
break;
}
progress.Report(thingy);
}
}
You could add a wrapper class, that will wait for the previous task execution to stop (either by finishing or by canceling) before starting the new task.
public class ChainableTask
{
private readonly Task _task;
private readonly CancellationTokenSource _cts = new CancellationTokenSource();
public ChainableTask(Func<CancellationToken, Task> asyncAction,
ChainableTask previous = null)
{
_task = Execute(asyncAction, previous);
}
private async Task CancelAsync()
{
try
{
_cts.Cancel();
await _task;
}
catch (OperationCanceledException)
{ }
}
private async Task Execute(Func<CancellationToken, Task> asyncAction, ChainableTask previous)
{
if (previous != null)
await previous.CancelAsync();
if (_cts.IsCancellationRequested)
return;
await asyncAction(_cts.Token);
}
}
If used the class above in previous projects. The class takes a lambda, the asyncAction to create the next task. The task is only created after the previous has finished.
It will pass a CancellationToken to each task, to allow the task to stop before finishing. Before the starting the next task, the token of the previous is canceled and the previous task is awaited. This happens in CancelAsync.
Only after the previous Cancel was awaited, we call the lambda to create the next task.
A usage example:
var firstAction = new ChainableTask(async tcs => await Task.Delay(1000));
var secondAction = new ChainableTask(async tcs => await Task.Delay(1000), firstAction ); // pass the previous action
In this example, the created task does not support cancellation, so the second call to ChainableTask will wait until the first Task.Delay(1000) finishes, before calling the second.

Categories