I have this code:
var Options = new ParallelOptions
{
MaxDegreeOfParallelism = Environment.ProcessorCount * 10,
CancellationToken = CTS.Token
};
while (!CTS.IsCancellationRequested)
{
var TasksZ = new[]
{
"Task A",
"Task B",
"Task C"
};
await Parallel.ForEachAsync(TasksZ, Options, async (Comando, Token) =>
{
await MyFunction(Comando)
await Task.Delay(1000, Token);
});
Now, Task A, B and C start together and the cycle finish when ALL tasks are completed. Let's suppose that Task A and B finish in 10 seconds, but Task C in 2 minutes. In this case, A nd B have to wait 2 minutes too to start again. How can i make this independent? I mean, every task for it's own thread AND considering that var TasksZ is load dynamically and can change during the execution, by adding or removing other tasks.
Also, for stop/pause each individual task, i need a separate TaskCompletionSource for everyone, but MyFunction is an Interface in common with the main app & every DLL, i need to declare every TCS separated in the DLL(s) or just one in the common Interface?
Edit:
My idea is (using this this code from Microsoft) to have an app that run separated DLL, using the same interface but everyone have his job to do and can't wait each other. They mainly have this sequence of work: read a file -> handle an online POST request -> save a file -> communicate with the main app, the returned JSON, via custom class -> repeat.
There are no other code that i can show you for let you understand, because now 90% is same as the link above, the other 10% is just the POST request with a JSON return in a custom class and load/save file.
For be 101% clear, suppose the example before, the situation should be this:
AM 12:00:00 = start all
AM 12:00:10 = task_A end // 10s
AM 12:00:10 = task_B end // 10s
AM 12:00:20 = task_A end // 10s
AM 12:00:20 = task_B end // 10s
AM 12:00:30 = task_A end // 10s
AM 12:00:30 = task_B end // 10s
...
AM 12:01:50 = task_A end // 10s
AM 12:01:50 = task_B end // 10s
AM 12:02:00 = task_C end // 2 minutes
AM 12:02:10 = task_A end // 10s
AM 12:02:10 = task_B end // 10s
...
(This because i don't need live data for task_3, so it can POST every 2 minutes or so, but for task_1 and task_2 i need to have it live)
About the cores, the important is that the PC will not freeze or have 100% CPU. The server where i run this is a Dual Core, so MaxDegreeOfParallelism = Environment.ProcessorCount * 10 was just for not stress too much the server.
I don't think that the Parallel.ForEachAsync is a suitable tool for solving your problem. My suggestion is to store the tasks in a dictionary that has the string commandos as keys, and (Task, CancellationTokenSource) tuples as values. Each time you add a commando in the dictionary, you start a Task associated with a CancellationTokenSource, after awaiting any previous Task that was stored previously for the same commando, in order to prevent concurrent executions of the same commando. For limiting the concurrency of all commandos, you can use a SemaphoreSlim. For limiting the parallelism (number of threads actively running code at any given moment) you can use a limited concurrency TaskScheduler. Here is a demo:
const int maximumConcurrency = 10;
const int maximumParallelism = 2;
Dictionary<string, (Task, CancellationTokenSource)> commandos = new();
SemaphoreSlim semaphore = new(maximumConcurrency, maximumConcurrency);
TaskScheduler scheduler = new ConcurrentExclusiveSchedulerPair(
TaskScheduler.Default, maximumParallelism).ConcurrentScheduler;
StartCommando("Task A");
StartCommando("Task B");
StartCommando("Task C");
void StartCommando(string commando)
{
Task existingTask = null;
CancellationTokenSource existingCts = null;
if (commandos.TryGetValue(commando, out var entry))
{
(existingTask, existingCts) = entry;
existingCts.Cancel();
}
CancellationTokenSource cts = new();
CancellationToken token = cts.Token;
Task task = Task.Factory.StartNew(async () =>
{
if (existingTask is not null) try { await existingTask; } catch { }
while (true)
{
await semaphore.WaitAsync(token);
try
{
await MyFunction(commando, token);
}
finally { semaphore.Release(); }
}
}, token, TaskCreationOptions.DenyChildAttach, scheduler).Unwrap();
commandos[commando] = (task, cts);
existingCts?.Dispose();
}
void StopCommando(string commando)
{
if (commandos.TryGetValue(commando, out var entry))
{
(_, CancellationTokenSource cts) = entry;
cts.Cancel();
}
}
Task DisposeAllCommandos()
{
List<Task> tasks = new(commandos.Count);
foreach (var (commando, entry) in commandos)
{
(Task task, CancellationTokenSource cts) = entry;
cts.Cancel();
commandos.Remove(commando);
cts.Dispose();
tasks.Add(task);
}
return Task.WhenAll(tasks);
}
Online demo.
It is important that all the awaits are not configured with ConfigureAwait(false). Enforcing the maximumParallelism policy depends on staying always in the realm of our preferred scheduler, so capturing the TaskScheduler.Current at the await points and continuing on that same scheduler is the desirable behavior. Which is also the default behavior of await.
The StartCommando, StopCommando and DisposeAllCommandos methods are intended to be called sequentially, not in parallel. In case you want to control the execution of the commandos from multiple threads in parallel, you'll have to synchronize these calls with a lock.
The DisposeAllCommandos is intended to be used before terminating the application. For a clean termination, the returned Task should be awaited. No more commandos should be started after calling this method.
As I mentioned in my comment above, you can create your own wrapper around a queue that manages background processors of your queue and re-queues the tasks as they complete.
In addition, you mentioned the need to dynamically add or remove tasks at will, which the below implementation will handle.
And finally, it takes an external CancellationToken so that you can either call stop on the processor itself, or cancel the parent CancellationTokenSource.
public class QueueProcessor
{
// could be replaced with a ref-count solution to ensure
// all duplicated tasks are removed
private readonly HashSet<string> _tasksToRemove = new();
private readonly ConcurrentQueue<string> _taskQueue;
private Task[] _processors;
private Func<string, CancellationToken, Task> _processorCallback;
private CancellationTokenSource _cts;
public QueueProcessor(
string[] tasks,
Func<string, CancellationToken, Task> processorCallback)
{
_taskQueue = new(tasks);
_processorCallback = processorCallback;
}
public async Task StartAsync(int numberOfProcessorThreads,
CancellationToken cancellationToken = default)
{
_cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
_processors = new Task[numberOfProcessorThreads];
for (int i = 0; i < _processors.Length; i++)
{
_processors[i] = Task.Run(async () => await ProcessQueueAsync());
}
await Task.WhenAll(_processors);
}
public void Stop()
{
_cts.Cancel();
_cts.Dispose();
}
public void RemoveTask(string task)
{
lock (_tasksToRemove)
{
_tasksToRemove.Add(task);
}
}
public void AddTask(string task) => _taskQueue.Enqueue(task);
private async Task ProcessQueueAsync()
{
while (!_cts.IsCancellationRequested)
{
if (_taskQueue.TryDequeue(out var task))
{
if (ShouldTaskBeRemoved(task))
{
continue;
}
await _processorCallback(task, _cts.Token);
if (!ShouldTaskBeRemoved(task))
{
_taskQueue.Enqueue(task);
}
}
else
{
// sleep for a bit before checking for more work
await Task.Delay(1000, _cts.Token);
}
}
}
private bool ShouldTaskBeRemoved(string task)
{
lock (_tasksToRemove)
{
if (_tasksToRemove.Contains(task))
{
Console.WriteLine($"Task {task} requested for removal");
_tasksToRemove.Remove(task);
return true;
}
}
return false;
}
}
You can test the above with the following:
public async Task MyFunction(string command, CancellationToken cancellationToken)
{
await Task.Delay(50);
if (!cancellationToken.IsCancellationRequested)
{
Console.WriteLine($"Execute command: {command}");
}
else
{
Console.WriteLine($"Terminating command: {command}");
}
}
var cts = new CancellationTokenSource();
var processor = new QueueProcessor(
new string[] { "Task1", "Task2", "Task3" },
MyFunction);
var task = processor.StartAsync(2, cts.Token);
await Task.Delay(100);
processor.RemoveTask("Task1");
await Task.Delay(500);
cts.Cancel();
await runningProcessorTask;
This results in the following output:
Execute command: Task2
Execute command: Task1
Execute command: Task3
Execute command: Task2
Task Task1 requested for removal
Execute command: Task3
Execute command: Task2
Execute command: Task2
Execute command: Task3
Execute command: Task3
Execute command: Task2
Execute command: Task2
Execute command: Task3
Execute command: Task3
Execute command: Task2
Execute command: Task2
Execute command: Task3
Execute command: Task2
Execute command: Task3
Terminating command: Task2
Terminating command: Task3
If you would prefer to use a Channel<T> backed version that handles waiting for additional work gracefully without a manual Task.Delay, the following version exposes the same public api without the internal ConcurrentQueue<T>.
public class QueueProcessor
{
// could be replaced with a ref-count solution to ensure all duplicated tasks are removed
private readonly HashSet<string> _tasksToRemove = new();
private readonly System.Threading.Channels.Channel<string> _taskQueue;
private Task[] _processors;
private Func<string, CancellationToken, Task> _processorCallback;
private CancellationTokenSource _cts;
public QueueProcessor(string[] tasks, Func<string, CancellationToken, Task> processorCallback)
{
_taskQueue = Channel.CreateUnbounded<string>();
_processorCallback = processorCallback;
for (int i = 0; i < tasks.Length; i++)
{
_taskQueue.Writer.WriteAsync(tasks[i]);
}
}
public async Task StartAsync(int numberOfProcessorThreads, CancellationToken cancellationToken = default)
{
_cts = CancellationTokenSource.CreateLinkedTokenSource(cancellationToken);
_processors = new Task[numberOfProcessorThreads];
for (int i = 0; i < _processors.Length; i++)
{
_processors[i] = Task.Run(async () => await ProcessQueueAsync());
}
await Task.WhenAll(_processors);
}
public void Stop()
{
_taskQueue.Writer.TryComplete();
_cts.Cancel();
_cts.Dispose();
}
public void RemoveTask(string task)
{
lock (_tasksToRemove)
{
_tasksToRemove.Add(task);
}
}
public ValueTask AddTask(string task) => _taskQueue.Writer.WriteAsync(task);
private async Task ProcessQueueAsync()
{
while (!_cts.IsCancellationRequested && await _taskQueue.Reader.WaitToReadAsync(_cts.Token))
{
if (_taskQueue.Reader.TryRead(out var task))
{
if (ShouldTaskBeRemoved(task))
{
continue;
}
await _processorCallback(task, _cts.Token);
if (!ShouldTaskBeRemoved(task))
{
await _taskQueue.Writer.WriteAsync(task);
}
}
}
}
private bool ShouldTaskBeRemoved(string task)
{
lock (_tasksToRemove)
{
if (_tasksToRemove.Contains(task))
{
Console.WriteLine($"Task {task} requested for removal");
_tasksToRemove.Remove(task);
return true;
}
}
return false;
}
}
Let me take a stab at identifying your actual root problem: you have I/O bound operations (network access, file I/O, database queries, etc) running at the same time as CPU bound operations (whatever processing you have on the former), and because of the way you wrote your code (that you don't show), you have I/O bound operations waiting for CPU bound ones to even start.
I'm guessing that because by reductio ad absurdum if everything was CPU bound then your CPU cores would be equally used no matter the order of operations, and for I/O bound operations the total time they'd take is equally independent of the order, they just have to get woken up when something finally finishes.
If I'm right, then the actual solution is to split your calls between two thread pools, one for CPU bound operations (that max at the number of available cores) and one for I/O bound operations (that max at some reasonable default, the maximum number of I/O connections that can be in flight at the same time). You can then schedule each operation to its own thread pool and await them as you normally would and they'd never step on each others' toes.
You can use Parallel.Invoke() method to execute multiple processes at the same time.
var TasksZ = new[]
{
() => MyFunction("Task A"),
() => MyFunction("Task B"),
() => MyFunction("Task C")
};
Parallel.Invoke(Options, TasksZ);
void MyFunction(string comando)
{
Console.WriteLine(comando);
}
I have an ASP.Net Core Web API application which consumes messages from an AMQ Queue. Currently I have the consuming code in a BackgroundService with an event handler hooked up to the Listener. The whole thing is in a while look (checking the cancellation token) to ensure any errors are handled and we retry the subscription but I also have an inner while loop to keep the service alive but it doesn't need to do anything.
My question is, what should I do inside that inner while loop to make sure I don't consume unnecessary CPU; e.g. Task.Yield(), Task.Delay(something)?
public class ReceiverService : BackgroundService
{
...
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
...
while (!stoppingToken.IsCancellationRequested)
{
...
IConnectionFactory factory =
new NMSConnectionFactory(
$"activemq:ssl://{parsed?["message"]}:51513?wireFormat.maxInactivityDuration=0");
connection = await factory.CreateConnectionAsync(Username, Password);
var session = await connection.CreateSessionAsync();
var destination = await session.GetQueueAsync("queuename/" + subscriptionId);
var consumer = await session.CreateConsumerAsync(destination);
consumer.Listener += async message =>
{
// do stuff with message
message.Acknowledge();
};
while (!stoppingToken.IsCancellationRequested)
{
await Task.Delay(0, stoppingToken);
}
await connection?.CloseAsync()!;
await Task.Delay(1000, stoppingToken);
}
}
}
Cheers
Rich
If you have nothing but cleanup to do, then you can just do await Task.Delay(Timeout.InfiniteTimeSpan, stoppingToken);. No need for any loops at all.
I have a background service that will be started when the application performing startup. The background service will start to create multiple tasks based on how many workers are set. As I do various trials and monitor the open connection on DB. The open connection is always the same value as the worker I set. Let say I set 32 workers, then the connection will be always 32 open connections shown as I use query to check it. FYI I am using Postgres as the DB server. In order to check the open connection, I use the query below to check the connection when the application is running.
select * from pg_stat_activity where application_name = 'myapplication';
Below is the background service code.
public class MessagingService : BackgroundService {
private int worker = 32;
protected override async Task ExecuteAsync(CancellationToken cancellationToken) {
var tasks = new List<Task>();
for (int i=0; i<worker; i++) {
tasks.Add(DoJob(cancellationToken));
}
while (!cancellationToken.IsCancellationRequested) {
try {
var completed = await Task.WhenAny(tasks);
tasks.Remove(completed);
} catch (Exception) {
await Task.Delay(1000, cancellationToken);
}
if (!cancellationToken.IsCancellationRequested) {
tasks.Add(DoJob(cancellationToken));
}
}
}
private async Task DoJob(CancellationToken cancellationToken) {
using (var scope = _services.CreateScope()) {
var service = scope.ServiceProvider
.GetRequiredService<MessageService>();
try {
//do select and update query on db if null return false otherwise send mail
if (!await service.Run(cancellationToken)) {
await Task.Delay(1000, cancellationToken);
}
} catch (Exception) {
await Task.Delay(1000, cancellationToken);
}
}
}
}
The workflow is not right as it will keep creating the task and leave the connection open and idle. Also, the CPU and memory usage are high when running those tasks. How can I achieve like when there is no record found on DB only keep 1 worker running at the moment? If a record or more is found it will keep increasing until the preset max worker then decreasing the worker when the record is less than the max worker. If this question is too vague or opinion-based then please let me know and I will try my best to make it as specific as possible.
Update Purpose
The purpose of this service is to perform email delivery. There is another API that will be used to create a scheduled job. Once the job is added to the DB, this service will do the email delivery at the scheduled time. Eg, 5k schedule jobs are added to the DB and the scheduled time to perform the job is '2021-12-31 08:00:00' and the time when creating the scheduled job is 2021-12-31 00:00:00'. The service will keep on looping from 00:00:00 until 08:00:00 with 32 workers running at the same time then just start to do the email delivery. How can I improve it to more efficiency like normally when there is no job scheduled only 1 worker is running. When it checked there is 5k scheduled job it will fully utilise all the worker. After 5k job is completed, it will back to 1 workers.
My suggestion is to spare yourself from the burden of manually creating and maintaining worker tasks, by using an ActionBlock<T> from the TPL Dataflow library. This component is a combination of an input queue and an Action<T> delegate. You specify the delegate in its constructor, and you feed it with messages with its Post method. The component invokes the delegate for each message it receives, with the specified degree of parallelism. When there are no more messages to send, you notify it by invoking its Complete method, and then await its Completion so that you know that all work that was delegated to it has completed.
Below is a rough demonstration if how you could use this component:
protected override async Task ExecuteAsync(CancellationToken cancellationToken)
{
var processor = new ActionBlock<Job>(async job =>
{
await ProcessJob(job);
await MarkJobAsCompleted(job);
}, new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 32
});
try
{
while (true)
{
Task delayTask = Task.Delay(TimeSpan.FromSeconds(60), cancellationToken);
Job[] jobs = await FetchReadyToProcessJobs();
foreach (var job in jobs)
{
await MarkJobAsPending(job);
processor.Post(job);
}
await delayTask; // Will throw when the token is canceled
}
}
finally
{
processor.Complete();
await processor.Completion;
}
}
The FetchReadyToProcessJobs method is supposed to connect to the database, and fetch all the jobs whose time has come to be processed. In the above example this method is invoked every 60 seconds. The Task.Delay is created before invoking the method, and awaited after the returned jobs have been posted to the ActionBlock<T>. This way the interval between invocations will be stable and consistent.
I have an orchestrator that calls an activity function to process customer Id's
The output of this activity returns error ID's (if there are any) and I wish to reprocess these Id's by executing the activity again until there are no error id's (output is null).
Is it good practice to have a do loop for an orchestrator?
How do I include a 5 min delay before each time the activity gets executed?
public static async Task<string> RunOrchestrator(
[OrchestrationTrigger] IDurableOrchestrationContext context, ILogger log)
{
log = context.CreateReplaySafeLogger(log);
dynamic errorOutput = null;
dynamic processCustomers = context.GetInput<dynamic>();
log = context.CreateReplaySafeLogger(log);
do
{
log.LogInformation("Calling activity");
errorOutput = await context.CallActivityAsync<dynamic>("GetCSPCustomerLicenses_Activity", processCustomers);
//Get customers to process from error object
processCustomers = errorOutput;
//Wait 5 minutes - how do I achieve this ?
} while (errorOutput != null);
return "Success";
}
Maybe you can use durable timers for delaying execution, please refer to Timers in Durable Functions (Azure Functions) first:
Durable Functions provides durable timers for use in orchestrator functions to implement delays or to set up timeouts on async actions. Durable timers should be used in orchestrator functions instead of Thread.Sleep and Task.Delay (C#), or setTimeout() and setInterval() (JavaScript), or time.sleep() (Python).
This is a code sample for delay usage:
public static async Task<string> RunOrchestrator(
[OrchestrationTrigger] IDurableOrchestrationContext context, ILogger log)
{
log = context.CreateReplaySafeLogger(log);
dynamic errorOutput = null;
dynamic processCustomers = context.GetInput<dynamic>();
log = context.CreateReplaySafeLogger(log);
do
{
log.LogInformation("Calling activity");
errorOutput = await context.CallActivityAsync<dynamic>("GetCSPCustomerLicenses_Activity", processCustomers);
//Get customers to process from error object
processCustomers = errorOutput;
//Wait 5 minutes - how do I achieve this ?
DateTime deadline = context.CurrentUtcDateTime.Add(TimeSpan.FromMinutes(5));
await context.CreateTimer(deadline, CancellationToken.None);
} while (errorOutput != null);
return "Success";
}
In my scheduler, implemented with quartz.net v3, i'm trying to test the behaviour of the cancellation token:
....
IScheduler scheduler = await factory.GetScheduler();
....
var tokenSource = new CancellationTokenSource();
CancellationToken ct = tokenSource.Token;
// Start scheduler
await scheduler.Start(ct);
// some sleep
await Task.Delay(TimeSpan.FromSeconds(60));
// communicate cancellation
tokenSource.Cancel();
I have a test Job that runs infinitely and in the Execute method checks the cancellation token:
public async Task Execute(IJobExecutionContext context)
{
while (true)
{
if (context.CancellationToken.IsCancellationRequested)
{
context.CancellationToken.ThrowIfCancellationRequested();
}
}
}
I would expect that when tokenSource.Cancel() is fired the job will enter in the if and throws the Exception. But it doesn't work.
According to the documentation, you should use Interrupt method to cancel Quartz jobs.
NameValueCollection props = new NameValueCollection
{
{ "quartz.serializer.type", "binary" }
};
StdSchedulerFactory factory = new StdSchedulerFactory(props);
var scheduler = await factory.GetScheduler();
await scheduler.Start();
IJobDetail job = JobBuilder.Create<HelloJob>()
.WithIdentity("myJob", "group1")
.Build();
ITrigger trigger = TriggerBuilder.Create()
.WithIdentity("myTrigger", "group1")
.StartNow()
.WithSimpleSchedule(x => x
.WithRepeatCount(1)
.WithIntervalInSeconds(40))
.Build();
await scheduler.ScheduleJob(job, trigger);
//Configure the cancellation of the schedule job with jobkey
await Task.Delay(TimeSpan.FromSeconds(1));
await scheduler.Interrupt(job.Key);
Scheduled job class;
public class HelloJob : IJob
{
public async Task Execute(IJobExecutionContext context)
{
while (true)
{
if (context.CancellationToken.IsCancellationRequested)
{
context.CancellationToken.ThrowIfCancellationRequested();
// After interrupt the job, the cancellation request activated
}
}
}
}
Apply scheduler.Interrupt after the job executed and the quartz will terminate the job.
EDIT
According to source code (Line 2151), the Interrupt method applys cancellation tokens of the job execution contexts. So, it could be better to use facility of the library.
Here is a Unit Test from Github Repo: https://github.com/quartznet/quartznet/blob/master/src/Quartz.Tests.Unit/InterrubtableJobTest.cs
I tried to implement the cancellation the same way, but it didn't work for me either.
#Stormcloak I have to check the cancellation request because I want to do some aborting operations for the job, e.g. write status data to a database.
EDIT:
So, after multiple tests and implementations. I've got it running.
Some Pseudo code here:
this.scheduler = await StdSchedulerFactory.GetDefaultScheduler();
this.tokenSource = new CancellationTokenSource();
this.token = tokenSource.Token;
// Start scheduler.
await this.scheduler.Start(token);
// add some jobs here
// ...
// cancel running jobs.
IReadOnlyCollection<IJobExecutionContext> jobs = await this.scheduler.GetCurrentlyExecutingJobs();
foreach (IJobExecutionContext context in jobs)
{
result = await this.scheduler.Interrupt(context.JobDetail.Key, this.token);
}
await this.scheduler.Shutdown(true);
So now you can use the CancellationToken in your Execute method.