Use Task.Run instead of Delegate.BeginInvoke - c#

I have recently upgraded my projects to ASP.NET 4.5 and I have been waiting a long time to use 4.5's asynchronous capabilities. After reading the documentation I'm not sure whether I can improve my code at all.
I want to execute a task asynchronously and then forget about it. The way that I'm currently doing this is by creating delegates and then using BeginInvoke.
Here's one of the filters in my project with creates an audit in our database every time a user accesses a resource that must be audited:
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
var request = filterContext.HttpContext.Request;
var id = WebSecurity.CurrentUserId;
var invoker = new MethodInvoker(delegate
{
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
});
invoker.BeginInvoke(StopAsynchronousMethod, invoker);
base.OnActionExecuting(filterContext);
}
But in order to finish this asynchronous task, I need to always define a callback, which looks like this:
public void StopAsynchronousMethod(IAsyncResult result)
{
var state = (MethodInvoker)result.AsyncState;
try
{
state.EndInvoke(result);
}
catch (Exception e)
{
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(e, username);
}
}
I would rather not use the callback at all due to the fact that I do not need a result from the task that I am invoking asynchronously.
How can I improve this code with Task.Run() (or async and await)?

If I understood your requirements correctly, you want to kick off a task and then forget about it. When the task completes, and if an exception occurred, you want to log it.
I'd use Task.Run to create a task, followed by ContinueWith to attach a continuation task. This continuation task will log any exception that was thrown from the parent task. Also, use TaskContinuationOptions.OnlyOnFaulted to make sure the continuation only runs if an exception occurred.
Task.Run(() => {
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
}).ContinueWith(task => {
task.Exception.Handle(ex => {
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(ex, username);
});
}, TaskContinuationOptions.OnlyOnFaulted);
As a side-note, background tasks and fire-and-forget scenarios in ASP.NET are highly discouraged. See The Dangers of Implementing Recurring Background Tasks In ASP.NET

It may sound a bit out of scope, but if you just want to forget after you launch it, why not using directly ThreadPool?
Something like:
ThreadPool.QueueUserWorkItem(
x =>
{
try
{
// Do something
...
}
catch (Exception e)
{
// Log something
...
}
});
I had to do some performance benchmarking for different async call methods and I found that (not surprisingly) ThreadPool works much better, but also that, actually, BeginInvoke is not that bad (I am on .NET 4.5). That's what I found out with the code at the end of the post. I did not find something like this online, so I took the time to check it myself. Each call is not exactly equal, but it is more or less functionally equivalent in terms of what it does:
ThreadPool: 70.80ms
Task: 90.88ms
BeginInvoke: 121.88ms
Thread: 4657.52ms
public class Program
{
public delegate void ThisDoesSomething();
// Perform a very simple operation to see the overhead of
// different async calls types.
public static void Main(string[] args)
{
const int repetitions = 25;
const int calls = 1000;
var results = new List<Tuple<string, double>>();
Console.WriteLine(
"{0} parallel calls, {1} repetitions for better statistics\n",
calls,
repetitions);
// Threads
Console.Write("Running Threads");
results.Add(new Tuple<string, double>("Threads", RunOnThreads(repetitions, calls)));
Console.WriteLine();
// BeginInvoke
Console.Write("Running BeginInvoke");
results.Add(new Tuple<string, double>("BeginInvoke", RunOnBeginInvoke(repetitions, calls)));
Console.WriteLine();
// Tasks
Console.Write("Running Tasks");
results.Add(new Tuple<string, double>("Tasks", RunOnTasks(repetitions, calls)));
Console.WriteLine();
// Thread Pool
Console.Write("Running Thread pool");
results.Add(new Tuple<string, double>("ThreadPool", RunOnThreadPool(repetitions, calls)));
Console.WriteLine();
Console.WriteLine();
// Show results
results = results.OrderBy(rs => rs.Item2).ToList();
foreach (var result in results)
{
Console.WriteLine(
"{0}: Done in {1}ms avg",
result.Item1,
(result.Item2 / repetitions).ToString("0.00"));
}
Console.WriteLine("Press a key to exit");
Console.ReadKey();
}
/// <summary>
/// The do stuff.
/// </summary>
public static void DoStuff()
{
Console.Write("*");
}
public static double RunOnThreads(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var toProcess = calls;
var stopwatch = new Stopwatch();
var resetEvent = new ManualResetEvent(false);
var threadList = new List<Thread>();
for (var i = 0; i < calls; i++)
{
threadList.Add(new Thread(() =>
{
// Do something
DoStuff();
// Safely decrement the counter
if (Interlocked.Decrement(ref toProcess) == 0)
{
resetEvent.Set();
}
}));
}
stopwatch.Start();
foreach (var thread in threadList)
{
thread.Start();
}
resetEvent.WaitOne();
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnThreadPool(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var toProcess = calls;
var resetEvent = new ManualResetEvent(false);
var stopwatch = new Stopwatch();
var list = new List<int>();
for (var i = 0; i < calls; i++)
{
list.Add(i);
}
stopwatch.Start();
for (var i = 0; i < calls; i++)
{
ThreadPool.QueueUserWorkItem(
x =>
{
// Do something
DoStuff();
// Safely decrement the counter
if (Interlocked.Decrement(ref toProcess) == 0)
{
resetEvent.Set();
}
},
list[i]);
}
resetEvent.WaitOne();
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnBeginInvoke(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var beginInvokeStopwatch = new Stopwatch();
var delegateList = new List<ThisDoesSomething>();
var resultsList = new List<IAsyncResult>();
for (var i = 0; i < calls; i++)
{
delegateList.Add(DoStuff);
}
beginInvokeStopwatch.Start();
foreach (var delegateToCall in delegateList)
{
resultsList.Add(delegateToCall.BeginInvoke(null, null));
}
// We lose a bit of accuracy, but if the loop is big enough,
// it should not really matter
while (resultsList.Any(rs => !rs.IsCompleted))
{
Thread.Sleep(10);
}
beginInvokeStopwatch.Stop();
totalMs += beginInvokeStopwatch.ElapsedMilliseconds;
}
return totalMs;
}
public static double RunOnTasks(int repetitions, int calls)
{
var totalMs = 0.0;
for (var j = 0; j < repetitions; j++)
{
Console.Write(".");
var resultsList = new List<Task>();
var stopwatch = new Stopwatch();
stopwatch.Start();
for (var i = 0; i < calls; i++)
{
resultsList.Add(Task.Factory.StartNew(DoStuff));
}
// We lose a bit of accuracy, but if the loop is big enough,
// it should not really matter
while (resultsList.Any(task => !task.IsCompleted))
{
Thread.Sleep(10);
}
stopwatch.Stop();
totalMs += stopwatch.ElapsedMilliseconds;
}
return totalMs;
}
}

Here's one of the filters in my project with creates an audit in our database every time a user accesses a resource that must be audited
Auditing is certainly not something I would call "fire and forget". Remember, on ASP.NET, "fire and forget" means "I don't care whether this code actually executes or not". So, if your desired semantics are that audits may occasionally be missing, then (and only then) you can use fire and forget for your audits.
If you want to ensure your audits are all correct, then either wait for the audit save to complete before sending the response, or queue the audit information to reliable storage (e.g., Azure queue or MSMQ) and have an independent backend (e.g., Azure worker role or Win32 service) process the audits in that queue.
But if you want to live dangerously (accepting that occasionally audits may be missing), you can mitigate the problems by registering the work with the ASP.NET runtime. Using the BackgroundTaskManager from my blog:
public override void OnActionExecuting(ActionExecutingContext filterContext)
{
var request = filterContext.HttpContext.Request;
var id = WebSecurity.CurrentUserId;
BackgroundTaskManager.Run(() =>
{
try
{
var audit = new Audit
{
Id = Guid.NewGuid(),
IPAddress = request.UserHostAddress,
UserId = id,
Resource = request.RawUrl,
Timestamp = DateTime.UtcNow
};
var database = (new NinjectBinder()).Kernel.Get<IDatabaseWorker>();
database.Audits.InsertOrUpdate(audit);
database.Save();
}
catch (Exception e)
{
var username = WebSecurity.CurrentUserName;
Debugging.DispatchExceptionEmail(e, username);
}
});
base.OnActionExecuting(filterContext);
}

Related

Why my workers work distribution count does not total the number of produced items in this System.Threading.Channel sample?

Following this post, I have been playing with System.Threading.Channel to get confident enough and use it in my production code, replacing the Threads/Monitor.Pulse/Wait based approach I am currently using (described in the referred post).
Basically I created a sample with a bounded channel where I run a couple of producer tasks at the beginning and, without waiting, start my consumer tasks, which start pushing elements from the channel.
After waiting for the producers tasks to complete, I then signal the channel as complete, so the consumer tasks can quit listening to new channel elements.
My channel is a Channel<Action>, and in each action I increment the count for each given worker in the WorkDistribution concurrent dictionary, and at the end of the sample I print it so I can check I consumed as many items as I expected, and also how did the channel distributed the actions between the consumers.
For some reason this "Work Distribution footer" is not printing the same number of items as the total items produced by producer tasks.
What am I missing ?
Some of the variables present were added for the sole purpose of helping troubleshoot.
Here's the full code:
public class ChannelSolution
{
object LockObject = new object();
Channel<Action<string>> channel;
int ItemsToProduce;
int WorkersCount;
int TotalItemsProduced;
ConcurrentDictionary<string, int> WorkDistribution;
CancellationToken Ct;
public ChannelSolution(int workersCount, int itemsToProduce, int maxAllowedItems,
CancellationToken ct)
{
WorkersCount = workersCount;
ItemsToProduce = itemsToProduce;
channel = Channel.CreateBounded<Action<string>>(maxAllowedItems);
Console.WriteLine($"Created channel with max {maxAllowedItems} items");
WorkDistribution = new ConcurrentDictionary<string, int>();
Ct = ct;
}
async Task ProduceItems(int cycle)
{
for (var i = 0; i < ItemsToProduce; i++)
{
var index = i + 1 + (ItemsToProduce * cycle);
bool queueHasRoom;
var stopwatch = new Stopwatch();
stopwatch.Start();
do
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting read loop - cancellation requested !");
break;
}
queueHasRoom = await channel.Writer.WaitToWriteAsync();
if (!queueHasRoom)
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting read loop - cancellation"
+ " requested !");
break;
}
if (stopwatch.Elapsed.Seconds % 3 == 0)
Console.WriteLine("Channel reached maximum capacity..."
+ " producer waiting for items to be freed...");
}
}
while (!queueHasRoom);
channel.Writer.TryWrite((workerName) => action($"A{index}", workerName));
Console.WriteLine($"Channel has room, item {index} added"
+ $" - channel items count: [{channel.Reader.Count}]");
Interlocked.Increment(ref TotalItemsProduced);
}
}
List<Task> GetConsumers()
{
var tasks = new List<Task>();
for (var i = 0; i < WorkersCount; i++)
{
var workerName = $"W{(i + 1).ToString("00")}";
tasks.Add(Task.Run(async () =>
{
while (await channel.Reader.WaitToReadAsync())
{
if (Ct.IsCancellationRequested)
{
Console.WriteLine("exiting write loop - cancellation"
+ "requested !");
break;
}
if (channel.Reader.TryRead(out var action))
{
Console.WriteLine($"dequed action in worker [{workerName}]");
action(workerName);
}
}
}));
}
return tasks;
}
void action(string actionNumber, string workerName)
{
Console.WriteLine($"processing {actionNumber} in worker {workerName}...");
var secondsToWait = new Random().Next(2, 5);
Thread.Sleep(TimeSpan.FromSeconds(secondsToWait));
Console.WriteLine($"action {actionNumber} completed by worker {workerName}"
+ $" after {secondsToWait} secs! channel items left:"
+ $" [{channel.Reader.Count}]");
if (WorkDistribution.ContainsKey(workerName))
{
lock (LockObject)
{
WorkDistribution[workerName]++;
}
}
else
{
var succeeded = WorkDistribution.TryAdd(workerName, 1);
if (!succeeded)
{
Console.WriteLine($"!!! failed incremeting dic value !!!");
}
}
}
public void Summarize(Stopwatch stopwatch)
{
Console.WriteLine("--------------------------- Thread Work Distribution "
+ "------------------------");
foreach (var kv in this.WorkDistribution)
Console.WriteLine($"thread: {kv.Key} items consumed: {kv.Value}");
Console.WriteLine($"Total actions consumed: "
+ $"{WorkDistribution.Sum(w => w.Value)} - Elapsed time: "
+ $"{stopwatch.Elapsed.Seconds} secs");
}
public void Run(int producerCycles)
{
var stopwatch = new Stopwatch();
stopwatch.Start();
var producerTasks = new List<Task>();
Console.WriteLine($"Started running at {DateTime.Now}...");
for (var i = 0; i < producerCycles; i++)
{
producerTasks.Add(ProduceItems(i));
}
var consumerTasks = GetConsumers();
Task.WaitAll(producerTasks.ToArray());
Console.WriteLine($"-------------- Completed waiting for PRODUCERS -"
+ " total items produced: [{TotalItemsProduced}] ------------------");
channel.Writer.Complete(); //just so I can complete this demo
Task.WaitAll(consumerTasks.ToArray());
Console.WriteLine("----------------- Completed waiting for CONSUMERS "
+ "------------------");
//Task.WaitAll(GetConsumers().Union(producerTasks/*.Union(
// new List<Task> { taskKey })*/).ToArray());
//Console.WriteLine("Completed waiting for tasks");
Summarize(stopwatch);
}
}
And here is the calling code in Program.cs
var workersCount = 5;
var itemsToProduce = 10;
var maxItemsInQueue = 5;
var cts = new CancellationTokenSource();
var producerConsumerTests = new ProducerConsumerTests(workersCount, itemsToProduce,
maxItemsInQueue, cts.Token);
producerConsumerTests.Run(2);
From a quick look there is a race condition in the ProduceItems method, around the queueHasRoom variable. You don't need this variable. The channel.Writer.TryWrite method will tell you whether there is room in the channel's buffer or not. Alternatively you could simply await the WriteAsync method, instead of using the WaitToWriteAsync/TryWrite combo. AFAIK this combo is intended as a performance optimization of the former method. If you absolutely need to know whether there is available space before attempting to post a value, then the Channel<T> is probably not a suitable container for your use case. You'll need to find something that can be locked during the whole operation of "check-for-available-space -> create-the-value -> post-the-value", so that this operation can be made atomic.
As a side note, using a lock to protect the updating of the ConcurrentDictionary is redundant. The ConcurrentDictionary offers the AddOrUpdate method, that can replace atomically a value it contains with another value. You may had to lock if the dictionary contained mutable objects, and you needed to mutate that objects with thread-safety. But in your case the values are of type Int32, which is an immutable struct. You don't change it, you just replace it with a new Int32, which is created based on the existing value:
WorkDistribution.AddOrUpdate(workerName, 1, (_, existing) => existing + 1);

How to span MaxDegreeOfParallelism across multiple TPL Dataflow blocks?

I want to limit the total number of queries that I submit to my database server across all Dataflow blocks to 30. In the following scenario, the throttling of 30 concurrent tasks is per block so it always hits 60 concurrent tasks during execution. Obviously I could limit my parallelism to 15 per block to achieve a system wide total of 30 but this wouldn't be optimal.
How do I make this work? Do I limit (and block) my awaits using SemaphoreSlim, etc, or is there an intrinsic Dataflow approach that works better?
public class TPLTest
{
private long AsyncCount = 0;
private long MaxAsyncCount = 0;
private long TaskId = 0;
private object MetricsLock = new object();
public async Task Start()
{
ExecutionDataflowBlockOptions execOption
= new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 30 };
DataflowLinkOptions linkOption = new DataflowLinkOptions()
{ PropagateCompletion = true };
var doFirstIOWorkAsync = new TransformBlock<Data, Data>(
async data => await DoIOBoundWorkAsync(data), execOption);
var doCPUWork = new TransformBlock<Data, Data>(
data => DoCPUBoundWork(data));
var doSecondIOWorkAsync = new TransformBlock<Data, Data>(
async data => await DoIOBoundWorkAsync(data), execOption);
var doProcess = new TransformBlock<Data, string>(
i => $"Task finished, ID = : {i.TaskId}");
var doPrint = new ActionBlock<string>(
s => Debug.WriteLine(s));
doFirstIOWorkAsync.LinkTo(doCPUWork, linkOption);
doCPUWork.LinkTo(doSecondIOWorkAsync, linkOption);
doSecondIOWorkAsync.LinkTo(doProcess, linkOption);
doProcess.LinkTo(doPrint, linkOption);
int taskCount = 150;
for (int i = 0; i < taskCount; i++)
{
await doFirstIOWorkAsync.SendAsync(new Data() { Delay = 2500 });
}
doFirstIOWorkAsync.Complete();
await doPrint.Completion;
Debug.WriteLine("Max concurrent tasks: " + MaxAsyncCount.ToString());
}
private async Task<Data> DoIOBoundWorkAsync(Data data)
{
lock(MetricsLock)
{
AsyncCount++;
if (AsyncCount > MaxAsyncCount)
MaxAsyncCount = AsyncCount;
}
if (data.TaskId <= 0)
data.TaskId = Interlocked.Increment(ref TaskId);
await Task.Delay(data.Delay);
lock (MetricsLock)
AsyncCount--;
return data;
}
private Data DoCPUBoundWork(Data data)
{
data.Step = 1;
return data;
}
}
Data Class:
public class Data
{
public int Delay { get; set; }
public long TaskId { get; set; }
public int Step { get; set; }
}
Starting point:
TPLTest tpl = new TPLTest();
await tpl.Start();
Why don't you marshal everything to an action block that has the actual limitation?
var count = 0;
var ab1 = new TransformBlock<int, string>(l => $"1:{l}");
var ab2 = new TransformBlock<int, string>(l => $"2:{l}");
var doPrint = new ActionBlock<string>(
async s =>
{
var c = Interlocked.Increment(ref count);
Console.WriteLine($"{c}:{s}");
await Task.Delay(5);
Interlocked.Decrement(ref count);
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 15 });
ab1.LinkTo(doPrint);
ab2.LinkTo(doPrint);
for (var i = 100; i > 0; i--)
{
if (i % 3 == 0) await ab1.SendAsync(i);
if (i % 5 == 0) await ab2.SendAsync(i);
}
ab1.Complete();
ab2.Complete();
await ab1.Completion;
await ab2.Completion;
This is the solution I ended up going with (unless I can figure out how to use a single generic DataFlow block for marshalling every type of database access):
I defined a SemaphoreSlim at the class level:
private SemaphoreSlim ThrottleDatabaseQuerySemaphore = new SemaphoreSlim(30, 30);
I modified the I/O class to call a throttling class:
private async Task<Data> DoIOBoundWorkAsync(Data data)
{
if (data.TaskId <= 0)
data.TaskId = Interlocked.Increment(ref TaskId);
Task t = Task.Delay(data.Delay); ;
await ThrottleDatabaseQueryAsync(t);
return data;
}
The throttling class: (I also have a generic version of the throttling routine because I couldn't figure out how to write one routine to handle both Task and Task<TResult>)
private async Task ThrottleDatabaseQueryAsync(Task task)
{
await ThrottleDatabaseQuerySemaphore.WaitAsync();
try
{
lock (MetricsLock)
{
AsyncCount++;
if (AsyncCount > MaxAsyncCount)
MaxAsyncCount = AsyncCount;
}
await task;
}
finally
{
ThrottleDatabaseQuerySemaphore.Release();
lock (MetricsLock)
AsyncCount--;
}
}
}
The simplest solution to this problem is to configure all your blocks with a limited-concurrency TaskScheduler:
TaskScheduler scheduler = new ConcurrentExclusiveSchedulerPair(
TaskScheduler.Default, maxConcurrencyLevel: 30).ConcurrentScheduler;
ExecutionDataflowBlockOptions execOption = new()
{
TaskScheduler = scheduler,
MaxDegreeOfParallelism = scheduler.MaximumConcurrencyLevel,
};
TaskSchedulers can only limit the concurrency of work done on threads. They can't throttle asynchronous operations that are not running on threads. So in order to enforce the MaximumConcurrencyLevel policy, unfortunately you must pass synchronous delegates to all the Dataflow blocks. For example:
TransformBlock<Data, Data> doFirstIOWorkAsync = new(data =>
{
return DoIOBoundWorkAsync(data).GetAwaiter().GetResult();
}, execOption);
This change will increase the demand for ThreadPool threads, so you'd better increase the number of threads that the ThreadPool creates instantly on demand to a higher value than the default Environment.ProcessorCount:
ThreadPool.SetMinThreads(100, 100); // At the start of the program
I am proposing this solution not because it is optimal, but because it is easy to implement. My understanding is that wasting some RAM on ~30 threads that are going to be blocked most of the time, won't have any measurable negative effect on the type of application that you are working with.

Aysnc Task --My App's Memory Consumption is Getting Higer Slowly ,Why?

I use 'Task.Factory.ContinueWhenAll()' to improve app's performance,but i got a problem that the memory is slowly getting higher.
Then i use the GC.collect() and GC.GC.WaitForPendingFinalizers() and i got the same situation.
int taskCount = Settings.Default.TaskCount;
int monitorInterval = Settings.Default.MonitorInterval;
while (true)
{
using (TaskHandleSvr taskItem = new TaskHandleSvr())
{
List<Task> taskList = new List<Task>();
for (int i = 0; i < taskCount; i++)
taskList.Add(taskItem.RegisterRun());
await Task.Factory.ContinueWhenAll(taskList.ToArray(), wordCountTasks =>
{
int count = taskItem.ProcessMisseHandledRedisMsg();
string sLogTime = $"{DateTime.UtcNow.AddHours(8).ToString("yyyy-MM-dd HH:mm:ss.ffff")}";
string sLogName = "-";
string sLogMessage = "Service End!";
NLogHelper.Info($"{sLogName}*|*{sLogTime}*|*Loop{count}Msg");
NLogHelper.Info($"{sLogName}*|*{sLogTime}*|*{sLogMessage}");
});
taskItem.Dispose();
GC.Collect();
GC.WaitForPendingFinalizers();
}
//Console.ReadKey();
await Task.Delay(TimeSpan.FromSeconds(5));
}
//Below is the TaskHandleSvr
Here i am ! Finally, i got my bug fixed. Whoa~
use 'static' task
there is a wrong place at "taskList.Add(taskItem.RegisterRun())", it should be 'taskList.Add(Task.Factory.StartNew(fn =>{TaskHandleSvr.RegisterRun().Wait(); }, null))'. After i printed threadid in 'RegisterRun()', i found that the current thread managed id is the same number. Then i realized there is something wrong with using multiple thread.
In case of not realsing memory in tasks, i use 'Task.Dispose()'
Thanks the Comments Upstairs, sincerely thank U~
Below is the Code
protected override async Task RunAsync(CancellationToken cancellationToken)
{
int taskCount= Settings.Default.TaskCount;
int monitorIntervalSeconds = Settings.Default.MonitorInterval;
List<Task> taskList = new List<Task>();
while (true)
{
for (int i = 0; i < taskCount; i++)
taskList.Add(Task.Factory.StartNew(fn => { TaskHandleSvr.RegisterRun().Wait(); }, null));
await Task.Factory.ContinueWhenAll(taskList.ToArray(), wordCountTasks =>
{
int count = TaskHandleSvr.ProcessMisseHandledRedisMsg().Result;
string sLogTime = $"{DateTime.UtcNow.AddHours(8).ToString("yyyy-MM-dd HH:mm:ss.ffff")}";
string sLogName = "--myTaskGroup--";
string sLogMessage = "Task End";
NLogHelper.Info($"{sLogName}*|*{sLogTime}*|*RollBack {count} Message");
NLogHelper.Info($"{sLogName}*|*{sLogTime}*|*{sLogMessage}");
});
//Release Resource
foreach (var item in taskList)
item.Dispose();
taskList.RemoveRange(0, taskList.Count);
await Task.Delay(TimeSpan.FromSeconds(monitorIntervalSeconds), cancellationToken);
}
}

Async Producer / Consumer with throttled duration and batched consumption

I am trying to build a service that provides a queue for many asynchronous clients to make requests and await a response. I need to be able to throttle the queue processing by X requests per Y duration. For example: 50 web requests per second. It is for a 3rd party REST Service where I can only issue X requests per second.
Found many SO questions, it is lead me down the path of using TPL Dataflow, I've used a TranformBlock to provide my custom throttling and then X number of ActionBlocks to complete the tasks in parallel. The implementation of the Action seems a bit clunky, so wondering if there is a better way for me to pass Tasks into the pipeline that notify the callers once completed.
I'm wondering if there is there a better or more optimal/simpler way to do what I want? Is there any glaring issues with my implementation? I know it is missing cancellation and exception handing and I'll be doing this next, but your comments are most welcomed.
I've Extended Stephen Cleary's example for my Dataflow pipeline and used
svick's concept of a time throttled TransformBlock. I am wondering if what I've built could be easily achieved with a pure SemaphoreSlim design, its the time based throttling with max operations that I think will complicate things.
Here is the latest implementation. FIFO queue async queue where I can pass in custom actions.
public class ThrottledProducerConsumer<T>
{
private class TimerState<T1>
{
public SemaphoreSlim Sem;
public T1 Value;
}
private BufferBlock<T> _queue;
private IPropagatorBlock<T, T> _throttleBlock;
private List<Task> _consumers;
private static IPropagatorBlock<T1, T1> CreateThrottleBlock<T1>(TimeSpan Interval, Int32 MaxPerInterval)
{
SemaphoreSlim _sem = new SemaphoreSlim(MaxPerInterval);
return new TransformBlock<T1, T1>(async (x) =>
{
var sw = new Stopwatch();
sw.Start();
//Console.WriteLine($"Current count: {_sem.CurrentCount}");
await _sem.WaitAsync();
sw.Stop();
var now = DateTime.UtcNow;
var releaseTime = now.Add(Interval) - now;
//-- Using timer as opposed to Task.Delay as I do not want to await or wait for it to complete
var tm = new Timer((s) => {
var state = (TimerState<T1>)s;
//Console.WriteLine($"RELEASE: {state.Value} was released {DateTime.UtcNow:mm:ss:ff} Reset Sem");
state.Sem.Release();
}, new TimerState<T1> { Sem = _sem, Value = x }, (int)Interval.TotalMilliseconds,
-1);
/*
Task.Delay(delay).ContinueWith((t)=>
{
Console.WriteLine($"RELEASE(FAKE): {x} was released {DateTime.UtcNow:mm:ss:ff} Reset Sem");
//_sem.Release();
});
*/
//Console.WriteLine($"{x} was tramsformed in {sw.ElapsedMilliseconds}ms. Will release {now.Add(Interval):mm:ss:ff}");
return x;
},
//new ExecutionDataflowBlockOptions { BoundedCapacity = 1 });
//
new ExecutionDataflowBlockOptions { BoundedCapacity = 5, MaxDegreeOfParallelism = 10 });
}
public ThrottledProducerConsumer(TimeSpan Interval, int MaxPerInterval, Int32 QueueBoundedMax = 5, Action<T> ConsumerAction = null, Int32 MaxConsumers = 1)
{
var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1, };
var linkOptions = new DataflowLinkOptions { PropagateCompletion = true, };
//-- Create the Queue
_queue = new BufferBlock<T>(new DataflowBlockOptions { BoundedCapacity = QueueBoundedMax, });
//-- Create and link the throttle block
_throttleBlock = CreateThrottleBlock<T>(Interval, MaxPerInterval);
_queue.LinkTo(_throttleBlock, linkOptions);
//-- Create and link the consumer(s) to the throttle block
var consumerAction = (ConsumerAction != null) ? ConsumerAction : new Action<T>(ConsumeItem);
_consumers = new List<Task>();
for (int i = 0; i < MaxConsumers; i++)
{
var consumer = new ActionBlock<T>(consumerAction, consumerOptions);
_throttleBlock.LinkTo(consumer, linkOptions);
_consumers.Add(consumer.Completion);
}
//-- TODO: Add some cancellation tokens to shut this thing down
}
/// <summary>
/// Default Consumer Action, just prints to console
/// </summary>
/// <param name="ItemToConsume"></param>
private void ConsumeItem(T ItemToConsume)
{
Console.WriteLine($"Consumed {ItemToConsume} at {DateTime.UtcNow}");
}
public async Task EnqueueAsync(T ItemToEnqueue)
{
await this._queue.SendAsync(ItemToEnqueue);
}
public async Task EnqueueItemsAsync(IEnumerable<T> ItemsToEnqueue)
{
foreach (var item in ItemsToEnqueue)
{
await this._queue.SendAsync(item);
}
}
public async Task CompleteAsync()
{
this._queue.Complete();
await Task.WhenAll(_consumers);
Console.WriteLine($"All consumers completed {DateTime.UtcNow}");
}
}
The test method
public class WorkItem<T>
{
public TaskCompletionSource<T> tcs;
//public T respone;
public string url;
public WorkItem(string Url)
{
tcs = new TaskCompletionSource<T>();
url = Url;
}
public override string ToString()
{
return $"{url}";
}
}
public static void TestQueue()
{
Console.WriteLine("Created the queue");
var defaultAction = new Action<WorkItem<String>>(async i => {
var taskItem = ((WorkItem<String>)i);
Console.WriteLine($"Consuming: {taskItem.url} {DateTime.UtcNow:mm:ss:ff}");
//-- Assume calling another async method e.g. await httpClient.DownloadStringTaskAsync(url);
await Task.Delay(5000);
taskItem.tcs.SetResult($"{taskItem.url}");
//Console.WriteLine($"Consumed: {taskItem.url} {DateTime.UtcNow}");
});
var queue = new ThrottledProducerConsumer<WorkItem<String>>(TimeSpan.FromMilliseconds(2000), 5, 2, defaultAction);
var results = new List<Task>();
foreach (var no in Enumerable.Range(0, 20))
{
var workItem = new WorkItem<String>($"http://someurl{no}.com");
results.Add(queue.EnqueueAsync(workItem));
results.Add(workItem.tcs.Task);
results.Add(workItem.tcs.Task.ContinueWith(response =>
{
Console.WriteLine($"Received: {response.Result} {DateTime.UtcNow:mm:ss:ff}");
}));
}
Task.WhenAll(results).Wait();
Console.WriteLine("All Work Items Have Been Processed");
}
Since asking, I have created a ThrottledConsumerProducer class based on TPL Dataflow. It was tested over a number of days which included concurrent producers which were queued and completed in order, approx 281k without any problems, however there my be bugs I've not discovered.
I am using a BufferBlock as an asynchronous queue, this is linked to:
A TransformBlock which provides the throttling and blocking I need. It is used in conjunction with a SempahoreSlim to control the max requests. As each item is passed through the block, it increments the semaphore and schedules a task to run X duration later to release the semaphore by one. This way I have a sliding window of X requests per duration; exactly what I wanted. Because of TPL I am also leveraging parallelism to the connected:
ActionBlock(s) which are responsible for performing the task I need.
The classes are generic, so it might be useful to others if they need something similar. I have not written cancellation or error handling, but thought I should just mark this as answered to move it along. I would be quite happy to see some alternatives and feedback, rather than mark mine as an accepted answer. Thanks for reading.
NOTE: I removed the Timer from the original implementation as it was doing weird stuff causing the semaphore to release more than the maximum, I am assuming it is dynamic context error, it occurred when I started running concurrent requests. I worked around it using Task.Delay to schedule a release of a semaphore lock.
Throttled Producer Consumer
public class ThrottledProducerConsumer<T>
{
private BufferBlock<T> _queue;
private IPropagatorBlock<T, T> _throttleBlock;
private List<Task> _consumers;
private static IPropagatorBlock<T1, T1> CreateThrottleBlock<T1>(TimeSpan Interval,
Int32 MaxPerInterval, Int32 BlockBoundedMax = 2, Int32 BlockMaxDegreeOfParallelism = 2)
{
SemaphoreSlim _sem = new SemaphoreSlim(MaxPerInterval, MaxPerInterval);
return new TransformBlock<T1, T1>(async (x) =>
{
//Log($"Transform blk: {x} {DateTime.UtcNow:mm:ss:ff} Semaphore Count: {_sem.CurrentCount}");
var sw = new Stopwatch();
sw.Start();
//Console.WriteLine($"Current count: {_sem.CurrentCount}");
await _sem.WaitAsync();
sw.Stop();
var delayTask = Task.Delay(Interval).ContinueWith((t) =>
{
//Log($"Pre-RELEASE: {x} {DateTime.UtcNow:mm:ss:ff} Semaphore Count {_sem.CurrentCount}");
_sem.Release();
//Log($"PostRELEASE: {x} {DateTime.UtcNow:mm:ss:ff} Semaphoere Count {_sem.CurrentCount}");
});
//},TaskScheduler.FromCurrentSynchronizationContext());
//Log($"Transformed: {x} in queue {sw.ElapsedMilliseconds}ms. {DateTime.Now:mm:ss:ff} will release {DateTime.Now.Add(Interval):mm:ss:ff} Semaphoere Count {_sem.CurrentCount}");
return x;
},
//-- Might be better to keep Bounded Capacity in sync with the semaphore
new ExecutionDataflowBlockOptions { BoundedCapacity = BlockBoundedMax,
MaxDegreeOfParallelism = BlockMaxDegreeOfParallelism });
}
public ThrottledProducerConsumer(TimeSpan Interval, int MaxPerInterval,
Int32 QueueBoundedMax = 5, Action<T> ConsumerAction = null, Int32 MaxConsumers = 1,
Int32 MaxThrottleBuffer = 20, Int32 MaxDegreeOfParallelism = 10)
{
//-- Probably best to link MaxPerInterval and MaxThrottleBuffer
// and MaxConsumers with MaxDegreeOfParallelism
var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1, };
var linkOptions = new DataflowLinkOptions { PropagateCompletion = true, };
//-- Create the Queue
_queue = new BufferBlock<T>(new DataflowBlockOptions { BoundedCapacity = QueueBoundedMax, });
//-- Create and link the throttle block
_throttleBlock = CreateThrottleBlock<T>(Interval, MaxPerInterval);
_queue.LinkTo(_throttleBlock, linkOptions);
//-- Create and link the consumer(s) to the throttle block
var consumerAction = (ConsumerAction != null) ? ConsumerAction : new Action<T>(ConsumeItem);
_consumers = new List<Task>();
for (int i = 0; i < MaxConsumers; i++)
{
var consumer = new ActionBlock<T>(consumerAction, consumerOptions);
_throttleBlock.LinkTo(consumer, linkOptions);
_consumers.Add(consumer.Completion);
}
//-- TODO: Add some cancellation tokens to shut this thing down
}
/// <summary>
/// Default Consumer Action, just prints to console
/// </summary>
/// <param name="ItemToConsume"></param>
private void ConsumeItem(T ItemToConsume)
{
Log($"Consumed {ItemToConsume} at {DateTime.UtcNow}");
}
public async Task EnqueueAsync(T ItemToEnqueue)
{
await this._queue.SendAsync(ItemToEnqueue);
}
public async Task EnqueueItemsAsync(IEnumerable<T> ItemsToEnqueue)
{
foreach (var item in ItemsToEnqueue)
{
await this._queue.SendAsync(item);
}
}
public async Task CompleteAsync()
{
this._queue.Complete();
await Task.WhenAll(_consumers);
Console.WriteLine($"All consumers completed {DateTime.UtcNow}");
}
private static void Log(String messageToLog)
{
System.Diagnostics.Trace.WriteLine(messageToLog);
Console.WriteLine(messageToLog);
}
}
- Example Usage -
A Generic WorkItem
public class WorkItem<Toutput,Tinput>
{
private TaskCompletionSource<Toutput> _tcs;
public Task<Toutput> Task { get { return _tcs.Task; } }
public Tinput InputData { get; private set; }
public Toutput OutputData { get; private set; }
public WorkItem(Tinput inputData)
{
_tcs = new TaskCompletionSource<Toutput>();
InputData = inputData;
}
public void Complete(Toutput result)
{
_tcs.SetResult(result);
}
public void Failed(Exception ex)
{
_tcs.SetException(ex);
}
public override string ToString()
{
return InputData.ToString();
}
}
Creating the action block executed in the pipeline
private Action<WorkItem<Location,PointToLocation>> CreateProcessingAction()
{
return new Action<WorkItem<Location,PointToLocation>>(async i => {
var sw = new Stopwatch();
sw.Start();
var taskItem = ((WorkItem<Location,PointToLocation>)i);
var inputData = taskItem.InputData;
//Log($"Consuming: {inputData.Latitude},{inputData.Longitude} {DateTime.UtcNow:mm:ss:ff}");
//-- Assume calling another async method e.g. await httpClient.DownloadStringTaskAsync(url);
await Task.Delay(500);
sw.Stop();
Location outData = new Location()
{
Latitude = inputData.Latitude,
Longitude = inputData.Longitude,
StreetAddress = $"Consumed: {inputData.Latitude},{inputData.Longitude} Duration(ms): {sw.ElapsedMilliseconds}"
};
taskItem.Complete(outData);
//Console.WriteLine($"Consumed: {taskItem.url} {DateTime.UtcNow}");
});
}
Test Method
You'll need to provide your own implementation for PointToLocation and Location. Just an example of how you'd use it with your own classes.
int startRange = 0;
int nextRange = 1000;
ThrottledProducerConsumer<WorkItem<Location,PointToLocation>> tpc;
private void cmdTestPipeline_Click(object sender, EventArgs e)
{
Log($"Pipeline test started {DateTime.Now:HH:mm:ss:ff}");
if(tpc == null)
{
tpc = new ThrottledProducerConsumer<WorkItem<Location, PointToLocation>>(
//1010, 2, 20000,
TimeSpan.FromMilliseconds(1010), 45, 100000,
CreateProcessingAction(),
2,45,10);
}
var workItems = new List<WorkItem<Models.Location, PointToLocation>>();
foreach (var i in Enumerable.Range(startRange, nextRange))
{
var ptToLoc = new PointToLocation() { Latitude = i + 101, Longitude = i + 100 };
var wrkItem = new WorkItem<Location, PointToLocation>(ptToLoc);
workItems.Add(wrkItem);
wrkItem.Task.ContinueWith(t =>
{
var loc = t.Result;
string line = $"[Simulated:{DateTime.Now:HH:mm:ss:ff}] - {loc.StreetAddress}";
//txtResponse.Text = String.Concat(txtResponse.Text, line, System.Environment.NewLine);
//var lines = txtResponse.Text.Split(new string[] { System.Environment.NewLine},
// StringSplitOptions.RemoveEmptyEntries).LongCount();
//lblLines.Text = lines.ToString();
//Log(line);
});
//}, TaskScheduler.FromCurrentSynchronizationContext());
}
startRange += nextRange;
tpc.EnqueueItemsAsync(workItems);
Log($"Pipeline test completed {DateTime.Now:HH:mm:ss:ff}");
}

TaskFactory, Starting a new Task when one ends

I have found many methods of using the TaskFactory but I could not find anything about starting more tasks and watching when one ends and starting another one.
I always want to have 10 tasks working.
I want something like this
int nTotalTasks=10;
int nCurrentTask=0;
Task<bool>[] tasks=new Task<bool>[nThreadsNum];
for (int i=0; i<1000; i++)
{
string param1="test";
string param2="test";
if (nCurrentTask<10) // if there are less than 10 tasks then start another one
tasks[nCurrentThread++] = Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
});
// How can I stop the for loop until a new task is finished and start a new one?
}
Check out the Task.WaitAny method:
Waits for any of the provided Task objects to complete execution.
Example from the documentation:
var t1 = Task.Factory.StartNew(() => DoOperation1());
var t2 = Task.Factory.StartNew(() => DoOperation2());
Task.WaitAny(t1, t2)
I would use a combination of Microsoft's Reactive Framework (NuGet "Rx-Main") and TPL for this. It becomes very simple.
Here's the code:
int nTotalTasks=10;
string param1="test";
string param2="test";
IDisposable subscription =
Observable
.Range(0, 1000)
.Select(i => Observable.FromAsync(() => Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
})))
.Merge(nTotalTasks)
.ToArray()
.Subscribe((bool[] results) =>
{
/* Do something with the results. */
});
The key part here is the .Merge(nTotalTasks) which limits the number of concurrent tasks.
If you need to stop the processing part way thru just call subscription.Dispose() and everything gets cleaned up for you.
If you want to process each result as they are produced you can change the code from the .Merge(...) like this:
.Merge(nTotalTasks)
.Subscribe((bool result) =>
{
/* Do something with each result. */
});
This should be all you need, not complete, but all you need to do is wait on the first to complete and then run the second.
Task.WaitAny(task to wait on);
Task.Factory.StartNew()
Have you seen the BlockingCollection class? It allows you to have multiple threads running in parallel and you can wait from results from one task to execute another. See more information here.
The answer depends on whether the tasks to be scheduled are CPU or I/O bound.
For CPU-intensive work I would use Parallel.For() API setting the number of thread/tasks through MaxDegreeOfParallelism property of ParallelOptions
For I/O bound work the number of concurrently executing tasks can be significantly larger than the number of available CPUs, so the strategy is to rely on async methods as much as possible, which reduces the total number of threads waiting for completion.
How can I stop the for loop until a new task is finished and start a
new one?
The loop can be throttled by using await:
static void Main(string[] args)
{
var task = DoWorkAsync();
task.Wait();
// handle results
// task.Result;
Console.WriteLine("Done.");
}
async static Task<bool> DoWorkAsync()
{
const int NUMBER_OF_SLOTS = 10;
string param1="test";
string param2="test";
var results = new bool[NUMBER_OF_SLOTS];
AsyncWorkScheduler ws = new AsyncWorkScheduler(NUMBER_OF_SLOTS);
for (int i = 0; i < 1000; ++i)
{
await ws.ScheduleAsync((slotNumber) => DoWorkAsync(i, slotNumber, param1, param2, results));
}
ws.Complete();
await ws.Completion;
}
async static Task DoWorkAsync(int index, int slotNumber, string param1, string param2, bool[] results)
{
results[slotNumber] = results[slotNumber} && await Task.Factory.StartNew<bool>(() =>
{
MyClass cls = new MyClass();
bool bRet = cls.Method1(param1, param2, i); // takes up to 2 minutes to finish
return bRet;
}));
}
A helper class AsyncWorkScheduler uses TPL.DataFlow components as well as Task.WhenAll():
class AsyncWorkScheduler
{
public AsyncWorkScheduler(int numberOfSlots)
{
m_slots = new Task[numberOfSlots];
m_availableSlots = new BufferBlock<int>();
m_errors = new List<Exception>();
m_tcs = new TaskCompletionSource<bool>();
m_completionPending = 0;
// Initial state: all slots are available
for(int i = 0; i < m_slots.Length; ++i)
{
m_slots[i] = Task.FromResult(false);
m_availableSlots.Post(i);
}
}
public async Task ScheduleAsync(Func<int, Task> action)
{
if (Volatile.Read(ref m_completionPending) != 0)
{
throw new InvalidOperationException("Unable to schedule new items.");
}
// Acquire a slot
int slotNumber = await m_availableSlots.ReceiveAsync().ConfigureAwait(false);
// Schedule a new task for a given slot
var task = action(slotNumber);
// Store a continuation on the task to handle completion events
m_slots[slotNumber] = task.ContinueWith(t => HandleCompletedTask(t, slotNumber), TaskContinuationOptions.ExecuteSynchronously);
}
public async void Complete()
{
if (Interlocked.CompareExchange(ref m_completionPending, 1, 0) != 0)
{
return;
}
// Signal the queue's completion
m_availableSlots.Complete();
await Task.WhenAll(m_slots).ConfigureAwait(false);
// Set completion
if (m_errors.Count != 0)
{
m_tcs.TrySetException(m_errors);
}
else
{
m_tcs.TrySetResult(true);
}
}
public Task Completion
{
get
{
return m_tcs.Task;
}
}
void SetFailed(Exception error)
{
lock(m_errors)
{
m_errors.Add(error);
}
}
void HandleCompletedTask(Task task, int slotNumber)
{
if (task.IsFaulted || task.IsCanceled)
{
SetFailed(task.Exception);
return;
}
if (Volatile.Read(ref m_completionPending) == 1)
{
return;
}
// Release a slot
m_availableSlots.Post(slotNumber);
}
int m_completionPending;
List<Exception> m_errors;
BufferBlock<int> m_availableSlots;
TaskCompletionSource<bool> m_tcs;
Task[] m_slots;
}

Categories