how can i implement a concurrent execution queue class? - c#

I need to have a class that will execute actions in thread pool, but these actions should be queued. For example:
method 1
method 2
method 3
When someone called method 1 from his thread he can also call method 2 or method 3 and they all 3 methods can be performed concurrently, but when another call came from user for method 1 or 2 or 3, this time the thread pool should block these call, until the old ones execution is completed.
Something like the below picture:
Should I use channels?

To should i use channels?, the answer is yes, but there are other features available too.
Dataflow
.NET already offers this feature through the TPL Dataflow classes. You can use an ActionBlock class to pass messages (ie data) to a worker method that executes i the background with guaranteed order and a configurable degree of parallelism. Channels are a new feature which does essentially the same job.
What you describe is actually the simplest way of using an ActionBlock - just post data messages to it and have it process them one by one :
void Method1(MyDataObject1 data){...}
var block=new ActionBlock<MyDataObject1>(Method1);
//Start sending data to the block
for(var msg in someListOfItems)
{
block.PostAsync(msg);
}
By default, an ActionBlock has an infinite input queue. It will use only one task to process messages asynchronously, in the order they are posted.
When you're done with it, you can tell it to Complete() and await asynchronously for all remaining items to finish processing :
block.Complete();
await block.Completion;
To handle different methods, you can simply use multiple blocks, eg :
var block1=new ActionBlock<MyDataObject1>(Method1);
var block2=new ActionBlock<MyDataObject1>(Method2);
Channels
Channels are a lower-level feature than blocks. This means you have to write more code but you get far better control on how the "processing blocks" work. In fact, you can probably rewrite the TPL Dataflow library using channels.
You could create a processing block similar to an ActionBlock with the following (a bit naive) method:
ChannelWriter<TIn> Work(Action<TIn> action)
{
var channel=Channel.CreateUnbounded<TIn>();
var workerTask=Task.Run(async ()=>{
await foreach(var msg in channel.Reader.ReadAllAsync())
{
action(msg);
}
})
var writer=channel.Writer;
return writer;
}
This method creates a channel and runs a task in the background to read data asynchronously and process them. I'm cheating "a bit" here by using await foreach and ChannelReader.ReadAllAsync() which are available in C#8 and .NET Core 3.0.
This method can be used like a block :
ChannelWriter<DataObject1> writer1 = Work(Method1);
foreach(var msg in someListOfItems)
{
writer1.WriteAsync(msg);
}
writer1.Complete();
There's a lot more to Channels though. SignalR for example uses them to allow streaming of notifications to the clients.

Here is my suggestion. For each synchronous method, an asynchronous method should be added. For example the method FireTheGun is synchronous:
private static void FireTheGun(int bulletsCount)
{
var ratata = Enumerable.Repeat("Ta", bulletsCount).Prepend("Ra");
Console.WriteLine(String.Join("-", ratata));
}
The asynchronous counterpart FireTheGunAsync is very simple, because the complexity of queuing the synchronous action is delegated to a helper method QueueAsync.
public static Task FireTheGunAsync(int bulletsCount)
{
return QueueAsync(FireTheGun, bulletsCount);
}
Here is the implementation of QueueAsync. Each action has its dedicated SemaphoreSlim, to prevent multiple concurrent executions:
private static ConcurrentDictionary<MethodInfo, SemaphoreSlim> semaphores =
new ConcurrentDictionary<MethodInfo, SemaphoreSlim>();
public static Task QueueAsync<T1>(Action<T1> action, T1 param1)
{
return Task.Run(async () =>
{
var semaphore = semaphores
.GetOrAdd(action.Method, key => new SemaphoreSlim(1));
await semaphore.WaitAsync();
try
{
action(param1);
}
finally
{
semaphore.Release();
}
});
}
Usage example:
FireTheGunAsync(5);
FireTheGunAsync(8);
Output:
Ra-Ta-Ta-Ta-Ta-Ta
Ra-Ta-Ta-Ta-Ta-Ta-Ta-Ta-Ta
Implementing versions of QueueAsync with different number of parameters should be trivial.
Update: My previous implementation of QueueAsync has the probably undesirable behavior that executes the actions in random order. This happens because the second task may be the first one to acquire the semaphore. Below is an implementation that guaranties the corrent order of execution. The performance could be bad in case of high contention, because each task is entering a loop until it takes the semaphore in the right order.
private class QueueInfo
{
public SemaphoreSlim Semaphore = new SemaphoreSlim(1);
public int TicketToRide = 0;
public int Current = 0;
}
private static ConcurrentDictionary<MethodInfo, QueueInfo> queues =
new ConcurrentDictionary<MethodInfo, QueueInfo>();
public static Task QueueAsync<T1>(Action<T1> action, T1 param1)
{
var queue = queues.GetOrAdd(action.Method, key => new QueueInfo());
var ticket = Interlocked.Increment(ref queue.TicketToRide);
return Task.Run(async () =>
{
while (true) // Loop until our ticket becomes current
{
await queue.Semaphore.WaitAsync();
try
{
if (Interlocked.CompareExchange(ref queue.Current,
ticket, ticket - 1) == ticket - 1)
{
action(param1);
break;
}
}
finally
{
queue.Semaphore.Release();
}
}
});
}

What about this solution?
public class ConcurrentQueue
{
private Dictionary<byte, PoolFiber> Actionsfiber;
public ConcurrentQueue()
{
Actionsfiber = new Dictionary<byte, PoolFiber>()
{
{ 1, new PoolFiber() },
{ 2, new PoolFiber() },
{ 3, new PoolFiber() },
};
foreach (var fiber in Actionsfiber.Values)
{
fiber.Start();
}
}
public void ExecuteAction(Action Action , byte Code)
{
if (Actionsfiber.ContainsKey(Code))
Actionsfiber[Code].Enqueue(() => { Action.Invoke(); });
else
Console.WriteLine($"invalid byte code");
}
}
public static void SomeAction1()
{
Console.WriteLine($"{DateTime.Now} Action 1 is working");
for (long i = 0; i < 2400000000; i++)
{
}
Console.WriteLine($"{DateTime.Now} Action 1 stopped");
}
public static void SomeAction2()
{
Console.WriteLine($"{DateTime.Now} Action 2 is working");
for (long i = 0; i < 5000000000; i++)
{
}
Console.WriteLine($"{DateTime.Now} Action 2 stopped");
}
public static void SomeAction3()
{
Console.WriteLine($"{DateTime.Now} Action 3 is working");
for (long i = 0; i < 5000000000; i++)
{
}
Console.WriteLine($"{DateTime.Now} Action 3 stopped");
}
public static void Main(string[] args)
{
ConcurrentQueue concurrentQueue = new ConcurrentQueue();
concurrentQueue.ExecuteAction(SomeAction1, 1);
concurrentQueue.ExecuteAction(SomeAction2, 2);
concurrentQueue.ExecuteAction(SomeAction3, 3);
concurrentQueue.ExecuteAction(SomeAction1, 1);
concurrentQueue.ExecuteAction(SomeAction2, 2);
concurrentQueue.ExecuteAction(SomeAction3, 3);
Console.WriteLine($"press any key to exit the program");
Console.ReadKey();
}
the output :
8/5/2019 7:56:57 AM Action 1 is working
8/5/2019 7:56:57 AM Action 3 is working
8/5/2019 7:56:57 AM Action 2 is working
8/5/2019 7:57:08 AM Action 1 stopped
8/5/2019 7:57:08 AM Action 1 is working
8/5/2019 7:57:15 AM Action 2 stopped
8/5/2019 7:57:15 AM Action 2 is working
8/5/2019 7:57:16 AM Action 3 stopped
8/5/2019 7:57:16 AM Action 3 is working
8/5/2019 7:57:18 AM Action 1 stopped
8/5/2019 7:57:33 AM Action 2 stopped
8/5/2019 7:57:33 AM Action 3 stopped
the poolFiber is a class in the ExitGames.Concurrency.Fibers namespace.
more info :
How To Avoid Race Conditions And Other Multithreading Issues?

Related

Creating a class that runs tasks sequentially [duplicate]

I know that asynchronous programming has seen a lot of changes over the years. I'm somewhat embarrassed that I let myself get this rusty at just 34 years old, but I'm counting on StackOverflow to bring me up to speed.
What I am trying to do is manage a queue of "work" on a separate thread, but in such a way that only one item is processed at a time. I want to post work on this thread and it doesn't need to pass anything back to the caller. Of course I could simply spin up a new Thread object and have it loop over a shared Queue object, using sleeps, interrupts, wait handles, etc. But I know things have gotten better since then. We have BlockingCollection, Task, async/await, not to mention NuGet packages that probably abstract a lot of that.
I know that "What's the best..." questions are generally frowned upon so I'll rephrase it by saying "What is the currently recommended..." way to accomplish something like this using built-in .NET mechanisms preferably. But if a third party NuGet package simplifies things a bunch, it's just as well.
I considered a TaskScheduler instance with a fixed maximum concurrency of 1, but seems there is probably a much less clunky way to do that by now.
Background
Specifically, what I am trying to do in this case is queue an IP geolocation task during a web request. The same IP might wind up getting queued for geolocation multiple times, but the task will know how to detect that and skip out early if it's already been resolved. But the request handler is just going to throw these () => LocateAddress(context.Request.UserHostAddress) calls into a queue and let the LocateAddress method handle duplicate work detection. The geolocation API I am using doesn't like to be bombarded with requests which is why I want to limit it to a single concurrent task at a time. However, it would be nice if the approach was allowed to easily scale to more concurrent tasks with a simple parameter change.
To create an asynchronous single degree of parallelism queue of work you can simply create a SemaphoreSlim, initialized to one, and then have the enqueing method await on the acquisition of that semaphore before starting the requested work.
public class TaskQueue
{
private SemaphoreSlim semaphore;
public TaskQueue()
{
semaphore = new SemaphoreSlim(1);
}
public async Task<T> Enqueue<T>(Func<Task<T>> taskGenerator)
{
await semaphore.WaitAsync();
try
{
return await taskGenerator();
}
finally
{
semaphore.Release();
}
}
public async Task Enqueue(Func<Task> taskGenerator)
{
await semaphore.WaitAsync();
try
{
await taskGenerator();
}
finally
{
semaphore.Release();
}
}
}
Of course, to have a fixed degree of parallelism other than one simply initialize the semaphore to some other number.
Your best option as I see it is using TPL Dataflow's ActionBlock:
var actionBlock = new ActionBlock<string>(address =>
{
if (!IsDuplicate(address))
{
LocateAddress(address);
}
});
actionBlock.Post(context.Request.UserHostAddress);
TPL Dataflow is robust, thread-safe, async-ready and very configurable actor-based framework (available as a nuget)
Here's a simple example for a more complicated case. Let's assume you want to:
Enable concurrency (limited to the available cores).
Limit the queue size (so you won't run out of memory).
Have both LocateAddress and the queue insertion be async.
Cancel everything after an hour.
var actionBlock = new ActionBlock<string>(async address =>
{
if (!IsDuplicate(address))
{
await LocateAddressAsync(address);
}
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 10000,
MaxDegreeOfParallelism = Environment.ProcessorCount,
CancellationToken = new CancellationTokenSource(TimeSpan.FromHours(1)).Token
});
await actionBlock.SendAsync(context.Request.UserHostAddress);
Actually you don't need to run tasks in one thread, you need them to run serially (one after another), and FIFO. TPL doesn't have class for that, but here is my very lightweight, non-blocking implementation with tests. https://github.com/Gentlee/SerialQueue
Also have #Servy implementation there, tests show it is twice slower than mine and it doesn't guarantee FIFO.
Example:
private readonly SerialQueue queue = new SerialQueue();
async Task SomeAsyncMethod()
{
var result = await queue.Enqueue(DoSomething);
}
Use BlockingCollection<Action> to create a producer/consumer pattern with one consumer (only one thing running at a time like you want) and one or many producers.
First define a shared queue somewhere:
BlockingCollection<Action> queue = new BlockingCollection<Action>();
In your consumer Thread or Task you take from it:
//This will block until there's an item available
Action itemToRun = queue.Take()
Then from any number of producers on other threads, simply add to the queue:
queue.Add(() => LocateAddress(context.Request.UserHostAddress));
I'm posting a different solution here. To be honest I'm not sure whether this is a good solution.
I'm used to use BlockingCollection to implement a producer/consumer pattern, with a dedicated thread consuming those items. It's fine if there are always data coming in and consumer thread won't sit there and do nothing.
I encountered a scenario that one of the application would like to send emails on a different thread, but total number of emails is not that big.
My initial solution was to have a dedicated consumer thread (created by Task.Run()), but a lot of time it just sits there and does nothing.
Old solution:
private readonly BlockingCollection<EmailData> _Emails =
new BlockingCollection<EmailData>(new ConcurrentQueue<EmailData>());
// producer can add data here
public void Add(EmailData emailData)
{
_Emails.Add(emailData);
}
public void Run()
{
// create a consumer thread
Task.Run(() =>
{
foreach (var emailData in _Emails.GetConsumingEnumerable())
{
SendEmail(emailData);
}
});
}
// sending email implementation
private void SendEmail(EmailData emailData)
{
throw new NotImplementedException();
}
As you can see, if there are not enough emails to be sent (and it is my case), the consumer thread will spend most of them sitting there and do nothing at all.
I changed my implementation to:
// create an empty task
private Task _SendEmailTask = Task.Run(() => {});
// caller will dispatch the email to here
// continuewith will use a thread pool thread (different to
// _SendEmailTask thread) to send this email
private void Add(EmailData emailData)
{
_SendEmailTask = _SendEmailTask.ContinueWith((t) =>
{
SendEmail(emailData);
});
}
// actual implementation
private void SendEmail(EmailData emailData)
{
throw new NotImplementedException();
}
It's no longer a producer/consumer pattern, but it won't have a thread sitting there and does nothing, instead, every time it is to send an email, it will use thread pool thread to do it.
My lib, It can:
Run random in queue list
Multi queue
Run prioritize first
Re-queue
Event all queue completed
Cancel running or cancel wait for running
Dispatch event to UI thread
public interface IQueue
{
bool IsPrioritize { get; }
bool ReQueue { get; }
/// <summary>
/// Dont use async
/// </summary>
/// <returns></returns>
Task DoWork();
bool CheckEquals(IQueue queue);
void Cancel();
}
public delegate void QueueComplete<T>(T queue) where T : IQueue;
public delegate void RunComplete();
public class TaskQueue<T> where T : IQueue
{
readonly List<T> Queues = new List<T>();
readonly List<T> Runnings = new List<T>();
[Browsable(false), DefaultValue((string)null)]
public Dispatcher Dispatcher { get; set; }
public event RunComplete OnRunComplete;
public event QueueComplete<T> OnQueueComplete;
int _MaxRun = 1;
public int MaxRun
{
get { return _MaxRun; }
set
{
bool flag = value > _MaxRun;
_MaxRun = value;
if (flag && Queues.Count != 0) RunNewQueue();
}
}
public int RunningCount
{
get { return Runnings.Count; }
}
public int QueueCount
{
get { return Queues.Count; }
}
public bool RunRandom { get; set; } = false;
//need lock Queues first
void StartQueue(T queue)
{
if (null != queue)
{
Queues.Remove(queue);
lock (Runnings) Runnings.Add(queue);
queue.DoWork().ContinueWith(ContinueTaskResult, queue);
}
}
void RunNewQueue()
{
lock (Queues)//Prioritize
{
foreach (var q in Queues.Where(x => x.IsPrioritize)) StartQueue(q);
}
if (Runnings.Count >= MaxRun) return;//other
else if (Queues.Count == 0)
{
if (Runnings.Count == 0 && OnRunComplete != null)
{
if (Dispatcher != null && !Dispatcher.CheckAccess()) Dispatcher.Invoke(OnRunComplete);
else OnRunComplete.Invoke();//on completed
}
else return;
}
else
{
lock (Queues)
{
T queue;
if (RunRandom) queue = Queues.OrderBy(x => Guid.NewGuid()).FirstOrDefault();
else queue = Queues.FirstOrDefault();
StartQueue(queue);
}
if (Queues.Count > 0 && Runnings.Count < MaxRun) RunNewQueue();
}
}
void ContinueTaskResult(Task Result, object queue_obj) => QueueCompleted((T)queue_obj);
void QueueCompleted(T queue)
{
lock (Runnings) Runnings.Remove(queue);
if (queue.ReQueue) lock (Queues) Queues.Add(queue);
if (OnQueueComplete != null)
{
if (Dispatcher != null && !Dispatcher.CheckAccess()) Dispatcher.Invoke(OnQueueComplete, queue);
else OnQueueComplete.Invoke(queue);
}
RunNewQueue();
}
public void Add(T queue)
{
if (null == queue) throw new ArgumentNullException(nameof(queue));
lock (Queues) Queues.Add(queue);
RunNewQueue();
}
public void Cancel(T queue)
{
if (null == queue) throw new ArgumentNullException(nameof(queue));
lock (Queues) Queues.RemoveAll(o => o.CheckEquals(queue));
lock (Runnings) Runnings.ForEach(o => { if (o.CheckEquals(queue)) o.Cancel(); });
}
public void Reset(T queue)
{
if (null == queue) throw new ArgumentNullException(nameof(queue));
Cancel(queue);
Add(queue);
}
public void ShutDown()
{
MaxRun = 0;
lock (Queues) Queues.Clear();
lock (Runnings) Runnings.ForEach(o => o.Cancel());
}
}
I know this thread is old, but it seems all the present solutions are extremely onerous. The simplest way I could find uses the Linq Aggregate function to create a daisy-chained list of tasks.
var arr = new int[] { 1, 2, 3, 4, 5};
var queue = arr.Aggregate(Task.CompletedTask,
(prev, item) => prev.ContinueWith(antecedent => PerformWorkHere(item)));
The idea is to get your data into an IEnumerable (I'm using an int array), and then reduce that enumerable to a chain of tasks, starting with a default, completed, task.

Parallel processing using TPL in windows service

I have a windows service which is consuming a messaging system to fetch messages. I have also created a callback mechanism with the help of Timer class which helps me to check the message after some fixed time to fetch and process. Previously, the service is processing the message one by one. But I want after the message arrives the processing mechanism to execute in parallel. So if the first message arrived it should go for processing on one task and even if the processing is not finished for the first message still after the interval time configured using the callback method (callback is working now) next message should be picked and processed on a different task.
Below is my code:
Task.Factory.StartNew(() =>
{
Subsriber<Message> subsriber = new Subsriber<Message>()
{
Interval = 1000
};
subsriber.Callback(Process, m => m != null);
});
public static void Process(Message message)
{
if (message != null)
{
// Processing logic
}
else
{
}
}
But using the Task Factory I am not able to control the number of tasks in parallel so in my case I want to configure the number of tasks on which messages will run on the availability of the tasks?
Update:
Updated my above code to add multiple tasks
Below is the code:
private static void Main()
{
try
{
int taskCount = 5;
Task.Factory.StartNewAsync(() =>
{
Subscriber<Message> consumer = new
Subcriber<Message>()
{
Interval = 1000
};
consumer.CallBack(Process, msg => msg!=
null);
}, taskCount);
Console.ReadLine();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
public static void StartNewAsync(this TaskFactory
target, Action action, int taskCount)
{
var tasks = new Task[taskCount];
for (int i = 0; i < taskCount; i++)
{
tasks[i] = target.StartNew(action);
}
}
public static void Process(Message message)
{
if (message != null)
{
}
else
{ }
}
}
I think what your looking for will result in quite a large sample. I'm trying just to demonstrate how you would do this with ActionBlock<T>. There's still a lot of unknowns so I left the sample as skeleton you can build off. In the sample the ActionBlock will handle and process in parallel all your messages as they're received from your messaging system
public class Processor
{
private readonly IMessagingSystem _messagingSystem;
private readonly ActionBlock<Message> _handler;
private bool _pollForMessages;
public Processor(IMessagingSystem messagingSystem)
{
_messagingSystem = messagingSystem;
_handler = new ActionBlock<Message>(msg => Process(msg), new ExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 5 //or any configured value
});
}
public async Task Start()
{
_pollForMessages = true;
while (_pollForMessages)
{
var msg = await _messagingSystem.ReceiveMessageAsync();
await _handler.SendAsync(msg);
}
}
public void Stop()
{
_pollForMessages = false;
}
private void Process(Message message)
{
//handle message
}
}
More Examples
And Ideas
Ok, sorry I'm short on time but here's the general idea/skeleton of what I was thinking as an alternative.
If I'm honest though I think the ActionBlock<T> is the better option as there's just so much done for you, with the only limit being that you can't dynamically scale the amount of work it will do it once, although I think the limit can be quite high. If you get into doing it this way you could have more control or just have a kind of dynamic amount of tasks running but you'll have to do a lot of things manually, e.g if you want to limit the amount of tasks running at a time, you'd have to implement a queueing system (something ActionBlock handles for you) and then maintain it. I guess it depends on how many messages you're receiving and how fast your process handles them.
You'll have to check it out and think of how it could apply to your direct use case as I think some of the details area a little sketchily implemented on my side around the concurrentbag idea.
So the idea behind what I've thrown together here is that you can start any number of tasks, or add to the tasks running or cancel tasks individually by using the collection.
The main thing I think is just making the method that the Callback runs fire off a thread that does the work, instead of subscribing within a separate thread.
I used Task.Factory.StartNew as you did, but stored the returned Task object in an object (TaskInfo) which also had it's CancellationTokenSource, it's Id (assigned externally) as properties, and then added that to a collection of TaskInfo which is a property on the class this is all a part of:
Updated - to avoid this being too confusing i've just updated the code that was here previously.
You'll have to update bits of it and fill in the blanks in places like with whatever you have for my HeartbeatController, and the few events that get called because they're beyond the scope of the question but the idea would be the same.
public class TaskContainer
{
private ConcurrentBag<TaskInfo> Tasks;
public TaskContainer(){
Tasks = new ConcurrentBag<TaskInfo>();
}
//entry point
//UPDATED
public void StartAndMonitor(int processorCount)
{
for (int i = 0; i <= processorCount; i++)
{
Processor task = new Processor(ProcessorId = i);
CreateProcessorTask(task);
}
this.IsRunning = true;
MonitorTasks();
}
private void CreateProcessorTask(Processor processor)
{
CancellationTokenSource cancellationTokenSource = new CancellationTokenSource();
Task taskInstance = Task.Factory.StartNew(
() => processor.Start(cancellationTokenSource.Token)
);
//bind status update event
processor.ProcessorStatusUpdated += ReportProcessorProcess;
Tasks.Add(new ProcessorInfo()
{
ProcessorId = processor.ProcessorId,
Task = taskInstance,
CancellationTokenSource = cancellationTokenSource
});
}
//this method gets called once but the HeartbeatController gets an action as a param that it then
//executes on a timer. I haven't included that but you get the idea
//This method also checks for tasks that have stopped and restarts them if the manifest call says they should be running.
//Will also start any new tasks included in the manifest and stop any that aren't included in the manifest.
internal void MonitorTasks()
{
HeartbeatController.Beat(() =>
{
HeartBeatHappened?.Invoke(this, null);
List<int> tasksToStart = new List<int>();
//this is an api call or whatever drives your config that says what tasks must be running.
var newManifest = this.GetManifest(Properties.Settings.Default.ResourceId);
//task Removed Check - If a Processor is removed from the task pool, cancel it if running and remove it from the Tasks List.
List<int> instanceIds = new List<int>();
newManifest.Processors.ForEach(x => instanceIds.Add(x.ProcessorId));
var removed = Tasks.Select(x => x.ProcessorId).ToList().Except(instanceIds).ToList();
if (removed.Count() > 0)
{
foreach (var extaskId in removed)
{
var task = Tasks.FirstOrDefault(x => x.ProcessorId == extaskId);
task.CancellationTokenSource?.Cancel();
}
}
foreach (var newtask in newManifest.Processors)
{
var oldtask = Tasks.FirstOrDefault(x => x.ProcessorId == newtask.ProcessorId);
//Existing task check
if (oldtask != null && oldtask.Task != null)
{
if (!oldtask.Task.IsCanceled && (oldtask.Task.IsCompleted || oldtask.Task.IsFaulted))
{
var ex = oldtask.Task.Exception;
tasksToStart.Add(oldtask.ProcessorId);
continue;
}
}
else //New task Check
tasksToStart.Add(newtask.ProcessorId);
}
foreach (var item in tasksToStart)
{
var taskToRemove = Tasks.FirstOrDefault(x => x.ProcessorId == item);
if (taskToRemove != null)
Tasks.Remove(taskToRemove);
var task = newManifest.Processors.FirstOrDefault(x => x.ProcessorId == item);
if (task != null)
{
CreateProcessorTask(task);
}
}
});
}
}
//UPDATED
public class Processor{
private int ProcessorId;
private Subsriber<Message> subsriber;
public Processor(int processorId) => ProcessorId = processorId;
public void Start(CancellationToken token)
{
Subsriber<Message> subsriber = new Subsriber<Message>()
{
Interval = 1000
};
subsriber.Callback(Process, m => m != null);
}
private void Process()
{
//do work
}
}
Hope this gives you an idea of how else you can approach your problem and that I didn't miss the point :).
Update
To use events to update progress or which tasks are processing, I'd extract them into their own class, which then has subscribe methods on it, and when creating a new instance of that class, assign the event to a handler in the parent class which can then update your UI or whatever you want it to do with that info.
So the content of Process() would look more like this:
Processor processor = new Processor();
Task task = Task.Factory.StartNew(() => processor.ProcessMessage(cancellationTokenSource.CancellationToken));
processor.StatusUpdated += ReportProcess;

Limit total concurrent tasks running [duplicate]

This question already has answers here:
How to limit the amount of concurrent async I/O operations?
(11 answers)
Closed 5 years ago.
I have a method Create which is executed whenever a new message is seen on the service bus message queue (https://azure.microsoft.com/en-us/services/service-bus/).
I am trying to limit the total number of concurrent tasks that can run in parallel for all calls of Create to 5 tasks.
In my code Parallel.ForEach does not seem to do anything.
I have tried to add a mutex/lock around the makePdfAsync() invocation like this:
mutex.WaitOne();
if(curretNumTasks < MaxTasks)
{
tasks.Add(makePdfAsync(form));
}
mutex.ReleaseMutex();
but it is extremely slow and makes the service bus throw.
How do I limit the number of concurrent tasks all invocations of Create creates?
public async Task Create(List<FormModel> forms)
{
var tasks = new List<Task>();
Parallel.ForEach(forms, new ParallelOptions { MaxDegreeOfParallelism = 5 }, form =>
{
tasks.Add(makePdfAsync(form));
});
await Task.WhenAny(Task.WhenAll(tasks), Task.Delay(TimeSpan.FromMinutes(10)));
}
public async Task makePdfAsync()
{
var message = new PdfMessageModel();
message.forms = new List<FormModel>() { form };
var retry = 10;
var uri = new Uri("http://localhost.:8007");
var json = JsonConvert.SerializeObject(message);
using (var wc = new WebClient())
{
wc.Encoding = System.Text.Encoding.UTF8;
// reconnect with delay in case process is not ready
while (true)
{
try
{
await wc.UploadStringTaskAsync(uri, json);
break;
}
catch
{
if (retry-- == 0) throw;
}
}
}
}
TL;DR. Create is a method on a class, it is called on many instances simultaneously. The concurrency is two fold; Several invocations of Create simultaneously and within each invocation of Create several tasks run concurrently.
How do I limit the total number of tasks running at any one point?
You could look at using a system wide semaphore?
for example :
var throttle = new Semaphore(5,5,"pdftaskthrottle");
if (throttle.WaitOne(5000)){
try{
//do some task / thread stuff
.....
} catch(Exception ex){
// handle
} finally {
//always remember to release the semaphore
throttle.Release();
}
} else {
//we timed out ... try again?
}
If I understand you correctly, you effectively want a producer/consumer queue with a limit of 5 tasks. BlockingCollection would be the best if that's what you're after. It has very good performance as internally it uses SemaphoreSlim to do the blocking when necessary. Also you can leverage Task together e.g. creating a BlockingCollection<Task<T>>. "C# in a nutshell" has a good section of this; see code below as a general example. Also try avoid using kernel-mode synchronisation construct like mutex if possible as they're slow (you have to pay for transiting from managed code into native code!).
class PCQueue : IDisposable
{
private BlockingCollection<Task> _taskQueue = new BlockingCollection<Task>();
public PCQueue(int workerCount)
{
for (int i = 0; i < workerCount; i++)
Task.Factory.StartNew(Consume);
}
public Task Enqueue(Action action, CancellationToken cancelToken = default(CancellationToken))
{
//! A task object can either be generated using TaskCompletionSource or instantiated directly (an unstarted or cold task!).
var task = new Task(action, cancelToken);
_taskQueue.Add(task); //? Create a cold task and enqueue it.
return task;
}
public Task<TResult> Enqueue<TResult>(Func<TResult> func, CancellationToken cancelToken = default(CancellationToken))
{
var task = new Task<TResult>(func, cancelToken);
_taskQueue.Add(task);
return task;
}
void Consume()
{
foreach (var task in _taskQueue.GetConsumingEnumerable())
{
try
{
//! We run the task synchronously on the consumer's thread.
if (!task.IsCanceled) task.RunSynchronously();
}
catch (InvalidOperationException)
{
//! Handle the unlikely event that the task is canceled in between checking whether it's canceled and running it.
// race condition!
}
}
}
public void Dispose() => _taskQueue.CompleteAdding();
}

Async/Await or Task.Run in Console Application/Windows Service

I have been researching (including looking at all other SO posts on this topic) the best way to implement a (most likely) Windows Service worker that will pull items of work from a database and process them in parallel asynchronously in a 'fire-and-forget' manner in the background (the work item management will all be handled in the asynchronous method). The work items will be web service calls and database queries. There will be some throttling applied to the producer of these work items to ensure some kind of measured approach to scheduling the work. The examples below are very basic and are just there to highlight the logic of the while loop and for loop in place. Which is the ideal method or does it not matter? Is there a more appropriate/performant way of achieving this?
async/await...
private static int counter = 1;
static void Main(string[] args)
{
Console.Title = "Async";
Task.Run(() => AsyncMain());
Console.ReadLine();
}
private static async void AsyncMain()
{
while (true)
{
// Imagine calling a database to get some work items to do, in this case 5 dummy items
for (int i = 0; i < 5; i++)
{
var x = DoSomethingAsync(counter.ToString());
counter++;
Thread.Sleep(50);
}
Thread.Sleep(1000);
}
}
private static async Task<string> DoSomethingAsync(string jobNumber)
{
try
{
// Simulated mostly IO work - some could be long running
await Task.Delay(5000);
Console.WriteLine(jobNumber);
}
catch (Exception ex)
{
LogException(ex);
}
Log("job {0} has completed", jobNumber);
return "fire and forget so not really interested";
}
Task.Run...
private static int counter = 1;
static void Main(string[] args)
{
Console.Title = "Task";
while (true)
{
// Imagine calling a database to get some work items to do, in this case 5 dummy items
for (int i = 0; i < 5; i++)
{
var x = Task.Run(() => { DoSomethingAsync(counter.ToString()); });
counter++;
Thread.Sleep(50);
}
Thread.Sleep(1000);
}
}
private static string DoSomethingAsync(string jobNumber)
{
try
{
// Simulated mostly IO work - some could be long running
Task.Delay(5000);
Console.WriteLine(jobNumber);
}
catch (Exception ex)
{
LogException(ex);
}
Log("job {0} has completed", jobNumber);
return "fire and forget so not really interested";
}
pull items of work from a database and process them in parallel asynchronously in a 'fire-and-forget' manner in the background
Technically, you want concurrency. Whether you want asynchronous concurrency or parallel concurrency remains to be seen...
The work items will be web service calls and database queries.
The work is I/O-bound, so that implies asynchronous concurrency as the more natural approach.
There will be some throttling applied to the producer of these work items to ensure some kind of measured approach to scheduling the work.
The idea of a producer/consumer queue is implied here. That's one option. TPL Dataflow provides some nice producer/consumer queues that are async-compatible and support throttling.
Alternatively, you can do the throttling yourself. For asynchronous code, there's a built-in throttling mechanism called SemaphoreSlim.
TPL Dataflow approach, with throttling:
private static int counter = 1;
static void Main(string[] args)
{
Console.Title = "Async";
var x = Task.Run(() => MainAsync());
Console.ReadLine();
}
private static async Task MainAsync()
{
var blockOptions = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 7
};
var block = new ActionBlock<string>(DoSomethingAsync, blockOptions);
while (true)
{
var dbData = await ...; // Imagine calling a database to get some work items to do, in this case 5 dummy items
for (int i = 0; i < 5; i++)
{
block.Post(counter.ToString());
counter++;
Thread.Sleep(50);
}
Thread.Sleep(1000);
}
}
private static async Task DoSomethingAsync(string jobNumber)
{
try
{
// Simulated mostly IO work - some could be long running
await Task.Delay(5000);
Console.WriteLine(jobNumber);
}
catch (Exception ex)
{
LogException(ex);
}
Log("job {0} has completed", jobNumber);
}
Asynchronous concurrency approach with manual throttling:
private static int counter = 1;
private static SemaphoreSlim semaphore = new SemaphoreSlim(7);
static void Main(string[] args)
{
Console.Title = "Async";
var x = Task.Run(() => MainAsync());
Console.ReadLine();
}
private static async Task MainAsync()
{
while (true)
{
var dbData = await ...; // Imagine calling a database to get some work items to do, in this case 5 dummy items
for (int i = 0; i < 5; i++)
{
var x = DoSomethingAsync(counter.ToString());
counter++;
Thread.Sleep(50);
}
Thread.Sleep(1000);
}
}
private static async Task DoSomethingAsync(string jobNumber)
{
await semaphore.WaitAsync();
try
{
try
{
// Simulated mostly IO work - some could be long running
await Task.Delay(5000);
Console.WriteLine(jobNumber);
}
catch (Exception ex)
{
LogException(ex);
}
Log("job {0} has completed", jobNumber);
}
finally
{
semaphore.Release();
}
}
As a final note, I hardly ever recommend my own book on SO, but I do think it would really benefit you. In particular, sections 8.10 (Blocking/Asynchronous Queues), 11.5 (Throttling), and 4.4 (Throttling Dataflow Blocks).
First of all, let's fix some.
In the second example you are calling
Task.Delay(5000);
without await. It is a bad idea. It creates a new Task instance which runs for 5 seconds but no one is waiting for it. Task.Delay is only useful with await. Mind you, do not use Task.Delay(5000).Wait() or you are going to get deadlocked.
In your second example you are trying to make the DoSomethingAsync method synchronous, lets call it DoSomethingSync and replace the Task.Delay(5000); with Thread.Sleep(5000);
Now, the second example is almost the old-school ThreadPool.QueueUserWorkItem. And there is nothing bad with it in case you are not using some already-async API inside. Task.Run and ThreadPool.QueueUserWorkItem used in the fire-and-forget case are just the same thing. I would use the latter for clarity.
This slowly drives us to the answer to the main question. Async or not async - this is the question! I would say: "Do not create async methods in case you do not have to use some async IO inside your code". If however there is async API you have to use than the first approach would be more expected by those who are going to read your code years later.

Continue with a method after completing producer-consumer

I have a producer-consumer application in WPF. After I click a button.
private async void Start_Click(object sender, RoutedEventArgs e)
{
try
{
// set up data
var producer = Producer();
var consumer = Consumer();
await Task.WhenAll(producer, consumer);
// need log the results in Summary method
Summary();
}
}
The summary method is a void one; I assume it is proper.
private void Summary(){}
async Task Producer(){ await something }
async Task Consumer(){ await something }
EDIT:
My question is in Summary() method I have to use the calculated values from the tasks, however the Consumer task is a long running process. The program run Summary quickly even not getting the updated values. It use the initial values.
My thought:
await Task.WhenAll(producer, consumer);
Summary();
EDIT2: 11:08 AM 11/05/2014
private void Summary()
{
myFail = 100 - mySuccess;
_dataContext.MyFail = myFail; // update window upon property changed
async Task Consumer()
{
try
{
Dictionary<string, string> dict = new Dictionary<string, string>();
var executionDataflowBlockOptions = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 5,
CancellationToken = cToken
};
var c = new ActionBlock<T>(
t=>
{
if (cToken.IsCancellationRequested)
return;
dict = Do(t, cToken);
if(dict["Success"] == "Success")
mySuccess++;
The current problem is mySuccess is always the initial value in Summary method.
You can use ContinueWith method to execute Summary after both producer and consumer have finished:
Task.WhenAll(producer, consumer)
.ContinueWith(continuation => Summary());
EDIT 1
It seems that you are abusing or using wrong the Producer/Consumer pattern.
The producer is supposed to produce the values and shovel them into one end of a communication pipe. On the other end of the pipe, the consumer consumes the values as they become available. In other words, the consumer waits for the producer to produce some value and to put the value in the pipe and for the value to arrive at the end of the pipe.
Usually this involves some sort of signaling mechanism where the producer signals (awakes) the consumer whenever a value has been created.
In your case, you don't have the signaling mechanism and I strongly suspect that your producer is generating only one value. If the later is the case you can just return a value from the "producer".
If however, your producer is creating more than one values, you can use the BlockingCollection<T> class to send values from producer to consumer.
In your Producer class, get a reference to the pipe and put data into it:
public class Producer
{
private BlockingCollection<Data> _pipe;
public void Start()
{
while(!done)
{
var value = ProduceValue();
_pipe.Add(value);
}
// Signal the consumer that we're finished
_pipe.CompleteAdding();
}
}
In the Consumer class wait for the values to arrive and process each one:
public class Consumer
{
private BlockingCollection<Data> _pipe;
public void Start()
{
foreach(var value in _pipe.GetConsumingEnumerable())
{
// GetConsumingEnumerable will block until a value arrives and
// will exit when producer calls CompleteAdding()
Process(value);
}
}
}
Having the above in place you can use ContinueWith or await on the WhenAll method to run the Summary.
EDIT 2
As promised in the comments I have analyzed the code you've posted on MSDN Forum. There are several problems in the code.
First of all and the simplest one to fix is that you're not incrementing the counter in a thread-safe manner. An increment (value++) is not an atomic operation so you should be careful when incrementing shared fields. An easy way to do this is:
Interlocked.Increment(ref evenNumber);
Now, the actual problems in your code:
As I mentioned earlier, the consumer does not know when the producer has finished producing the values. So, after the producer exits the for block it should signal that it has finished. The consumer waits for the finish signal of the producer; otherwise it will wait forever for the next value but there won't be one.
You are linking the BufferBlock with the consumer code which starts to execute but you're not waiting for the consumer block to finish - you're only waiting 0.5 of a second and exit the consumer method leaving the worker threads of the consumer block to do their work in vain.
As a consequence of the above, your Report method executes before the processing is finished outputting the value of the evenNumber counter at the moment when the method executes not when all processing is finished.
Below is the edited code with some comments:
class Program
{
public static BufferBlock<int> m_Queue = new BufferBlock<int>(new DataflowBlockOptions { BoundedCapacity = 1000 });
private static int evenNumber;
static void Main(string[] args)
{
var producer = Producer();
var consumer = Consumer();
Task.WhenAll(producer, consumer).Wait();
Report();
}
static void Report()
{
Console.WriteLine("There are {0} even numbers", evenNumber);
Console.Read();
}
static async Task Producer()
{
for (int i = 0; i < 500; i++)
{
// Send a value to the consumer and wait for the value to be processed
await m_Queue.SendAsync(i);
}
// Signal the consumer that there will be no more values
m_Queue.Complete();
}
static async Task Consumer()
{
var executionDataflowBlockOptions = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 4
};
var consumerBlock = new ActionBlock<int>(x =>
{
int j = DoWork(x);
if (j % 2 == 0)
// Increment the counter in a thread-safe way
Interlocked.Increment(ref evenNumber);
}, executionDataflowBlockOptions);
// Link the buffer to the consumer
using (m_Queue.LinkTo(consumerBlock, new DataflowLinkOptions { PropagateCompletion = true }))
{
// Wait for the consumer to finish.
// This method will exit after all the data from the buffer was processed.
await consumerBlock.Completion;
}
}
static int DoWork(int x)
{
Thread.Sleep(100);
return x;
}
}

Categories