Thread Safety in Concurrent Queue C# - c#

I have a MessagesManager thread to which different threads may send messages and then this MessagesManager thread is responsible to publish these messages inside SendMessageToTcpIP() (start point of MessagesManager thread ).
class MessagesManager : IMessageNotifier
{
//private
private readonly AutoResetEvent _waitTillMessageQueueEmptyARE = new AutoResetEvent(false);
private ConcurrentQueue<string> MessagesQueue = new ConcurrentQueue<string>();
public void PublishMessage(string Message)
{
MessagesQueue.Enqueue(Message);
_waitTillMessageQueueEmptyARE.Set();
}
public void SendMessageToTcpIP()
{
//keep waiting till a new message comes
while (MessagesQueue.Count() == 0)
{
_waitTillMessageQueueEmptyARE.WaitOne();
}
//Copy the Concurrent Queue into a local queue - keep dequeuing the item once it is inserts into the local Queue
Queue<string> localMessagesQueue = new Queue<string>();
while (!MessagesQueue.IsEmpty)
{
string message;
bool isRemoved = MessagesQueue.TryDequeue(out message);
if (isRemoved)
localMessagesQueue.Enqueue(message);
}
//Use the Local Queue for further processing
while (localMessagesQueue.Count() != 0)
{
TcpIpMessageSenderClient.ConnectAndSendMessage(localMessagesQueue.Dequeue().PadRight(80, ' '));
Thread.Sleep(2000);
}
}
}
The different threads (3-4) send their message by calling the PublishMessage(string Message) (using same object to MessageManager). Once the message comes, I push that message into a concurrent queue and notifies the SendMessageToTcpIP() by setting _waitTillMessageQueueEmptyARE.Set();. Inside SendMessageToTcpIP(), I am copying the message from the concurrent queue inside a local queue and then publish one by one.
QUESTIONS: Is it thread safe to do enqueuing and dequeuing in this way? Could there be some strange effects due to it?

While this is probably thread-safe, there are built-in classes in .NET to help with "many publishers one consumer" pattern, like BlockingCollection. You can rewrite your class like this:
class MessagesManager : IDisposable {
// note that your ConcurrentQueue is still in play, passed to constructor
private readonly BlockingCollection<string> MessagesQueue = new BlockingCollection<string>(new ConcurrentQueue<string>());
public MessagesManager() {
// start consumer thread here
new Thread(SendLoop) {
IsBackground = true
}.Start();
}
public void PublishMessage(string Message) {
// no need to notify here, will be done for you
MessagesQueue.Add(Message);
}
private void SendLoop() {
// this blocks until new items are available
foreach (var item in MessagesQueue.GetConsumingEnumerable()) {
// ensure that you handle exceptions here, or whole thing will break on exception
TcpIpMessageSenderClient.ConnectAndSendMessage(item.PadRight(80, ' '));
Thread.Sleep(2000); // only if you are sure this is required
}
}
public void Dispose() {
// this will "complete" GetConsumingEnumerable, so your thread will complete
MessagesQueue.CompleteAdding();
MessagesQueue.Dispose();
}
}

.NET already provides ActionBlock< T> that allows posting messages to a buffer and processing them asynchronously. By default, only one message is processed at a time.
Your code could be rewritten as:
//In an initialization function
ActionBlock<string> _hmiAgent=new ActionBlock<string>(async msg=>{
TcpIpMessageSenderClient.ConnectAndSendMessage(msg.PadRight(80, ' '));
await Task.Delay(2000);
);
//In some other thread ...
foreach ( ....)
{
_hmiAgent.Post(someMessage);
}
// When the application closes
_hmiAgent.Complete();
await _hmiAgent.Completion;
ActionBlock offers many benefits - you can specify a limit to the number of items it can accept in a buffer and specify that multiple messages can be processed in parallel. You can also combine multiple blocks in a processing pipeline. In a desktop application, a message can be posted to a pipeline in response to an event, get processed by separate blocks and results posted to a final block that updates the UI.
Padding, for example, could be performed by an intermediary TransformBlock< TIn,TOut>. This transformation is trivial and the cost of using the block is greater than the method, but that's just an illustration:
//In an initialization function
TransformBlock<string> _hmiAgent=new TransformBlock<string,string>(
msg=>msg.PadRight(80, ' '));
ActionBlock<string> _tcpBlock=new ActionBlock<string>(async msg=>{
TcpIpMessageSenderClient.ConnectAndSendMessage());
await Task.Delay(2000);
);
var linkOptions=new DataflowLinkOptions{PropagateCompletion = true};
_hmiAgent.LinkTo(_tcpBlock);
The posting code doesn't change at all
_hmiAgent.Post(someMessage);
When the application terminates, we need to wait for the _tcpBlock to complete:
_hmiAgent.Complete();
await _tcpBlock.Completion;
Visual Studio 2015+ itself uses TPL Dataflow for such scenarios
Bar Arnon provides a better example in TPL Dataflow Is The Best Library You're Not Using, that shows how both synchronous and asynchronous methods can be used in a block.

The code is thread safe since both ConcurrentQueue and AutoResetEvent are thread safe. your strings are anyway being read and never being written to, so this code is thread safe.
However, You have to make sure you call SendMessageToTcpIP in some sort of a loop.
otherwise , you have a dangerous race condition - some messages may get lost:
while (!MessagesQueue.IsEmpty)
{
string message;
bool isRemoved = MessagesQueue.TryDequeue(out message);
if (isRemoved)
localMessagesQueue.Enqueue(message);
}
//<<--- what happens if another thread enqueues a message here?
while (localMessagesQueue.Count() != 0)
{
TcpIpMessageSenderClient.ConnectAndSendMessage(localMessagesQueue.Dequeue().PadRight(80, ' '));
Thread.Sleep(2000);
}
Other than that, AutoResetEvent is extremely heavy object. it uses a kernel object to synchronize threads. every call is a system call which may be costly. consider using user mode synchronization object (doesn't .net provides some sort of condition variable?)

This is an refactored code snippet of how I would implement this functionality:
class MessagesManager {
private readonly AutoResetEvent messagesAvailableSignal = new AutoResetEvent(false);
private readonly ConcurrentQueue<string> messageQueue = new ConcurrentQueue<string>();
public void PublishMessage(string Message) {
messageQueue.Enqueue(Message);
messagesAvailableSignal.Set();
}
public void SendMessageToTcpIP() {
while (true) {
messagesAvailableSignal.WaitOne();
while (!messageQueue.IsEmpty) {
string message;
if (messageQueue.TryDequeue(out message)) {
TcpIpMessageSenderClient.ConnectAndSendMessage(message.PadRight(80, ' '));
}
}
}
}
}
Points to note here:
This drains the queue completely: if there is at least one message, it will process all of them
The 2000ms Thread sleep is removed

Related

Creating a class that runs tasks sequentially [duplicate]

I know that asynchronous programming has seen a lot of changes over the years. I'm somewhat embarrassed that I let myself get this rusty at just 34 years old, but I'm counting on StackOverflow to bring me up to speed.
What I am trying to do is manage a queue of "work" on a separate thread, but in such a way that only one item is processed at a time. I want to post work on this thread and it doesn't need to pass anything back to the caller. Of course I could simply spin up a new Thread object and have it loop over a shared Queue object, using sleeps, interrupts, wait handles, etc. But I know things have gotten better since then. We have BlockingCollection, Task, async/await, not to mention NuGet packages that probably abstract a lot of that.
I know that "What's the best..." questions are generally frowned upon so I'll rephrase it by saying "What is the currently recommended..." way to accomplish something like this using built-in .NET mechanisms preferably. But if a third party NuGet package simplifies things a bunch, it's just as well.
I considered a TaskScheduler instance with a fixed maximum concurrency of 1, but seems there is probably a much less clunky way to do that by now.
Background
Specifically, what I am trying to do in this case is queue an IP geolocation task during a web request. The same IP might wind up getting queued for geolocation multiple times, but the task will know how to detect that and skip out early if it's already been resolved. But the request handler is just going to throw these () => LocateAddress(context.Request.UserHostAddress) calls into a queue and let the LocateAddress method handle duplicate work detection. The geolocation API I am using doesn't like to be bombarded with requests which is why I want to limit it to a single concurrent task at a time. However, it would be nice if the approach was allowed to easily scale to more concurrent tasks with a simple parameter change.
To create an asynchronous single degree of parallelism queue of work you can simply create a SemaphoreSlim, initialized to one, and then have the enqueing method await on the acquisition of that semaphore before starting the requested work.
public class TaskQueue
{
private SemaphoreSlim semaphore;
public TaskQueue()
{
semaphore = new SemaphoreSlim(1);
}
public async Task<T> Enqueue<T>(Func<Task<T>> taskGenerator)
{
await semaphore.WaitAsync();
try
{
return await taskGenerator();
}
finally
{
semaphore.Release();
}
}
public async Task Enqueue(Func<Task> taskGenerator)
{
await semaphore.WaitAsync();
try
{
await taskGenerator();
}
finally
{
semaphore.Release();
}
}
}
Of course, to have a fixed degree of parallelism other than one simply initialize the semaphore to some other number.
Your best option as I see it is using TPL Dataflow's ActionBlock:
var actionBlock = new ActionBlock<string>(address =>
{
if (!IsDuplicate(address))
{
LocateAddress(address);
}
});
actionBlock.Post(context.Request.UserHostAddress);
TPL Dataflow is robust, thread-safe, async-ready and very configurable actor-based framework (available as a nuget)
Here's a simple example for a more complicated case. Let's assume you want to:
Enable concurrency (limited to the available cores).
Limit the queue size (so you won't run out of memory).
Have both LocateAddress and the queue insertion be async.
Cancel everything after an hour.
var actionBlock = new ActionBlock<string>(async address =>
{
if (!IsDuplicate(address))
{
await LocateAddressAsync(address);
}
}, new ExecutionDataflowBlockOptions
{
BoundedCapacity = 10000,
MaxDegreeOfParallelism = Environment.ProcessorCount,
CancellationToken = new CancellationTokenSource(TimeSpan.FromHours(1)).Token
});
await actionBlock.SendAsync(context.Request.UserHostAddress);
Actually you don't need to run tasks in one thread, you need them to run serially (one after another), and FIFO. TPL doesn't have class for that, but here is my very lightweight, non-blocking implementation with tests. https://github.com/Gentlee/SerialQueue
Also have #Servy implementation there, tests show it is twice slower than mine and it doesn't guarantee FIFO.
Example:
private readonly SerialQueue queue = new SerialQueue();
async Task SomeAsyncMethod()
{
var result = await queue.Enqueue(DoSomething);
}
Use BlockingCollection<Action> to create a producer/consumer pattern with one consumer (only one thing running at a time like you want) and one or many producers.
First define a shared queue somewhere:
BlockingCollection<Action> queue = new BlockingCollection<Action>();
In your consumer Thread or Task you take from it:
//This will block until there's an item available
Action itemToRun = queue.Take()
Then from any number of producers on other threads, simply add to the queue:
queue.Add(() => LocateAddress(context.Request.UserHostAddress));
I'm posting a different solution here. To be honest I'm not sure whether this is a good solution.
I'm used to use BlockingCollection to implement a producer/consumer pattern, with a dedicated thread consuming those items. It's fine if there are always data coming in and consumer thread won't sit there and do nothing.
I encountered a scenario that one of the application would like to send emails on a different thread, but total number of emails is not that big.
My initial solution was to have a dedicated consumer thread (created by Task.Run()), but a lot of time it just sits there and does nothing.
Old solution:
private readonly BlockingCollection<EmailData> _Emails =
new BlockingCollection<EmailData>(new ConcurrentQueue<EmailData>());
// producer can add data here
public void Add(EmailData emailData)
{
_Emails.Add(emailData);
}
public void Run()
{
// create a consumer thread
Task.Run(() =>
{
foreach (var emailData in _Emails.GetConsumingEnumerable())
{
SendEmail(emailData);
}
});
}
// sending email implementation
private void SendEmail(EmailData emailData)
{
throw new NotImplementedException();
}
As you can see, if there are not enough emails to be sent (and it is my case), the consumer thread will spend most of them sitting there and do nothing at all.
I changed my implementation to:
// create an empty task
private Task _SendEmailTask = Task.Run(() => {});
// caller will dispatch the email to here
// continuewith will use a thread pool thread (different to
// _SendEmailTask thread) to send this email
private void Add(EmailData emailData)
{
_SendEmailTask = _SendEmailTask.ContinueWith((t) =>
{
SendEmail(emailData);
});
}
// actual implementation
private void SendEmail(EmailData emailData)
{
throw new NotImplementedException();
}
It's no longer a producer/consumer pattern, but it won't have a thread sitting there and does nothing, instead, every time it is to send an email, it will use thread pool thread to do it.
My lib, It can:
Run random in queue list
Multi queue
Run prioritize first
Re-queue
Event all queue completed
Cancel running or cancel wait for running
Dispatch event to UI thread
public interface IQueue
{
bool IsPrioritize { get; }
bool ReQueue { get; }
/// <summary>
/// Dont use async
/// </summary>
/// <returns></returns>
Task DoWork();
bool CheckEquals(IQueue queue);
void Cancel();
}
public delegate void QueueComplete<T>(T queue) where T : IQueue;
public delegate void RunComplete();
public class TaskQueue<T> where T : IQueue
{
readonly List<T> Queues = new List<T>();
readonly List<T> Runnings = new List<T>();
[Browsable(false), DefaultValue((string)null)]
public Dispatcher Dispatcher { get; set; }
public event RunComplete OnRunComplete;
public event QueueComplete<T> OnQueueComplete;
int _MaxRun = 1;
public int MaxRun
{
get { return _MaxRun; }
set
{
bool flag = value > _MaxRun;
_MaxRun = value;
if (flag && Queues.Count != 0) RunNewQueue();
}
}
public int RunningCount
{
get { return Runnings.Count; }
}
public int QueueCount
{
get { return Queues.Count; }
}
public bool RunRandom { get; set; } = false;
//need lock Queues first
void StartQueue(T queue)
{
if (null != queue)
{
Queues.Remove(queue);
lock (Runnings) Runnings.Add(queue);
queue.DoWork().ContinueWith(ContinueTaskResult, queue);
}
}
void RunNewQueue()
{
lock (Queues)//Prioritize
{
foreach (var q in Queues.Where(x => x.IsPrioritize)) StartQueue(q);
}
if (Runnings.Count >= MaxRun) return;//other
else if (Queues.Count == 0)
{
if (Runnings.Count == 0 && OnRunComplete != null)
{
if (Dispatcher != null && !Dispatcher.CheckAccess()) Dispatcher.Invoke(OnRunComplete);
else OnRunComplete.Invoke();//on completed
}
else return;
}
else
{
lock (Queues)
{
T queue;
if (RunRandom) queue = Queues.OrderBy(x => Guid.NewGuid()).FirstOrDefault();
else queue = Queues.FirstOrDefault();
StartQueue(queue);
}
if (Queues.Count > 0 && Runnings.Count < MaxRun) RunNewQueue();
}
}
void ContinueTaskResult(Task Result, object queue_obj) => QueueCompleted((T)queue_obj);
void QueueCompleted(T queue)
{
lock (Runnings) Runnings.Remove(queue);
if (queue.ReQueue) lock (Queues) Queues.Add(queue);
if (OnQueueComplete != null)
{
if (Dispatcher != null && !Dispatcher.CheckAccess()) Dispatcher.Invoke(OnQueueComplete, queue);
else OnQueueComplete.Invoke(queue);
}
RunNewQueue();
}
public void Add(T queue)
{
if (null == queue) throw new ArgumentNullException(nameof(queue));
lock (Queues) Queues.Add(queue);
RunNewQueue();
}
public void Cancel(T queue)
{
if (null == queue) throw new ArgumentNullException(nameof(queue));
lock (Queues) Queues.RemoveAll(o => o.CheckEquals(queue));
lock (Runnings) Runnings.ForEach(o => { if (o.CheckEquals(queue)) o.Cancel(); });
}
public void Reset(T queue)
{
if (null == queue) throw new ArgumentNullException(nameof(queue));
Cancel(queue);
Add(queue);
}
public void ShutDown()
{
MaxRun = 0;
lock (Queues) Queues.Clear();
lock (Runnings) Runnings.ForEach(o => o.Cancel());
}
}
I know this thread is old, but it seems all the present solutions are extremely onerous. The simplest way I could find uses the Linq Aggregate function to create a daisy-chained list of tasks.
var arr = new int[] { 1, 2, 3, 4, 5};
var queue = arr.Aggregate(Task.CompletedTask,
(prev, item) => prev.ContinueWith(antecedent => PerformWorkHere(item)));
The idea is to get your data into an IEnumerable (I'm using an int array), and then reduce that enumerable to a chain of tasks, starting with a default, completed, task.

Forcing certain code to always run on the same thread

We have an old 3rd party system (let's call it Junksoft® 95) that we interface with via PowerShell (it exposes a COM object) and I'm in the process of wrapping it in a REST API (ASP.NET Framework 4.8 and WebAPI 2). I use the System.Management.Automation nuget package to create a PowerShell in which I instantiate Junksoft's COM API as a dynamic object that I then use:
//I'm omitting some exception handling and maintenance code for brevity
powerShell = System.Management.Automation.PowerShell.Create();
powerShell.AddScript("Add-Type -Path C:\Path\To\Junksoft\Scripting.dll");
powerShell.AddScript("New-Object Com.Junksoft.Scripting.ScriptingObject");
dynamic junksoftAPI = powerShell.Invoke()[0];
//Now we issue commands to junksoftAPI like this:
junksoftAPI.Login(user,pass);
int age = junksoftAPI.GetAgeByCustomerId(custId);
List<string> names = junksoftAPI.GetNames();
This works fine when I run all of this on the same thread (e.g. in a console application). However, for some reason this usually doesn't work when I put junksoftAPI into a System.Web.Caching.Cache and use it from different controllers in my web app. I say ususally because this actually works when ASP.NET happens to give the incoming call to the thread that junksoftAPI was created on. If it doesn't, Junksoft 95 gives me an error.
Is there any way for me to make sure that all interactions with junksoftAPI happen on the same thread?
Note that I don't want to turn the whole web application into a single-threaded application! The logic in the controllers and elswhere should happen like normal on different threads. It should only be the Junksoft interactions that happen on the Junksoft-specific thread, something like this:
[HttpGet]
public IHttpActionResult GetAge(...)
{
//finding customer ID in database...
...
int custAge = await Task.Run(() => {
//this should happen on the Junksoft-specific thread and not the next available thread
var cache = new System.Web.Caching.Cache();
var junksoftAPI = cache.Get(...); //This has previously been added to cache on the Junksoft-specific thread
return junksoftAPI.GetAgeByCustomerId(custId);
});
//prepare a response using custAge...
}
You can create your own singleton worker thread to achieve this. Here is the code which you can plug it into your web application.
public class JunkSoftRunner
{
private static JunkSoftRunner _instance;
//singleton pattern to restrict all the actions to be executed on a single thread only.
public static JunkSoftRunner Instance => _instance ?? (_instance = new JunkSoftRunner());
private readonly SemaphoreSlim _semaphore;
private readonly AutoResetEvent _newTaskRunSignal;
private TaskCompletionSource<object> _taskCompletionSource;
private Func<object> _func;
private JunkSoftRunner()
{
_semaphore = new SemaphoreSlim(1, 1);
_newTaskRunSignal = new AutoResetEvent(false);
var contextThread = new Thread(ThreadLooper)
{
Priority = ThreadPriority.Highest
};
contextThread.Start();
}
private void ThreadLooper()
{
while (true)
{
//wait till the next task signal is received.
_newTaskRunSignal.WaitOne();
//next task execution signal is received.
try
{
//try execute the task and get the result
var result = _func.Invoke();
//task executed successfully, set the result
_taskCompletionSource.SetResult(result);
}
catch (Exception ex)
{
//task execution threw an exception, set the exception and continue with the looper
_taskCompletionSource.SetException(ex);
}
}
}
public async Task<TResult> Run<TResult>(Func<TResult> func, CancellationToken cancellationToken = default(CancellationToken))
{
//allows only one thread to run at a time.
await _semaphore.WaitAsync(cancellationToken);
//thread has acquired the semaphore and entered
try
{
//create new task completion source to wait for func to get executed on the context thread
_taskCompletionSource = new TaskCompletionSource<object>();
//set the function to be executed by the context thread
_func = () => func();
//signal the waiting context thread that it is time to execute the task
_newTaskRunSignal.Set();
//wait and return the result till the task execution is finished on the context/looper thread.
return (TResult)await _taskCompletionSource.Task;
}
finally
{
//release the semaphore to allow other threads to acquire it.
_semaphore.Release();
}
}
}
Console Main Method for testing:
public class Program
{
//testing the junk soft runner
public static void Main()
{
//get the singleton instance
var softRunner = JunkSoftRunner.Instance;
//simulate web request on different threads
for (var i = 0; i < 10; i++)
{
var taskIndex = i;
//launch a web request on a new thread.
Task.Run(async () =>
{
Console.WriteLine($"Task{taskIndex} (ThreadID:'{Thread.CurrentThread.ManagedThreadId})' Launched");
return await softRunner.Run(() =>
{
Console.WriteLine($"->Task{taskIndex} Completed On '{Thread.CurrentThread.ManagedThreadId}' thread.");
return taskIndex;
});
});
}
}
}
Output:
Notice that, though the function was launched from the different threads, some portion of code got always executed always on the same context thread with ID: '5'.
But beware that, though all the web requests are executed on independent threads, they will eventually wait for some tasks to get executed on the singleton worker thread. This will eventually create a bottle neck in your web application. This is anyway your design limitation.
Here is how you could issue commands to the Junksoft API from a dedicated STA thread, using a BlockingCollection class:
public class JunksoftSTA : IDisposable
{
private readonly BlockingCollection<Action<Lazy<dynamic>>> _pump;
private readonly Thread _thread;
public JunksoftSTA()
{
_pump = new BlockingCollection<Action<Lazy<dynamic>>>();
_thread = new Thread(() =>
{
var lazyApi = new Lazy<dynamic>(() =>
{
var powerShell = System.Management.Automation.PowerShell.Create();
powerShell.AddScript("Add-Type -Path C:\Path\To\Junksoft.dll");
powerShell.AddScript("New-Object Com.Junksoft.ScriptingObject");
dynamic junksoftAPI = powerShell.Invoke()[0];
return junksoftAPI;
});
foreach (var action in _pump.GetConsumingEnumerable())
{
action(lazyApi);
}
});
_thread.SetApartmentState(ApartmentState.STA);
_thread.IsBackground = true;
_thread.Start();
}
public Task<T> CallAsync<T>(Func<dynamic, T> function)
{
var tcs = new TaskCompletionSource<T>(
TaskCreationOptions.RunContinuationsAsynchronously);
_pump.Add(lazyApi =>
{
try
{
var result = function(lazyApi.Value);
tcs.SetResult(result);
}
catch (Exception ex)
{
tcs.SetException(ex);
}
});
return tcs.Task;
}
public Task CallAsync(Action<dynamic> action)
{
return CallAsync<object>(api => { action(api); return null; });
}
public void Dispose() => _pump.CompleteAdding();
public void Join() => _thread.Join();
}
The purpose of using the Lazy class is for surfacing a possible exception during the construction of the dynamic object, by propagating it to the callers.
...exceptions are cached. That is, if the factory method throws an exception the first time a thread tries to access the Value property of the Lazy<T> object, the same exception is thrown on every subsequent attempt.
Usage example:
// A static field stored somewhere
public static readonly JunksoftSTA JunksoftStatic = new JunksoftSTA();
await JunksoftStatic.CallAsync(api => { api.Login("x", "y"); });
int age = await JunksoftStatic.CallAsync(api => api.GetAgeByCustomerId(custId));
In case you find that a single STA thread is not enough to serve all the requests in a timely manner, you could add more STA threads, all of them running the same code (private readonly Thread[] _threads; etc). The BlockingCollection class is thread-safe and can be consumed concurrently by any number of threads.
If you did not say that was a 3rd party tool, I would have asumed it is a GUI class. For practical reasons, it is a very bad idea to have multiple threads write to them. .NET enforces a strict "only the creating thread shall write" rule, from 2.0 onward.
WebServers in general and ASP.Net in particular use a pretty big thread pool. We are talking 10's to 100's of Threads per Core. That means it is really hard to nail any request down to a specific Thread. You might as well not try.
Again, looking at the GUI classes might be your best bet. You could basically make a single thread with the sole purpose of immitating a GUI's Event Queue. The Main/UI Thread of your average Windows Forms application, is responsible for creating every GUI class instance. It is kept alive by polling/processing the event queue. It ends onlyx when it receies a cancel command, via teh Event Queue. Dispatching just puts orders into that Queue, so we can avoid Cross-Threading issues.

Task.Run() schedules tasks randomly / erratically under high loads

As I am working on processing bulk of emails, I have used the Task method to process those emails asynchronously without affecting the primary work. (Basically sending email functionality should work on different thread than primary thread.) Imagine you're processing more than 1K email per 30 seconds in Windows Service.
The issue I am facing is-- many time the Task method is not executed, it completely behave randomly. Technically, it schedules the task randomly. Sometime I receive the call in SendEmail method and sometime not. I have tried both the approaches as mentioned below.
Method 1
public void ProcessMails()
{
Task.Run(() => SendEmail(emailModel));
}
Method 2
public async void ProcessMails()
{
// here the SendEmail method is awaitable, but I have not used 'await' because
// I need non-blocking operation on main thread.
SendEmail(emailModel));
}
Would anybody please let me know what could be the problem OR I am missing anything here?
As has been noted it seems as though you're running out resources to schedule tasks that ultimately send your emails. Right now the sample code tries to force feed all the work that needs to get scheduled immediatly.
The other answer provided suggects using a blocking collection but I think there is a cleaner easier way. This sample should at least give you the right idea.
using System;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace ClassLibrary1
{
public class MailHandler
{
private EmailLibrary emailLibrary = new EmailLibrary();
private ExecutionDataflowBlockOptions options = new ExecutionDataflowBlockOptions() { MaxDegreeOfParallelism = Environment.ProcessorCount };
private ActionBlock<string> messageHandler;
public MailHandler() => messageHandler = new ActionBlock<string>(msg => DelegateSendEmail(msg), options);
public Task ProcessMail(string message) => messageHandler.SendAsync(message);
private Task DelegateSendEmail(string message) => emailLibrary.SendEmail(message);
}
public class EmailLibrary
{
public Task SendEmail(string message) => Task.Delay(1000);
}
}
Given the fairly high frequency of sending email, it is likely that you are scheduling too many Tasks for the Scheduler.
In Method 1, calling Task.Run will create a new task each time, each of which needs to be scheduled on a thread. It is quite likely that you are exhausting your thread pool by doing this.
Although Method 2 will be less Task hungry, even with unawaited Task invocation (fire and forget), the completion of the continuation after the async method will still need to be scheduled on the Threadpool, which will adversely affect your system.
Instead of unawaited Tasks or Task.Run, and since you are a Windows Service, I would instead have a long-running background thread dedicated to sending emails. This thread can work independently to your primary work, and emails can be scheduled to this thread via a queue.
If a single mail sending thread is insufficient to keep pace with the mails, you can extend the number of EmailSender threads, but constrain this to a reasonable, finite number).
You should explore other optimizations too, which again will improve the throughput of your email sender e.g.
Can the email senders keep long lived connections to the mail server?
Does the mail server accept batches of email?
Here's an example using BlockingCollection with a backing ConcurrentQueue of your email message Model.
Creating a queue which is shared between the producer "PrimaryWork" thread and the "EmailConsumer" thread (obviously, if you have an IoC container, it's best registered there)
Enqueuing mails on the primary work thread
The consumer EmailSender runs a loop on the blocking collection queue until CompleteAdding is called
I've used a TaskCompletionSource to provide a Task which will complete once all messages have been sent, i.e. so that graceful exit is possible without losing emails still in the queue.
public class PrimaryWork
{
private readonly BlockingCollection<EmailModel> _enqueuer;
public PrimaryWork(BlockingCollection<EmailModel> enqueuer)
{
_enqueuer = enqueuer;
}
public void DoWork()
{
// ... do your work
for (var i = 0; i < 100; i++)
{
EnqueueEmail(new EmailModel {
To = $"recipient{i}#foo.com",
Message = $"Message {i}" });
}
}
// i.e. Queue work for the email sender
private void EnqueueEmail(EmailModel message)
{
_enqueuer.Add(message);
}
}
public class EmailSender
{
private readonly BlockingCollection<EmailModel> _mailQueue;
private readonly TaskCompletionSource<string> _tcsIsCompleted
= new TaskCompletionSource<string>();
public EmailSender(BlockingCollection<EmailModel> mailQueue)
{
_mailQueue = mailQueue;
}
public void Start()
{
Task.Run(() =>
{
try
{
while (!_mailQueue.IsCompleted)
{
var nextMessage = _mailQueue.Take();
SendEmail(nextMessage).Wait();
}
_tcsIsCompleted.SetResult("ok");
}
catch (Exception)
{
_tcsIsCompleted.SetResult("fail");
}
});
}
public async Task Stop()
{
_mailQueue.CompleteAdding();
await _tcsIsCompleted.Task;
}
private async Task SendEmail(EmailModel message)
{
// IO bound work to the email server goes here ...
}
}
Example of bootstrapping and starting the above producer / consumer classes:
public static async Task Main(string[] args)
{
var theQueue = new BlockingCollection<EmailModel>(new ConcurrentQueue<EmailModel>());
var primaryWork = new PrimaryWork(theQueue);
var mailSender = new EmailSender(theQueue);
mailSender.Start();
primaryWork.DoWork();
// Wait for all mails to be sent
await mailSender.Stop();
}
I've put a complete sample up on Bitbucket here
Other Notes
The blocking collection (and the backing ConcurrentQueue) are thread safe, so you can concurrently use more than one many producer and consumer thread.
As per above, batching is encouraged, and asynchronous parallelism is possible (Since each mail sender uses a thread, Task.WaitAll(tasks) will wait for a batch of tasks). A totally asynchronous sender could obviously use await Task.WhenAll(tasks).
As per comments below, I believe the nature of your system (i.e. Windows Service, with 2k messages / minute) warrants at least one dedicated thread for email sending, despite emailing likely being inherently IO bound.

Threadsafe FIFO Queue/Buffer

I need to implement a sort of task buffer. Basic requirements are:
Process tasks in a single background thread
Receive tasks from multiple threads
Process ALL received tasks i.e. make sure buffer is drained of buffered tasks after a stop signal is received
Order of tasks received per thread must be maintained
I was thinking of implementing it using a Queue like below. Would appreciate feedback on the implementation. Are there any other brighter ideas to implement such a thing?
public class TestBuffer
{
private readonly object queueLock = new object();
private Queue<Task> queue = new Queue<Task>();
private bool running = false;
public TestBuffer()
{
}
public void start()
{
Thread t = new Thread(new ThreadStart(run));
t.Start();
}
private void run()
{
running = true;
bool run = true;
while(run)
{
Task task = null;
// Lock queue before doing anything
lock (queueLock)
{
// If the queue is currently empty and it is still running
// we need to wait until we're told something changed
if (queue.Count == 0 && running)
{
Monitor.Wait(queueLock);
}
// Check there is something in the queue
// Note - there might not be anything in the queue if we were waiting for something to change and the queue was stopped
if (queue.Count > 0)
{
task = queue.Dequeue();
}
}
// If something was dequeued, handle it
if (task != null)
{
handle(task);
}
// Lock the queue again and check whether we need to run again
// Note - Make sure we drain the queue even if we are told to stop before it is emtpy
lock (queueLock)
{
run = queue.Count > 0 || running;
}
}
}
public void enqueue(Task toEnqueue)
{
lock (queueLock)
{
queue.Enqueue(toEnqueue);
Monitor.PulseAll(queueLock);
}
}
public void stop()
{
lock (queueLock)
{
running = false;
Monitor.PulseAll(queueLock);
}
}
public void handle(Task dequeued)
{
dequeued.execute();
}
}
You can actually handle this with the out-of-the-box BlockingCollection.
It is designed to have 1 or more producers, and 1 or more consumers. In your case, you would have multiple producers and one consumer.
When you receive a stop signal, have that signal handler
Signal producer threads to stop
Call CompleteAdding on the BlockingCollection instance
The consumer thread will continue to run until all queued items are removed and processed, then it will encounter the condition that the BlockingCollection is complete. When the thread encounters that condition, it just exits.
You should think about ConcurrentQueue, which is FIFO, in fact. If not suitable, try some of its relatives in Thread-Safe Collections. By using these you can avoid some risks.
I suggest you take a look at TPL DataFlow. BufferBlock is what you're looking for, but it offers so much more.
Look at my lightweight implementation of threadsafe FIFO queue, its a non-blocking synchronisation tool that uses threadpool - better than create own threads in most cases, and than using blocking sync tools as locks and mutexes. https://github.com/Gentlee/SerialQueue
Usage:
var queue = new SerialQueue();
var result = await queue.Enqueue(() => /* code to synchronize */);
You could use Rx on .NET 3.5 for this. It might have never come out of RC, but I believe it is stable* and in use by many production systems. If you don't need Subject you might find primitives (like concurrent collections) for .NET 3.5 you can use that didn't ship with the .NET Framework until 4.0.
Alternative to Rx (Reactive Extensions) for .net 3.5
* - Nit picker's corner: Except for maybe advanced time windowing, which is out of scope, but buffers (by count and time), ordering, and schedulers are all stable.

Multi-threaded application interaction with logger thread

Here I am again with questions about multi-threading and an exercise of my Concurrent Programming class.
I have a multi-threaded server - implemented using .NET Asynchronous Programming Model - with GET (download) and PUT (upload) file services. This part is done and tested.
It happens that the statement of the problem says this server must have logging activity with the minimum impact on the server response time, and it should be supported by a low priority thread - logger thread - created for this effect. All logging messages shall be passed by the threads that produce them to this logger thread, using a communication mechanism that may not lock the thread that invokes it (besides the necessary locking to ensure mutual exclusion) and assuming that some logging messages may be ignored.
Here is my current solution, please help validating if this stands as a solution to the stated problem:
using System;
using System.IO;
using System.Threading;
// Multi-threaded Logger
public class Logger {
// textwriter to use as logging output
protected readonly TextWriter _output;
// logger thread
protected Thread _loggerThread;
// logger thread wait timeout
protected int _timeOut = 500; //500ms
// amount of log requests attended
protected volatile int reqNr = 0;
// logging queue
protected readonly object[] _queue;
protected struct LogObj {
public DateTime _start;
public string _msg;
public LogObj(string msg) {
_start = DateTime.Now;
_msg = msg;
}
public LogObj(DateTime start, string msg) {
_start = start;
_msg = msg;
}
public override string ToString() {
return String.Format("{0}: {1}", _start, _msg);
}
}
public Logger(int dimension,TextWriter output) {
/// initialize queue with parameterized dimension
this._queue = new object[dimension];
// initialize logging output
this._output = output;
// initialize logger thread
Start();
}
public Logger() {
// initialize queue with 10 positions
this._queue = new object[10];
// initialize logging output to use console output
this._output = Console.Out;
// initialize logger thread
Start();
}
public void Log(string msg) {
lock (this) {
for (int i = 0; i < _queue.Length; i++) {
// seek for the first available position on queue
if (_queue[i] == null) {
// insert pending log into queue position
_queue[i] = new LogObj(DateTime.Now, msg);
// notify logger thread for a pending log on the queue
Monitor.Pulse(this);
break;
}
// if there aren't any available positions on logging queue, this
// log is not considered and the thread returns
}
}
}
public void GetLog() {
lock (this) {
while(true) {
for (int i = 0; i < _queue.Length; i++) {
// seek all occupied positions on queue (those who have logs)
if (_queue[i] != null) {
// log
LogObj obj = (LogObj)_queue[i];
// makes this position available
_queue[i] = null;
// print log into output stream
_output.WriteLine(String.Format("[Thread #{0} | {1}ms] {2}",
Thread.CurrentThread.ManagedThreadId,
DateTime.Now.Subtract(obj._start).TotalMilliseconds,
obj.ToString()));
}
}
// after printing all pending log's (or if there aren't any pending log's),
// the thread waits until another log arrives
//Monitor.Wait(this, _timeOut);
Monitor.Wait(this);
}
}
}
// Starts logger thread activity
public void Start() {
// Create the thread object, passing in the Logger.Start method
// via a ThreadStart delegate. This does not start the thread.
_loggerThread = new Thread(this.GetLog);
_loggerThread.Priority = ThreadPriority.Lowest;
_loggerThread.Start();
}
// Stops logger thread activity
public void Stop() {
_loggerThread.Abort();
_loggerThread = null;
}
// Increments number of attended log requests
public void IncReq() { reqNr++; }
}
Basically, here are the main points of this code:
Start a low priority thread that loops the logging queue and prints pending logs to the output. After this, the thread is suspended till new log arrives;
When a log arrives, the logger thread is awaken and does it's work.
Is this solution thread-safe? I have been reading Producers-Consumers problem and solution algorithm, but in this problem although I have multiple producers, I only have one reader.
It seems it should be working. Producers-Consumers shouldn't change greatly in case of single consumer. Little nitpicks:
acquiring lock may be an expensive operation (as #Vitaliy Lipchinsky says). I'd recommend to benchmark your logger against naive 'write-through' logger and logger using interlocked operations. Another alternative would be exchanging existing queue with empty one in GetLog and leaving critical section immediately. This way none of producers won't be blocked by long operations in consumers.
make LogObj reference type (class). There's no point in making it struct since you are boxing it anyway. or else make _queue field to be of type LogObj[] (that's better anyway).
make your thread background so that it won't prevent closing your program if Stop won't be called.
Flush your TextWriter. Or else you are risking to lose even those records that managed to fit queue (10 items is a bit small IMHO)
Implement IDisposable and/or finalizer. Your logger owns thread and text writer and those should be freed (and flushed - see above).
While it appears to be thread-safe, I don't believe it is particularly optimal. I would suggest a solution along these lines
NOTE: just read the other responses. What follows is a fairly optimal, optimistic locking solution based on your own. Major differences is locking on an internal class, minimizing 'critical sections', and providing graceful thread termination. If you want to avoid locking altogether, then you can try some of that volatile "non-locking" linked list stuff as #Vitaliy Lipchinsky suggests.
using System.Collections.Generic;
using System.Linq;
using System.Threading;
...
public class Logger
{
// BEST PRACTICE: private synchronization object.
// lock on _syncRoot - you should have one for each critical
// section - to avoid locking on public 'this' instance
private readonly object _syncRoot = new object ();
// synchronization device for stopping our log thread.
// initialized to unsignaled state - when set to signaled
// we stop!
private readonly AutoResetEvent _isStopping =
new AutoResetEvent (false);
// use a Queue<>, cleaner and less error prone than
// manipulating an array. btw, check your indexing
// on your array queue, while starvation will not
// occur in your full pass, ordering is not preserved
private readonly Queue<LogObj> _queue = new Queue<LogObj>();
...
public void Log (string message)
{
// you want to lock ONLY when absolutely necessary
// which in this case is accessing the ONE resource
// of _queue.
lock (_syncRoot)
{
_queue.Enqueue (new LogObj (DateTime.Now, message));
}
}
public void GetLog ()
{
// while not stopping
//
// NOTE: _loggerThread is polling. to increase poll
// interval, increase wait period. for a more event
// driven approach, consider using another
// AutoResetEvent at end of loop, and signal it
// from Log() method above
for (; !_isStopping.WaitOne(1); )
{
List<LogObj> logs = null;
// again lock ONLY when you need to. because our log
// operations may be time-intensive, we do not want
// to block pessimistically. what we really want is
// to dequeue all available messages and release the
// shared resource.
lock (_syncRoot)
{
// copy messages for local scope processing!
//
// NOTE: .Net3.5 extension method. if not available
// logs = new List<LogObj> (_queue);
logs = _queue.ToList ();
// clear the queue for new messages
_queue.Clear ();
// release!
}
foreach (LogObj log in logs)
{
// do your thang
...
}
}
}
}
...
public void Stop ()
{
// graceful thread termination. give threads a chance!
_isStopping.Set ();
_loggerThread.Join (100);
if (_loggerThread.IsAlive)
{
_loggerThread.Abort ();
}
_loggerThread = null;
}
Actually, you ARE introducing locking here. You have locking while pushing a log entry to the queue (Log method): if 10 threads simultaneously pushed 10 items into queue and woke up the Logger thread, then 11th thread will wait until the logger thread log all items...
If you want something really scalable - implement lock-free queue (example is below). With lock-free queue synchronization mechanism will be really straightaway (you can even use single wait handle for notifications).
If you won't manage to find lock-free queue implementation in the web, here is an idea how to do this:
Use linked list for an implementation. Each node in linked list contains a value and a volatile reference to the next node. therefore for operations enqueue and dequeue you can use Interlocked.CompareExchange method. I hope, the idea is clear. If not - let me know and I'll provide more details.
I'm just doing a thought experiment here, since I don't have time to actually try code right now, but I think you can do this without locks at all if you're creative.
Have your logging class contain a method that allocates a queue and a semaphore each time it's called (and another that deallocates the queue and semaphore when the thread is done). The threads that want to do logging will call this method when they start. When they want to log, they push the message onto their own queue and set the semaphore. The logger thread has a big loop that runs through the queues and checks the associated semaphores. If the semaphore associated with the queue is greater than zero, then the queue gets popped off and the semaphore decremented.
Because you're not attempting to pop things off the queue until after the semaphore is set, and you're not setting the semaphore until after you've pushed things onto the queue, I think this will be safe. According to the MSDN documentation for the queue class, if you are enumerating the queue and another thread modifies the collection, an exception is thrown. Catch that exception and you should be good.

Categories