How to enqueue and dequeue a lot at the same time? - c#

I have a simple scenario with two threads where the first thread reads permanently some data and enqueues that data into a queue. The second thread first peeks at a single object from that queue and makes some conditional checks. If these are good the single object will be dequeued and passed to some processing.
I have tried to use the ConcurrentQueue which is a thread safe implementation of a simple queue, but the problem with this one is that all calls are blocking. This means if the first thread is enqueuing an object, the second thread can't peek or dequeue an object.
In my situation I need to enqueue at the end and dequeue from the beginning of the queue at the same time.
The lock statement of C# would also.
So my question is whether it is possible to do these both operations in parallel without blocking each other in a thread safe way.
These are my first tries and this is an similar example for my problem.
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
namespace Scenario {
public class Program {
public static void Main(string[] args) {
Scenario scenario = new Scenario();
scenario.Start();
Console.ReadKey();
}
public class Scenario {
public Scenario() {
someData = new Queue<int>();
}
public void Start() {
Task.Factory.StartNew(firstThread);
Task.Factory.StartNew(secondThread);
}
private void firstThread() {
Random random = new Random();
while (true) {
int newData = random.Next(1, 100);
someData.Enqueue(newData);
Console.WriteLine("Enqueued " + newData);
}
}
private void secondThread() {
Random random = new Random();
while (true) {
if (someData.Count == 0) {
continue;
}
int singleData = someData.Peek();
int someValue = random.Next(1, 100);
if (singleData > someValue || singleData == 1 || singleData == 99) {
singleData = someData.Dequeue();
Console.WriteLine("Dequeued " + singleData);
// ... processing ...
}
}
}
private readonly Queue<int> someData;
}
}
}
Second example:
public class Scenario {
public Scenario() {
someData = new ConcurrentQueue<int>();
}
public void Start() {
Task.Factory.StartNew(firstThread);
Task.Factory.StartNew(secondThread);
}
private void firstThread() {
Random random = new Random();
while (true) {
int newData = random.Next(1, 100);
someData.Enqueue(newData);
lock (syncRoot) { Console.WriteLine($"Enqued {enqued++} Dequed {dequed}"); }
}
}
private void secondThread() {
Random random = new Random();
while (true) {
if (!someData.TryPeek(out int singleData)) {
continue;
}
int someValue = random.Next(1, 100);
if (singleData > someValue || singleData == 1 || singleData == 99) {
if (!someData.TryDequeue(out singleData)) {
continue;
}
lock (syncRoot) { Console.WriteLine($"Enqued {enqued} Dequed {dequed++}"); }
// ... processing ...
}
}
}
private int enqued = 0;
private int dequed = 0;
private readonly ConcurrentQueue<int> someData;
private static readonly object syncRoot = new object();
}

First off: I strongly encourage you to reconsider whether your technique of having multiple threads and a shared memory data structure is even the right approach at all. Code that has multiple threads of control sharing access to data structures is hard to get right, and failures can be subtle, catastrophic, and hard to debug.
Second: If you are bent upon multiple threads and a shared memory data structure, I strongly encourage you to use designed-by-experts data types like concurrent queues, rather than rolling your own.
Now that I've got those warnings out of the way: here is a way to address your concern. It is sufficiently complicated that you should obtain the services of an expert on the C# memory model to verify the correctness of your solution if you go with this. I would not consider myself to be competent to implement the scheme I'm about to describe, not without help of someone who is actually an expert on the memory model.
The goal is to have a queue that supports simultaneous enqueue and dequeue operations and low lock contention.
What you want is two immutable stack variables called the enqueue stack and the dequeue stack, each with their own lock.
The enqueue operation is:
Take the enqueue lock
Push the item onto the enqueue stack; this produces a new stack in O(1) time.
Assign the newly produced stack to the enqueue stack variable.
Release the enqueue lock
The dequeue operation is:
Take the dequeue lock
If the dequeue stack is empty then
take the enqueue lock
enumerate the enqueue stack and use it to build the dequeue stack; this reverses the enqueue stack, which maintains the property we want: that the first in is the first out.
assign an empty immutable stack to the enqueue stack variable
release the enqueue lock
assign the new stack to the dequeue stack
If the dequeue stack is empty, throw, or abandon and retry later, or sleep until signaled by the enqueue operation, or whatever the right thing to do here is.
The dequeue stack is not empty.
Pop an item from the dequeue stack, which produces a new stack in O(1).
Assign the new stack to the dequeue stack variable.
Release the dequeue lock.
Process the item.
Note that of course if there is only one thread dequeuing, then we don't need the dequeue lock at all, but with this scheme there can be many threads dequeuing.
Suppose there are 1000 items on the enqueue stack and zero on the dequeue stack. When we dequeue the first time, we do an expensive O(n) operation of reversing the enqueue stack once, but now we have 1000 items on the dequeue stack. Once the dequeue stack is big, the dequeueing thread can spend most of its time processing, while the enqueuing thread spends most of its time enqueuing. Contention on the enqueue lock is rare, but expensive when it happens.
Why use immutable data structures? Everything I described here would also work with mutable stacks, but (1) it is easier to reason about immutable stacks, (2) if you want to really live dangerously you can elide some of the locks and go for interlocked swap operations; make sure you understand everything about the possible re-orderings of operations in low-lock conditions if you're doing that.
UPDATE:
The real problem is that i cant dequeue and process a lot of points because i am permanently reading and enquing new points. That enqueue calls are blocking the processing step.
Well if that is your real problem then mentioning it in the question instead of burying it in a comment would be a good idea. Help us help you.
There are a number of things you could do here. You could for example set the priority of the enqueuing thread lower than the priority of the dequeuing thread. Or you could have multiple dequeuing threads, as many as there are CPUs in your machine. Or you could dynamically choose to drop some enqueue operations if the dequeues are not keeping up. Without knowing a lot more about your actual problem it is hard to give advice on how to solve it.

Related

Batch process all items in ConcurrentBag

I have the following use case. Multiple threads are creating data points which are collected in a ConcurrentBag. Every x ms a single consumer thread looks at the data points that came in since the last time and processes them (e.g. count them + calculate average).
The following code more or less represents the solution that I came up with:
private static ConcurrentBag<long> _bag = new ConcurrentBag<long>();
static void Main()
{
Task.Run(() => Consume());
var producerTasks = Enumerable.Range(0, 8).Select(i => Task.Run(() => Produce()));
Task.WaitAll(producerTasks.ToArray());
}
private static void Produce()
{
for (int i = 0; i < 100000000; i++)
{
_bag.Add(i);
}
}
private static void Consume()
{
while (true)
{
var oldBag = _bag;
_bag = new ConcurrentBag<long>();
var average = oldBag.DefaultIfEmpty().Average();
var count = oldBag.Count;
Console.WriteLine($"Avg = {average}, Count = {count}");
// Wait x ms
}
}
Is a ConcurrentBag the right tool for the job here?
Is switching the bags the right way to achieve clearing the list for new data points and then processing the old ones?
Is it safe to operate on oldBag or could I run into trouble when I iterate over oldBag and a thread is still adding an item?
Should I use Interlocked.Exchange() for switching the variables?
EDIT
I guess the above code was not really a good representation of what I'm trying to achieve. So here is some more code to show the problem:
public class LogCollectorTarget : TargetWithLayout, ILogCollector
{
private readonly List<string> _logMessageBuffer;
public LogCollectorTarget()
{
_logMessageBuffer = new List<string>();
}
protected override void Write(LogEventInfo logEvent)
{
var logMessage = Layout.Render(logEvent);
lock (_logMessageBuffer)
{
_logMessageBuffer.Add(logMessage);
}
}
public string GetBuffer()
{
lock (_logMessageBuffer)
{
var messages = string.Join(Environment.NewLine, _logMessageBuffer);
_logMessageBuffer.Clear();
return messages;
}
}
}
The class' purpose is to collect logs so they can be sent to a server in batches. Every x seconds GetBuffer is called. This should get the current log messages and clear the buffer for new messages. It works with locks but it as they are quite expensive I don't want to lock on every Logging-operation in my program. So that's why I wanted to use a ConcurrentBag as a buffer. But then I still need to switch or clear it when I call GetBuffer without loosing any log messages that happen during the switch.
Since you have a single consumer, you can work your way with a simple ConcurrentQueue, without swapping collections:
public class LogCollectorTarget : TargetWithLayout, ILogCollector
{
private readonly ConcurrentQueue<string> _logMessageBuffer;
public LogCollectorTarget()
{
_logMessageBuffer = new ConcurrentQueue<string>();
}
protected override void Write(LogEventInfo logEvent)
{
var logMessage = Layout.Render(logEvent);
_logMessageBuffer.Enqueue(logMessage);
}
public string GetBuffer()
{
// How many messages should we dequeue?
var count = _logMessageBuffer.Count;
var messages = new StringBuilder();
while (count > 0 && _logMessageBuffer.TryDequeue(out var message))
{
messages.AppendLine(message);
count--;
}
return messages.ToString();
}
}
If memory allocations become an issue, you can instead dequeue them to a fixed-size array and call string.Join on it. This way, you're guaranteed to do only two allocations (whereas the StringBuilder could do many more if the initial buffer isn't properly sized):
public string GetBuffer()
{
// How many messages should we dequeue?
var count = _logMessageBuffer.Count;
var buffer = new string[count];
for (int i = 0; i < count; i++)
{
_logMessageBuffer.TryDequeue(out var message);
buffer[i] = message;
}
return string.Join(Environment.NewLine, buffer);
}
Is a ConcurrentBag the right tool for the job here?
Its the right tool for a job, this really depends on what you are trying to do, and why. The example you have given is very simplistic without any context so its hard to tell.
Is switching the bags the right way to achieve clearing the list for
new data points and then processing the old ones?
The answer is no, for probably many reasons. What happens if a thread writes to it, while you are switching it?
Is it safe to operate on oldBag or could I run into trouble when I
iterate over oldBag and a thread is still adding an item?
No, you have just copied the reference, this will achieve nothing.
Should I use Interlocked.Exchange() for switching the variables?
Interlock methods are great things, however this will not help you in your current problem, they are for thread safe access to integer type values. You are really confused and you need to look up more thread safe examples.
However Lets point you in the right direction. forget about ConcurrentBag and those fancy classes. My advice is start simple and use locking so you understand the nature of the problem.
If you want multiple tasks/threads to access a list, you can easily use the lock statement and guard access to the list/array so other nasty threads aren't modifying it.
Obviously the code you have written is a nonsensical example, i mean you are just adding consecutive numbers to a list, and getting another thread to average them them. This hardly needs to be consumer producer at all, and would make more sense to just be synchronous.
At this point i would point you to better architectures that would allow you to implement this pattern, e.g Tpl Dataflow, but i fear this is just a learning excise and unfortunately you really need to do more reading on multithreading and try more examples before we can truly help you with a problem.
It works with locks but it as they are quite expensive. I don't want to lock on every logging-operation in my program.
Acquiring an uncontended lock is actually quite cheap. Quoting from Joseph Albahari's book:
You can expect to acquire and release a lock in as little as 20 nanoseconds on a 2010-era computer if the lock is uncontended.
Locking becomes expensive when it is contended. You can minimize the contention by reducing the work inside the critical region to the absolute minimum. In other words don't do anything inside the lock that can be done outside the lock. In your second example the method GetBuffer does a String.Join inside the lock, delaying the release of the lock and increasing the chances of blocking other threads. You can improve it like this:
public string GetBuffer()
{
string[] messages;
lock (_logMessageBuffer)
{
messages = _logMessageBuffer.ToArray();
_logMessageBuffer.Clear();
}
return String.Join(Environment.NewLine, messages);
}
But it can be optimized even further. You could use the technique of your first example, and instead of clearing the existing List<string>, just swap it with a new list:
public string GetBuffer()
{
List<string> oldList;
lock (_logMessageBuffer)
{
oldList = _logMessageBuffer;
_logMessageBuffer = new();
}
return String.Join(Environment.NewLine, oldList);
}
Starting from .NET Core 3.0, the Monitor class has the property Monitor.LockContentionCount, that returns the number of times there was contention at the entry point of a lock. You could watch the delta of this property every second, and see if the number is concerning. If you get single-digit numbers, there is nothing to worry about.
Touching some of your questions:
Is a ConcurrentBag the right tool for the job here?
No. The ConcurrentBag<T> is a very specialized collection intended for mixed producer scenarios, mainly object pools. You don't have such a scenario here. A ConcurrentQueue<T> is preferable to a ConcurrentBag<T> in almost all scenarios.
Should I use Interlocked.Exchange() for switching the variables?
Only if the collection was immutable. If the _logMessageBuffer was an ImmutableQueue<T>, then it would be excellent to swap it with Interlocked.Exchange. With mutable types you have no idea if the old collection is still in use by another thread, and for how long. The operating system can suspend any thread at any time for a duration of 10-30 milliseconds or even more (demo). So it's not safe to use lock-free techniques. You have to lock.

Start threads at the order that they were started, only when previous thread was finished

Sorry for the confusing title, but that's basically what i need, i could do something with global variables but that would only be viable for 2 threads that are requested one after the other.
here is a pseudo code that might explain it better.
/*Async function that gets requests from a server*/
if ()//recieved request from server
{
new Thread(() =>
{
//do stuff
//in the meantime a new thread has been requested from server
//and another one 10 seconds later.. etc.
//wait for this current thread to finish
//fire up the first thread that was requested while this ongoing thread
//after the second thread is finished fire up the third thread that was requested 10 seconds after this thread
//etc...
}).Start();
}
I don't know when each thread will be requested, as it is based on the server sending info to client, so i cant do Task.ContiuneWith as it's dynamic.
So Michael suggested me to look into Queues, and i came up with it
static Queue<Action> myQ = new Queue<Action>();
static void Main(string[] args)
{
new Thread(() =>
{
while (1 == 1)
{
if (myQ.FirstOrDefault() == null)
break;
myQ.FirstOrDefault().Invoke();
}
}).Start();
myQ.Enqueue(() =>
{
TestQ("First");
});
myQ.Enqueue(() =>
{
TestQ("Second");
});
Console.ReadLine();
}
private static void TestQ(string s)
{
Console.WriteLine(s);
Thread.Sleep(5000);
myQ.Dequeue();
}
I commented the code, i basically need to check if the act is first in queue or not.
EDIT: So i re-made it, and now it works, surely there is a better way to do this ? because i cant afford to use an infinite while loop.
You will have to use a global container for the threads. Maybe check Queues.
This class implements a queue as a circular array. Objects stored in a
Queue are inserted at one end and removed from the other.
Queues and stacks are useful when you need temporary storage for
information; that is, when you might want to discard an element after
retrieving its value. Use Queue if you need to access the information
in the same order that it is stored in the collection. Use Stack if
you need to access the information in reverse order. Use
ConcurrentQueue(Of T) or ConcurrentStack(Of T) if you need to access
the collection from multiple threads concurrently.
Three main operations can be performed on a Queue and its elements:
Enqueue adds an element to the end of the Queue.
Dequeue removes the oldest element from the start of the Queue.
Peek returns the oldest element that is at the start of the Queue but does not remove it from the Queue.
EDIT (From what you added)
Here is how I would change your example code to implement the infinite loop and keep it under your control.
static Queue<Action> myQ = new Queue<Action>();
static void Main(string[] args)
{
myQ.Enqueue(() =>
{
TestQ("First");
});
myQ.Enqueue(() =>
{
TestQ("Second");
});
Thread thread = new Thread(() =>
{
while(true) {
Thread.Sleep(5000)
if (myQ.Count > 0) {
myQ.Dequeue().Invoke()
}
}
}).Start();
// Do other stuff, eventually calling "thread.Stop()" the stop the infinite loop.
Console.ReadLine();
}
private static void TestQ(string s)
{
Console.WriteLine(s);
}
You could put the requests that you receive into a queue if there is a thread currently running. Then, to find out when threads return, they could fire an event. When this event fires, if there is something in the queue, start a new thread to process this new request.
The only thing with this is you have to be careful about race conditions, since you are communicating essentially between multiple threads.

Consumer/producer pattern using a List

I have a WinForms app with one consumer and one producer task. My producer task periodically connects to a web service and retrieves a specified number of strings which then need to be placed into some kind of concurrent fixed-size FIFO queue. My consumer task then processes these strings and sends then out as SMS messages (one string per message). The SOAP function that my producer task calls requires a parameter to specify the number of strings that I want to get. This number will be determined by the space available in my queue. So if I have a max queue size of 100 strings and I have 60 strings in the queue the next time my producer polls the web service, I need it to ask for 40 strings since that's all that I can fit in my queue at that moment.
Here's the code that I'm using to represent my fixed-size FIFO queue:
public class FixedSizeQueue<T>
{
private readonly List<T> queue = new List<T>();
private readonly object syncObj = new object();
public int Size { get; private set; }
public FixedSizeQueue(int size)
{
Size = size;
}
public void Enqueue(T obj)
{
lock (syncObj)
{
queue.Insert(0, obj);
if (queue.Count > Size)
{
queue.RemoveRange(Size, queue.Count - Size);
}
}
}
public T[] Dequeue()
{
lock (syncObj)
{
var result = queue.ToArray();
queue.Clear();
return result;
}
}
public T Peek()
{
lock (syncObj)
{
var result = queue[0];
return result;
}
}
public int GetCount()
{
lock (syncObj)
{
return queue.Count;
}
}
My producer task doesn't currently specify the number of strings that I need from the web service but it seems like it could be as simple as getting the current item count in my queue (q.GetCount()) and then subtracting it from my max queue size. However, even though GetCount() uses a lock, isn't it possible that as soon as GetCount() exits, my consumer task could process 10 strings in the queue meaning that I'll never actually be able to keep the queue 100% full?
Also, my consumer task basically needs to "peek" at the first string in the queue before trying to sent it in an SMS message. In the event that the message can't be sent, I need to leave the string in it's original position in the queue. My first thought about accomplishing this is to "peek" at the first string in the queue, try to send it in an SMS message and then remove it from the queue if the send was successful. This way, if the send fails, the string is still in the queue at its original position. Does that sound reasonable?
This is a broad question, so there really is no definitive answer, but here are my thoughts.
However, even though GetCount() uses a lock, isn't it possible that as soon as GetCount() exits, my consumer task could process 10 strings in the queue meaning that I'll never actually be able to keep the queue 100% full?
Yes, it is, unless you lock on syncObj for the entire duration of your query to the web service. But the point of producer/consumer is to allow the consumer to process items while the producer is fetching more. There's really not much you can do about this; at some point, the queue will not be 100% full. If it always was 100% full then that would mean that the consumer isn't doing anything at all.
This way, if the send fails, the string is still in the queue at its original position. Does that sound reasonable?
Perhaps, but the way you have this coded, a Dequeue() operation returns the entire state of the queue and clears it. Your only option given this interface is to re-queue failed items to be processed later, which is a perfectly reasonable technique.
I would also consider adding a way for the consumer to block itself until there are items to be processed. For example:
public T[] WaitForItemAndDequeue(TimeSpan timeout)
{
lock (syncObj) {
if (queue.Count == 0 && !Monitor.Wait(syncObj, timeout)) {
return null; // Timeout expired
}
return Dequeue();
}
}
public T[] WaitForItem()
{
lock (syncObj) {
while (queue.Count != 0) {
Monitor.Wait(syncObj);
}
return Dequeue();
}
}
Then you have to change Enqueue() to call Monitor.Pulse(syncObj) after it has manipulated the list (so at the end of the method, but inside of the lock block).

C# - Pass data back from ThreadPool thread to main thread

Current implementation: Waits until parallelCount values are collected, uses ThreadPool to process the values, waits until all threads complete, re-collect another set of values and so on...
Code:
private static int parallelCount = 5;
private int taskIndex;
private object[] paramObjects;
// Each ThreadPool thread should access only one item of the array,
// release object when done, to be used by another thread
private object[] reusableObjects = new object[parallelCount];
private void MultiThreadedGenerate(object paramObject)
{
paramObjects[taskIndex] = paramObject;
taskIndex++;
if (taskIndex == parallelCount)
{
MultiThreadedGenerate();
// Reset
taskIndex = 0;
}
}
/*
* Called when 'paramObjects' array gets filled
*/
private void MultiThreadedGenerate()
{
int remainingToGenerate = paramObjects.Count;
resetEvent.Reset();
for (int i = 0; i < paramObjects.Count; i++)
{
ThreadPool.QueueUserWorkItem(delegate(object obj)
{
try
{
int currentIndex = (int) obj;
Generate(currentIndex, paramObjects[currentIndex], reusableObjects[currentIndex]);
}
finally
{
if (Interlocked.Decrement(ref remainingToGenerate) == 0)
{
resetEvent.Set();
}
}
}, i);
}
resetEvent.WaitOne();
}
I've seen significant performance improvements with this approach, however there are a number of issues to consider:
[1] Collecting values in paramObjects and synchronization using resetEvent can be avoided as there is no dependency between the threads (or current set of values with the next set of values). I'm only doing this to manage access to reusableObjects (when a set paramObjects is done processing, I know that all objects in reusableObjects are free, so taskIndex is reset and each new task of the next set of values will have its unique 'reusableObj' to work with).
[2] There is no real connection between the size of reusableObjects and the number of threads the ThreadPool uses. I might initialize reusableObjects to have 10 objects, and say due to some limitations, ThreadPool can run only 3 threads for my MultiThreadedGenerate() method, then I'm wasting memory.
So by getting rid of paramObjects, how can the above code be refined in a way that as soon as one thread completes its job, that thread returns its taskIndex(or the reusableObj) it used and no longer needs so that it becomes available to the next value. Also, the code should create a reUsableObject and add it to some collection only when there is a demand for it. Is using a Queue here a good idea ?
Thank you.
There's really no reason to do your own manual threading and task management any more. You could restructure this to a more loosely-coupled model using Task Parallel Library (and possibly System.Collections.Concurrent for result collation).
Performance could be further improved if you don't need to wait for a full complement of work before handing off each Task for processing.
TPL came along in .Net 4.0 but was back-ported to .Net 3.5. Download here.

Multi-threaded application interaction with logger thread

Here I am again with questions about multi-threading and an exercise of my Concurrent Programming class.
I have a multi-threaded server - implemented using .NET Asynchronous Programming Model - with GET (download) and PUT (upload) file services. This part is done and tested.
It happens that the statement of the problem says this server must have logging activity with the minimum impact on the server response time, and it should be supported by a low priority thread - logger thread - created for this effect. All logging messages shall be passed by the threads that produce them to this logger thread, using a communication mechanism that may not lock the thread that invokes it (besides the necessary locking to ensure mutual exclusion) and assuming that some logging messages may be ignored.
Here is my current solution, please help validating if this stands as a solution to the stated problem:
using System;
using System.IO;
using System.Threading;
// Multi-threaded Logger
public class Logger {
// textwriter to use as logging output
protected readonly TextWriter _output;
// logger thread
protected Thread _loggerThread;
// logger thread wait timeout
protected int _timeOut = 500; //500ms
// amount of log requests attended
protected volatile int reqNr = 0;
// logging queue
protected readonly object[] _queue;
protected struct LogObj {
public DateTime _start;
public string _msg;
public LogObj(string msg) {
_start = DateTime.Now;
_msg = msg;
}
public LogObj(DateTime start, string msg) {
_start = start;
_msg = msg;
}
public override string ToString() {
return String.Format("{0}: {1}", _start, _msg);
}
}
public Logger(int dimension,TextWriter output) {
/// initialize queue with parameterized dimension
this._queue = new object[dimension];
// initialize logging output
this._output = output;
// initialize logger thread
Start();
}
public Logger() {
// initialize queue with 10 positions
this._queue = new object[10];
// initialize logging output to use console output
this._output = Console.Out;
// initialize logger thread
Start();
}
public void Log(string msg) {
lock (this) {
for (int i = 0; i < _queue.Length; i++) {
// seek for the first available position on queue
if (_queue[i] == null) {
// insert pending log into queue position
_queue[i] = new LogObj(DateTime.Now, msg);
// notify logger thread for a pending log on the queue
Monitor.Pulse(this);
break;
}
// if there aren't any available positions on logging queue, this
// log is not considered and the thread returns
}
}
}
public void GetLog() {
lock (this) {
while(true) {
for (int i = 0; i < _queue.Length; i++) {
// seek all occupied positions on queue (those who have logs)
if (_queue[i] != null) {
// log
LogObj obj = (LogObj)_queue[i];
// makes this position available
_queue[i] = null;
// print log into output stream
_output.WriteLine(String.Format("[Thread #{0} | {1}ms] {2}",
Thread.CurrentThread.ManagedThreadId,
DateTime.Now.Subtract(obj._start).TotalMilliseconds,
obj.ToString()));
}
}
// after printing all pending log's (or if there aren't any pending log's),
// the thread waits until another log arrives
//Monitor.Wait(this, _timeOut);
Monitor.Wait(this);
}
}
}
// Starts logger thread activity
public void Start() {
// Create the thread object, passing in the Logger.Start method
// via a ThreadStart delegate. This does not start the thread.
_loggerThread = new Thread(this.GetLog);
_loggerThread.Priority = ThreadPriority.Lowest;
_loggerThread.Start();
}
// Stops logger thread activity
public void Stop() {
_loggerThread.Abort();
_loggerThread = null;
}
// Increments number of attended log requests
public void IncReq() { reqNr++; }
}
Basically, here are the main points of this code:
Start a low priority thread that loops the logging queue and prints pending logs to the output. After this, the thread is suspended till new log arrives;
When a log arrives, the logger thread is awaken and does it's work.
Is this solution thread-safe? I have been reading Producers-Consumers problem and solution algorithm, but in this problem although I have multiple producers, I only have one reader.
It seems it should be working. Producers-Consumers shouldn't change greatly in case of single consumer. Little nitpicks:
acquiring lock may be an expensive operation (as #Vitaliy Lipchinsky says). I'd recommend to benchmark your logger against naive 'write-through' logger and logger using interlocked operations. Another alternative would be exchanging existing queue with empty one in GetLog and leaving critical section immediately. This way none of producers won't be blocked by long operations in consumers.
make LogObj reference type (class). There's no point in making it struct since you are boxing it anyway. or else make _queue field to be of type LogObj[] (that's better anyway).
make your thread background so that it won't prevent closing your program if Stop won't be called.
Flush your TextWriter. Or else you are risking to lose even those records that managed to fit queue (10 items is a bit small IMHO)
Implement IDisposable and/or finalizer. Your logger owns thread and text writer and those should be freed (and flushed - see above).
While it appears to be thread-safe, I don't believe it is particularly optimal. I would suggest a solution along these lines
NOTE: just read the other responses. What follows is a fairly optimal, optimistic locking solution based on your own. Major differences is locking on an internal class, minimizing 'critical sections', and providing graceful thread termination. If you want to avoid locking altogether, then you can try some of that volatile "non-locking" linked list stuff as #Vitaliy Lipchinsky suggests.
using System.Collections.Generic;
using System.Linq;
using System.Threading;
...
public class Logger
{
// BEST PRACTICE: private synchronization object.
// lock on _syncRoot - you should have one for each critical
// section - to avoid locking on public 'this' instance
private readonly object _syncRoot = new object ();
// synchronization device for stopping our log thread.
// initialized to unsignaled state - when set to signaled
// we stop!
private readonly AutoResetEvent _isStopping =
new AutoResetEvent (false);
// use a Queue<>, cleaner and less error prone than
// manipulating an array. btw, check your indexing
// on your array queue, while starvation will not
// occur in your full pass, ordering is not preserved
private readonly Queue<LogObj> _queue = new Queue<LogObj>();
...
public void Log (string message)
{
// you want to lock ONLY when absolutely necessary
// which in this case is accessing the ONE resource
// of _queue.
lock (_syncRoot)
{
_queue.Enqueue (new LogObj (DateTime.Now, message));
}
}
public void GetLog ()
{
// while not stopping
//
// NOTE: _loggerThread is polling. to increase poll
// interval, increase wait period. for a more event
// driven approach, consider using another
// AutoResetEvent at end of loop, and signal it
// from Log() method above
for (; !_isStopping.WaitOne(1); )
{
List<LogObj> logs = null;
// again lock ONLY when you need to. because our log
// operations may be time-intensive, we do not want
// to block pessimistically. what we really want is
// to dequeue all available messages and release the
// shared resource.
lock (_syncRoot)
{
// copy messages for local scope processing!
//
// NOTE: .Net3.5 extension method. if not available
// logs = new List<LogObj> (_queue);
logs = _queue.ToList ();
// clear the queue for new messages
_queue.Clear ();
// release!
}
foreach (LogObj log in logs)
{
// do your thang
...
}
}
}
}
...
public void Stop ()
{
// graceful thread termination. give threads a chance!
_isStopping.Set ();
_loggerThread.Join (100);
if (_loggerThread.IsAlive)
{
_loggerThread.Abort ();
}
_loggerThread = null;
}
Actually, you ARE introducing locking here. You have locking while pushing a log entry to the queue (Log method): if 10 threads simultaneously pushed 10 items into queue and woke up the Logger thread, then 11th thread will wait until the logger thread log all items...
If you want something really scalable - implement lock-free queue (example is below). With lock-free queue synchronization mechanism will be really straightaway (you can even use single wait handle for notifications).
If you won't manage to find lock-free queue implementation in the web, here is an idea how to do this:
Use linked list for an implementation. Each node in linked list contains a value and a volatile reference to the next node. therefore for operations enqueue and dequeue you can use Interlocked.CompareExchange method. I hope, the idea is clear. If not - let me know and I'll provide more details.
I'm just doing a thought experiment here, since I don't have time to actually try code right now, but I think you can do this without locks at all if you're creative.
Have your logging class contain a method that allocates a queue and a semaphore each time it's called (and another that deallocates the queue and semaphore when the thread is done). The threads that want to do logging will call this method when they start. When they want to log, they push the message onto their own queue and set the semaphore. The logger thread has a big loop that runs through the queues and checks the associated semaphores. If the semaphore associated with the queue is greater than zero, then the queue gets popped off and the semaphore decremented.
Because you're not attempting to pop things off the queue until after the semaphore is set, and you're not setting the semaphore until after you've pushed things onto the queue, I think this will be safe. According to the MSDN documentation for the queue class, if you are enumerating the queue and another thread modifies the collection, an exception is thrown. Catch that exception and you should be good.

Categories