I am using a Queue (C#) to store data that has to be sent to any client connecting.
my lock statement is private readonly:
private readonly object completedATEQueueSynched = new object();
only two methods are enqueueing:
1) started by mouse-movement, executed by the mainform-thread:
public void handleEddingToolMouseMove(MouseEventArgs e)
{
AbstractTrafficElement de = new...
sendElementToAllPlayers(de)
lock (completedATEQueueSynched)
{
completedATEQueue.Enqueue(de);
}
}
2) started on a button-event, executed by mainform-thread too (does not matter here, but better safe than sorry):
public void handleBLC(EventArgs e)
{
AbstractTrafficElement de = new...
sendElementToAllPlayers(de);
lock (completedATEQueueSynched)
{
completedATEQueue.Enqueue(de);
}
}
this method is called by the thread responsible for the specific client connected. here it is:
private void sendSetData(TcpClient c)
{
NetworkStream clientStream = c.GetStream();
lock (completedATEQueueSynched)
{
foreach (AbstractTrafficElement ate in MainForm.completedATEQueue)
{
binaryF.Serialize(clientStream, ate);
}
}
}
if a client connects and i am moving my mouse at the same time, a deadlock occurs.
if i lock the iteration only, a InvalidOperation exection is thrown, because the queue changed.
i have tried the synchronized Queue-Wrapper as well, but it does't work for Iterating. (even in combination with locks)
any ideas? i just don't get my mistake
You can reduce the contention, probably enough to make it acceptable:
private void sendSetData(TcpClient c)
{
IEnumerable<AbstractTrafficElement> list;
lock (completedATEQueueSynched)
{
list = MainForm.completedATEQueue.ToList(); // take a snapshot
}
NetworkStream clientStream = c.GetStream();
foreach (AbstractTrafficElement ate in list)
{
binaryF.Serialize(clientStream, ate);
}
}
But of course a snapshot introduces its own bit of timing logic. What exactly does 'all elements' mean at any given moment?
Looks like ConcurrentQueue you've wanted
UPDATE
Yes work fine, TryDequeue uses within the Interlocked.CompareExchange and SpinWait. Lock is not good choice, because too expensive take a look on SpinLock and don't forget about Data Structures for Parallel Programming
Her is enqueue from ConcurrentQueue, as you see only SpinWait and Interlocked.Increment are used. looks pretty nice
public void Enqueue(T item)
{
SpinWait spinWait = new SpinWait();
while (!this.m_tail.TryAppend(item, ref this.m_tail))
spinWait.SpinOnce();
}
internal void Grow(ref ConcurrentQueue<T>.Segment tail)
{
this.m_next = new ConcurrentQueue<T>.Segment(this.m_index + 1L);
tail = this.m_next;
}
internal bool TryAppend(T value, ref ConcurrentQueue<T>.Segment tail)
{
if (this.m_high >= 31)
return false;
int index = 32;
try
{
}
finally
{
index = Interlocked.Increment(ref this.m_high);
if (index <= 31)
{
this.m_array[index] = value;
this.m_state[index] = 1;
}
if (index == 31)
this.Grow(ref tail);
}
return index <= 31;
}
Henk Holterman's approach is good if your rate of en-queue, dequeue on queue is not very high. Here I think you are capturing mouse movements. If you expect to generate lot of data in queue the above approach is not fine. The lock becomes contention between the network code and en-queue code. The granularity of this lock is at whole queue level.
In this case I'll recommend what GSerjo mentioned - ConcurrentQueue. I've looked into the implementation of this queue. It is very granular. It operates at single element level in queue. While one thread is dequeueing, other threads can in parallel enqueue without stopping.
Related
I have a class that provides thread-safe access to LinkedList<> (adding and reading items).
class LinkedListManager {
public static object locker = new object();
public static LinkedList<AddXmlNodeArgs> tasks { get; set; }
public static EventWaitHandle wh { get; set; }
public void AddItemThreadSafe(AddXmlNodeArgs task) {
lock (locker)
tasks.AddLast(task);
wh.Set();
}
public LinkedListNode<AddXmlNodeArgs> GetNextItemThreadSafe(LinkedListNode<AddXmlNodeArgs> prevItem) {
LinkedListNode<AddXmlNodeArgs> nextItem;
if (prevItem == null) {
lock (locker)
return tasks.First;
}
lock (locker) // *1
nextItem = prevItem.Next;
if (nextItem == null) { // *2
wh.WaitOne();
return prevItem.Next;
}
lock (locker)
return nextItem;
}
}
}
I have 3 threads: 1st - writes data to tasks; 2nd and 3rd - read data from tasks.
In 2nd and 3rd threads I retrieve data from tasks by calling GetNextItemThreadSafe().
The problem is that sometimes GetNextItemThreadSafe() returns null, when parameter of method (prevItem) is not null`.
Question:
Can a thread somehow jump over lock(locker) (// *1) and get to // *2 at once ??
I think it's the only way to get a return value = null from GetNextItemThreadSafe()...
I've spend a whole day to find the mistake, but it's extremely hard because it seems to be almost impossible to debug it step by step (tasks contains 5.000 elements and error occurs whenever it wants). Btw sometimes program works fine - without exception.
I'm new to threads so maybe I'm asking silly questions...
Not clear what you're trying to achieve. Are both threads supposed to get the same elements of the linked list ? Or are you trying to have 2 threads process the tasks out of the list in parallel ? If it's the second case, then what you are doing cannot work. You'd better look at BlockingCollection which is thread-safe and designed for this kind of multi-threaded producers/consumers patterns.
A lock is only active when executing the block of code declared following the lock. Since you lock multiple times on single commands, this effectively degenerates to only locking the single command that follows the lock, after which another thread is free to jump in and consume the data. Perhaps what you meant is this:
public LinkedListNode<AddXmlNodeArgs> GetNextItemThreadSafe(LinkedListNode<AddXmlNodeArgs> prevItem) {
LinkedListNode<AddXmlNodeArgs> nextItem;
LinkedListNode<AddXmlNodeArgs> returnItem;
lock(locker) { // Lock the entire method contents to make it atomic
if (prevItem == null) {
returnItem = tasks.First;
}
// *1
nextItem = prevItem.Next;
if (nextItem == null) { // *2
// wh.WaitOne(); // Waiting in a locked block is not a good idea
returnItem = prevItem.Next;
}
returnItem = nextItem;
}
return returnItem;
}
}
Note that only assignments (as opposed to returns) occur within the locked block and there is a single return point at the bottom of the method.
I think the solution is the following:
In your Add method, add the node and set the EventWaitHandle both inside the same lock
In the Get method, inside a lock, check if the next element is empty and inside the same lock, Reset the EventWaitHandle. Outside of the lock, wait on the EventWaitHandle.
I have a simulation that generates data which must be saved to database.
ParallelLoopResult res = Parallel.For(0, 1000000, options, (r, state) =>
{
ComplexDataSet cds = GenerateData(r);
SaveDataToDatabase(cds);
});
The simulation generates a whole lot of data, so it wouldn't be practical to first generate it and then save it to database (up to 1 GB of data) and it also wouldn't make sense to save it to database one by one (too small transanctions to be practical). I want to insert them to database as a batch insert of controlled size (say 100 with one commit).
However, I think my knowledge of parallel computing is less that theoretical. I came up with this (which as you can see is very flawed):
DataBuffer buffer = new DataBuffer(...);
ParallelLoopResult res = Parallel.For(0, 10000000, options, (r, state) =>
{
ComplexDataSet cds = GenerateData(r);
buffer.SaveDataToBuffer(cds, i == r - 1);
});
public class DataBuffer
{
int count = 0;
int limit = 100
object _locker = new object();
ConcurrentQueue<ConcurrentBag<ComplexDataSet>> ComplexDataBagQueue{ get; set; }
public void SaveDataToBuffer(ComplexDataSet data, bool isfinalcycle)
{
lock (_locker)
{
if(count >= limit)
{
ConcurrentBag<ComplexDataSet> dequeueRef;
if(ComplexDataBagQueue.TryDequeue(out dequeueRef))
{
Commit(dequeueRef);
}
_lastItemRef = new ConcurrentBag<ComplexDataSet>{data};
ComplexDataSetsQueue.Enqueue(_lastItemRef);
count = 1;
}
else
{
// First time
if(_lastItemRef == null)
{
_lastItemRef = new ConcurrentBag<ComplexDataSet>{data};
ComplexDataSetsQueue.Enqueue(_lastItemRef);
count = 1;
}
// If buffer isn't full
else
{
_lastItemRef.Add(data);
count++;
}
}
if(isfinalcycle)
{
// Commit everything that hasn't been committed yet
ConcurrentBag<ComplexDataSet> dequeueRef;
while (ComplexDataSetsQueue.TryDequeue(out dequeueRef))
{
Commit(dequeueRef);
}
}
}
}
public void Commit(ConcurrentBag<ComplexDataSet> data)
{
// Commit data to database..should this be somehow in another thread or something ?
}
}
As you can see, I'm using queue to create a buffer and then manually decide when to commit. However I have a strong feeling that this isn't very performing solution to my problem. First, I'm unsure whether I'm doing locking right. Second, I'm not sure even if this is fully thread-safe (or at all).
Can you please take a look for a moment and comment what should I do differently ? Or if there is a complitely better way of doing this (using somekind of Producer-Consumer technique or something) ?
Thanks and best wishes,
D.
There is no need to use locks or expensive concurrency-safe data structures. The data is all independent, so introducing locking and sharing will only hurt performance and scalability.
Parallel.For has an overload that lets you specify per-thread data. In this you can store a private queue and private database connection.
Also: Parallel.For internally partitions your range into smaller chunks. It's perfectly efficient to pass it a huge range, so nothing to change there.
Parallel.For(0, 10000000, () => new ThreadState(),
(i, loopstate, threadstate) =>
{
ComplexDataSet data = GenerateData(i);
threadstate.Add(data);
return threadstate;
}, threadstate => threadstate.Dispose());
sealed class ThreadState : IDisposable
{
readonly IDisposable db;
readonly Queue<ComplexDataSet> queue = new Queue<ComplexDataSet>();
public ThreadState()
{
// initialize db with a private MongoDb connection.
}
public void Add(ComplexDataSet cds)
{
queue.Enqueue(cds);
if(queue.Count == 100)
{
Commit();
}
}
void Commit()
{
db.Write(queue);
queue.Clear();
}
public void Dispose()
{
try
{
if(queue.Count > 0)
{
Commit();
}
}
finally
{
db.Dispose();
}
}
}
Now, MongoDb currently doesn't support truly concurrent inserts -- it holds some expensive locks in the server, so parallel commits won't gain you much (if any) speed. They want to fix this in the future, so you might get a free speed-up one day.
If you need to limit the number of database connections held, a producer/consumer setup is a good alternative. You can use a BlockingCollection queue to do this efficiently without using any locks:
// Specify a maximum of 1000 items in the collection so that we don't
// run out of memory if we get data faster than we can commit it.
// Add() will wait if it is full.
BlockingCollection<ComplexDataSet> commits =
new BlockingCollection<ComplexDataSet>(1000);
Task consumer = Task.Factory.StartNew(() =>
{
// This is the consumer. It processes the
// "commits" queue until it signals completion.
while(!commits.IsCompleted)
{
ComplexDataSet cds;
// Timeout of -1 will wait for an item or IsCompleted == true.
if(commits.TryTake(out cds, -1))
{
// Got at least one item, write it.
db.Write(cds);
// Continue dequeuing until the queue is empty, where it will
// timeout instantly and return false, or until we've dequeued
// 100 items.
for(int i = 1; i < 100 && commits.TryTake(out cds, 0); ++i)
{
db.Write(cds);
}
// Now that we're waiting for more items or have dequeued 100
// of them, commit. More can be continue to be added to the
// queue by other threads while this commit is processing.
db.Commit();
}
}
}, TaskCreationOptions.LongRunning);
try
{
// This is the producer.
Parallel.For(0, 1000000, i =>
{
ComplexDataSet data = GenerateData(i);
commits.Add(data);
});
}
finally // put in a finally to ensure the task closes down.
{
commits.CompleteAdding(); // set commits.IsFinished = true.
consumer.Wait(); // wait for task to finish committing all the items.
}
In your example you have 10 000 000 packages of work. Each of this needs to be distributed to a thread.
Assuming you don't have a really large number of cpu cores this is not optimal. You also have to synchronize your threads on every buffer.SaveDataToBuffer call (by using locks). Additionally you should be aware that the variable r isn't necessarly increased by one in a chronology view (example: Thread1 executes r with 1,2,3 and Thread2 with 4,5,6. Chronological this would lead to the following sequence of r passed to SaveDataToBuffer 1,4,2,5,3,6 (approximately)).
I would make the packages of work larger and then commit each package at once. This has also the benefit that you don't have to lock/synchronize all to often.
Here's an example:
int total = 10000000;
int step = 1000;
Parallel.For(0, total / step, (r, state) =>
{
int start = r * start;
int end = start + step;
ComplexDataSet[] result = new ComplexDataSet[step];
for (int i = start; i < end; i++)
{
result[i - start] = GenerateData(i);
}
Commit(result);
});
In this example the whole work is split into 10 000 packages (which are executed in parallel) and every package generates 1000 data items and commits them to the database.
With this solution the Commit method might be a bottleneck, if not wisely designed. Best would be to make it thread safe without using any locks. This can be accomplished, if you don't use common objects between threads which need synchronization.
E.g. for a sql server backend that would mean creating an own sql connection in the context of every Commit() call:
private void Commit(ComplexDataSet[] data)
{
using (var connection = new SqlConnection("connection string..."))
{
connection.Open();
// insert your data here...
}
}
Instead of increasing complexity of software, rather consider simplification. You can refactor the code into three parts:
Workers that enqueue
This is concurrent GenerateData in Parallel.For that does some heavy computation and produce ComplexDataSet.
Actual queue
A concurrent queue that stores the results from [1] - so many ComplexDataSet. Here I assumed that one instance of ComplexDataSet is actually not really resource consuming and fairly light. As long as the queue is concurrent it will support parallel "inserts" and "deletes".
Workers that dequeue
Code that takes one instance of the ComplexDataSet from processing queue [2] and puts it into the concurrent bag (or other storage). Once the bag has N number of items you block, stop dequeueing, flush the content of the bag into the database and clear it. Finally, you unblock and resume dequeueing.
Here is some metacode (it still compiles, but needs improvements)
[1]
// [1] - Class is responsible for generating complex data sets and
// adding them to processing queue
class EnqueueWorker
{
//generate data and add to queue
internal void ParrallelEnqueue(ConcurrentQueue<ComplexDataSet> resultQueue)
{
Parallel.For(1, 10000, (i) =>
{
ComplexDataSet cds = GenerateData(i);
resultQueue.Enqueue(cds);
});
}
//generate data
ComplexDataSet GenerateData(int i)
{
return new ComplexDataSet();
}
}
[3]
//[3] This guy takes sets from the processing queue and flush results when
// N items have been generated
class DequeueWorker
{
//buffer that holds processed dequeued data
private static ConcurrentBag<ComplexDataSet> buffer;
//lock to flush the data to the db once in a while
private static object syncRoot = new object();
//take item from processing queue and add it to internal buffer storage
//once buffer is full - flush it to the database
internal void ParrallelDequeue(ConcurrentQueue<ComplexDataSet> resultQueue)
{
buffer = new ConcurrentBag<ComplexDataSet>();
int N = 100;
Parallel.For(1, 10000, (i) =>
{
//try dequeue
ComplexDataSet cds = null;
var spinWait = new SpinWait();
while (cds == null)
{
resultQueue.TryDequeue(out cds);
spinWait.SpinOnce();
}
//add to buffer
buffer.Add(cds);
//flush to database if needed
if (buffer.Count == N)
{
lock (syncRoot)
{
IEnumerable<ComplexDataSet> data = buffer.ToArray();
// flush data to database
buffer = new ConcurrentBag<ComplexDataSet>();
}
}
});
}
}
[2] and usage
class ComplexDataSet { }
class Program
{
//processing queueu - [2]
private static ConcurrentQueue<ComplexDataSet> processingQueue;
static void Main(string[] args)
{
// create new processing queue - single instance for whole app
processingQueue = new ConcurrentQueue<ComplexDataSet>();
//enqueue worker
Task enqueueTask = Task.Factory.StartNew(() =>
{
EnqueueWorker enqueueWorker = new EnqueueWorker();
enqueueWorker.ParrallelEnqueue(processingQueue);
});
//dequeue worker
Task dequeueTask = Task.Factory.StartNew(() =>
{
DequeueWorker dequeueWorker = new DequeueWorker();
dequeueWorker.ParrallelDequeue(processingQueue);
});
}
}
Let's say I have a list and am streaming data from a namedpipe to that list.
hypothetical sample:
private void myStreamingThread()
{
while(mypipe.isconnected)
{
if (mypipe.hasdata)
myList.add(mypipe.data);
}
}
Then on another thread I need to read that list every 1000ms for example:
private void myListReadingThread()
{
while(isStarted)
{
if (myList.count > 0)
{
//do whatever I need to.
}
Thread.Sleep(1000);
}
}
My priority here is to be able to read the list every 1000 ms and do whatever I need with the list but at the same time it is very important to be able to get the new data from it that comes from the pipe.
What is a good method to come with this ?
Forgot to mention I am tied to .NET 3.5
I would recommend using a Queue with a lock.
Queue<string> myQueue = new Queue<string>();
private void myStreamingThread()
{
while(mypipe.isconnected)
{
if (mypipe.hasdata)
{
lock (myQueue)
{
myQueue.add(mypipe.data);
}
}
}
}
If you want to empty the queue every 1000 ms, do not use Thread.Sleep. Use a timer instead.
System.Threading.Timer t = new Timer(myListReadingProc, null, 1000, 1000);
private void myListReadingProc(object s)
{
while (myQueue.Count > 0)
{
lock (myQueue)
{
string item = myQueue.Dequeue();
// do whatever
}
}
}
Note that the above assumes that the queue is only being read by one thread. If multiple threads are reading, then there's a race condition. But the above will work with a single reader and one or more writers.
I would suggest using a ConcurrentQueue (http://msdn.microsoft.com/en-us/library/dd267265.aspx). If you use a simple List<> then you will encourter a lot threading issues.
The other practice would be to use a mutex called outstandingWork and wait on it instead of Thread.Sleep(). Then when you enqueue some work you pulse outstandingWork. This means that you sleep when no work is available but start processing work immediately instead of sleep the entire 1 second.
Edit
As #Prix pointed out, you are using .Net 3.5. So you cannot use ConcurrentQueue. Use the Queue class with the following
Queue<Work> queue;
AutoResetEvent outstandingWork = new AutoResetEvent(false);
void Enqueue(Work work)
{
lock (queue)
{
queue.Enqueue(work);
outstandingWork.Set();
}
}
Work DequeMaybe()
{
lock (queue)
{
if (queue.Count == 0) return null;
return queue.Dequeue();
}
}
void DoWork()
{
while (true)
{
Work work = DequeMaybe();
if (work == null)
{
outstandingWork.WaitOne();
continue;
}
// Do the work.
}
}
i have built a Producer Consumer queue wrapping a ConcurrentQueue of .net 4.0 with SlimManualResetEvent signaling between the producing (Enqueue) and the consuming (while(true) thread based.
the queue looks like:
public class ProducerConsumerQueue<T> : IDisposable, IProducerConsumerQueue<T>
{
private bool _IsActive=true;
public int Count
{
get
{
return this._workerQueue.Count;
}
}
public bool IsActive
{
get { return _IsActive; }
set { _IsActive = value; }
}
public event Dequeued<T> OnDequeued = delegate { };
public event LoggedHandler OnLogged = delegate { };
private ConcurrentQueue<T> _workerQueue = new ConcurrentQueue<T>();
private object _locker = new object();
Thread[] _workers;
#region IDisposable Members
int _workerCount=0;
ManualResetEventSlim _mres = new ManualResetEventSlim();
public void Dispose()
{
_IsActive = false;
_mres.Set();
LogWriter.Write("55555555555");
for (int i = 0; i < _workerCount; i++)
// Wait for the consumer's thread to finish.
{
_workers[i].Join();
}
LogWriter.Write("6666666666");
// Release any OS resources.
}
public ProducerConsumerQueue(int workerCount)
{
try
{
_workerCount = workerCount;
_workers = new Thread[workerCount];
// Create and start a separate thread for each worker
for (int i = 0; i < workerCount; i++)
(_workers[i] = new Thread(Work)).Start();
}
catch (Exception ex)
{
OnLogged(ex.Message + ex.StackTrace);
}
}
#endregion
#region IProducerConsumerQueue<T> Members
public void EnqueueTask(T task)
{
if (_IsActive)
{
_workerQueue.Enqueue(task);
//Monitor.Pulse(_locker);
_mres.Set();
}
}
public void Work()
{
while (_IsActive)
{
try
{
T item = Dequeue();
if (item != null)
OnDequeued(item);
}
catch (Exception ex)
{
OnLogged(ex.Message + ex.StackTrace);
}
}
}
#endregion
private T Dequeue()
{
try
{
T dequeueItem;
//if (_workerQueue.Count > 0)
//{
_workerQueue.TryDequeue(out dequeueItem);
if (dequeueItem != null)
return dequeueItem;
//}
if (_IsActive)
{
_mres.Wait();
_mres.Reset();
}
//_workerQueue.TryDequeue(out dequeueItem);
return dequeueItem;
}
catch (Exception ex)
{
OnLogged(ex.Message + ex.StackTrace);
T dequeueItem;
//if (_workerQueue.Count > 0)
//{
_workerQueue.TryDequeue(out dequeueItem);
return dequeueItem;
}
}
public void Clear()
{
_workerQueue = new ConcurrentQueue<T>();
}
}
}
when calling Dispose it sometimes blocks on the join (one thread consuming) and the dispose method is stuck. i guess it get's stuck on the Wait of the resetEvents but for that i call the set on the dispose.
any suggestions?
Update: I understand your point about needing a queue internally. My suggestion to use a BlockingCollection<T> is based on the fact that your code contains a lot of logic to provide the blocking behavior. Writing such logic yourself is very prone to bugs (I know this from experience); so when there's an existing class within the framework that does at least some of the work for you, it's generally preferable to go with that.
A complete example of how you can implement this class using a BlockingCollection<T> is a little bit too large to include in this answer, so I've posted a working example on pastebin.com; feel free to take a look and see what you think.
I also wrote an example program demonstrating the above example here.
Is my code correct? I wouldn't say yes with too much confidence; after all, I haven't written unit tests, run any diagnostics on it, etc. It's just a basic draft to give you an idea how using BlockingCollection<T> instead of ConcurrentQueue<T> cleans up a lot of your logic (in my opinion) and makes it easier to focus on the main purpose of your class (consuming items from a queue and notifying subscribers) rather than a somewhat difficult aspect of its implementation (the blocking behavior of the internal queue).
Question posed in a comment:
Any reason you're not using BlockingCollection<T>?
Your answer:
[...] i needed a queue.
From the MSDN documentation on the default constructor for the BlockingCollection<T> class:
The default underlying collection is a ConcurrentQueue<T>.
If the only reason you opted to implement your own class instead of using BlockingCollection<T> is that you need a FIFO queue, well then... you might want to rethink your decision. A BlockingCollection<T> instantiated using the default parameterless constructor is a FIFO queue.
That said, while I don't think I can offer a comprehensive analysis of the code you've posted, I can at least offer a couple of pointers:
I'd be very hesitant to use events in the way that you are here for a class that deals with such tricky multithreaded behavior. Calling code can attach any event handlers it wants, and these can in turn throw exceptions (which you don't catch), block for long periods of time, or possibly even deadlock for reasons completely outside your control--which is very bad in the case of a blocking queue.
There's a race condition in your Dequeue and Dispose methods.
Look at these lines of your Dequeue method:
if (_IsActive) // point A
{
_mres.Wait(); // point C
_mres.Reset(); // point D
}
And now take a look at these two lines from Dispose:
_IsActive = false;
_mres.Set(); // point B
Let's say you have three threads, T1, T2, and T3. T1 and T2 are both at point A, where each checks _IsActive and finds true. Then Dispose is called, and T3 sets _IsActive to false (but T1 and T2 have already passed point A) and then reaches point B, where it calls _mres.Set(). Then T1 gets to point C, moves on to point D, and calls _mres.Reset(). Now T2 reaches point C and will be stuck forever since _mres.Set will not be called again (any thread executing Enqueue will find _IsActive == false and return immediately, and the thread executing Dispose has already passed point B).
I'd be happy to try and offer some help on solving this race condition, but I'm skeptical that BlockingCollection<T> isn't in fact exactly the class you need for this. If you can provide some more information to convince me that this isn't the case, maybe I'll take another look.
Since _IsActive isn't marked as volatile and there's no lock around all access, each core can have a separate cache for this value and that cache may never get refreshed. So marking _IsActive to false in Dispose will not actually affect all running threads.
http://igoro.com/archive/volatile-keyword-in-c-memory-model-explained/
private volatile bool _IsActive=true;
Can anyone see any problems with this Producer/Consumer unique keyed buffer impl? The idea is if you add items for processing with the same key only the lastest value will be processed and the old/existing value will be thrown away.
public sealed class PCKeyedBuffer<K,V>
{
private readonly object _locker = new object();
private readonly Thread _worker;
private readonly IDictionary<K, V> _items = new Dictionary<K, V>();
private readonly Action<V> _action;
private volatile bool _shutdown;
public PCKeyedBuffer(Action<V> action)
{
_action = action;
(_worker = new Thread(Consume)).Start();
}
public void Shutdown(bool waitForWorker)
{
_shutdown = true;
if (waitForWorker)
_worker.Join();
}
public void Add(K key, V value)
{
lock (_locker)
{
_items[key] = value;
Monitor.Pulse(_locker);
}
}
private void Consume()
{
while (true)
{
IList<V> values;
lock (_locker)
{
while (_items.Count == 0) Monitor.Wait(_locker);
values = new List<V>(_items.Values);
_items.Clear();
}
foreach (V value in values)
{
_action(value);
}
if(_shutdown) return;
}
}
}
static void Main(string[] args)
{
PCKeyedBuffer<string, double> l = new PCKeyedBuffer<string, double>(delegate(double d)
{
Thread.Sleep(10);
Console.WriteLine(
"Processed: " + d.ToString());
});
for (double i = 0; i < 100; i++)
{
l.Add(i.ToString(), i);
}
for (double i = 0; i < 100; i++)
{
l.Add(i.ToString(), i);
}
for (double i = 0; i < 100; i++)
{
l.Add(i.ToString(), i);
}
Console.WriteLine("Done Enqeueing");
Console.ReadLine();
}
After a quick once over I would say that the following code in the Consume method
while (_items.Count == 0) Monitor.Wait(_locker);
Should probably Wait using a timeout and check the _shutdown flag each iteration. Especially since you are not setting your consumer thread to be aq background thread.
In addition, the Consume method does not appear very scalable, since it single handedly tries to process an entire queue of items. Of course this might depend on the rate that items are being produced. I would probably have the consumer focus on a single item in the list and then use TPL to run multiple concurrent consumers, this way you can take advantage of multple cores while letting TPL balance the work load for you. To reduce the required locking for the consumer processing a single item you could use a ConcurrentDictionary
As Chris pointed out, ConcurrentDictionary already exists and is more scalable. It was added to the base libraries in .NET 4.0, and is also available as an add-on to .NET 3.5.
This is one of the few attempts at creating a custom producer/consumer that is actually correct. So job well done in that regard. However, like Chris pointed out your stop flag will be ignored while Monitor.Wait is blocked. There is no need to rehash his suggestion for fixing that. The advice I can offer is to use a BlockingCollection instead of doing the Wait/Pulse calls manually. That would also solve the shutdown problem since the Take method is cancellable. If you are not using .NET 4.0 then it available in the Reactive Extension download that Stephen linked to. If that is not an option then Stephen Toub has a correct implementation here (except his is not cancellable, but you can always do a Thread.Interrupt to safely unblock it). What you can do is feed in KeyValuePair items into the queue instead of using a Dictionary.