Proper use of .NET Concurrent Collections

Proper use of .NET Concurrent Collections - c#

In my attempt to create concurrent Socket operations, I've created the following code:
ConcurrentQueue<byte[]> messageQueue;
ManualResetEvent resetEvent;
Thread outThread; // -> new Thread(BeginSending);
public void BeginSending() // invoked by outThread
{
while (true)
{
resetEvent.WaitOne();
while (messageQueue.Count > 0)
{
byte[] msg;
messageQueue.TryDequeue(out msg);
// send msg via socket
}
resetEvent.Reset();
}
}
public void QueueMessage(byte[] msg) // invoked by the main thread
{
messageQueue.Enqueue(msg);
resetEvent.Set();
}
Is adding items to the ConcurrentQueue while a different thread is iterating/dequeuing it a dangerous thing?
From my understanding many synchronized collections simply have individually synchronized methods, but is the same true for concurrentQueue and similar collections?(ConcurrentBag, ConcurrentDictionary, ConcurrentStack)

The ConcurrentQueue itself is OK, as long as you are not mutating the arrays stored as its elements.
However, your usage pattern with ManualResetEvent suggests that there is a better solution: if you use BlockingCollection<T>, you would be able to avoid doing manual synchronization.

Is adding items to the ConcurrentQueue while a different thread is iterating/dequeuing it a dangerous thing?
No, it is safe.

The ConcurrentQueue is fine, the ManualResetEvent is not:
public void BeginSending() // invoked by outThread
{
while (true)
{
resetEvent.WaitOne();
while (messageQueue.Count > 0)
{
byte[] msg;
messageQueue.TryDequeue(out msg);
// send msg via socket
}
// context change
messageQueue.Enqueue(msg);
resetEvent.Set();
// context change
resetEvent.Reset();
}
}
Such a sequence of events will result in an enqueued message being ignored. Either use a BlockingCollection, as suggested by the other posters, or use a semaphore for signal/wait.

The concurrent collections are designed to be thread safe. Using them saves you a lot of trouble by implementing them yourself.
Be aware though that the collection itself is synchronized, NOT the data inside it. Updating the objects inside the collections without attention for the other threads can bring race-conditions.
As with any usage of a class it helps if you have a understanding of the use and use-cases of the collections

Related

In the scenario of using Wait() and Pulse(), can we replace `while` with `if`?

On the internet, I saw many example about Wait() and Pulse() and they used two while like in this example:
class MyQueue
{
private Queue<string> queue = new Queue<string>();
private const int CAPACITY = 3;
public void Put(string element)
{
lock (this)
{
// first `while`
while (queue.Count == CAPACITY)
{
Monitor.Wait(this);
}
queue.Enqueue(element);
Console.WriteLine($"Put {element} ({queue.Count})");
Monitor.Pulse(this);
}
}
public string Take()
{
lock (this)
{
// second `while`
while (queue.Count == 0)
{
Monitor.Wait(this);
}
string element = queue.Dequeue();
Console.WriteLine($"Taked {element} ({queue.Count})");
Monitor.Pulse(this);
return element;
}
}
}
In the Main():
MyQueue queue = new MyQueue();
new Thread(new ThreadStart(() => {
queue.Take();
queue.Take();
})).Start();
new Thread(new ThreadStart(() => {
queue.Put("a");
queue.Put("b");
queue.Put("c");
queue.Put("d");
queue.Put("e");
queue.Put("f");
})).Start();
I think I understood the scenario of using Pulse() and Wait().
In the above example, I think it's ok to replace the two while with if. I tried and it also printed the same result.
Is it right? Thank you.

In your exact example, probably it would be fine to do as you suggest. You have exactly one producer and one consumer, and so they should always operate in concert with each other to ensure a thread is woken only if its wait condition is resolved.
However:
The producer and consumer implementations would not be safe to use if, if you have more than one of either the producer or consumer. This is because the threads could be racing, and one thread could be made runnable but then not scheduled until another thread has in some way invalidated the original resolution of the wait condition.
While I'm skeptical that the .NET Monitor class is subject to the problem of spurious wake-ups — i.e. a thread in a wait state being woken due to some event other than an explicit wake by a cooperating thread (e.g. calling Monitor.Pulse()) — people who know concurrent programming and C# much better than I do have said otherwise (see e.g. Does C# Monitor.Wait() suffer from spurious wakeups?). And if you're at all concerned about spurious wake-ups, you'll want a loop instead of a simple if, to ensure that you recheck the wait condition before proceeding, just in case it wasn't actually satisfied before your thread was woken.
See also Eric Lippert's article Monitor madness, part two.
All that said, note that a producer/consumer scenario is much more easily implemented in modern .NET/C# by using BlockingCollection<T>. You can even include a maximum length for the queue when creating the collection to provide the "block if full" behavior seen in your code example.

Scaling Connections with BlockingCollection<T>()

I have a server which communicates with 50 or more devices over TCP LAN. There is a Task.Run for each socket reading message loop.
I buffer each message reach into a blocking queue, where each blocking queue has a Task.Run using a BlockingCollection.Take().
So something like (semi-pseudocode):
Socket Reading Task
Task.Run(() =>
{
while (notCancelled)
{
element = ReadXml();
switch (element)
{
case messageheader:
MessageBlockingQueue.Add(deserialze<messageType>());
...
}
}
});
Message Buffer Task
Task.Run(() =>
{
while (notCancelled)
{
Process(MessageQueue.Take());
}
});
So that would make 50+ reading tasks and 50+ tasks blocking on their own buffers.
I did it this way to avoid blocking the reading loop and allow the program to distribute processing time on messages more fairly, or so I believe.
Is this an inefficient way to handle it? what would be a better way?

You may be interested in the "channels" work, in particular: System.Threading.Channels. The aim of this is to provider asynchronous producer/consumer queues, covering both single and multiple producer and consumer scenarios, upper limits, etc. By using an asynchronous API, you aren't tying up lots of threads just waiting for something to do.
Your read loop would become:
while (notCancelled) {
var next = await queue.Reader.ReadAsync(optionalCancellationToken);
Process(next);
}
and the producer:
switch (element)
{
case messageheader:
queue.Writer.TryWrite(deserialze<messageType>());
...
}
so: minimal changes
Alternatively - or in combination - you could look into things like "pipelines" (https://www.nuget.org/packages/System.IO.Pipelines/) - since you're dealing with TCP data, this would be an ideal fit, and is something I've looked at for the custom web-socket server here on Stack Overflow (which deals with huge numbers of connections). Since the API is async throughout, it does a good job of balancing work - and the pipelines API is engineered with typical TCP scenarios in mind, for example partially consuming incoming data streams as you detect frame boundaries. I've written about this usage a lot, with code examples mostly here. Note that "pipelines" doesn't include a direct TCP layer, but the "kestrel" server includes one, or the third-party library https://www.nuget.org/packages/Pipelines.Sockets.Unofficial/ does (disclosure: I wrote it).

I actually do something similar in another project. What I learned or would do differently are the following:
First of all, better to use dedicated threads for the reading/writing loop (with new Thread(ParameterizedThreadStart)) because Task.Run uses a pool thread and as you use it in a (nearly) endless loop the thread is practically never returned to the pool.
var thread = new Thread(ReaderLoop) { Name = nameof(ReaderLoop) }; // priority, etc if needed
thread.Start(cancellationToken);
Your Process can be an event, which you can invoke asynchronously so your reader loop can be return immediately to process the new incoming packages as fast as possible:
private void ReaderLoop(object state)
{
var token = (CancellationToken)state;
while (!token.IsCancellationRequested)
{
try
{
var message = MessageQueue.Take(token);
OnMessageReceived(new MessageReceivedEventArgs(message));
}
catch (OperationCanceledException)
{
if (!disposed && IsRunning)
Stop();
break;
}
}
}
Please note that if a delegate has multiple targets it's async invocation is not trivial. I created this extension method for invoking a delegate on pool threads:
public static void InvokeAsync<TEventArgs>(this EventHandler<TEventArgs> eventHandler, object sender, TEventArgs args)
{
void Callback(IAsyncResult ar)
{
var method = (EventHandler<TEventArgs>)ar.AsyncState;
try
{
method.EndInvoke(ar);
}
catch (Exception e)
{
HandleError(e, method);
}
}
foreach (EventHandler<TEventArgs> handler in eventHandler.GetInvocationList())
handler.BeginInvoke(sender, args, Callback, handler);
}
So the OnMessageReceived implementation can be:
protected virtual void OnMessageReceived(MessageReceivedEventArgs e)
=> messageReceivedHandler.InvokeAsync(this, e);
Finally it was a big lesson that BlockingCollection<T> has some performance issues. It uses SpinWait internally, whose SpinOnce method waits longer and longer times if there is no incoming data for a long time. This is a tricky issue because even if you log every single step of the processing you will not notice that everything is started delayed unless you can mock also the server side. Here you can find a fast BlockingCollection implementation using an AutoResetEvent for triggering incoming data. I added a Take(CancellationToken) overload to it as follows:
/// <summary>
/// Takes an item from the <see cref="FastBlockingCollection{T}"/>
/// </summary>
public T Take(CancellationToken token)
{
T item;
while (!queue.TryDequeue(out item))
{
waitHandle.WaitOne(cancellationCheckTimeout); // can be 10-100 ms
token.ThrowIfCancellationRequested();
}
return item;
}
Basically that's it. Maybe not everything is applicable in your case, eg. if the nearly immediate response is not crucial the regular BlockingCollection also will do it.

Yes, this is a bit inefficient, because you block ThreadPool threads.
I already discussed this problem Using Task.Yield to overcome ThreadPool starvation while implementing producer/consumer pattern
You can also look at examples with testing a producer -consumer pattern here:
https://github.com/BBGONE/TestThreadAffinity
You can use await Task.Yield in the loop to give other tasks access to this thread.
You can solve it also by using dedicated threads or better a custom ThreadScheduler which uses its own thread pool. But it is ineffective to create 50+ plain threads. Better to adjust the task, so it would be more cooperative.
If you use a BlockingCollection (because it can block the thread for long while waiting to write (if bounded) or to read or no items to read) then it is better to use System.Threading.Tasks.Channels https://github.com/stephentoub/corefxlab/blob/master/src/System.Threading.Tasks.Channels/README.md
They don't block the thread while waiting when the collection will be available to write or to read. There's an example how it is used https://github.com/BBGONE/TestThreadAffinity/tree/master/ThreadingChannelsCoreFX/ChannelsTest

C# threading for processing message queue

I have the following requirements -
A thread which receives the messages and en-queue those.
A thread which processes the enqueued messages.
Now, the second thread always has to be alive - for which I have used infinite while loop as follows:
private AutoResetEvent messageReset;
private Queue<byte[]> messageQueue;
//thread 2 method
private void ProcessIncomingMessages()
{
messageReset.WaitOne(); //wait for signal
while(true)
{
if (messageQueue.Count > 0)
{
//processing messages
}
}
}
public void SubmitMessageForProcessing(byte[] message){
messageQueue.Enqueue(message); //enqueue message
// Release the thread
messageReset.Set();
}
Now, this infinite while loop is shooting the CPU utilization very high.
Is there any workaround to lower down the CPU utilization
NOTE: I can't add any thread.sleep statement as the incoming messages are to be displayed on UI with minimum delay.

Just use a BlockingCollection instead of Queue. It is threadsafe and will block onTake until some worker adds an item:
// Use default constructor to make BlockingCollection FIFO
private BlockingCollection<byte[]> messageQueue = new BlockingCollection<byte[]>();
//thread 2 method
private void ProcessIncomingMessages()
{
while (true)
{
//will block until thread1 Adds a message
byte[] message = messageQueue.Take();
//processing messages
}
}
public void SubmitMessageForProcessing(byte[] message)
{
messageQueue.Add(message); //enqueue message
}
EDIT2: I forgot to mention that by using the default constructor BlockingCollection will be FIFO. It will actually use a ConcurrentQueue as item container.
If you wanted BlockingCollection to behave like a LIFO collection you would need to pass a IProducerConsumerCollection that is LIFO to the constructor. The usual class for that would be ConcurrentStack
EDIT: Some explanation on how your Queue is not thread-safe and this could lead to problems with your current code.
From the Microsoft documentation on Queue:
A Queue can support multiple readers concurrently, as long as the collection is not modified.
This means you cannot read and write from multiple threads at the same time.
Look at the following example which also applies to the other answers which suggest just moving messageReset.WaitOne() in your while(true) block.
SubmitMessageForProcessing is called and signals messageReset.Set()
Thread 2 gets active and tries to read data.
While thread 2 reads data SubmitMessageForProcessing is called a second time.
Now you are writing and reading at the same time resulting in unexpected behavior (usually some kind of exception)

In your example, the while loop will busy-wait until the queue has at least one element. You can move the signal into that loop to reduce the busy-waiting and use less CPU.
private void ProcessIncomingMessages()
{
while(true)
{
messageReset.WaitOne(100); //wait for signal
while (messageQueue.Count > 0)
{
//processing messages
}
}
}
P.S. Unless you have some sort of custom locking mechanism, you must use a ConcurrentQueue<T> instead of a Queue<T> if you want to be thread-safe. Also, I put a timeout on the WaitOne call because there is a slim chance the signal will get set after you check Count but before the WaitOne call is reached. There may be other threading issues in your solution. If you're not confident about threading concerns, you might want to use a BlockingCollection, which takes care of a lot of the details for you.

Can I optimise this concurrency better?

I've recently begun my first multi-threading code, and I'd appreciate some comments.
It delivers video samples from a buffer that is filled in the background by a stream parser (outside the scope of this question). If the buffer is empty, it needs to wait until the buffer level becomes acceptable and then continue.
Code is for Silverlight 4, some error-checking removed:
// External class requests samples - can happen multiple times concurrently
protected override void GetSampleAsync()
{
Interlocked.Add(ref getVideoSampleRequestsOutstanding, 1);
}
// Runs on a background thread
void DoVideoPumping()
{
do
{
if (getVideoSampleRequestsOutstanding > 0)
{
PumpNextVideoSample();
// Decrement the counter
Interlocked.Add(ref getVideoSampleRequestsOutstanding, -1);
}
else Thread.Sleep(0);
} while (!this.StopAllBackgroundThreads);
}
void PumpNextVideoSample()
{
// If the video sample buffer is empty, tell stream parser to give us more samples
bool MyVidBufferIsEmpty = false; bool hlsClientIsExhausted = false;
ParseMoreSamplesIfMyVideoBufferIsLow(ref MyVidBufferIsEmpty, ref parserAtEndOfStream);
if (parserAtEndOfStream) // No more data, start running down buffers
this.RunningDownVideoBuffer = true;
else if (MyVidBufferIsEmpty)
{
// Buffer is empty, wait for samples
WaitingOnEmptyVideoBuffer = true;
WaitOnEmptyVideoBuffer.WaitOne();
}
// Buffer is OK
nextSample = DeQueueVideoSample(); // thread-safe, returns NULL if a problem
// Send the sample to the external renderer
ReportGetSampleCompleted(nextSample);
}
The code seems to work well. However, I'm told that using Thread.Wait(...) is 'evil': when no samples are being requested, my code loops unnecessarily, eating up CPU time.
Can my code be further optimised? Since my class is designed for an environment where samples WILL be requested, does the potential 'pointless loop' scenario outweigh the simplicity of its current design?
Comments much appreciated.

This looks like the classic producer/consumer pattern. The normal way to solve this is with what is known as a blocking queue.
Version 4.0 of .net introduced a set of efficient, well-designed, concurrent collection classes for this very type of problem. I think BlockingCollection<T> will serve your present needs.
If you don't have access to .net 4.0 then there are many websites containing implementations of blocking queues. Personally my standard reference is Joe Duffy's book, Concurrent Programming on Windows. A good start would be Marc Gravell's blocking queue presented here in Stack Overflow.
The first advantage of using a blocking queue is that you stop using busy wait loops, hacky calls to Sleep() etc. Using a blocking queue to avoid this sort of code is always a good idea.
However, I perceive a more important benefit to using a blocking queue. At the moment your code to produce work items, consume them, and handle the queue is all intermingled. If you use a blocking queue correctly then you will end up with much better factored code which keeps separate various components of the algorithm: queue, producer and consumer.

You have one main problem: Thread.Sleep()
It has a granularity of ~20ms, that is kind of crude for video. In addition Sleep(0) has issues of possible starvation of lower-priority threads [].
The better approach is waiting on a Waithandle, preferably built into a Queue.

Blocking queue is a good and simple example of a blocking queue.
The main key is that the threads need to be coordinated with signals and not by checking the value of a counter or the state of a data structure. Any checking takes ressources (CPU) and thus you need signals (Monitor.Wait and Monitor.Pulse).

You could use an AutoResetEvent rather than a manual thread.sleep. It's fairly simple to do so:
AutoResetEvent e;
void RequestSample()
{
Interlocked.Increment(ref requestsOutstanding);
e.Set(); //also set this when StopAllBackgroundThreads=true!
}
void Pump()
{
while (!this.StopAllBackgroundThreads) {
e.WaitOne();
int leftOver = Interlocked.Decrement(ref requestsOutstanding);
while(leftOver >= 0) {
PumpNextVideoSample();
leftOver = Interlocked.Decrement(ref requestsOutstanding);
}
Interlocked.Increment(ref requestsOutstanding);
}
}
Note that it's probably even more attractive to implement a semaphore. Basically; synchronization overhead is liable to be almost nil anyhow in your scenario, and a simpler programming model is worth it. With a semaphore, you'd have something like this:
MySemaphore sem;
void RequestSample()
{
sem.Release();
}
void Pump()
{
while (true) {
sem.Acquire();
if(this.StopAllBackgroundThreads) break;
PumpNextVideoSample();
}
}
...I'd say the simplicity is worth it!
e.g. a simple implemenation of a semaphore:
public sealed class SimpleSemaphore
{
readonly object sync = new object();
public int val;
public void WaitOne()
{
lock(sync) {
while(true) {
if(val > 0) {
val--;
return;
}
Monitor.Wait(sync);
}
}
}
public void Release()
{
lock(sync) {
if(val==int.MaxValue)
throw new Exception("Too many releases without waits.");
val++;
Monitor.Pulse(sync);
}
}
}
On one trivial benchmark this trivial implementation needs ~1.7 seconds where Semaphore needs 7.5 and SemaphoreSlim needs 1.1; suprisingly reasonable, in other words.

How to async add elements to Queue<T> in C#?

public void EnqueueTask(int[] task)
{
lock (_locker)
{
_taskQ.Enqueue(task);
Monitor.PulseAll(_locker);
}
}
So, here I'm adding elements to my queue and than threads do some work with them.How can I add items to my queue asynchronously?

If you using .net V4 have a look at the new thread safe collections, they are mostly none blocking so will properly avoid the need for an async add.

Since your using Queue<T> (recommended), Queue.Synchronized can't be used.
But besides that I would use the thread pool. But your EnqueueTask method kind of implies that the threading logic is handled outside of your "TaskQueue" class (your method implies that it is a Queue of tasks).
Your implementation also implies that it is not "Here" we wan't to add logic but rather in another place, the code you have there isn't really blocking for long so I would turn things upside down.
It also implies that the thing taking things off the queue is already on another thread since you use "PulseAll" to weak that thread up.
E.g.
public void StartQueueHandler()
{
new Thread(()=>StartWorker).Start();
}
private int[] Dequeue()
{
lock(_locker)
{
while(_taskQ.Count == 0) Monitor.Wait(_locker);
return _taskQ.Dequeue();
}
}
private void StartWorker(object obj)
{
while(_keepProcessing)
{
//Handle thread abort or have another "shot down" mechanism.
int[] work = Dequeue();
//If work should be done in parallel without results.
ThreadPool.QueueUserWorkItem(obj => DoWork(work));
//If work should be done sequential according to the queue.
DoWork(work);
}
}

Maybe something like this could work:
void AddToQueue(Queue queue, string mess) {
var t = new Thread(() => Queue.Synchronized(queue).Enqueue(mess));
t.Start();
}
The new thread ensures that your current thread does not block.
Queue.Syncronized handles all locking of the queue.
It could be replaced with your locker code, might be better performance.

The code from your question seems to indicate that you are attempting to implement a blocking queue. I make that obseration from the call to Monitor.PulseAll after the Queue<T>.Enqueue. This is the normal pattern for signalling the dequeuing thread. So if that is the case then the best option is to use the BlockingCollection class which is available in .NET 4.0.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.