I have a code that manages a large queue of data, it's locked witch lock statement to ensure only a single thread is working on it at a time.
The order of data in queue is really important, and each thread with its parameters can either add or take from it.
How do I ensure threads are queued to start in order of FIFO like my queue? Does the lock statement guarantee this?
var t = new Thread(() => parse(params)); //This is how I start my threads.
t.Start();
No, the lock statement does not guarantee FIFO ordering. Per Albahari:
If more than one thread contends the lock, they are queued on a “ready queue” and granted the lock on a first-come, first-served basis (a caveat is that nuances in the behavior of Windows and the CLR mean that the fairness of the queue can sometimes be violated).
If you want to ensure that your items are retrieved in a FIFO order, you should use the ConcurrentQueue<T> collection instead.
Edit: If you're targeting .NET 2.0, you could use a custom implementation for a concurrent thread-safe queue. Here's a trivial one:
public class ThreadSafeQueue<T>
{
private readonly object syncLock = new object();
private readonly Queue<T> innerQueue = new Queue<T>();
public void Enqueue(T item)
{
lock (syncLock)
innerQueue.Enqueue(item);
}
public bool TryDequeue(out T item)
{
lock (syncLock)
{
if (innerQueue.Count == 0)
{
item = default(T);
return false;
}
item = innerQueue.Dequeue();
return true;
}
}
}
Lock does't guarantee First In First Out access. An alternate approach would be Queue if you are limited with .NET 2.0. Keep in mind that, Queue is not thread safe hence you should synchronize the access.
Related
I need to have the piece of code which allowed to execute only by 1 thread at the same time based on parameter key:
private static readonly ConcurrentDictionary<string, SemaphoreSlim> Semaphores = new();
private async Task<TModel> GetValueWithBlockAsync<TModel>(string valueKey, Func<Task<TModel>> valueAction)
{
var semaphore = Semaphores.GetOrAdd(valueKey, s => new SemaphoreSlim(1, 1));
try
{
await semaphore.WaitAsync();
return await valueAction();
}
finally
{
semaphore.Release(); // Exception here - System.ObjectDisposedException
if (semaphore.CurrentCount > 0 && Semaphores.TryRemove(valueKey, out semaphore))
{
semaphore?.Dispose();
}
}
}
Time to time I got the error:
The semaphore has been disposed. : System.ObjectDisposedException: The semaphore has been disposed.
at System.Threading.SemaphoreSlim.CheckDispose()
at System.Threading.SemaphoreSlim.Release(Int32 releaseCount)
at Project.GetValueWithBlockAsync[TModel](String valueKey, Func`1 valueAction)
All cases that I can imagine here are thread safety. Please help, what case I missed?
You have a thread race here, where another task is trying to acquire the same semaphore, and acquires it when you Release - i.e. another thread is awaiting the semaphore.WaitAsync(). The check against CurrentCount is a race condition, and it could go either way depending on timing. The check for TryRemove is irrelevant, as the competing thread already got the semaphore out - it was, after all, awaiting the WaitAsync().
As discussed in the comments, you have a couple of race conditions here.
Thread 1 holds the lock and Thread 2 is waiting on WaitAsync(). Thread 1 releases the lock, and then checks semaphore.CurrentCount, before Thread 2 is able to acquire it.
Thread 1 holds the lock, releases it, and checks semaphore.CurrentCount which passes. Thread 2 enters GetValueWithBlockAsync, calls Semaphores.GetOrAdd and fetches the semaphore. Thread 1 then calls Semaphores.TryRemove and diposes the semaphore.
You really need locking around the decision to remove an entry from Semaphores -- there's no way around this. You also don't have a way of tracking whether any threads have fetched a semaphore from Semaphores (and are either currently waiting on it, or haven't yet got to that point).
One way is to do something like this: have a lock which is shared between everyone, but which is only needed when fetching/creating a semaphore, and deciding whether to dispose it. We manually keep track of how many threads currently have an interest in a particular semaphore. When a thread has released the semaphore, it then acquires the shared lock to check whether anyone else currently has an interest in that semaphore, and disposes it only if noone else does.
private static readonly object semaphoresLock = new();
private static readonly Dictionary<string, State> semaphores = new();
private async Task<TModel> GetValueWithBlockAsync<TModel>(string valueKey, Func<Task<TModel>> valueAction)
{
State state;
lock (semaphoresLock)
{
if (!semaphores.TryGetValue(valueKey, out state))
{
state = new();
semaphores[valueKey] = state;
}
state.Count++;
}
try
{
await state.Semaphore.WaitAsync();
return await valueAction();
}
finally
{
state.Semaphore.Release();
lock (semaphoresLock)
{
state.Count--;
if (state.Count == 0)
{
semaphores.Remove(valueKey);
state.Semaphore.Dispose();
}
}
}
}
private class State
{
public int Count { get; set; }
public SemaphoreSlim Semaphore { get; } = new(1, 1);
}
The other option, of course, is to let Semaphores grow. Maybe you have a periodic operation to go through and clear out anything which isn't being used, but this will of course need to be protected to ensure that a thread doesn't suddenly become interested in a semaphore which is being cleared up.
I have a simple scenario with two threads where the first thread reads permanently some data and enqueues that data into a queue. The second thread first peeks at a single object from that queue and makes some conditional checks. If these are good the single object will be dequeued and passed to some processing.
I have tried to use the ConcurrentQueue which is a thread safe implementation of a simple queue, but the problem with this one is that all calls are blocking. This means if the first thread is enqueuing an object, the second thread can't peek or dequeue an object.
In my situation I need to enqueue at the end and dequeue from the beginning of the queue at the same time.
The lock statement of C# would also.
So my question is whether it is possible to do these both operations in parallel without blocking each other in a thread safe way.
These are my first tries and this is an similar example for my problem.
using System;
using System.Collections.Generic;
using System.Threading.Tasks;
namespace Scenario {
public class Program {
public static void Main(string[] args) {
Scenario scenario = new Scenario();
scenario.Start();
Console.ReadKey();
}
public class Scenario {
public Scenario() {
someData = new Queue<int>();
}
public void Start() {
Task.Factory.StartNew(firstThread);
Task.Factory.StartNew(secondThread);
}
private void firstThread() {
Random random = new Random();
while (true) {
int newData = random.Next(1, 100);
someData.Enqueue(newData);
Console.WriteLine("Enqueued " + newData);
}
}
private void secondThread() {
Random random = new Random();
while (true) {
if (someData.Count == 0) {
continue;
}
int singleData = someData.Peek();
int someValue = random.Next(1, 100);
if (singleData > someValue || singleData == 1 || singleData == 99) {
singleData = someData.Dequeue();
Console.WriteLine("Dequeued " + singleData);
// ... processing ...
}
}
}
private readonly Queue<int> someData;
}
}
}
Second example:
public class Scenario {
public Scenario() {
someData = new ConcurrentQueue<int>();
}
public void Start() {
Task.Factory.StartNew(firstThread);
Task.Factory.StartNew(secondThread);
}
private void firstThread() {
Random random = new Random();
while (true) {
int newData = random.Next(1, 100);
someData.Enqueue(newData);
lock (syncRoot) { Console.WriteLine($"Enqued {enqued++} Dequed {dequed}"); }
}
}
private void secondThread() {
Random random = new Random();
while (true) {
if (!someData.TryPeek(out int singleData)) {
continue;
}
int someValue = random.Next(1, 100);
if (singleData > someValue || singleData == 1 || singleData == 99) {
if (!someData.TryDequeue(out singleData)) {
continue;
}
lock (syncRoot) { Console.WriteLine($"Enqued {enqued} Dequed {dequed++}"); }
// ... processing ...
}
}
}
private int enqued = 0;
private int dequed = 0;
private readonly ConcurrentQueue<int> someData;
private static readonly object syncRoot = new object();
}
First off: I strongly encourage you to reconsider whether your technique of having multiple threads and a shared memory data structure is even the right approach at all. Code that has multiple threads of control sharing access to data structures is hard to get right, and failures can be subtle, catastrophic, and hard to debug.
Second: If you are bent upon multiple threads and a shared memory data structure, I strongly encourage you to use designed-by-experts data types like concurrent queues, rather than rolling your own.
Now that I've got those warnings out of the way: here is a way to address your concern. It is sufficiently complicated that you should obtain the services of an expert on the C# memory model to verify the correctness of your solution if you go with this. I would not consider myself to be competent to implement the scheme I'm about to describe, not without help of someone who is actually an expert on the memory model.
The goal is to have a queue that supports simultaneous enqueue and dequeue operations and low lock contention.
What you want is two immutable stack variables called the enqueue stack and the dequeue stack, each with their own lock.
The enqueue operation is:
Take the enqueue lock
Push the item onto the enqueue stack; this produces a new stack in O(1) time.
Assign the newly produced stack to the enqueue stack variable.
Release the enqueue lock
The dequeue operation is:
Take the dequeue lock
If the dequeue stack is empty then
take the enqueue lock
enumerate the enqueue stack and use it to build the dequeue stack; this reverses the enqueue stack, which maintains the property we want: that the first in is the first out.
assign an empty immutable stack to the enqueue stack variable
release the enqueue lock
assign the new stack to the dequeue stack
If the dequeue stack is empty, throw, or abandon and retry later, or sleep until signaled by the enqueue operation, or whatever the right thing to do here is.
The dequeue stack is not empty.
Pop an item from the dequeue stack, which produces a new stack in O(1).
Assign the new stack to the dequeue stack variable.
Release the dequeue lock.
Process the item.
Note that of course if there is only one thread dequeuing, then we don't need the dequeue lock at all, but with this scheme there can be many threads dequeuing.
Suppose there are 1000 items on the enqueue stack and zero on the dequeue stack. When we dequeue the first time, we do an expensive O(n) operation of reversing the enqueue stack once, but now we have 1000 items on the dequeue stack. Once the dequeue stack is big, the dequeueing thread can spend most of its time processing, while the enqueuing thread spends most of its time enqueuing. Contention on the enqueue lock is rare, but expensive when it happens.
Why use immutable data structures? Everything I described here would also work with mutable stacks, but (1) it is easier to reason about immutable stacks, (2) if you want to really live dangerously you can elide some of the locks and go for interlocked swap operations; make sure you understand everything about the possible re-orderings of operations in low-lock conditions if you're doing that.
UPDATE:
The real problem is that i cant dequeue and process a lot of points because i am permanently reading and enquing new points. That enqueue calls are blocking the processing step.
Well if that is your real problem then mentioning it in the question instead of burying it in a comment would be a good idea. Help us help you.
There are a number of things you could do here. You could for example set the priority of the enqueuing thread lower than the priority of the dequeuing thread. Or you could have multiple dequeuing threads, as many as there are CPUs in your machine. Or you could dynamically choose to drop some enqueue operations if the dequeues are not keeping up. Without knowing a lot more about your actual problem it is hard to give advice on how to solve it.
I need to implement a sort of task buffer. Basic requirements are:
Process tasks in a single background thread
Receive tasks from multiple threads
Process ALL received tasks i.e. make sure buffer is drained of buffered tasks after a stop signal is received
Order of tasks received per thread must be maintained
I was thinking of implementing it using a Queue like below. Would appreciate feedback on the implementation. Are there any other brighter ideas to implement such a thing?
public class TestBuffer
{
private readonly object queueLock = new object();
private Queue<Task> queue = new Queue<Task>();
private bool running = false;
public TestBuffer()
{
}
public void start()
{
Thread t = new Thread(new ThreadStart(run));
t.Start();
}
private void run()
{
running = true;
bool run = true;
while(run)
{
Task task = null;
// Lock queue before doing anything
lock (queueLock)
{
// If the queue is currently empty and it is still running
// we need to wait until we're told something changed
if (queue.Count == 0 && running)
{
Monitor.Wait(queueLock);
}
// Check there is something in the queue
// Note - there might not be anything in the queue if we were waiting for something to change and the queue was stopped
if (queue.Count > 0)
{
task = queue.Dequeue();
}
}
// If something was dequeued, handle it
if (task != null)
{
handle(task);
}
// Lock the queue again and check whether we need to run again
// Note - Make sure we drain the queue even if we are told to stop before it is emtpy
lock (queueLock)
{
run = queue.Count > 0 || running;
}
}
}
public void enqueue(Task toEnqueue)
{
lock (queueLock)
{
queue.Enqueue(toEnqueue);
Monitor.PulseAll(queueLock);
}
}
public void stop()
{
lock (queueLock)
{
running = false;
Monitor.PulseAll(queueLock);
}
}
public void handle(Task dequeued)
{
dequeued.execute();
}
}
You can actually handle this with the out-of-the-box BlockingCollection.
It is designed to have 1 or more producers, and 1 or more consumers. In your case, you would have multiple producers and one consumer.
When you receive a stop signal, have that signal handler
Signal producer threads to stop
Call CompleteAdding on the BlockingCollection instance
The consumer thread will continue to run until all queued items are removed and processed, then it will encounter the condition that the BlockingCollection is complete. When the thread encounters that condition, it just exits.
You should think about ConcurrentQueue, which is FIFO, in fact. If not suitable, try some of its relatives in Thread-Safe Collections. By using these you can avoid some risks.
I suggest you take a look at TPL DataFlow. BufferBlock is what you're looking for, but it offers so much more.
Look at my lightweight implementation of threadsafe FIFO queue, its a non-blocking synchronisation tool that uses threadpool - better than create own threads in most cases, and than using blocking sync tools as locks and mutexes. https://github.com/Gentlee/SerialQueue
Usage:
var queue = new SerialQueue();
var result = await queue.Enqueue(() => /* code to synchronize */);
You could use Rx on .NET 3.5 for this. It might have never come out of RC, but I believe it is stable* and in use by many production systems. If you don't need Subject you might find primitives (like concurrent collections) for .NET 3.5 you can use that didn't ship with the .NET Framework until 4.0.
Alternative to Rx (Reactive Extensions) for .net 3.5
* - Nit picker's corner: Except for maybe advanced time windowing, which is out of scope, but buffers (by count and time), ordering, and schedulers are all stable.
I have scenarios where I need a main thread to wait until every one of a set of possible more than 64 threads have completed their work, and for that I wrote the following helper utility, (to avoid the 64 waithandle limit on WaitHandle.WaitAll())
public static void WaitAll(WaitHandle[] handles)
{
if (handles == null)
throw new ArgumentNullException("handles",
"WaitHandle[] handles was null");
foreach (WaitHandle wh in handles) wh.WaitOne();
}
With this utility method, however, each waithandle is only examined after every preceding one in the array has been signalled... so it is in effect synchronous, and will not work if the waithandles are autoResetEvent wait handles (which clear as soon as a waiting thread has been released)
To fix this issue I am considering changing this code to the following, but would like others to check and see if it looks like it will work, or if anyone sees any issues with it, or can suggest a better way ...
Thanks in advance:
public static void WaitAllParallel(WaitHandle[] handles)
{
if (handles == null)
throw new ArgumentNullException("handles",
"WaitHandle[] handles was null");
int actThreadCount = handles.Length;
object locker = new object();
foreach (WaitHandle wh in handles)
{
WaitHandle qwH = wh;
ThreadPool.QueueUserWorkItem(
delegate
{
try { qwH.WaitOne(); }
finally { lock(locker) --actThreadCount; }
});
}
while (actThreadCount > 0) Thread.Sleep(80);
}
If you know how many threads you have, you can use an interlocked decrement. This is how I usually do it:
{
eventDone = new AutoResetEvent();
totalCount = 128;
for(0...128) {ThreadPool.QueueUserWorkItem(ThreadWorker, ...);}
}
void ThreadWorker(object state)
try
{
... work and more work
}
finally
{
int runningCount = Interlocked.Decrement(ref totalCount);
if (0 == runningCount)
{
// This is the last thread, notify the waiters
eventDone.Set();
}
}
Actually, most times I don't even signal but instead invoke a callback continues the processing from where the waiter would continue. Less blocked threads, more scalability.
I know is different and may not apply to your case (eg. for sure will not work if some of thoe handles are not threads, but I/O or events), but it may worth thinking about this.
I'm not sure what exactly you're trying to do, but would a CountdownEvent (.NET 4.0) conceptually solve your problem?
I'm not a C# or .NET programmer, but you could use a semaphore that is posted when one of your worker threads exits. The monitoring thread would simply wait on the semaphore n times where n is the number of worker threads. Semaphores are traditionally used to count resources in use but they can be used to count jobs completed by waiting on the same semaphore for n times.
When working with lots of simultaneous threads, I prefer to add each thread's ManagedThreadId into a Dictionary when I start the thread, and then have each thread invoke a callback routine that removes the dying thread's id from the Dictionary. The Dictionary's Count property tells you how many threads are active. Use the value side of the key/value pair to hold info that your UI thread can use to report status. Wrap the Dictionary with a lock to keep things safe.
ThreadPool.QueueUserWorkItem(o =>
{
try
{
using (var h = (o as WaitHandle))
{
if (!h.WaitOne(100000))
{
// Alert main thread of the timeout
}
}
}
finally
{
Interlocked.Decrement(ref actThreadCount);
}
}, wh);
I'm trying to use WebClient to download a bunch of files asynchronously. From my understanding, this is possible, but you need to have one WebClient object for each download. So I figured I'd just throw a bunch of them in a queue at the start of my program, then pop them off one at a time and tell them to download a file. When the file is done downloading, they can get pushed back onto the queue.
Pushing stuff onto my queue shouldn't be too bad, I just have to do something like:
lock(queue) {
queue.Enqueue(webClient);
}
Right? But what about popping them off? I want my main thread to sleep when the queue is empty (wait until another web client is ready so it can start the next download). I suppose I could use a Semaphore alongside the queue to keep track of how many elements are in the queue, and that would put my thread to sleep when necessary, but it doesn't seem like a very good solution. What happens if I forget to decrement/increment my Semaphore every time I push/pop something on/off my queue and they get out of sync? That would be bad. Isn't there some nice way to have queue.Dequeue() automatically sleep until there is an item to dequeue then proceed?
I'd also welcome solutions that don't involve a queue at all. I just figured a queue would be the easiest way to keep track of which WebClients are ready for use.
Here's an example using a Semaphore. IMO it is a lot cleaner than using a Monitor:
public class BlockingQueue<T>
{
Queue<T> _queue = new Queue<T>();
Semaphore _sem = new Semaphore(0, Int32.MaxValue);
public void Enqueue(T item)
{
lock (_queue)
{
_queue.Enqueue(item);
}
_sem.Release();
}
public T Dequeue()
{
_sem.WaitOne();
lock (_queue)
{
return _queue.Dequeue();
}
}
}
What you want is a producer/consumer queue.
I have a simple example of this in my threading tutorial - scroll about half way down that page. It was written pre-generics, but it should be easy enough to update. There are various features you may need to add, such as the ability to "stop" the queue: this is often performed by using a sort of "null work item" token; you inject as many "stop" items in the queue as you have dequeuing threads, and each of them stops dequeuing when it hits one.
Searching for "producer consumer queue" may well provide you with better code samples - this was really just do demonstrate waiting/pulsing.
IIRC, there are types in .NET 4.0 (as part of Parallel Extensions) which will do the same thing but much better :) I think you want a BlockingCollection wrapping a ConcurrentQueue.
I use a BlockingQueue to deal with exactly this type of situation. You can call .Dequeue when the queue is empty, and the calling thread will simply wait until there is something to Dequeue.
public class BlockingQueue<T> : IEnumerable<T>
{
private int _count = 0;
private Queue<T> _queue = new Queue<T>();
public T Dequeue()
{
lock (_queue)
{
while (_count <= 0)
Monitor.Wait(_queue);
_count--;
return _queue.Dequeue();
}
}
public void Enqueue(T data)
{
if (data == null)
throw new ArgumentNullException("data");
lock (_queue)
{
_queue.Enqueue(data);
_count++;
Monitor.Pulse(_queue);
}
}
IEnumerator<T> IEnumerable<T>.GetEnumerator()
{
while (true)
yield return Dequeue();
}
IEnumerator IEnumerable.GetEnumerator()
{
return ((IEnumerable<T>) this).GetEnumerator();
}
}
Just use this in place of a normal Queue and it should do what you need.