Just checking... _count is being accessed safely, right?
Both methods are accessed by multiple threads.
private int _count;
public void CheckForWork() {
if (_count >= MAXIMUM) return;
Interlocked.Increment(ref _count);
Task t = Task.Run(() => Work());
t.ContinueWith(CompletedWorkHandler);
}
public void CompletedWorkHandler(Task completedTask) {
Interlocked.Decrement(ref _count);
// Handle errors, etc...
}
This is thread safe, right?
Suppose MAXIMUM is one, count is zero, and five threads call CheckForWork.
All five threads could verify that count is less than MAXIMUM. The counter would then be bumped up to five and five jobs would start.
That seems contrary to the intention of the code.
Moreover: the field is not volatile. So what mechanism guarantees that any thread will read an up-to-date value on the no-memory-barrier path? Nothing guarantees that! You only make a memory barrier if the condition is false.
More generally: you are making a false economy here. By going with a low-lock solution you are saving the dozen nanoseconds that the uncontended lock would take. Just take the lock. You can afford the extra dozen nanoseconds.
And even more generally: do not write low-lock code unless you are an expert on processor architectures and know all optimizations that a CPU is permitted to perform on low-lock paths. You are not such an expert. I am not either. That's why I don't write low-lock code.
No, if (_count >= MAXIMUM) return; is not thread safe.
edit: You'd have to lock around the read too, which should then logically be grouped with the increment, so I'd rewrite like
private int _count;
private readonly Object _locker_ = new Object();
public void CheckForWork() {
lock(_locker_)
{
if (_count >= MAXIMUM)
return;
_count++;
}
Task.Run(() => Work());
}
public void CompletedWorkHandler() {
lock(_locker_)
{
_count--;
}
...
}
That's what Semaphore and SemaphoreSlim are for:
private readonly SemaphoreSlim WorkSem = new SemaphoreSlim(Maximum);
public void CheckForWork() {
if (!WorkSem.Wait(0)) return;
Task.Run(() => Work());
}
public void CompletedWorkHandler() {
WorkSem.Release();
...
}
No, what you have is not safe. The check to see if _count >= MAXIMUM could race with the call to Interlocked.Increment from another thread. This is actually really hard to solve using low-lock techniques. To get this to work properly you need to make a series of several operations appear atomic without using a lock. That is the hard part. The series of operations in question here are:
Read _count
Test _count >= MAXIMUM
Make a decision based on the above.
Increment _count depending on the decision made.
If you do not make all 4 of these steps appear atomic then there will be a race condition. The standard pattern for performing a complex operation without taking a lock is as follows.
public static T InterlockedOperation<T>(ref T location)
{
T initial, computed;
do
{
initial = location;
computed = op(initial); // where op() represents the operation
}
while (Interlocked.CompareExchange(ref location, computed, initial) != initial);
return computed;
}
Notice what is happening. The operation is repeatedly performed until the ICX operation determines that the initial value has not changed between the time it was first read and the time the attempt was made to change it. This is the standard pattern and the magic all happens because of the CompareExchange (ICX) call. Note, however, that this does not take into account the ABA problem.1
What you could do:
So taking the above pattern and incorporating it into your code would result in this.
public void CheckForWork()
{
int initial, computed;
do
{
initial = _count;
computed = initial < MAXIMUM ? initial + 1 : initial;
}
while (Interlocked.CompareExchange(ref _count, computed, initial) != initial);
if (replacement > initial)
{
Task.Run(() => Work());
}
}
Personally, I would punt on the low-lock strategy entirely. There are several problems with what I presented above.
This might actually run slower than taking a hard lock. The reasons are difficult to explain and outside the scope of my answer.
Any deviation from what is above will likely cause the code to fail. Yes, it really is that brittle.
It is hard to understand. I mean look at it. It is ugly.
What you should do:
Going with the hard lock route your code might look like this.
private object _lock = new object();
private int _count;
public void CheckForWork()
{
lock (_lock)
{
if (_count >= MAXIMUM) return;
_count++;
}
Task.Run(() => Work());
}
public void CompletedWorkHandler()
{
lock (_lock)
{
_count--;
}
}
Notice that this is much simpler and considerably less error prone. You may actually find that this approach (hard lock) is actually faster than what I showed above (low lock). Again, the reason is tricky and there are techniques that can be used to speed things up, but it outside the scope of this answer.
1The ABA problem is not really an issue in this case because the logic does not depend on _count remaining unchanged. It only matters that its value is the same at two points in time regardless of what happened in between. In other words the problem can be reduced to one in which it seemed like the value did not change even though in reality it may have.
Define thread safe.
If you want to ensure that _count will never be greater than MAXIMUM, than you did not succeed.
What you should do is lock around that too:
private int _count;
private object locker = new object();
public void CheckForWork()
{
lock(locker)
{
if (_count >= MAXIMUM) return;
_count++;
}
Task.Run(() => Work());
}
public void CompletedWorkHandler()
{
lock(locker)
{
_count--;
}
...
}
You might also want to take a look at The SemaphoreSlim class.
you can do the following if you don't want to lock or move to a semaphore:
if (_count >= MAXIMUM) return; // not necessary but handy as early return
if(Interlocked.Increment(ref _count)>=MAXIMUM+1)
{
Interlocked.Decrement(ref _count);//restore old value
return;
}
Task.Run(() => Work());
Increment returns the incremented value on which you can double check whether _count was less than maximum, if the test fails then I restore the old value
Related
I wonder is there a better solution for this task. One have a function which called concurrently by some amount of threads, but if some thread is already executing the code the other threads should skip that part of code and wait until that thread finish the execution. Here is what I have for now:
int _flag = 0;
readonly ManualResetEventSlim Mre = new ManualResetEventSlim();
void Foo()
{
if (Interlocked.CompareExchange(ref _flag, 1, 0) == 0)
{
Mre.Reset();
try
{
// do stuff
}
finally
{
Mre.Set();
Interlocked.Exchange(ref _flag, 0);
}
}
else
{
Mre.Wait();
}
}
What I want to achieve is faster execution, lower overhead and prettier look.
You could use a combination of an AutoResetEvent and a Barrier to do this.
You can use the AutoResetEvent to ensure that only one thread enters a "work" method.
The Barrier is used to ensure that all the threads wait until the one that entered the "work" method has returned from it.
Here's some sample code:
using System;
using System.Threading;
using System.Threading.Tasks;
namespace Demo
{
class Program
{
const int TASK_COUNT = 3;
static readonly Barrier barrier = new Barrier(TASK_COUNT);
static readonly AutoResetEvent gate = new AutoResetEvent(true);
static void Main()
{
Parallel.Invoke(task, task, task);
}
static void task()
{
while (true)
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " is waiting at the gate.");
// This bool is just for test purposes to prevent the same thread from doing the
// work every time!
bool didWork = false;
if (gate.WaitOne(0))
{
work();
didWork = true;
gate.Set();
}
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " is waiting at the barrier.");
barrier.SignalAndWait();
if (didWork)
Thread.Sleep(10); // Give a different thread a chance to get past the gate!
}
}
static void work()
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " is entering work()");
Thread.Sleep(3000);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId + " is leaving work()");
}
}
}
However, it might well be that the Task Parallel Library might have a better, higher-level solution. It's worth reading up on it a bit.
First of all, the waiting threads wouldn't do anything, they only wait, and after they get the signal from the event, they simply move out of the method, so you should add the while loop. After that, you can use the AutoResetEvent instead of manual one, as #MatthewWatson suggested. Also, you may consider SpinWait inside the loop, which is a lightweight solution.
Second, why use int, if this is definitely bool nature for the flag field?
Third, why not to use the simple locking, as #grrrrrrrrrrrrr suggested? This is exactly what are you doing here: forcing other threads to wait for one. If your code should write something by only one thread in a given time, but can read by multiple threads, you can use the ReaderWriterLockSlim object for such synchronization.
What I want to achieve is faster execution, lower overhead and prettier look.
faster execution
unless your "Do Stuff" is extremely fast this code shouldn't have any major overhead.
lower overhead
Again, Interlocked Exchange,/CompareExchange are very low overhead, as is manual reset event.
If your "Do Stuff" is really fast, e.g. moving a linked list head, then you can spin:
prettier look
Correct multi-threaded C# code rarely looks pretty when compared to correct single threaded C# code. The language idioms are just not there yet.
That said: If you have a really fast operation ("a few tens of cycles"), then you can spin: (although without knowing exactly what your code is doing, I can't say if this is correct).
if (Interlocked.CompareExchange(ref _flag, 1, 0) == 0)
{
try
{
// do stuff that is very quick.
}
finally
{
Interlocked.Exchange(ref _flag, 0);
}
}
else
{
SpinWait.SpinUntil(() => _flag == 0);
}
The first thing that springs to mind is to change it to use a lock. This won't skip the code, but will cause each thread getting to it to pause while the first thread executes its stuff. This way the lock will also automatically get released in the case of an exception.
object syncer = new object();
void Foo()
{
lock(syncer)
{
//Do stuff
}
}
I'm trying to implement something that manages a pool of resources such that the calling code can request an object and will be given one from the pool if it's available, or else it will be made to wait. I'm having trouble getting the synchronization to work correctly however. What I have in my pool class is something like this (where autoEvent is an AutoResetEvent initially set as signaled:
public Foo GetFooFromPool()
{
autoEvent.WaitOne();
var foo = Pool.FirstOrDefault(p => !p.InUse);
if (foo != null)
{
foo.InUse = true;
autoEvent.Set();
return foo;
}
else if (Pool.Count < Capacity)
{
System.Diagnostics.Debug.WriteLine("count {0}\t capacity {1}", Pool.Count, Capacity);
foo = new Foo() { InUse = true };
Pool.Add(foo);
autoEvent.Set();
return foo;
}
else
{
return GetFooFromPool();
}
}
public void ReleaseFoo(Foo p)
{
p.InUse = false;
autoEvent.Set();
}
The idea is when you call GetFooFromPool, you wait until signaled, then you try and find an existing Foo that is not in use. If you find one, we set it to InUse and then fire a signal so other threads can proceed. If we don't find one, we check to see if the the pool is full. If not, we create a new Foo, add it to the pool and signal again. If neither of those conditions are satisfied, we are made to wait again by calling GetFooFromPool again.
Now in ReleaseFoo we just set InUse back to false, and signal the next thread waiting in GetFooFromPool (if any) to try and get a Foo.
The problem seems to be in my managing the size of the pool. With a capacity of 5, I'm ending up with 6 Foos. I can see in my debug line count 0 appear a couple of times and count 1 might appear a couple of times also. So clearly I have multiple threads getting into the block when, as far as I can see, they shouldn't be able to.
What am I doing wrong here?
Edit: A double check lock like this:
else if (Pool.Count < Capacity)
{
lock(locker)
{
if (Pool.Count < Capacity)
{
System.Diagnostics.Debug.WriteLine("count {0}\t capacity {1}", Pool.Count, Capacity);
foo = new Foo() { InUse = true };
Pool.Add(foo);
autoEvent.Set();
return foo;
}
}
}
Does seem to fix the problem, but I'm not sure it's the most elegant way to do it.
As was already mentioned in the comments, a counting semaphore is your friend.
Combine this with a concurrent stack and you have got a nice simple, thread safe implementation, where you can still lazily allocate your pool items.
The bare-bones implementation below provides an example of this approach. Note that another advantage here is that you do not need to "contaminate" your pool items with an InUse member as a flag to track stuff.
Note that as a micro-optimization, a stack is preferred over a queue in this case, because it will provide the most recently returned instance from the pool, that may still be in e.g. L1 cache.
public class GenericConcurrentPool<T> : IDisposable where T : class
{
private readonly SemaphoreSlim _sem;
private readonly ConcurrentStack<T> _itemsStack;
private readonly Action<T> _onDisposeItem;
private readonly Func<T> _factory;
public GenericConcurrentPool(int capacity, Func<T> factory, Action<T> onDisposeItem = null)
{
_itemsStack = new ConcurrentStack<T>(new T[capacity]);
_factory = factory;
_onDisposeItem = onDisposeItem;
_sem = new SemaphoreSlim(capacity);
}
public async Task<T> CheckOutAsync()
{
await _sem.WaitAsync();
return Pop();
}
public T CheckOut()
{
_sem.Wait();
return Pop();
}
public void CheckIn(T item)
{
Push(item);
_sem.Release();
}
public void Dispose()
{
_sem.Dispose();
if (_onDisposeItem != null)
{
T item;
while (_itemsStack.TryPop(out item))
{
if (item != null)
_onDisposeItem(item);
}
}
}
private T Pop()
{
T item;
var result = _itemsStack.TryPop(out item);
Debug.Assert(result);
return item ?? _factory();
}
private void Push(T item)
{
Debug.Assert(item != null);
_itemsStack.Push(item);
}
}
There are a few problems with what you're doing, but your specific race condition is likely caused by a situation like the following. Imagine you have a capacity of one.
1) There is one unused item in the pool.
2) Thread #1 grabs it and signals the event.
3) Thread #2 finds no available event and gets inside the capacity block. It does not add the item yet.
4) Thread #1 returns the item to the pool and signals the event.
5) Repeat steps 1, 2, and 3 using two other threads (e.g. #3, #4).
6) Thread #2 adds an item to the pool.
7) Thread #4 adds an item to the pool.
There are now two items in a pool with a capacity of one.
Your implementation has other potential issues, however.
Depending on how your Pool.Count and Add() are synchronized, you might not see an up-to-date value.
You could potentially have multiple threads grab the same unused item.
Controlling access with an AutoResetEvent opens yourself up to difficult to find issues (like this one) because you are trying to use a lockless solution instead of just taking a lock and using Monitor.Wait() and Monitor.Pulse() for this purpose.
In my project there is an audio thread updating with about 86 fps and a graphics thread which runs at 60 fps. Both threads can produce and consume values from each other.
But it is not necessary to consume every value, only the latest one is important and no notification is required because the threads just ask for a new value when they need one.
After reading tons of websites about threading I am a bit confused what I really need, because my task is quite simple. With locks my code would look like:
private T aField; //memory location
//other thread reads value
public void ReadValue(ref T val)
{
lock(myLock) copy aField to val;
}
//this thread updates value
private void UpdateValue(T newVal)
{
lock(myLock) copy newVal to aField;
}
My first question is, would this work for primitive types like float or int (<=32bit of size) without any lock because the copy is only one assignment which is atomic?
The next idea was a protection by a bool:
private T aField; //memory location
private volatile bool isReading;
private volatile bool isWriting;
//other thread reads value
public void ReadValue(ref T val)
{
isReading = true;
if(!isWriting) copy aField to val;
isReading = false;
}
//this thread updates value
private void UpdateValue(T newVal)
{
isWriting = true;
if(!isReading) copy newVal to aField;
isWriting = false;
}
Looks good to me but i am pretty sure i missed something. I could think of a worst case scenario when the faster thread reads while the slow thread wants to write. then the fast thread will read again the older value the next time, because no update was done.
What i also found was a nonblocking update method, but i wonder if and how it can help me:
static void LockFreeUpdate<T> (ref T field, Func <T, T> updateFunction)
where T : class
{
var spinWait = new SpinWait();
while (true)
{
T snapshot1 = field;
T calc = updateFunction (snapshot1);
T snapshot2 = Interlocked.CompareExchange (ref field, calc, snapshot1);
if (snapshot1 == snapshot2) return;
spinWait.SpinOnce();
}
}
What is the most efficient method with the lowest latency?
for your case you do not need any locks, just add volatile to private T aField; to prevent any possible compiler optimizations
upd: Let me rephrase my question shortly.
There are N double numbers. There are N dedicated threads each of them update own double number (_cachedProduct in the example below).
Somehow I need to have sum of these numbers and I need IndexUpdated event to be raised ASAP after any double number is changed (it would be nice if such event can be raised in 10 µs or less).
Below is how I tried to implement this task
===============================================
To calculate stock exchange index I create private double[] _cachedProduct; field. These field is written
by many threads
// called from another threads
public override void InstrumentUpdated(Instrument instrument)
{
if (!_initialized)
{
if (!Initialize())
{
return;
}
}
int instrumentId = instrument.Id;
OrderBook ob = Program.market.OrderBook(instrument);
if (ob.MedianOrAskOrBid == null)
{
_cachedProduct[instrumentId] = 0;
}
else
{
_cachedProduct[instrumentId] = ((double) ob.MedianOrAskOrBid)*_ammounts[instrumentId];
}
}
_ammounts is pre-initialized array and please ignore Initialize method and variable - they just works.
In loop I just sum all _cachedProduct and when values changes I notify others.
Task.Factory.StartNew(() =>
{
while(true)
{
if (_initialized)
{
break;
}
}
while (true)
{
CalculateAndNotify();
//Thread.Sleep(5);
}
}
, TaskCreationOptions.LongRunning);
protected void CalculateAndNotify()
{
var oldValue = Value;
Calculate();
if (oldValue != Value)
{
NotifyIndexChanged();
}
}
protected override void Calculate()
{
double result = 0;
for (int i = 0; i < _instrumentIds.Count(); i++)
{
int instrumentId = _instrumentIds[i];
if (_cachedProduct[instrumentId] == 0)
{
Value = null;
return;
}
result += _cachedProduct[instrumentId];;
}
Value = result;
}
I must use Interlocked to update my double _cachedProduct values but please ignore that fact now, what other problems with this code do you see?
Should I call Calculate method inside while(true) so I always use one core without delays. My machine has 24 cores so I was thinking this is ok.
However without Thread.Sleep(5) (commented) I do see significant slow-down in the program overall and I do not understand why. Program executes several dozens times slower in many places.
The question is if my idea of using while(true) without any locking at all is OK. Or should I introduce some locking method so I would only Calculate index when one of of _cachedProduct is updated?
I think you might get better performance and clearer code if you do not use an extra thread and loop for your sum. On every change to an instrument you calculate the difference and immediately update the index and perform the notify
So if a thread calls InstrumentUpdated for a single instrument;
change = newvalue - currentvalue;
// used interlocked here to change the index threadsafe
StockExchangeSum = Interlocked.Add(ref StockExchangeSum,change);
NotifyIndexChanged();
Can double[] be a more complex type?
How does WaitHandle.WaitAny compare performance wise?
Something like as follows.
private Index[] indicies;
public class Index
{
public WaitHandle Updated =
new EventWaitHandle(false, EventResetMode.AutoReset);
public double _value;
public double Value
{
get {return _value;}
set
{
if(_value != value)
{
_value = value;
Updated.Set();
}
}
}
}
TaskFactory.StartNew(() =>
{
while(true)
{
WaitHandle.Any(indicies.Select(i => i.Updated));
CalculateAndNotify();
}
});
Some points for you to think about
Have you tried profiling your calculation block in isolation to the rest of the code? I noticed this in your Calculate function:
for (int i = 0; i < _instrumentIds.Count(); i++)
_instrumentIds.Count() invokes an iteration over the entire collection and it is possible this is invoked for each trip around the loop. i.e. you are doing N^2/2 iterations of _instrumentIds
Is the _instrumentIdsIEnumerable being modified during this calculation operation? If so you could get all sorts of race conditions leading to incorrect answers.
Is the Task containing CalculateAndNotify called once or is it called many times (nested)? E.g. is there some operation inside CalculateAndNotify that could cause it to be triggered recursively?
If so, you might find you have several calculations performing simultaneously (using more than one thread until the pool is starved). Can you include some logging on start/end of operation and perhaps count the number of simultaneous calculations to check this?
If this is an issue you could include some logic whereby the CalculateAndNotify operation is queued up and new calculate operations cannot be executed until the previous has completed.
I'm just wondering whether this code that a fellow developer (who has since left) is OK, I think he wanted to avoid putting a lock. Is there a performance difference between this and just using a straight forward lock?
private long m_LayoutSuspended = 0;
public void SuspendLayout()
{
Interlocked.Exchange(ref m_LayoutSuspended, 1);
}
public void ResumeLayout()
{
Interlocked.Exchange(ref m_LayoutSuspended, 0);
}
public bool IsLayoutSuspended
{
get { return Interlocked.Read(ref m_LayoutSuspended) != 1; }
}
I was thinking that something like that would be easier with a lock? It will indeed be used by multiple threads, hence why the use of locking/interlocked was decided.
Yes what you are doing is safe from a race point of view reaching the m_LayoutSuspended field, however, a lock is required for the following reason if the code does the following:
if (!o.IsLayoutSuspended) // This is not thread Safe .....
{
o.SuspendLayout(); // This is not thread Safe, because there's a difference between the checck and the actual write of the variable a race might occur.
...
o.ResumeLayout();
}
A safer way, that uses CompareExchange to make sure no race conditions have occurred:
private long m_LayoutSuspended = 0;
public bool SuspendLayout()
{
return Interlocked.CompareExchange(ref m_LayoutSuspended, 1) == 0;
}
if (o.SuspendLayout())
{
....
o.ResumeLayout();
}
Or better yet simply use a lock.
Personally I'd use a volatile Boolean:
private volatile bool m_LayoutSuspended = false;
public void SuspendLayout()
{
m_LayoutSuspended = true;
}
public void ResumeLayout()
{
m_LayoutSuspended = false;
}
public bool IsLayoutSuspended
{
get { return m_LayoutSuspended; }
}
Then again, as I've recently acknowledged elsewhere, volatile doesn't mean quite what I thought it did. I suspect this is okay though :)
Even if you stick with Interlocked, I'd change it to an int... there's no need to make 32 bit systems potentially struggle to make a 64 bit write atomic when they can do it easily with 32 bits...