So I have many threads that are feeding me input data, which must be processed by a single thread in order of arrival. Currently, all the input items wind up inserted in a queue, and read/writes to the queue are protected with the C# lock statement. However, over time, the CPU usage of the application rises to an unacceptable level, and the profiler says that the majority of the CPU time is being spent on the lock statement itself. Is there a more efficient synchronization method available in place of the lock, that supports many writers and one reader?
It sounds like the writers are contending with each other for the locks. Consider a model where each writer has its own queue, and where the reader uses the Peek method to read the first message off of each queue without removing it. The reader can then keep iterating between the queues, peeking the first item among the set of first items from each queue, and then removing and processing that first item. It will be slower than your current architecture, but should eliminate the lock contention among the writers.
A trivial example might look like:
public class TimestampedItem<T> : IComparable<TimestampedItem<T>>
{
public DateTime TimeStamp { get; set; }
public T Data { get; set; }
public int CompareTo(TimestampedItem<T> other)
{
return TimeStamp.CompareTo(other.TimeStamp);
}
}
public void ReadFirstFromEachQueue<T>(IEnumerable<Queue<TimestampedItem<T>>> queues)
{
while (true)
{
var firstItems = new List<TimestampedItem<T>>(queues.Select(q => { lock (q) { return q.Peek(); } }));
ProcessItem(firstItems.OrderBy(tsi => tsi.TimeStamp).First());
}
}
}
If you are using .net version 4.0 you can then use the ConcurrentQueue which is part of the ConcurrentCollections, instead of the normal Queue and then get rid of the lock when read/write your data to the queue, the ConcurrentCollections are designed to be used to handle concurrent read/write with lock free code..
If you are not using 4.0 you can is to lock only if no other lock is heeled, you can achieve that by using Monitor.TryEnter instead of lock note that lock itself is Monitor.Enter and Monitor.Exit combination.., sample implementation would be:
private readonly object _syncObject = new object();
private bool TryUpdate(object someData)
{
if (Monitor.TryEnter(_syncObject))
{
try
{
//Update the data here.
return true;
}
finally
{
Monitor.Exit(_SyncObject);
}
}
return false;
}
It might be a big change to your app, but you could consider making your queue external to your application (for example MSMQ) and then you could have your writer threads writing to that queue to their hearts content. Your reader could then just pick the items off when its ready. If the bulk of your cpu time is just on the lock around your queue (I assume you are not actually locking around the work on the items being put on the queue), then putting the queue exteral to your app could really help. Ideally you could also split the writing and reading into seperate processes.
Another thing to check is that the object your are locking on is not being used to lock somewhere else in your app. A monitor (the thing behind the lock statement) is probably the lightest weight thread sync method there is, so might be best to re-architect things to avoid locking in the same process that is doing the processing of items.
Related
I have a list of "Module" classes, List<Module> modules. These modules each contain their own public object to use as a lock when accessing data. Let's say I have a couple threads which perform processing on these modules at random times. Currently I have each thread perform the processing on the modules in order, like so:
foreach (Module module in modules)
{
lock (module.Locker)
{
//Do stuff
}
}
This has worked fine so far, but I have the feeling there's a lot of unnecessary waiting. For instance, if two threads start one right after another, but the first is performing heavy processing and the second one isn't, the second one will have to wait on every module while the first one is doing its processing.
This is the question then: Is there a "proper" or "most efficient" way to lock on elements in a list? I was going to do this:
foreach (Module module in modules.Randomize())
{
lock (module.Locker)
{
//Do stuff
}
}
Where "Randomize()" is just an extension method that returns the elements of the list in a random order. However, I was wondering if there's an even better way than random?
Assuming that work inside the lock is huge and has heavy contention. I'm introducing additional overhead of creating new List<T> and removing items from them.
public void ProcessModules(List<Module> modules)
{
List<Module> myModules = new List<Module>(modules);//Take a copy of the list
int index = myModules.Count - 1;
while (myModules.Count > 0)
{
if (index < 0)
{
index = myModules.Count - 1;
}
Module module = myModules[index];
if (!Monitor.TryEnter(module.Locker))
{
index--;
continue;
}
try
{
//Do processing module
}
finally
{
Monitor.Exit(module.Locker);
myModules.RemoveAt(index);
index--;
}
}
}
What this method does is takes the copy of the modules passed in, then tries to acquire the lock, if not possible to acquire it(because another thread owns it), it skips and moves on. After finishing the list, it comes again to see whether another thread has released the lock, if not again skips it and moves on. This cycle continues till we process all the modules in the list.
This way, we're not waiting for any contended locks, we just keep on processing the modules that's not locked by another thread.
lock stands for Monitor.Enter, you can use Monitor.TryEnter to check if lock is already acquired and somehow skip this element and try to take another.
There will be overhead if multiple threads are processing same ordered list of items, so idea with Randomize seems a good one (unless reordering is expensive compared to processing itself, or list can be changed while processing, etc).
Totally other possibility is to prepare queues (from list) for each thread in a way what there will be no cross-waiting (or waiting will be minimized). Combined with Monitor.TryEnter this should be an ultimate solution. Unfortunately, I have no clue in how to prepare such queues, nor how to skip processing queue item, leaving that for you =P.
Here is a snippet of what I mean:
foreach(var item in list)
if(!item.Processed && Monitor.TryEnter(item.Locker))
try
{
... // do job
item.Processed = true;
}
finally
{
Monitor.Exit(item.Locker))
}
Not sure I entirely follow, however from what I can tell your goals is to periodically do stuff to each module, and you want to use multiple threads because the stuff is time consuming. If this is the case I would have a single thread periodically check all modules and have that thread use the TPL to spread the workload, like so:
Parallel.ForEach(modules, module =>
{
lock(module.Locker)
{
}
});
As an aside, the guidance on locks is that the object that you lock on should be private, so I'd probably change to doing something like this:
Parallel.ForEach(modules, module => module.DoStuff());
// In the module implementation
private readonly object _lock = new object();
public void DoStuff()
{
lock (this._lock)
{
// Do stuff here
}
}
I.e. each module should be thread-safe and responsible for its own locking.
I have a one-process, two-thread application. Thread 1 will listen to a market data feed and update the most recent quote on thousands of stocks. Thread 2 will run a timer at a Sampling Frequency and take a snapshot of the most recent quotes for processing. Effectively, I need to down-sample an extremely fast market data feed.
My first guess at a solution is to use a BlockingQueue. To do this I need to move the timer functionality into Thread 1, which I can do by checking the clock every time a quote update comes in and send a snapshot of the quotes onto the queue at the Sampling Frequency. My concern here is that the queue will consume a lot of memory and garbage collection will slow things down.
My second guess is to have Thread 1 copy the data into a locked member at the Sampling Frequency, which Thread 2 can access. My concern here is that the locks will be slow.
My thirt guess it to make the quote primitives volatile. Since one thread only writes and one thread only reads, maybe this is appropriate?
Is there a best practice way to communicate the data between the threads for this latency sensitive application? This is not an ultra-high frequency application. I can tolerate latencies on the order of tens of ms.
If you only have 2 threads accessing this resource (ie concurrent reads are not required), then the simplest (and one of the quickest) would just be to use the lock keyword:
public class QuoteStore
{
private readonly List<Quote> _quotes = new List<Quote>();
private readonly object _mutex = new object();
public ReadOnlyCollection<Quote> GetQuotes()
{
lock (_mutex)
{
return _quotes.ToReadOnly();
}
}
public void AddQuote()
{
lock (_mutex)
{
_quotes.Add(quote);
}
}
}
If however concurrent reads are required this would be a good fit for the ReaderWriterLockSlim class. You can acquire the read lock when copying data and the write lock when writing data eg:
public class QuoteStore : IDisposable
{
private readonly ReaderWriterLockSlim _mutex = new ReaderWriterLockSlim();
private readonly List<Quote> _quotes = new List<Quote>();
public ReadOnlyCollection<Quote> GetQuotes()
{
_mutex.EnterReadLock();
try
{
return _quotes.ToReadOnly();
}
finally
{
_mutex.ExitReadLock();
}
}
public void AddQuote()
{
_mutex.EnterWriteLock();
try
{
_quotes.Add(quote);
}
finally
{
_mutex.ExitWriteLock();
}
}
public void Dispose()
{
_mutex.Dispose();
}
}
Or if you are using .Net 4 or above there are many wonderful concurrently modifiable collections in the System.Collections.Concurrent namespace which you could probably use without any problem (they are lock free objects and are generally very quick - and some performance enhancements are coming in .Net 4.5 too!).
Isn't it a case of Producer-Consumer Queue? Consumer will wait(Monitor.Wait) upon the Producer to pulse when new feed comes in. As soon as the new/updated feeds comes in, Producer will populate the Queue and fire Monitor.Pulse.
In my application I manage a collection of locks that I need to serialize access to some of objects (each object is assigned a lock). This collection of locks (lock manager) also needs to be maintained in a thread-safe fashion (new locks are added/old locks are removed as new objects which require serialization are added/removed).
The algorithm works something like this:
LockManager.Lock();
var myLock = LockManager.FindLock(myObject);
LockManager.Unlock(); // atomic
myLock.Lock(); // atomic
Swapping two lines is not a good solution. If locking of myLock would block then this would also block unlocking of LockManager making any requests for other locks to block.
What I would need is that the two marked lines are executed atomically. Is there a way to achieve this?
So you want to:
guarantee that the individual lock (via myLock) was entered
then unlock the LockManager
make the above two operations atomic
and not allow this new atomic operation to block if the individual lock cannot be entered immediately
Similar to how you cannot circumvent the laws of physics to create a perpetual motion machine you also cannot circumvent the laws of computation by executing a sequence of operations atomically in such manner that it does not block even if one of its constituents can, in fact, be expected to block. In other words, there is no way to make the operation complete until the individual parts also complete.
However, what we can do is attempt this atomic operation in an all-or-none manner that never blocks as long as we are okay with the "none" outcome. You see this a lot with the TryXXX methods that exist on a lot of concurrent data structures. All you would need to do is define a TryLock on your myLock type. Then, the LockManager could look like the following.
public class LockManager
{
public bool TryEnterIndividualLock(object value)
{
Lock();
try
{
var myLock = FindLock(value);
if (myLock != null)
{
return myLock.TryLock();
}
return false;
}
finally
{
Unlock();
}
}
}
Then the calling code would look like this:
while (!LockManager.TryEnterIndividualLock(myObject))
{
// Do something else until the lock can be entered.
}
This would give you the atomicity you were looking for, but at the cost of the operation not succeeding. If you are relying on this operation succeeding immediately then you are going to have to rethink your overall design.
I need to design a thread-safe logger. My logger must have a Log() method that simply queues a text to be logged. Also a logger must be lock-free - so that other thread can log messages without locking the logger. I need to design a worker thread that must wait
for some synchronization event and then log all messages from the queue using standard .NET logging (that is not thread-safe). So what i am interested in is synchronization of worker thread - and Log function. Below is a sketch of the class that i designed. I think I must use Monitor.Wait/Pulse here or any other means to suspend and resume worker thread. I don;t want to spend CPU cycles when there is no job for logger.
Let me put it another way - I want to design a logger that will not block a caller threads that use it. I have a high performance system - and that is a requirement.
class MyLogger
{
// This is a lockfree queue - threads can directly enqueue and dequeue
private LockFreeQueue<String> _logQueue;
// worker thread
Thread _workerThread;
bool _IsRunning = true;
// this function is used by other threads to queue log messages
public void Log(String text)
{
_logQueue.Enqueue(text);
}
// this is worker thread function
private void ThreadRoutine()
{
while(IsRunning)
{
// do something here
}
}
}
"lock-free"does not mean that threads won't block each other. It means that they block each other through very efficient but also very tricky mechanisms. Only needed for very high performance scenarios and even the experts get it wrong (a lot).
Best advice: forget "lock-free"and just use a "thread-safe" queue.
I would recommend the "Blocking Queue" from this page.
And it's a matter of choice to include the ThreadRoutine (the Consumer) in the class itself.
To the second part of your question, it depends on what "some synchronization event" exactly is. If you are going to use a Method call, then let that start a one-shot thread. If you want to wait on a Semaphore than don't use Monitor and Pulse. They are not reliable here. Use an AutoResetEvent/ManualResetEvent.
How to surface that depends on how you want to use it.
Your basic ingredients should look like this:
class Logger
{
private AutoResetEvent _waitEvent = new AutoResetEvent(false);
private object _locker = new object();
private bool _isRunning = true;
public void Log(string msg)
{
lock(_locker) { _queue.Enqueue(msg); }
}
public void FlushQueue()
{
_waitEvent.Set();
}
private void WorkerProc(object state)
{
while (_isRunning)
{
_waitEvent.WaitOne();
// process queue,
// ***
while(true)
{
string s = null;
lock(_locker)
{
if (_queue.IsEmpty)
break;
s = _queue.Dequeu();
}
if (s != null)
// process s
}
}
}
}
Part of the discussion seems to be what to do when processing the Queue (marked ***). You can lock the Queue and process all items, during which adding of new entries will be blocked (longer), or lock and retrieve entries one by one and only lock (very) shortly each time. I've adde that last scenario.
A summary: You don't want a Lock-Free solution but a Block-Free one. Block-Free doesn't exist, you will have to settle for something that blocks as little as possible. The last iteration of mys sample (incomplete) show how to only lock around the Enqueue and Dequeue calls. I think that will be fast enough.
Has your profiler shown you that you are experiencing a large overhead by using a simple lock statement? Lock-free programming is very hard to get right, and if you really need it I would suggest taking something existing from a reliable source.
It's not hard to make this lock-free if you have atomic operations. Take a singly linked list; you just need the head pointer.
Log function:
1. Locally prepare the log item (node with logging string).
2. Set the local node's next pointer to head.
3. ATOMIC: Compare head with local node's next, if equal, replace head with address of local node.
4. If the operation failed, repeat from step 2, otherwise, the item is in the "queue".
Worker:
1. Copy head locally.
2. ATOMIC: Compare head with local one, if equal, replace head with NULL.
3. If the operation failed, repeat from step 1.
4. If it succeeded, process the items; which are now local and out of the "queue".
I have a function thats the main bottleneck of my application, because its doing heavy string comparisions against a global list shared among the threads. My question is basicly this:
Is it bad practive to lock the list ( called List gList ) multiple times in 1 function. For then to lock it again later ( Basicly locking when doing the lookup, unlocking getting a new item ready for insertion then locking it again and adding the new item).
When i you a profiler i dont see any indication that im paying a heavy price for this, but could i be at a later point or when the code it out in the wild? Anyone got any best practive or personal experence in this?
How do you perform the locking? You may want to look into using ReaderWriterLockSlim, if that is not already the case.
Here is a simple usage example:
class SomeData
{
private IList<string> _someStrings = new List<string>();
private ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
public void Add(string text)
{
_lock.EnterWriteLock();
try
{
_someStrings.Add(text);
}
finally
{
_lock.ExitWriteLock();
}
}
public bool Contains(string text)
{
_lock.EnterReadLock();
try
{
return _someStrings.Contains(text);
}
finally
{
_lock.ExitReadLock();
}
}
}
It sounds like you don't want to be releasing the lock between the lookup and the insertion. Either that, or you don't need to lock during the lookup at all.
Are you trying to add to the list only if the element is not already there? If so, then releasing the lock between the two steps allows another thread to add to the list while you are preparing your element. By the time you are ready to add, your lookup is out of date.
If it is not a problem that the lookup might be out of date, then you probably don't need to lock during the lookup at all.
In general, you want to lock for as short a time as possible. The cost of a contention is much much higher (must go to kernel) than the cost of a contention-free lock acquisition (can be done in userspace), so finer-grained locking will usually be good for performance even if it means acquiring the lock more times.
That said, make sure you profile in an appropriate situation for this: one with a high amount of simultaneous load. Otherwise your results will have little relationship to reality.
In my opinion there are to few data to give a concrete answer. Generally not the number of locks creates a performance issue, but the number of threads that are waiting for that lock.