multiple lock in same function - c#

I have a function thats the main bottleneck of my application, because its doing heavy string comparisions against a global list shared among the threads. My question is basicly this:
Is it bad practive to lock the list ( called List gList ) multiple times in 1 function. For then to lock it again later ( Basicly locking when doing the lookup, unlocking getting a new item ready for insertion then locking it again and adding the new item).
When i you a profiler i dont see any indication that im paying a heavy price for this, but could i be at a later point or when the code it out in the wild? Anyone got any best practive or personal experence in this?

How do you perform the locking? You may want to look into using ReaderWriterLockSlim, if that is not already the case.
Here is a simple usage example:
class SomeData
{
private IList<string> _someStrings = new List<string>();
private ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
public void Add(string text)
{
_lock.EnterWriteLock();
try
{
_someStrings.Add(text);
}
finally
{
_lock.ExitWriteLock();
}
}
public bool Contains(string text)
{
_lock.EnterReadLock();
try
{
return _someStrings.Contains(text);
}
finally
{
_lock.ExitReadLock();
}
}
}

It sounds like you don't want to be releasing the lock between the lookup and the insertion. Either that, or you don't need to lock during the lookup at all.
Are you trying to add to the list only if the element is not already there? If so, then releasing the lock between the two steps allows another thread to add to the list while you are preparing your element. By the time you are ready to add, your lookup is out of date.
If it is not a problem that the lookup might be out of date, then you probably don't need to lock during the lookup at all.

In general, you want to lock for as short a time as possible. The cost of a contention is much much higher (must go to kernel) than the cost of a contention-free lock acquisition (can be done in userspace), so finer-grained locking will usually be good for performance even if it means acquiring the lock more times.
That said, make sure you profile in an appropriate situation for this: one with a high amount of simultaneous load. Otherwise your results will have little relationship to reality.

In my opinion there are to few data to give a concrete answer. Generally not the number of locks creates a performance issue, but the number of threads that are waiting for that lock.

Related

lock keyword on a LINQ Parallel.ForEach<> loop

This is more a conceptual question. I was wondering if I used a lock inside of Parallel.ForEach<> loop if that would take away the benefits of Paralleling a foreachloop.
Here is some sample code where I have seen it done.
Parallel.ForEach<KeyValuePair<string, XElement>>(binReferences.KeyValuePairs, reference =>
{
lock (fileLockObject)
{
if (fileLocks.ContainsKey(reference.Key) == false)
{
fileLocks.Add(reference.Key, new object());
}
}
RecursiveBinUpdate(reference.Value, testPath, reference.Key, maxRecursionCount, ref recursionCount);
lock (fileLocks[reference.Key])
{
reference.Value.Document.Save(reference.Key);
}
});
Where fileLockObject and fileLocks are as follows.
private static object fileLockObject = new object();
private static Dictionary<string, object> fileLocks = new Dictionary<string, object>();
Does this technique completely make the loop not parallel?
I would like to see your thoughts on this.
It means all of the work inside of the lock can't be done in parallel. This greatly harms the performance here, yes. Since the entire body is not all locked (and locked on the same object) there is still some parallelization here though. Whether the parallelization that you do get adds enough benefit to surpass the overhead that comes with managing the threads and synchronizing around the locks is something you really just need to test yourself with your specific data.
That said, it looks like what you're doing (at least in the first locked block, which is the one I'd be more concerned with at every thread is locking on the same object) is locking access to a Dictionary. You can instead use a ConcurrentDictionary, which is specifically designed to be utilized from multiple threads, and will minimize the amount of synchronization that needs to be done.
if I used a lock ... if that would take away the benefits of Paralleling a foreachloop.
Proportionally. When RecursiveBinUpdate() is a big chunk of work (and independent) then it will still pay off. The locking part could be a less than 1%, or 99%. Look up Amdahls law, that applies here.
But worse, your code is not thread-safe. From your 2 operations on fileLocks, only the first is actually inside a lock.
lock (fileLockObject)
{
if (fileLocks.ContainsKey(reference.Key) == false)
{
...
}
}
and
lock (fileLocks[reference.Key]) // this access to fileLocks[] is not protected
change the 2nd part to:
lock (fileLockObject)
{
reference.Value.Document.Save(reference.Key);
}
and the use of ref recursionCount as a parameter looks suspicious too. It might work with Interlocked.Increment though.
The "locked" portion of the loop will end up running serially. If the RecursiveBinUpdate function is the bulk of the work, there may be some gain, but it would be better if you could figure out how to handle the lock generation in advance.
When it comes to locks, there's no difference in the way PLINQ/TPL threads have to wait to gain access. So, in your case, it only makes the loop not parallel in those areas that you're locking and any work outside those locks is still going to execute in parallel (i.e. all the work in RecursiveBinUpdate).
Bottom line, I see nothing substantially wrong with what you're doing here.

Is this use of the generic List thread safe

I have a System.Collections.Generic.List<T> to which I only ever add items in a timer callback. The timer is restarted only after the operation completes.
I have a System.Collections.Concurrent.ConcurrentQueue<T> which stores indices of added items in the list above. This store operation is also always performed in the same timer callback described above.
Is a read operation that iterates the queue and accesses the corresponding items in the list thread safe?
Sample code:
private List<Object> items;
private ConcurrentQueue<int> queue;
private Timer timer;
private void callback(object state)
{
int index = items.Count;
items.Add(new object());
if (true)//some condition here
queue.Enqueue(index);
timer.Change(TimeSpan.FromMilliseconds(500), TimeSpan.FromMilliseconds(-1));
}
//This can be called from any thread
public IEnumerable<object> AccessItems()
{
foreach (var index in queue)
{
yield return items[index];
}
}
My understanding:
Even if the list is resized when it is being indexed, I am only accessing an item that already exists, so it does not matter whether it is read from the old array or the new array. Hence this should be thread-safe.
Is a read operation that iterates the queue and accesses the corresponding items in the list thread safe?
Is it documented as being thread safe?
If no, then it is foolish to treat it as thread safe, even if it is in this implementation by accident. Thread safety should be by design.
Sharing memory across threads is a bad idea in the first place; if you don't do it then you don't have to ask whether the operation is thread safe.
If you have to do it then use a collection designed for shared memory access.
If you can't do that then use a lock. Locks are cheap if uncontended.
If you have a performance problem because your locks are contended all the time then fix that problem by changing your threading architecture rather than trying to do dangerous and foolish things like low-lock code. No one writes low-lock code correctly except for a handful of experts. (I am not one of them; I don't write low-lock code either.)
Even if the list is resized when it is being indexed, I am only accessing an item that already exists, so it does not matter whether it is read from the old array or the new array.
That's the wrong way to think about it. The right way to think about it is:
If the list is resized then the list's internal data structures are being mutated. It is possible that the internal data structure is mutated into an inconsistent form halfway through the mutation, that will be made consistent by the time the mutation is finished. Therefore my reader can see this inconsistent state from another thread, which makes the behaviour of my entire program unpredictable. It could crash, it could go into an infinite loop, it could corrupt other data structures, I don't know, because I'm running code that assumes a consistent state in a world with inconsistent state.
Big edit
The ConcurrentQueue is only safe with regard to the Enqueue(T) and T Dequeue() operations.
You're doing a foreach on it and that doesn't get synchronized at the required level.
The biggest problem in your particular case is the fact the enumerating of the Queue (which is a Collection in it's own right) might throw the wellknown "Collection has been modified" exception. Why is that the biggest problem ? Because you are adding things to the queue after you've added the corresponding objects to the list (there's also a great need for the List to be synchronized but that + the biggest problem get solved with just one "bullet"). While enumerating a collection it is not easy to swallow the fact that another thread is modifying it (even if on a microscopic level the modification is a safe - ConcurrentQueue does just that).
Therefore you absolutely need synchronize the access to the queues (and the central List while you're at it) using another means of synchronization (and by that I mean you can also forget abount ConcurrentQueue and use a simple Queue or even a List since you never Dequeue things).
So just do something like:
public void Writer(object toWrite) {
this.rwLock.EnterWriteLock();
try {
int tailIndex = this.list.Count;
this.list.Add(toWrite);
if (..condition1..)
this.queue1.Enqueue(tailIndex);
if (..condition2..)
this.queue2.Enqueue(tailIndex);
if (..condition3..)
this.queue3.Enqueue(tailIndex);
..etc..
} finally {
this.rwLock.ExitWriteLock();
}
}
and in the AccessItems:
public IEnumerable<object> AccessItems(int queueIndex) {
Queue<object> whichQueue = null;
switch (queueIndex) {
case 1: whichQueue = this.queue1; break;
case 2: whichQueue = this.queue2; break;
case 3: whichQueue = this.queue3; break;
..etc..
default: throw new NotSupportedException("Invalid queue disambiguating params");
}
List<object> results = new List<object>();
this.rwLock.EnterReadLock();
try {
foreach (var index in whichQueue)
results.Add(this.list[index]);
} finally {
this.rwLock.ExitReadLock();
}
return results;
}
And, based on my entire understanding of the cases in which your app accesses the List and the various Queues, it should be 100% safe.
End of big edit
First of all: What is this thing you call Thread-Safe ? by Eric Lippert
In your particular case, I guess the answer is no.
It is not the case that inconsistencies might arrise in the global context (the actual list).
Instead it is possible that the actual readers (who might very well "collide" with the unique writer) end up with inconsistencies in themselves (their very own Stacks meaning: local variables of all methods, parameters and also their logically isolated portion of the heap)).
The possibility of such "per-Thread" inconsistencies (the Nth thread wants to learn the number of elements in the List and finds out that value is 39404999 although in reality you only added 3 values) is enough to declare that, generally speaking that architecture is not thread-safe ( although you don't actually change the globally accessible List, simply by reading it in a flawed manner ).
I suggest you use the ReaderWriterLockSlim class.
I think you will find it fits your needs:
private ReaderWriterLockSlim rwLock = new ReaderWriterLockSlim(LockRecursionPolicy.SupportsRecursion);
private List<Object> items;
private ConcurrentQueue<int> queue;
private Timer timer;
private void callback(object state)
{
int index = items.Count;
this.rwLock.EnterWriteLock();
try {
// in this place, right here, there can be only ONE writer
// and while the writer is between EnterWriteLock and ExitWriteLock
// there can exist no readers in the following method (between EnterReadLock
// and ExitReadLock)
// we add the item to the List
// AND do the enqueue "atomically" (as loose a term as thread-safe)
items.Add(new object());
if (true)//some condition here
queue.Enqueue(index);
} finally {
this.rwLock.ExitWriteLock();
}
timer.Change(TimeSpan.FromMilliseconds(500), TimeSpan.FromMilliseconds(-1));
}
//This can be called from any thread
public IEnumerable<object> AccessItems()
{
List<object> results = new List<object>();
this.rwLock.EnterReadLock();
try {
// in this place there can exist a thousand readers
// (doing these actions right here, between EnterReadLock and ExitReadLock)
// all at the same time, but NO writers
foreach (var index in queue)
{
this.results.Add ( this.items[index] );
}
} finally {
this.rwLock.ExitReadLock();
}
return results; // or foreach yield return you like that more :)
}
No because you are reading and writing to/from the same object concurrently. This is not documented to be safe so you can't be sure it is safe. Don't do it.
The fact that it is in fact unsafe as of .NET 4.0 means nothing, btw. Even if it was safe according to Reflector it could change anytime. You can't rely on the current version to predict future versions.
Don't try to get away with tricks like this. Why not just do it in an obviously safe way?
As a side note: Two timer callbacks can execute at the same time, so your code is doubly broken (multiple writers). Don't try to pull off tricks with threads.
It is thread-safish. The foreach statement uses the ConcurrentQueue.GetEnumerator() method. Which promises:
The enumeration represents a moment-in-time snapshot of the contents of the queue. It does not reflect any updates to the collection after GetEnumerator was called. The enumerator is safe to use concurrently with reads from and writes to the queue.
Which is another way of saying that your program isn't going to blow up randomly with an inscrutable exception message like the kind you'll get when you use the Queue class. Beware of the consequences though, implicit in this guarantee is that you may well be looking at a stale version of the queue. Your loop will not be able to see any elements that were added by another thread after your loop started executing. That kind of magic doesn't exist and is impossible to implement in a consistent way. Whether or not that makes your program misbehave is something you will have to think about and can't be guessed from the question. It is pretty rare that you can completely ignore it.
Your usage of the List<> is however utterly unsafe.

When to use the lock thread in C#?

I have a server which handles multiple incoming socket connections and creates 2 different threads which store the data in XML format.
I was using the lock statement for thread safety almost in every event handler called asyncronously and in the 2 threads in different parts of code. Sadly using this approach my application significantly slows down.
I tried to not use lock at all and the server is very fast in execution, even the file storage seems to boost; but the program crashes for reasons I don't understand after 30sec - 1min. of work.
So. I thought that the best way is to use less locks or to use it only there where's strictly necessary. As such, I have 2 questions:
Is the lock needed when I write to the public accessed variables (C# lists) only or even when I read from them ?
Is the lock needed only in the asyncronous threads created by the socket handler or in other places too ?
Someone could give me some practical guidelines, about how to operate. I'll not post the whole code this time. It hasn't sense to post about 2500 lines of code.
You ever sit in your car or on the bus at a red light when there's no cross traffic? Big waste of time, right? A lock is like a perfect traffic light. It is always green except when there is traffic in the intersection.
Your question is "I spend too much time in traffic waiting at red lights. Should I just run the red light? Or even better, should I remove the lights entirely and just let everyone drive through the intersection at highway speeds without any intersection controls?"
If you're having a performance problem with locks then removing locks is the last thing you should do. You are waiting at that red light precisely because there is cross traffic in the intersection. Locks are extraordinarily fast if they are not contended.
You can't eliminate the light without eliminating the cross traffic first. The best solution is therefore to eliminate the cross traffic. If the lock is never contended then you'll never wait at it. Figure out why the cross traffic is spending so much time in the intersection; don't remove the light and hope there are no collisions. There will be.
If you can't do that, then adding more finely-grained locks sometimes helps. That is, maybe you have every road in town converging on the same intersection. Maybe you can split that up into two intersections, so that code can be moving through two different intersections at the same time.
Note that making the cars faster (getting a faster processor) or making the roads shorter (eliminating code path length) often makes the problem worse in multithreaded scenarios. Just as it does in real life; if the problem is gridlock then buying faster cars and driving them on shorter roads gets them to the traffic jam faster, but not out of it faster.
Is the lock needed when I write to the public accessed variables (C# lists) only or even when I read from them ?
Yes (even when you read).
Is the lock needed only in the asyncronous threads created by the socket handler or in other places too ?
Yes. Wherever code accesses a section of code which is shared, always lock.
This sounds like you may not be locking individual objects, but locking one thing for all lock situations.
If so put in smart discrete locks by creating individual unique objects which relate and lock only certain sections at a time, which don't interfere with other threads in other sections.
Here is an example:
// This class simulates the use of two different thread safe resources and how to lock them
// for thread safety but not block other threads getting different resources.
public class SmartLocking
{
private string StrResource1 { get; set; }
private string StrResource2 { get; set; }
private object _Lock1 = new object();
private object _Lock2 = new object();
public void DoWorkOn1( string change )
{
lock (_Lock1)
{
_Resource1 = change;
}
}
public void DoWorkOn2( string change2 )
{
lock (_Lock2)
{
_Resource2 = change2;
}
}
}
Always use a lock when you access members (either read or write). If you are iterating over a collection, and from another thread you're removing items, things can go wrong quickly.
A suggestion is when you want to iterate a collection, copy all the items to a new collection and then iterate the copy. I.e.
var newcollection; // Initialize etc.
lock(mycollection)
{
// Copy from mycollection to newcollection
}
foreach(var item in newcollection)
{
// Do stuff
}
Likewise, only use the lock the moment you are actually writing to the list.
The reason that you need to lock while reading is:
let's say you are making change to one property and it has being read twice while the thread is inbetween a lock. Once right before we made any change and another after, then we will have inconsistent results.
I hope that helps,
Basically this can be answered pretty simple:
You need to lock all the things that are accessed by different threads. It actually doesnt really matter if its about reading or writing. If you are reading and another thread is overwriting the data at the same time the data read may get invalid and you possibly are performing invalid operations.

Foreach loop and variables with locking

Basically what I want to do is this psuedo code
List<DatabaseRecord> records;
List<ChangedItem> changedItems;
Parallel.ForEach<DatabaseRecord>(records, (item, loopState) =>
{
if (item.HasChanged)
{
lock (changedItems)
{
changedItems.Add(new ChangedItem(item));
}
}
});
But what I'm worried about is locking on the changedItems. While it works, I have heard that it has to serialize the locked object over and over. Is there a better way of doing this?
Why don't you use PLinq instead? No locking needed:
changedItems = records.AsParallel()
.Where(x => x.HasChanged)
.Select(x => new ChangedItem(x))
.ToList();
Since you are projecting into a new list of ChangedItem and do not have any side effects this would be the way to go in my opinion.
Could you use a ConcurrentCollection for your changedItems. Something like ConcurrentQueue? Then you wouldn't need to lock at all.
Update:
With regard to the ConcurrentQueue, the Enqueue won't block the thread in keeping the operation thread safe. It stays in user mode with a SpinWait...
public void Enqueue(T item)
{
SpinWait wait = new SpinWait();
while (!this.m_tail.TryAppend(item, ref this.m_tail))
{
wait.SpinOnce();
}
}
I don't think that locking on the list will cause the list to serialize/deserialize as locking takes place on a private field available on all objects (syncBlockIndex). However, the recommended way to go about locking is to create a private field that you will specifically use for locking:
object _lock = new object();
This is because you have control over what you lock on. If you publish access to your list via a property, then code outside of your control might take a lock on that object and thus introduce a deadlock situation.
With regards to PLinq, I think deciding to use it depends on what your host and load is like. For example, in ASP.NET, PLINQ is more processor hungry which will get it done more quickly but at the expense of taking processing away from serving other web requests. The syntax is admittedly a lot cleaner.
It seems that you using this code in single thread ?
if it's single thread , No lock needed .
Does it works alright when you remove the line " lock (changedItems) "
The code that #BrokenGlass have posted is more clear and easy to understand.

locking only when modifying vs entire method

When should locks be used? Only when modifying data or when accessing it as well?
public class Test {
static Dictionary<string, object> someList = new Dictionary<string, object>();
static object syncLock = new object();
public static object GetValue(string name) {
if (someList.ContainsKey(name)) {
return someList[name];
} else {
lock(syncLock) {
object someValue = GetValueFromSomeWhere(name);
someList.Add(name, someValue);
}
}
}
}
Should there be a lock around the the entire block or is it ok to just add it to the actual modification? My understanding is that there still could be some race condition where one call might not have found it and started to add it while another call right after might have also run into the same situation - but I'm not sure. Locking is still so confusing. I haven't run into any issues with the above similar code but I could just be lucky so far. Any help above would be appriciated as well as any good resources for how/when to lock objects.
You have to lock when reading too, or you can get unreliable data, or even an exception if a concurrent modification physically changes the target data structure.
In the case above, you need to make sure that multiple threads don't try to add the value at the same time, so you need at least a read lock while checking whether it is already present. Otherwise multiple threads could decide to add, find the value is not present (since this check is not locked), and then all try to add in turn (after getting the lock)
You could use a ReaderWriterLockSlim if you have many reads and only a few writes. In the code above you would acquire the read lock to do the check and upgrade to a write lock once you decide you need to add it. In most cases, only a read lock (which allows your reader threads to still run in parallel) would be needed.
There is a summary of the available .Net 4 locking primitives here. Definitely you should understand this before you get too deep into multithreaded code. Picking the correct locking mechanism can make a huge performance difference.
You are correct that you have been lucky so far - that's a frequent feature of concurrency bugs. They are often hard to reproduce without targeted load testing, meaning correct design (and exhaustive testing, of course) is vital to avoid embarrassing and confusing production bugs.
Lock the whole block before you check for the existence of name. Otherwise, in theory, another thread could add it between the check, and your code that adds it.
Actually locking just when you perform the Add really doesn't do anything at all. All that would do is prevent another thread from adding something simultaneously. But since that other thread would have already decided it was going to do the add, it would just try to do it anyway as soon as the lock was released.
If a resource can only be accessed by multiple threads, you do not need any locks.
If a resource can be accessed by multiple threads and can be modified, then all accesses/modifications need to be synchronized. In your example, if GetValueFromSomeWhere takes a long time to return, it is possible for a second call to be made with the same value in name, but the value has not been stored in the Dictionary.
ReaderWriterLock or the slim version if you under 4.0.
You will aquire the reader lock for the reads (will allow for concurrent reads) and upgrade the lock to the writer lock when something is to write (will allow only one write at the time and will block all the reads until is done, as well as the concurrent write-threads).
Make sure to release your locks with the pattern to avoid deadlocking:
void Write(object[] args)
{
this.ReaderWriterLock.AquireWriteLock(TimeOut.Infinite);
try
{
this.myData.Write(args);
}
catch(Exception ex)
{
}
finally
{
this.ReaderWriterLock.RelaseWriterLock();
}
}

Categories