I'm creating an app, where I have 50x50 map. On this map I can add dots, which are new instances of the class "dot". Every dot has it's own thread, and every thread connected with a specific dot operates on the method "explore" of the class, and in this method there is another method "check_place(x,y)" which is responsible for checking if some place on the map was already discovered. If not, the static variable of the class "num_discovered" should be incremented. This single instance of the method "check_place(x,y)" should be accessed in the real-time by every thread started in the app.
Constructor:
public dot(Form1 F)
{
/...
thread = new System.Threading.Thread(new System.Threading.ThreadStart(explore)); //wątek wykonujący metodę explore klasy robot
thread.Start();
}
check_place(x,y) method:
static void check_place(int x, int y)
{
lock (ob)
{
if (discovered[x, y] == false)
{
discovered[x, y] = true;
num_discovered += 1;
}
}
}
In the explore method I'm invoking method "check_place(x,y)" like this:
dot.check_place(x, y);
Is it enough to achieve a situation where in the single time only one dot can check if place was already discovered?
Is it enough to achieve a situation where in the single time only one dot can check if place was already discovered?
Yes. But what's the point?
If threads are spending all of their time waiting on other threads, what have you gained from being multi-threaded?
There are three (sometimes overlapping) reasons to spawn more threads:
To make use of more than one core at the same time: overall throughput increases.
To have work done while another thread is waiting on something else (typically I/O from file, DB or network): overall throughput increases.
To respond to user interaction while work is being done: overall throughput decreases, but it feels faster to the user as they are separately being reacted to.
Here the last doesn't apply.
If your "checking" involved I/O then the second might apply, and this strategy might make sense.
The first could well apply, but because all the threads are spending most of their time waiting on other threads, you don't gain an improvement in throughput.
Indeed, because there is overhead involved in setting up threads and switching between them, this code will be slower than just having one thread do everything: If only one thread can work at a time, then only have one thread!
So your use of a lock here is correct in that it prevents corruption and errors, but pointless in that it makes everything too slow.
What to do about this:
If your real case involves I/O or other reasons why the threads in fact spend most of their time out of each others' way, then what you have is fine.
Otherwise you've got two options.
Easy: Just use one thread.
Hard: Have finer locking.
One way to have finer locking would be to do double-checking:
static void check_place(int x, int y)
{
if (!discovered[x, y])
lock (ob)
if (!discovered[x, y])
{
discovered[x, y] = true;
num_discovered += 1;
}
}
Now at the very least some threads will skip past some cases where discovered[x, y] is true without holding up the other threads.
This is useful when a thread is going to get a result at the end of the locked period. Its still not good enough here though, because it's just going to move on quickly to a case were it fights for the lock again.
If our lookup of discovered were itself thread-safe and that thread-safety was finely grained, then we could make some progress:
static void check_place(int x, int y)
{
if (discovered.SetIfFalse(x, y))
Interlocked.Increment(ref num_discovered)
}
So far though we've just moved the problem around; how do we make SetIfFalse thread-safe without using a single lock and causing the same problem?
There are a few approaches. We could use striped locks, or low-locking concurrent collections.
It seem that you have a fixed-size structure of 50×50, in which case this isn't too hard:
private class DotMap
{
//ints because we can't use interlocked with bools
private int[][] _map = new int[50][];
public DotMap()
{
for(var i = 0; i != 50; ++i)
_map[i] = new int[50];
}
public bool SetIfFalse(int x, int y)
{
return Interlocked.CompareExchange(ref _map[x][y], 1, 0) == 0;
}
}
Now our advantages are:
All of our locking is much lower-level (but note that Interlocked operations will still slow down in the face of contention, albeit not as much as lock).
Much of our locking is out of the way of other locking. Specifically, that in SetIfFalse can allow for separate areas to be checked without being in each others way at all.
This is neither a panacea though (such approaches still suffer in the face of contention, and also bring their own costs) nor easy to generalise to other cases (changing SetIfFalse to something that does anything more than check and change that single value is not easy). It's still quite likely that even on a machine with a lot of cores this would be slower than the single-threaded approach.
Another possibility is to not have SetIfFalse thread-safe at all, but to ensure that the threads where each partitioned from each other so that they were never going to hit the same values and that the structure is safe in the case of such multi-threaded access (fixed arrays of elements above machine word-size are thread-safe when threads only ever hit different indices, must mutable structures where one can Add and/or Remove are not).
In all, you've got the right idea about how to use lock to keep threads from causing errors, and that is the approach to use 98% of the time when something lends itself well to multithreading because it involves threads waiting on something else. Your example though hits that lock too much to benefit from multiple cores, and creating code that does is not trivial.
Your performance on this could potentially be pretty bad - I recommend using Task.Run here to increase efficiency when you need to run your explore method on multiple threads in parallel.
As far as locking and thread safety, if the lock in check_place is the only place you're setting bools in the discovered variable and setting the num_discovered variable, the existing code will work. If you start setting them from somewhere else in the code, you will need to use locks there as well.
Also, when reading from these variables, you should read these values into local variables inside other locks using the same lock object to maintain thread safety here as well.
I have other suggestions but those are the two most basic things you need here.
Related
I have a thread safe class which uses a particular resource that needs to be accessed exclusively. In my assessment it does not make sense to have the callers of various methods block on a Monitor.Enter or await a SemaphoreSlim in order to access this resource.
For instance I have some "expensive" asynchronous initialization. Since it does not make sense to initialize more than once, whether it be from multiple threads or a single one, multiple calls should return immediately (or even throw an exception). Instead one should create, init and then distribute the instance to multiple threads.
UPDATE 1:
MyClass uses two NamedPipes in either direction. The InitBeforeDistribute method is not really initialization, but rather properly setting up a connection in both directions. It does not make sense to make the pipe available to N threads before you have set up the connection. Once it is setup multiple threads can post work, but only one can actually read/write to the stream. My apologies for obfuscating this with poor naming of the examples.
UPDATE 2:
If InitBeforeDistribute implemented a SemaphoreSlim(1, 1) with proper await logic (instead of the interlocked operation throwing an exception), is the Add/Do Square method OK practice? It does not throw a redundant exception (such as in InitBeforeDistribute), while being lock-free?
The following would be a good bad example:
class MyClass
{
private int m_isIniting = 0; // exclusive access "lock"
private volatile bool vm_isInited = false; // vol. because other methods will read it
public async Task InitBeforeDistribute()
{
if (Interlocked.Exchange(ref this.m_isIniting, -1) != 0)
throw new InvalidOperationException(
"Cannot init concurrently! Did you distribute before init was finished?");
try
{
if (this.vm_isInited)
return;
await Task.Delay(5000) // init asynchronously
.ConfigureAwait(false);
this.vm_isInited = true;
}
finally
{
Interlocked.Exchange(ref this.m_isConnecting, 0);
}
}
}
Some points:
If there is a case where blocking/awaiting access to a lock makes
perfect sense, then this example does not (make sense, that is).
Since I need to await in the method, I must use something like a
SemaphoreSlim if I where to use a "proper" lock. Forgoing the
Semaphore for the example above allows me to not worry about
disposing the class once I'm done with it. (I always disliked the
idea of disposing an item used by multiple threads. This is a minor
positive, for sure.)
If the method is called often there might be some performance
benefits, which of course should be measured.
The above example does not make sense in ref. to (3.) so here is another example:
class MyClass
{
private volatile bool vm_isInited = false; // see above example
private int m_isWorking = 0; // exclusive access "lock"
private readonly ConcurrentQueue<Tuple<int, TaskCompletionSource<int>> m_squareWork =
new ConcurrentQueue<Tuple<int, TaskCompletionSource<int>>();
public Task<int> AddSquare(int number)
{
if (!this.vm_isInited) // see above example
throw new InvalidOperationException(
"You forgot to init! Did you already distribute?");
var work = new Tuple<int, TaskCompletionSource<int>(number, new TaskCompletionSource<int>()
this.m_squareWork.Enqueue(work);
Task do = DoSquare();
return work.Item2.Task;
}
private async Task DoSquare()
{
if (Interlocked.Exchange(ref this.m_isWorking, -1) != 0)
return; // let someone else do the work for you
do
{
try
{
Tuple<int, TaskCompletionSource<int> work;
while (this.m_squareWork.TryDequeue(out work))
{
await Task.Delay(5000) // Limiting resource that can only be
.ConfigureAwait(false); // used by one thread at a time.
work.Item2.TrySetResult(work.Item1 * work.Item1);
}
}
finally
{
Interlocked.Exchange(ref this.m_isWorking, 0);
}
} while (this.m_squareWork.Count != 0 &&
Interlocked.Exchange(ref this.m_isWorking, -1) == 0)
}
}
Are there some of the specific negative aspects of this "lock-free" example that I should pay attention to?
Most questions relating to "lock-free" code on SO generally advise against it, stating that it is for the "experts". Rarely (I could be wrong on this one) do I see suggestions for books/blogs/etc that one can delve into, should one be so inclined. If there any such resources I should look into, please share. Any suggestions will be highly appreciated!
Update: great article related
.: Creating High-Performance Locks and Lock-free Code (for .NET) :.
The main point about lock-free algorythms is not that they are for experts.
The main point is Do you really need lock-free algorythm here? I can't understand your logic here:
Since it does not make sense to initialize more than once, whether it be from multiple threads or a single one, multiple calls should return immediately (or even throw an exception).
Why can't your users simply wait for a result of initialization, and use your resource after that? If your can, simply use the Lazy<T> class or even Asynchronous Lazy Initialization.
You really should read about consensus number and CAS-operations and why does it matters while implementing your own synchronization primitive.
In your code your are using the Interlocked.Exchange method, which isn't CAS in real, as it always exchanges the value, and it has a consensus number equal to 2. This means that the primitive using such construction will work correctly only for 2 threads (not in your situation, but still 2).
I've tried to define is your code works correctly for 3 threads, or there can be some circumstances which lead your application to damaged state, but after 30 minutes I stopped. And any your team member will stop like me after some time trying to understand your code. This is a waste of time, not only yours, but your team. Don't reinvent the wheel until you really have to.
My favorite book in related area is Writing High-Performance .NET Code by Ben Watson, and my favorite blog is Stephen Cleary's. If you can be more specific about what kind of book are you interested in, I can add some more references.
No locks in program doesn't make your application lock-free. In .NET application you really should not use the Exceptions for your internal program flow. Consider that the initializing thread isn't scheduled for a while by the OS (on various reasons, no matter what they are exactly).
In this case all other threads in your app will die step by step trying to access your shared resource. I can't say that this is a lock-free code. Yes, there are no locks in it, but it doesn't guarantee the correctness of the program and thus it isn't a lock-free by definition.
The Art of Multiprocessor Programming by Maurice Herlihy and Nir Shavit, is a great resource for lock-free and wait-free programming. lock-free is a progress guarantee other than a mode of programming, so to argue that an algorithm is lock-free, one has to validate or show proofs of the progress guarantee. lock-free in simple terms implies that blocking or halting of one thread doesn't not block progress of other threads or that if a threads is blocked infinitely often, then there is another thread that makes progress infinitely often.
This is more a conceptual question. I was wondering if I used a lock inside of Parallel.ForEach<> loop if that would take away the benefits of Paralleling a foreachloop.
Here is some sample code where I have seen it done.
Parallel.ForEach<KeyValuePair<string, XElement>>(binReferences.KeyValuePairs, reference =>
{
lock (fileLockObject)
{
if (fileLocks.ContainsKey(reference.Key) == false)
{
fileLocks.Add(reference.Key, new object());
}
}
RecursiveBinUpdate(reference.Value, testPath, reference.Key, maxRecursionCount, ref recursionCount);
lock (fileLocks[reference.Key])
{
reference.Value.Document.Save(reference.Key);
}
});
Where fileLockObject and fileLocks are as follows.
private static object fileLockObject = new object();
private static Dictionary<string, object> fileLocks = new Dictionary<string, object>();
Does this technique completely make the loop not parallel?
I would like to see your thoughts on this.
It means all of the work inside of the lock can't be done in parallel. This greatly harms the performance here, yes. Since the entire body is not all locked (and locked on the same object) there is still some parallelization here though. Whether the parallelization that you do get adds enough benefit to surpass the overhead that comes with managing the threads and synchronizing around the locks is something you really just need to test yourself with your specific data.
That said, it looks like what you're doing (at least in the first locked block, which is the one I'd be more concerned with at every thread is locking on the same object) is locking access to a Dictionary. You can instead use a ConcurrentDictionary, which is specifically designed to be utilized from multiple threads, and will minimize the amount of synchronization that needs to be done.
if I used a lock ... if that would take away the benefits of Paralleling a foreachloop.
Proportionally. When RecursiveBinUpdate() is a big chunk of work (and independent) then it will still pay off. The locking part could be a less than 1%, or 99%. Look up Amdahls law, that applies here.
But worse, your code is not thread-safe. From your 2 operations on fileLocks, only the first is actually inside a lock.
lock (fileLockObject)
{
if (fileLocks.ContainsKey(reference.Key) == false)
{
...
}
}
and
lock (fileLocks[reference.Key]) // this access to fileLocks[] is not protected
change the 2nd part to:
lock (fileLockObject)
{
reference.Value.Document.Save(reference.Key);
}
and the use of ref recursionCount as a parameter looks suspicious too. It might work with Interlocked.Increment though.
The "locked" portion of the loop will end up running serially. If the RecursiveBinUpdate function is the bulk of the work, there may be some gain, but it would be better if you could figure out how to handle the lock generation in advance.
When it comes to locks, there's no difference in the way PLINQ/TPL threads have to wait to gain access. So, in your case, it only makes the loop not parallel in those areas that you're locking and any work outside those locks is still going to execute in parallel (i.e. all the work in RecursiveBinUpdate).
Bottom line, I see nothing substantially wrong with what you're doing here.
How come if I have a statement like this:
private int sharedValue = 0;
public void SomeMethodOne()
{
lock(this){sharedValue++;}
}
public void SomeMethodTwo()
{
lock(this){sharedValue--;}
}
So for a thread to get into a lock it must first check if another thread is operating on it. If it isn't, it can enter and has to write something to memory, this surely cannot be atomic as it needs to read and write.
So how come it's impossible for one thread to be reading the lock, while the other is writing its ownership to it?
To simplify Why cannot two threads both get into a lock at the same time?
It looks like you are basically asking how the lock works. How can the lock maintain internal state in an atomic manner without already having the lock built? It seems like a chicken and egg problem at first does it not?
The magic all happens because of a compare-and-swap (CAS) operation. The CAS operation is a hardware level instruction that does 2 important things.
It generates a memory barrier so that instruction reordering is constrained.
It compares the contents of a memory address with another value and if they are equal then the original value is replaced with a new value. It does all of this in an atomic manner.
At the most fundamental level this is how the trick is accomplished. It is not that all other threads are blocked from reading while another is writing. That is totally the wrong way to think about it. What actually happens is that all threads are acting as writers simultaneously. The strategy is more optimistic than it is pessimistic. Every thread is trying to acquire the lock by performing this special kind of write called a CAS. You actually have access to a CAS operation in .NET via the Interlocked.CompareExchange (ICX) method. Every synchronization primitive can be built from this single operation.
If I were going to write a Monitor-like class (that is what the lock keyword uses behind the scenes) from scratch entirely in C# I could do it using the Interlocked.CompareExchange method. Here is an overly simplified implementation. Please keep in mind that this is most certainly not how the .NET Framework does it.1 The reason I present the code below is to show you how it could be done in pure C# code without the need for CLR magic behind the scenes and because it might get you thinking about how Microsoft could have implemented it.
public class SimpleMonitor
{
private int m_LockState = 0;
public void Enter()
{
int iterations = 0;
while (!TryEnter())
{
if (iterations < 10) Thread.SpinWait(4 << iterations);
else if (iterations % 20 == 0) Thread.Sleep(1);
else if (iterations % 5 == 0) Thread.Sleep(0);
else Thread.Yield();
iterations++;
}
}
public void Exit()
{
if (!TryExit())
{
throw new SynchronizationLockException();
}
}
public bool TryEnter()
{
return Interlocked.CompareExchange(ref m_LockState, 1, 0) == 0;
}
public bool TryExit()
{
return Interlocked.CompareExchange(ref m_LockState, 0, 1) == 1;
}
}
This implementation demonstrates a couple of important things.
It shows how the ICX operation is used to atomically read and write the lock state.
It shows how the waiting might occur.
Notice how I used Thread.SpinWait, Thread.Sleep(0), Thread.Sleep(1) and Thread.Yield while the lock is waiting to be acquired. The waiting strategy is overly simplified, but it does approximate a real life algorithm implemented in the BCL already. I intentionally kept the code simple in the Enter method above to make it easier to spot the crucial bits. This is not how I would have normally implemented this, but I am hoping it does drive home the salient points.
Also note that my SimpleMonitor above has a lot of problems. Here are but only a few.
It does not handle nested locking.
It does not provide Wait or Pulse methods like the real Monitor class. They are really hard to do right.
1The CLR will actually use a special block of memory that exists on each reference type. This block of memory is referred to as the "sync block". The Monitor will manipulate bits in this block of memory to acquire and release the lock. This action may require a kernel event object. You can read more about it on Joe Duffy's blog.
lock in C# is used to create a Monitor object that is actually used for locking.
You can read more about Monitor in here: http://msdn.microsoft.com/en-us/library/system.threading.monitor.aspx. The Enter method of the Monitor ensures that only one thread can enter the critical section at the time:
Acquires a lock for an object. This action also marks the beginning of a critical section. No other thread can enter the critical section unless it is executing the instructions in the critical section using a different locked object.
BTW, you should avoid locking on this (lock(this)). You should use a private variable on a class (static or non-static) to protect the critical section. You can read more in the same link provided above but the reason is:
When selecting an object on which to synchronize, you should lock only on private or internal objects. Locking on external objects might result in deadlocks, because unrelated code could choose the same objects to lock on for different purposes.
I have a server which handles multiple incoming socket connections and creates 2 different threads which store the data in XML format.
I was using the lock statement for thread safety almost in every event handler called asyncronously and in the 2 threads in different parts of code. Sadly using this approach my application significantly slows down.
I tried to not use lock at all and the server is very fast in execution, even the file storage seems to boost; but the program crashes for reasons I don't understand after 30sec - 1min. of work.
So. I thought that the best way is to use less locks or to use it only there where's strictly necessary. As such, I have 2 questions:
Is the lock needed when I write to the public accessed variables (C# lists) only or even when I read from them ?
Is the lock needed only in the asyncronous threads created by the socket handler or in other places too ?
Someone could give me some practical guidelines, about how to operate. I'll not post the whole code this time. It hasn't sense to post about 2500 lines of code.
You ever sit in your car or on the bus at a red light when there's no cross traffic? Big waste of time, right? A lock is like a perfect traffic light. It is always green except when there is traffic in the intersection.
Your question is "I spend too much time in traffic waiting at red lights. Should I just run the red light? Or even better, should I remove the lights entirely and just let everyone drive through the intersection at highway speeds without any intersection controls?"
If you're having a performance problem with locks then removing locks is the last thing you should do. You are waiting at that red light precisely because there is cross traffic in the intersection. Locks are extraordinarily fast if they are not contended.
You can't eliminate the light without eliminating the cross traffic first. The best solution is therefore to eliminate the cross traffic. If the lock is never contended then you'll never wait at it. Figure out why the cross traffic is spending so much time in the intersection; don't remove the light and hope there are no collisions. There will be.
If you can't do that, then adding more finely-grained locks sometimes helps. That is, maybe you have every road in town converging on the same intersection. Maybe you can split that up into two intersections, so that code can be moving through two different intersections at the same time.
Note that making the cars faster (getting a faster processor) or making the roads shorter (eliminating code path length) often makes the problem worse in multithreaded scenarios. Just as it does in real life; if the problem is gridlock then buying faster cars and driving them on shorter roads gets them to the traffic jam faster, but not out of it faster.
Is the lock needed when I write to the public accessed variables (C# lists) only or even when I read from them ?
Yes (even when you read).
Is the lock needed only in the asyncronous threads created by the socket handler or in other places too ?
Yes. Wherever code accesses a section of code which is shared, always lock.
This sounds like you may not be locking individual objects, but locking one thing for all lock situations.
If so put in smart discrete locks by creating individual unique objects which relate and lock only certain sections at a time, which don't interfere with other threads in other sections.
Here is an example:
// This class simulates the use of two different thread safe resources and how to lock them
// for thread safety but not block other threads getting different resources.
public class SmartLocking
{
private string StrResource1 { get; set; }
private string StrResource2 { get; set; }
private object _Lock1 = new object();
private object _Lock2 = new object();
public void DoWorkOn1( string change )
{
lock (_Lock1)
{
_Resource1 = change;
}
}
public void DoWorkOn2( string change2 )
{
lock (_Lock2)
{
_Resource2 = change2;
}
}
}
Always use a lock when you access members (either read or write). If you are iterating over a collection, and from another thread you're removing items, things can go wrong quickly.
A suggestion is when you want to iterate a collection, copy all the items to a new collection and then iterate the copy. I.e.
var newcollection; // Initialize etc.
lock(mycollection)
{
// Copy from mycollection to newcollection
}
foreach(var item in newcollection)
{
// Do stuff
}
Likewise, only use the lock the moment you are actually writing to the list.
The reason that you need to lock while reading is:
let's say you are making change to one property and it has being read twice while the thread is inbetween a lock. Once right before we made any change and another after, then we will have inconsistent results.
I hope that helps,
Basically this can be answered pretty simple:
You need to lock all the things that are accessed by different threads. It actually doesnt really matter if its about reading or writing. If you are reading and another thread is overwriting the data at the same time the data read may get invalid and you possibly are performing invalid operations.
I have two internal properties that use lazy-loading of backing fields, and are used in a multi-threaded application, so I have implemented a double-checking lock scheme as per this MSDN article
Now, firstly assuming that this is an appropriate pattern, all the examples show creating a single lock object for an instance. If my two properties are independent of each other, would it not be more efficient to create a lock instance for each property?
It occurs to me that maybe there is only one in order to avoid deadlocks or race-conditions. A obvious situation doesn't come to mind, but I'm sure someone can show me one... (I'm not very experienced with multi-threaded code, obviously)
private List<SomeObject1> _someProperty1;
private List<SomeObject2> _someProperty2;
private readonly _syncLockSomeProperty1 = new Object();
private readonly _syncLockSomeProperty2 = new Object();
internal List<SomeObject1> SomeProperty1
{
get
{
if (_someProperty1== null)
{
lock (_syncLockSomeProperty1)
{
if (_someProperty1 == null)
{
_someProperty1 = new List<SomeObject1>();
}
}
}
return _someProperty1;
}
set
{
_someProperty1 = value;
}
}
internal List<SomeObject2> SomeProperty2
{
get
{
if (_someProperty2 == null)
{
lock (_syncLockSomeProperty2)
{
if (_someProperty2 == null)
{
_someProperty2 = new List<SomeObject2>();
}
}
}
return _someProperty2;
}
set
{
_someProperty2 = value;
}
}
If your properties are truly independent, then there's no harm in using independent locks for each of them.
In case the two properties (or their initializers more specifically) are independent of each other, as in the sample code you provided, it makes sense to have two different lock objects. However, when the initialization occurs rarely, the effect will be negligible.
Note that you should protect the setter's code as well. The lock statement imposes a so called memory barrier, which is indispensable especially on multi-CPU and/or multi-core systems to prevent race conditions.
Yes, if they are independent of each other, this would indeed be more efficient, as access to one wont' block access to the other. You're also on the money about the risk of a deadlock if that independence turned out to be false.
The question is, presuming that _someProperty1 = new List<SomeObject1>(); isn't the real code for assigning to _someProperty1 (hardly worth the lazy-load, is it?), then the question is: Can the code that fills SomeProperty1 ever call that which fills SomeProperty2, or vice-versa, through any code-path, no matter how bizarre?
Even if one can call the other, there can't be a deadlock, but if they both can call each other (or 1 call 2, 2 call 3 and 3 call 1, and so on), then a deadlock can definitely happen.
As a rule, I'd start with broad locks (one lock for all locked tasks) and then make the locks narrower as an optimisation as needed. In cases where you have, say, 20 methods which need locking, then judging the safety can be harder (also, you begin to fill memory just with lock objects).
Note that there are two issues with your code also:
First, you don't lock in your setter. Possibly this is fine (you just want your lock to prevent multiple heavy calls to the loading method, and don't actually care if there are over-writes between the set, and the get), possibly this is a disaster.
Second, depending on the CPU running it, double-check as you write it can have issues with read/write reordering, so you should either have a volatile field, or call a memory barrier. See http://blogs.msdn.com/b/brada/archive/2004/05/12/130935.aspx
Edit:
It's also worth considering whether it's really needed at all.
Consider that the operation itself should be thread-safe:
Do a bunch of stuff is done.
Have an object created based on that bunch of stuff.
Assign that object to the local variable.
1 and 2 will only happen on one thread, and 3 is atomic. Therefore, the advantage of locking is:
If performing step 1 and/or 2 above have their own threading issues, and aren't protected from them by their own locks, then locking is 100% necessary.
If it would be disastrous for something to have acted upon a value obtained in step 1 and 2, and then later to do so with step 1 and 2 being repeated, locking is 100% necessary.
Locking will prevent the waste of 1 and 2 being done multiple times.
So, if we can rule out case 1 and 2 as an issue (takes a bit of analysis, but it's often possible), then we've only preventing the waste in case 3 to worry about. Now, maybe this is a big worry. However, if it would rarely come up, and also not be that much of a waste when it did, then the gains of not locking would outweigh the gains of locking.
If in doubt, locking is probably the safer approach, but its possible that just living with the occasional wasted operation is better.