Is there any way to lock on an integer in C#? Integers can not be used with lock because they are boxed (and lock only locks on references).
The scenario is as follows: I have a forum based website with a moderation feature. What I want to do is make sure that no more than one moderator can moderate a post at any given time. To achieve this, I want to lock on the ID of the post.
I've had a couple of ideas so far (e.g. using a dictionary<int, object>), but I'm looking for a better and cleaner way.
Any suggestions?
I like doing it like this
public class Synchronizer {
private Dictionary<int, object> locks;
private object myLock;
public Synchronizer() {
locks = new Dictionary<int, object>();
myLock = new object();
}
public object this[int index] {
get {
lock (myLock) {
object result;
if (locks.TryGetValue(index, out result))
return result;
result = new object();
locks[index] = result;
return result;
}
}
}
}
Then, to lock on an int you simply (using the same synchronizer every time)
lock (sync[15]) { ... }
This class returns the same lock object when given the same index twice. When a new index comes, it create an object, returning it, and stores it in the dictionary for next times.
It can easily be changed to work generically with any struct or value type, or to be static so that the synchronizer object does not have to be passed around.
If it's a website then using an in-process lock probably isn't the best approach as if you need to scale the site out onto multiple servers, or add another site hosting an API (or anything else that would require another process accessing the same data to exist) then all your locking strategies are immediately ineffective.
I'd be inclined to look into database-based locking for this. The simplest approach is to use optimistic locking with something like a timestamp of when the post was last updated, and to reject updates made to a post unless the timestamps match.
I've read a lot of comments mentioning that locking isn't safe for web applications, but, other than web farms, I haven't seen any explanations of why. I would be interested in hearing the arguments against it.
I have a similar need, though I'm caching re-sized images on the hard drive (which is obviously a local action so a web farm scenario isn't an issue).
Here is a redone version of what #Configurator posted. It includes a couple features that #Configurator didn't include:
Unlocking: Ensures the list doesn't grow unreasonably large (we have millions of photos and we can have many different sizes for each).
Generic: Allows locking based on different data types (such as int or string).
Here's the code...
/// <summary>
/// Provides a way to lock a resource based on a value (such as an ID or path).
/// </summary>
public class Synchronizer<T>
{
private Dictionary<T, SyncLock> mLocks = new Dictionary<T, SyncLock>();
private object mLock = new object();
/// <summary>
/// Returns an object that can be used in a lock statement. Ex: lock(MySync.Lock(MyValue)) { ... }
/// </summary>
/// <param name="value"></param>
/// <returns></returns>
public SyncLock Lock(T value)
{
lock (mLock)
{
SyncLock theLock;
if (mLocks.TryGetValue(value, out theLock))
return theLock;
theLock = new SyncLock(value, this);
mLocks.Add(value, theLock);
return theLock;
}
}
/// <summary>
/// Unlocks the object. Called from Lock.Dispose.
/// </summary>
/// <param name="theLock"></param>
public void Unlock(SyncLock theLock)
{
mLocks.Remove(theLock.Value);
}
/// <summary>
/// Represents a lock for the Synchronizer class.
/// </summary>
public class SyncLock
: IDisposable
{
/// <summary>
/// This class should only be instantiated from the Synchronizer class.
/// </summary>
/// <param name="value"></param>
/// <param name="sync"></param>
internal SyncLock(T value, Synchronizer<T> sync)
{
Value = value;
Sync = sync;
}
/// <summary>
/// Makes sure the lock is removed.
/// </summary>
public void Dispose()
{
Sync.Unlock(this);
}
/// <summary>
/// Gets the value that this lock is based on.
/// </summary>
public T Value { get; private set; }
/// <summary>
/// Gets the synchronizer this lock was created from.
/// </summary>
private Synchronizer<T> Sync { get; set; }
}
}
Here's how you can use it...
public static readonly Synchronizer<int> sPostSync = new Synchronizer<int>();
....
using(var theLock = sPostSync.Lock(myID))
lock (theLock)
{
...
}
This option builds on the good answer provided by configurator with the following modifications:
Prevents the size of the dictionary from growing uncontrollably. Since, new posts will get new ids, your dictionary of locks will grow indefinitely. The solution is to mod the id against a maximum dictionary size. This does mean that some ids will have the same lock (and have to wait when they would otherwise not have to), but this will be acceptable for some dictionary size.
Uses ConcurrentDictionary so there is no need for a separate dictionary lock.
The code:
internal class IdLock
{
internal int LockDictionarySize
{
get { return m_lockDictionarySize; }
}
const int m_lockDictionarySize = 1000;
ConcurrentDictionary<int, object> m_locks = new ConcurrentDictionary<int, object>();
internal object this[ int id ]
{
get
{
object lockObject = new object();
int mapValue = id % m_lockDictionarySize;
lockObject = m_locks.GetOrAdd( mapValue, lockObject );
return lockObject;
}
}
}
Also, just for completeness, there is the alternative of string interning: -
Mod the id against the maximum number of interned id strings you will allow.
Convert this modded value to a string.
Concatenate the modded string with a GUID or namespace name for name collision safety.
Intern this string.
lock on the interned string.
See this answer for some information:
The only benefit of the string interning approach is that you don't need to manage a dictionary. I prefer the dictionary of locks approach as the intern approach makes a lot of assumptions about how string interning works and that it will continue to work in this way. It also uses interning for something it was never meant / designed to do.
I would personally go with either Greg's or Konrad's approach.
If you really do want to lock against the post ID itself (and assuming that your code will only ever be running in a single process) then something like this isn't too dirty:
public class ModeratorUtils
{
private static readonly HashSet<int> _LockedPosts = new HashSet<int>();
public void ModeratePost(int postId)
{
bool lockedByMe = false;
try
{
lock (_LockedPosts)
{
lockedByMe = _LockedPosts.Add(postId);
}
if (lockedByMe)
{
// do your editing
}
else
{
// sorry, can't edit at this time
}
}
finally
{
if (lockedByMe)
{
lock (_LockedPosts)
{
_LockedPosts.Remove(postId);
}
}
}
}
}
Why don't you lock on the whole posting instead just on its ID?
Coresystem at codeplex has two class for thread synchronization based on value types, for details see http://codestand.feedbook.org/2012/06/lock-on-integer-in-c.html
I doubt you should use a database or O/S level feature such as locks for a business level decision. Locks incur significant overheads when held for long times (and in these contexts, anything beyond a couple of hundred milliseconds is an eternity).
Add a status field to the post. If you deal with several therads directly, then you can use O/S level locks -- to set the flag.
You need a whole different approach to this.
Remember that with a website, you don't actually have a live running application on the other side that responds to what the user does.
You basically start a mini-app, which returns the web-page, and then the server is done. That the user ends up sending some data back is a by-product, not a guarantee.
So, you need to lock to persist after the application has returned the moderation page back to the moderator, and then release it when the moderator is done.
And you need to handle some kind of timeout, what if the moderator closes his browser after getting the moderation page back, and thus never communicates back with the server that he/she is done with the moderation process for that post.
Ideally you can avoid all the complex and brittle C# locking and replace it with database locking, if your transactions are designed correctly then you should be able to get by with DB transactions only.
Two boxed integers that happen to have the same value are completely indepent objects.
So if you wanted to do this, your idea of Dictionary would probably be the way to go. You'd need to synchronize access to the dictionary to make sure you are always getting the same instance. And you'd have the problem of the dictionary growing in size.
C# locking is for thread safety and doesn't work the way you want it to for web applications.
The simplest solution is adding a column to the table that you want to lock and when somone locks it write to the db that that column is locked.
Dont let anyone open a post in edit mode if the column is locked for editing.
Otherwise maintain a static list of locked entry Ids and compare to that before allowing an edit.
You want to make sure that a delete doesn't happen twice?
CREATE PROCEDURE RemovePost( #postID int )
AS
if exists(select postID from Posts where postID = #postID)
BEGIN
DELETE FROM Posts where postID = #postID
-- Do other stuff
END
This is pretty much SQL server syntax, I'm not familiar with MyISAM. But it allows stored procedures. I'm guessing you can mock up a similar procedure.
Anyhow, this will work for the majority of cases. The only time it will fail is if two moderators submit at almost exactly the same time, and the exists() function passes on one request just before the DELETE statement executes on another request. I would happily use this for a small site. You could take it a step further and check that the delete actually deleted a row before continuing with the rest, which would guarantee the atomicity of it all.
Trying to create a lock in code, for this use case, I consider very impractical. You lose nothing by having two moderators attempting to delete a post, with one succeeding, and the other having no effect.
You should use a sync object like this:
public class YourForm
{
private static object syncObject = new object();
public void Moderate()
{
lock(syncObject)
{
// do your business
}
}
}
But this approach shouldn't be used in a web app scenario.
public static class ConexoesDeTeste
{
private static int NumeroDeConexoes = 0;
public static void Incrementar()
{
Interlocked.Increment(ref NumeroDeConexoes);
}
public static void Decrementar()
{
Interlocked.Decrement(ref NumeroDeConexoes);
}
public static int Obter() => NumeroDeConexoes;
}
Related
I'm trying to create my own Cache implementation for an API. It is the first time I work with ConcurrentDictionary and I do not know if I am using it correctly. In a test, something has thrown error and so far I have not been able to reproduce it again. Maybe some concurrency professional / ConcurrentDictionary can look at the code and find what may be wrong. Thank you!
private static readonly ConcurrentDictionary<string, ThrottleInfo> CacheList = new ConcurrentDictionary<string, ThrottleInfo>();
public override void OnActionExecuting(HttpActionContext actionExecutingContext)
{
if (CacheList.TryGetValue(userIdentifier, out var throttleInfo))
{
if (DateTime.Now >= throttleInfo.ExpiresOn)
{
if (CacheList.TryRemove(userIdentifier, out _))
{
//TODO:
}
}
else
{
if (throttleInfo.RequestCount >= defaultMaxRequest)
{
actionExecutingContext.Response = ResponseMessageExtension.TooManyRequestHttpResponseMessage();
}
else
{
throttleInfo.Increment();
}
}
}
else
{
if (CacheList.TryAdd(userIdentifier, new ThrottleInfo(Seconds)))
{
//TODO:
}
}
}
public class ThrottleInfo
{
private int _requestCount;
public int RequestCount => _requestCount;
public ThrottleInfo(int addSeconds)
{
Interlocked.Increment(ref _requestCount);
ExpiresOn = ExpiresOn.AddSeconds(addSeconds);
}
public void Increment()
{
// this is about as thread safe as you can get.
// From MSDN: Increments a specified variable and stores the result, as an atomic operation.
Interlocked.Increment(ref _requestCount);
// you can return the result of Increment if you want the new value,
//but DO NOT set the counter to the result :[i.e. counter = Interlocked.Increment(ref counter);] This will break the atomicity.
}
public DateTime ExpiresOn { get; } = DateTime.Now;
}
If I understand what you are trying to do if the ExpiresOn has passed remove the entry else update it or add if not exists.
You certainly can take advantage of the AddOrUpdateMethod to simplify some of your code.
Take a look here for some good examples: https://learn.microsoft.com/en-us/dotnet/standard/collections/thread-safe/how-to-add-and-remove-items
Hope this helps.
The ConcurrentDictionary is sufficient as a thread-safe container only in cases where (1) the whole state that needs protection is its internal state (the keys and values it contains), and only if (2) this state can be mutated atomically using the specialized API it offers (GetOrAdd, AddOrUpdate). In your case the second requirement is not met, because you need to remove keys conditionally depending on the state of their value, and this scenario is not supported by the ConcurrentDictionary class.
So your current cache implementation is not thread safe. The fact that throws exceptions sporadically is a coincidence. It would still be non-thread-safe if it was totally throw-proof, because it would not be totally error-proof, meaning that it could occasionally (or permanently) transition to a state incompatible with its specifications (returning expired values for example).
Regarding the ThrottleInfo class, it suffers from a visibility bug that could remain unobserved if you tested the class extensively in one machine, and then suddenly emerge when you deployed your app in another machine with a different CPU architecture. The non-volatile private int _requestCount field is exposed through the public property RequestCount, so there is no guarantee (based on the C# specification) that all threads will see its most recent value. You can read this article by Igor Ostrovsky about the peculiarities of the memory models, which may convince you (like me) that employing lock-free techniques (using the Interlocked class in this case) with multithreaded code is more trouble than it's worth. If you read it and like it, there is also a part 2 of this article.
Assuming the following case:
public HashTable map = new HashTable();
public void Cache(String fileName) {
if (!map.ContainsKey(fileName))
{
map.Add(fileName, new Object());
_Cache(fileName);
}
}
}
private void _Cache(String fileName) {
lock (map[fileName])
{
if (File Already Cached)
return;
else {
cache file
}
}
}
When having the following consumers:
Task.Run(()=> {
Cache("A");
});
Task.Run(()=> {
Cache("A");
});
Would it be possible in any ways that the Cache method would throw a Duplicate key exception meaning that both tasks would hit the map.add method and try to add the same key??
Edit:
Would using the following data structure solve this concurrency problem?
public class HashMap<Key, Value>
{
private HashSet<Key> Keys = new HashSet<Key>();
private List<Value> Values = new List<Value>();
public int Count => Keys.Count;
public Boolean Add(Key key, Value value) {
int oldCount = Keys.Count;
Keys.Add(key);
if (oldCount != Keys.Count) {
Values.Add(value);
return true;
}
return false;
}
}
Yes, of course it would be possible. Consider the following fragment:
if (!map.ContainsKey(fileName))
{
map.Add(fileName, new Object());
Thread 1 may execute if (!map.ContainsKey(fileName)) and find that the map does not contain the key, so it will proceed to add it, but before it gets the chance to add it, Thread 2 may also execute if (!map.ContainsKey(fileName)), at which point it will also find that the map does not contain the key, so it will also proceed to add it. Of course, that will fail.
EDIT (after clarifications)
So, the problem seems to be how to keep the main map locked for as little as possible, and how to prevent cached objects from being initialized twice.
This is a complex problem, so I cannot give you a ready-to-run answer that will work, (especially since I do not currently even have a C# development environment handy,) but generally speaking, I think that you should proceed as follows:
Fully guard your map with lock().
Keep your map locked as little as possible; when an object is not found to be in the map, add an empty object to the map and exit the lock immediately. This will ensure that this map will not become a point of contention for all requests coming in to the web server.
After the check-if-present-and-add-if-not fragment, you are holding an object which is guaranteed to be in the map. However, this object may and may not be initialized at this point. That's fine. We will take care of that next.
Repeat the lock-and-check idiom, this time with the cached object: every single incoming request interested in that specific object will need to lock it, check whether it is initialized, and if not, initialize it. Of course, only the first request will suffer the penalty of initialization. Also, any requests that arrive before the object has been fully initialized will have to wait on their lock until the object is initialized. But that's all very fine, that's exactly what you want.
I'm designing a class that I wish to make readonly after a main thread is done configuring it, i.e. "freeze" it. Eric Lippert calls this popsicle immutability. After it is frozen, it can be accessed by multiple threads concurrently for reading.
My question is how to write this in a thread safe way that is realistically efficient, i.e. without trying to be unnecessarily clever.
Attempt 1:
public class Foobar
{
private Boolean _isFrozen;
public void Freeze() { _isFrozen = true; }
// Only intended to be called by main thread, so checks if class is frozen. If it is the operation is invalid.
public void WriteValue(Object val)
{
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
public Object ReadSomething()
{
return it;
}
}
Eric Lippert seems to suggest this would be OK in this post.
I know writes have release semantics, but as far as I understand this only pertains to ordering, and it doesn't necessarily mean that all threads will see the value immediately after the write. Can anyone confirm this? This would mean this solution is not thread safe (this may not be the only reason of course).
Attempt 2:
The above, but using Interlocked.Exchange to ensure the value is actually published:
public class Foobar
{
private Int32 _isFrozen;
public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void WriteValue(Object val)
{
if (_isFrozen == 1)
throw new InvalidOperationException();
// write ...
}
}
Advantage here would be that we ensure the value is published without suffering the overhead on every read. If none of the reads are moved before the write to _isFrozen as the Interlocked method uses a full memory barrier I would guess this is thread safe. However, who knows what the compiler will do (and according to section 3.10 of the C# spec that seems like quite a lot), so I don't know if this is threadsafe.
Attempt 3:
Also do the read using Interlocked.
public class Foobar
{
private Int32 _isFrozen;
public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void WriteValue(Object val)
{
if (Interlocked.CompareExchange(ref _isFrozen, 0, 0) == 1)
throw new InvalidOperationException();
// write ...
}
}
Definitely thread safe, but it seems a little wasteful to have to do the compare exchange for every read. I know this overhead is probably minimal, but I'm looking for a reasonably efficient method (although perhaps this is it).
Attempt 4:
Using volatile:
public class Foobar
{
private volatile Boolean _isFrozen;
public void Freeze() { _isFrozen = true; }
public void WriteValue(Object val)
{
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
}
But Joe Duffy declared "sayonara volatile", so I won't consider this a solution.
Attempt 5:
Lock everything, seems a bit overkill:
public class Foobar
{
private readonly Object _syncRoot = new Object();
private Boolean _isFrozen;
public void Freeze() { lock(_syncRoot) _isFrozen = true; }
public void WriteValue(Object val)
{
lock(_syncRoot) // as above we could include an attempt that reads *without* this lock
if (_isFrozen)
throw new InvalidOperationException();
// write ...
}
}
Also seems definitely thread safe, but has more overhead than using the Interlocked approach above, so I would favour attempt 3 over this one.
And then I can come up with at least some more (I'm sure there are many more):
Attempt 6: use Thread.VolatileWrite and Thread.VolatileRead, but these are supposedly a little on the heavy side.
Attempt 7: use Thread.MemoryBarrier, seems a little too internal.
Attempt 8: create an immutable copy - don't want to do this
Summarising:
which attempt would you use and why (or how would you do it if entirely different)? (i.e. what is the best way for publishing a value once that is then read concurrently, while being reasonably efficient without being overly "clever"?)
does .NET's memory model "release" semantics of writes imply that all other threads see updates (cache coherency etc.)? I generally don't want to think too much about this, but it's nice to have an understanding.
EDIT:
Perhaps my question wasn't clear, but I am looking in particular for reasons as to why the above attempts are good or bad. Note that I am talking here about a scenario of one single writer that writes then freezes before any concurrent reads. I believe attempt 1 is OK but I'd like to know exactly why (as I wonder if reads could be optimized away somehow, for example).
I care less about whether or not this is good design practice but more about the actual threading aspect of it.
Many thanks for the response the question received, but I have chosen to mark this as an answer myself because I feel that the answers given do not quite answer my question and I do not want to give the impression to anyone visiting the site that the marked answer is correct simply because it was automatically marked as such due to the bounty expiring.
Furthermore I do not think the answer with the highest number of votes was overwhelmingly voted for, not enough to mark it automatically as an answer.
I am still leaning to attempt #1 being correct, however, I would have liked some authoritative answers. I understand x86 has a strong model, but I don't want to (and shouldn't) code for a particular architecture, after all that's one of the nice things about .NET.
If you are in doubt about the answer, go for one of the locking approaches, perhaps with the optimizations shown here to avoid a lot of contention on the lock.
Maybe slightly off topic but just out of curiosity :) Why don't you use "real" immutability? e.g. making Freeze() return an immutable copy (without "write methods" or any other possibility to change the inner state) and using this copy instead of the original object. You could even go without changing the state and return a new copy (with the changed state) on each write operation instead (afaik the string class works this). "Real immutability" is inherently thread safe.
I vote for Attempt 5, use the lock(this) implementation.
This is the most reliable means of making this work. Reader/writer locks could be employed, but to very little gain. Just go with using a normal lock.
If necessary you could improve the 'frozen' performance by first checking _isFrozen and then locking:
void Freeze() { lock (this) _isFrozen = true; }
object ReadValue()
{
if (_isFrozen)
return Read();
else
lock (this) return Read();
}
void WriteValue(object value)
{
lock (this)
{
if (_isFrozen) throw new InvalidOperationException();
Write(value);
}
}
If you really create, fill and freeze the object before showing it to other threads, then you don't need anything special to deal with thread-safety (the strong memory model of .NET is already your guarantee), so the solution 1 is valid.
But, if you give the unfrozen object to another thread (or if you are simple creating your class without knowing how users will use it) then using the version the solution that returns a new fully immutable instance is probably better. In this case, the Mutable instance is like the StringBuilder and the immutable instance is like the string. If you need an extra guarantee, the mutable instance may check its creator thread and throw exceptions if it is used from any other thread (in all methods... to avoid possible partial reads).
Attempt 2 is thread safe on x86 and other processors that have a strong memory model, but how I would do it is to make thread safety the consumers problem because there is no way for you to efficiently do it within the consumed code. Consider:
if(!foo.frozen)
{
foo.apropery = "avalue";
}
the thread saftey of the frozen property and the guard code in apropery's setter doesn't really matter because even they are perfectly thread safe you still have a race condition. Instead I would write it like
lock(foo)
{
if(!foo.frozen)
{
foo.apropery = "avalue";
}
}
and have neither of the properties inherently thread safe.
#1 - reader not threadsafe - I believe problem would be in reader side, not writer (code not shown)
#2 - reader not threadsafe - same as #1
#3 - promising, read check can be optimized out for most cases (when CPU caches are in sync)
Attempt 3:
Also do the read using Interlocked.
public class Foobar {
private object _syncRoot = new object();
private int _isFrozen = 0; // perf compiler warning, but training code, so show defaults
// Why Exchange to 1 then throw away result. Best to just increment.
//public void Freeze() { Interlocked.Exchange(ref _isFrozen, 1); }
public void Freeze() { Interlocked.Increment(ref _isFrozen); }
public void WriteValue(Object val) {
// if this core can see _isFrozen then no special lock or sync needed
if (_isFrozen != 0)
throw new InvalidOperationException();
lock(_syncRoot) {
if (_isFrozen != 0)
throw new InvalidOperationException(); // the 'throw' is 100x-1000x more costly than the lock, just eat it
_val = val;
}
}
public object Read() {
// frozen is one-way, if one-way state has been published
// to my local CPU cache then just read _val.
// There are very strange corner cases when _isFrozen and _val fields are in
// different cache lines, but should be nearly impossible to hit unless
// dealing with very large structs (make it more likely to cross
// 4k cache line).
if (_isFrozen != 0)
return _val;
// else
lock(_syncRoot) { // _isFrozen is 0 here
if (_isFrozen != 0) // if _isFrozen is 1 here we just collided with writer using lock on other thread, or our CPU cache was out of sync and lock() forced the dirty cache line to be read from main memory
return _val;
throw new InvalidOperationException(); // throw is 100x-1000x more expensive than lock, eat the cost of lock
}
}
}
Joe Duffy's post about 'volatile is dead' is, I think, in the context of his next-gen CLR/OS architecture and for CLR on ARM. Those of us doing multi-core x64/x86 I think volatile is fine. If perf is the primary concern I suggest you measure the code above and compare it to volatile.
Unlike other folks posting answers I wouldn't jump straight to lock() if you have lots of readers (3 or more threads likely to read the same object at the same time). But in your sample you mix perf-sensitive question with exceptions when a collision happens, which doesn't make much sense. If you're using exceptions, then you can also use other higher-level constructs.
If you want complete safety but need to optimize for lots of concurrent readers change lock()/Monitor to ReaderWriterLockSlim.
.NET has new primitives to handle publishing values. Take a look at Rx. It can be very fast and lockless for some cases (I think they use optimizations similar to above).
If written multiple times but only one value is kept - in Rx that is "new ReplaySubject(bufferSize: 1)". If you try it you might be surprised how fast it. At the same time I applaud your attempt to learn this level of detail.
If you want to go lockless get over your distaste for Thread.MemoryBarrier(). It is extremely important. But it has the same gotchas as volatile as described by Joe Duffy - it was designed as a hint to the compiler & CPU to prevent reordering of memory reads (which take a long time in CPU terms, so they are aggressively reordered when there are no hints present). When this reordering is combined with CLR constructs like auto-inline of functions and you can see very surprising behavior at the memory & register level. MemoryBarrier() just disables those single-threaded memory access assumptions that CPU and CLR use most of the time.
Perhaps my question wasn't clear, but I am looking in particular for reasons as to why the above attempts are good or bad. Note that I am talking here about a scenario of one single writer that writes then freezes before any concurrent reads. I believe attempt 1 is OK but I'd like to know exactly why (as I wonder if reads could be optimized away somehow, for example). I care less about whether or not this is good design practice but more about the actual threading aspect of it.
Ok, now I better understand what you are doing and looking for in a response. Allow me to elaborate on my previous answer promoting the use of locks by first addressing each of your attempts.
Attempt 1:
The approach of using a simple class that has no synchronization primitives of any form is entirely viable in your example. Since the 'authoring' thread is the only thread having access to this class during it's mutating state this should be safe. If an only if another thread has the potential to access before the class is 'frozen' would you need to provide synchronization. Essentially, it's not possible for a thread to have a cache of something it has never seen.
Aside from a thread having a cached copy of the internal state of this list there is one other concurrency issue that you should be concerned with. You should consider write reordering by the authoring thread. You example solution doesn't have enough code for me to address this, but the process of handing this 'frozen' list to another thread is the heart of the issue. Are you using Interlocked.Exchange or writing to a volatile state?
I still advocate that is not the best approach simply because there is no guarantee that another thread has not seen the instance while it's mutating.
Attempt 2:
While attempt 2 should not be used. If you are using atomic writes to a member, one should also use atomic reads. I would never recommend one without the other as without both reads and writes being atomic you haven't gained anything. The correct application of atomic reads and writes is your 'Attempt 3'.
Attempt 3:
This will guarantee an exception is thrown if a thread has attempted to mutate an frozen list. However it makes no assertion that a read is only acceptable on a frozen instance. This, IMHO, is just as bad as accessing our _isFrozen variable with atomic and non-atomic accessors. If you are going to say that it's important to safeguard writes, then you should always safeguard reads. One without the other is just 'odd'.
Overlooking my own feeling towards writing code that gaurds writes but not reads this is an acceptable approach given your specific uses. I have one writer, I write, I freeze, then I make it available to readers. Under this scenario you code works correctly. You rely on the atomic operation on the set of _isFrozen to provide the required memory barrier prior to handing the class to another thread.
In a nutshell this approach works, but again if a thread has an instance that is not frozen it's going to break.
Attempt 4:
While at heart this is nearly the same as attempt 3 (given one writer) there is one big difference. In this example, if you check _isFrozen in the reader then every access will require a memory barrier. This is unnecessary overhead once the list is frozen.
Still this has the same issue as Attempt 3 in that no assertions are made about the state of _isFrozen during the read so the performance should be identical in your example usage.
Attempt 5:
As I said this is my preference given the modification to read as appears in my other answer.
Attempt 6:
Is essentially the same as #4.
Attempt 7:
You could solve your specific needs with a Thread.MemoryBarrier. Essentially using the code from Attempt 1, you create the instance, call Freeze(), add your Thread.MemoryBarrier, and then share the instance (or share it within a lock). This should work great, again only under your limited use case.
Attempt 8:
Without knowing more about this, I can't advise on the cost of the copy.
Summary
Again I prefer using a class that has some threading guarantee or none at all. Creating a class that is only 'partially' thread safe is, IMO, dangerous.
In the words of a famous jedi master:
Either do or do not there is no try.
The same goes for thread safety. The class should either be thread safe or not. Taking this approach you are left with either using my augmentation of Attempt 5, or using Attempt 7. Given the choice, I would never recommend #7.
So my recommendation stands firmly behind a completely thread-safe version. The performance cost between the two is so infinitesimally small it's almost non-existent. The reader threads will never hit the lock simply because of your usage scenario of having a single writer. Yet, if they do, proper behavior is still a certainty. Thus as your code changes over time and suddenly your instance is being shared prior to being frozen you don't wind up with race condition that crashes your program. Thread safe, or not, don't be half-in or you wind up with nasty surprise someday.
My preference is all classes shared by more than one thread are one of two types:
Completely immutable.
Completely Thread-safe.
Since a popsicle list is not immutable by design it does not fit #1. Therefore if you are going to share the object across threads it should fit #2.
Hopefully all this ranting further explains my reasoning :)
_syncRoot
Many people have noticed that I skipped the use of a _syncRoot on my locking implementation. While the reasons to use _syncRoot are valid they are not always necessary. In your example usage where you have a single writer the use of lock(this) should suffice nicely without adding another heap allocation for _syncRoot.
Is the thing constructed and written to, then permanently frozen and read multiple times?
Or do you freeze and unfreeze and refreeze it multiple times?
If it's the former, then perhaps the "is frozen" check should be in the reader method not the writer method (to prevent it reading before it's frozen).
Or, if it's the latter, then the use case you need to beware of is:
Main thread invokes the writer method, finds that it's not frozen, and therefore begins to write
Before the write has finished, someone tries to freeze the object and then reads from it, while the other (main) thread is still writing
In the latter case, Google shows a lot of results for multiple reader single writer which you might find interesting.
In general, each mutable object should have precisely one clearly-defined "owner"; shared objects should be immutable. Popsicles should not be accessible by multiple threads until after they are frozen.
Personally, I don't like forms of popsicle immunity with an exposed "freeze" method. I think a cleaner approach is to have AsMutable and AsImmutable methods (each of which would simply return the object unmodified when appropriate). Such an approach can allow for more robust promises about immutability. For example, if an "unshared mutable object" is being mutated while its AsImmutable member is being called (behavior which would be contrary to the object being "unshared"), the state of the data in the copy may be indeterminate, but whatever was returned would be immutable. By contrast, if one thread froze an object and then assumed it was immutable while another thread was writing to it, the "immutable" object could end up changing after it was frozen and its values were read.
Edit
Based on further description, I would suggest having code which writes to the object do so within a monitor lock, and having the freeze routine look something like:
public Thingie Freeze(void) // Returns the object in question
{
if (isFrozen) // Private field
return this;
else
return DoFreeze();
}
Thingie DoFreeze(void)
{
if (Monitor.TryEnter(whatever))
{
isFrozen = true;
return this;
}
else if (isFrozen)
return this;
else
throw new InvalidOperationException("Object in use by writer");
}
The Freeze method may be called any number of times by any number of threads; it should be short enough to be inlined (though I haven't profiled it), and should thus take almost no time to execute. If the first access of the object in any thread is via the Freeze method, that should guarantee proper visibility under any reasonable memory model (even if the thread didn't see the updates to the object performed by the thread which created and originally froze it, it would perform the TryEnter, which would guarantee a memory barrier, and after that failed it would notice that the object was frozen and return it.
If code which is going to write the object acquires the lock first, an attempt to write to a frozen object could deadlock. If one would rather have such code throw an exception, one use TryEnter and throw an exception if it can't get the lock.
The object used for locking should be something which is exclusively held by the object to be frozen. If the object to be frozen doesn't hold a purely-private reference to anything, one could either lock on this or create a private object purely for locking purposes. Note that it is safe to abandon 'entered' monitor locks without cleanup; the GC will simply forget about them, since if no references exist to a lock there's no way anybody will ever care (or could even ask) whether the lock was entered at the time it was abandoned.
I am not sure in terms of cost how the following approach will do, but it is a bit different. Only initially if there are multiple threads trying to write value simultaneously will they encounter locks. Once it is frozen all later calls will get the exception directly.
Attempt 9:
public class Foobar
{
private readonly Object _syncRoot = new Object();
private object _val;
private Boolean _isFrozen;
private Action<object> WriteValInternal;
public void Freeze() { _isFrozen = true; }
public Foobar()
{
WriteValInternal = BeforeFreeze;
}
private void BeforeFreeze(object val)
{
lock (_syncRoot)
{
if (_isFrozen == false)
{
//Write the values....
_val = val;
//...
//...
//...
//and then modify the write value function
WriteValInternal = AfterFreeze;
Freeze();
}
else
{
throw new InvalidOperationException();
}
}
}
private void AfterFreeze(object val)
{
throw new InvalidOperationException();
}
public void WriteValue(Object val)
{
WriteValInternal(val);
}
public Object ReadSomething()
{
return _val;
}
}
Have you checked out Lazy
http://msdn.microsoft.com/en-us/library/dd642331.aspx
which uses ThreadLocal
http://msdn.microsoft.com/en-us/library/dd642243.aspx
And actually looking further there is a Freezable class...
http://msdn.microsoft.com/en-us/library/vstudio/ms602734(v=vs.100).aspx
you may achieve this using POST Sharp
take one interface
public interface IPseudoImmutable
{
bool IsFrozen { get; }
bool Freeze();
}
then derive your attribute from InstanceLevelAspect like this
/// <summary>
/// implement by divyang
/// </summary>
[Serializable]
[IntroduceInterface(typeof(IPseudoImmutable),
AncestorOverrideAction = InterfaceOverrideAction.Ignore, OverrideAction = InterfaceOverrideAction.Fail)]
public class PseudoImmutableAttribute : InstanceLevelAspect, IPseudoImmutable
{
private volatile bool isFrozen;
#region "IPseudoImmutable"
[IntroduceMember]
public bool IsFrozen
{
get
{
return this.isFrozen;
}
}
[IntroduceMember(IsVirtual = true, OverrideAction = MemberOverrideAction.Fail)]
public bool Freeze()
{
if (!this.isFrozen)
{
this.isFrozen = true;
}
return this.IsFrozen;
}
#endregion
[OnLocationSetValueAdvice]
[MulticastPointcut(Targets = MulticastTargets.Property | MulticastTargets.Field)]
public void OnValueChange(LocationInterceptionArgs args)
{
if (!this.IsFrozen)
{
args.ProceedSetValue();
}
}
}
public class ImmutableException : Exception
{
/// <summary>
/// The location name.
/// </summary>
private readonly string locationName;
/// <summary>
/// Initializes a new instance of the <see cref="ImmutableException"/> class.
/// </summary>
/// <param name="message">
/// The message.
/// </param>
public ImmutableException(string message)
: base(message)
{
}
public ImmutableException(string message, string locationName)
: base(message)
{
this.locationName = locationName;
}
public string LocationName
{
get
{
return this.locationName;
}
}
}
then apply in your class like this
[PseudoImmutableAttribute]
public class TestClass
{
public string MyString { get; set; }
public int MyInitval { get; set; }
}
then run it in multi thread
/// <summary>
/// The program.
/// </summary>
public class Program
{
/// <summary>
/// The main.
/// </summary>
/// <param name="args">
/// The args.
/// </param>
public static void Main(string[] args)
{
Console.Title = "Divyang Demo ";
var w = new Worker();
w.Run();
Console.ReadLine();
}
}
internal class Worker
{
private object SyncObject = new object();
public Worker()
{
var r = new Random();
this.ObjectOfMyTestClass = new MyTestClass { MyInitval = r.Next(500) };
}
public MyTestClass ObjectOfMyTestClass { get; set; }
public void Run()
{
Task readWork;
readWork = Task.Factory.StartNew(
action: () =>
{
for (;;)
{
Task.Delay(1000);
try
{
this.DoReadWork();
}
catch (Exception exception)
{
// Console.SetCursorPosition(80,80);
// Console.SetBufferSize(100,100);
Console.WriteLine("Read Exception : {0}", exception.Message);
}
}
// ReSharper disable FunctionNeverReturns
});
Task writeWork;
writeWork = Task.Factory.StartNew(
action: () =>
{
for (int i = 0; i < int.MaxValue; i++)
{
Task.Delay(1000);
try
{
this.DoWriteWork();
}
catch (Exception exception)
{
Console.SetCursorPosition(80, 80);
Console.SetBufferSize(100, 100);
Console.WriteLine("write Exception : {0}", exception.Message);
}
if (i == 5000)
{
((IPseudoImmutable)this.ObjectOfMyTestClass).Freeze();
}
}
});
Task.WaitAll();
}
/// <summary>
/// The do read work.
/// </summary>
public void DoReadWork()
{
// ThreadId where reading is done
var threadId = System.Threading.Thread.CurrentThread.ManagedThreadId;
// printing on screen
lock (this.SyncObject)
{
Console.SetCursorPosition(0, 0);
Console.SetBufferSize(290, 290);
Console.WriteLine("\n");
Console.WriteLine("Read Start");
Console.WriteLine("Read => Thread Id: {0} ", threadId);
Console.WriteLine("Read => this.objectOfMyTestClass.MyInitval: {0} ", this.ObjectOfMyTestClass.MyInitval);
Console.WriteLine("Read => this.objectOfMyTestClass.MyString: {0} ", this.ObjectOfMyTestClass.MyString);
Console.WriteLine("Read End");
Console.WriteLine("\n");
}
}
/// <summary>
/// The do write work.
/// </summary>
public void DoWriteWork()
{
// ThreadId where reading is done
var threadId = System.Threading.Thread.CurrentThread.ManagedThreadId;
// random number generator
var r = new Random();
var count = r.Next(15);
// new value for Int property
var tempInt = r.Next(5000);
this.ObjectOfMyTestClass.MyInitval = tempInt;
// new value for string Property
var tempString = "Randome" + r.Next(500).ToString(CultureInfo.InvariantCulture);
this.ObjectOfMyTestClass.MyString = tempString;
// printing on screen
lock (this.SyncObject)
{
Console.SetBufferSize(290, 290);
Console.SetCursorPosition(125, 25);
Console.WriteLine("\n");
Console.WriteLine("Write Start");
Console.WriteLine("Write => Thread Id: {0} ", threadId);
Console.WriteLine("Write => this.objectOfMyTestClass.MyInitval: {0} and New Value :{1} ", this.ObjectOfMyTestClass.MyInitval, tempInt);
Console.WriteLine("Write => this.objectOfMyTestClass.MyString: {0} and New Value :{1} ", this.ObjectOfMyTestClass.MyString, tempString);
Console.WriteLine("Write End");
Console.WriteLine("\n");
}
}
}
but still it will allow you to change property like array ,list . but if you apply more login in that then it may work for all type of property and field
I'd do something like this, inspired by C++ movable types. Just remember not to access the object after Freeze/Thaw.
Of course, you can add a _data != null check/throw if you want to be clear about why the user gets an NRE if accessing after thaw/freeze.
public class Data
{
public string _foo;
public int _bar;
}
public class Mutable
{
private Data _data = new Data();
public Mutable() {}
public string Foo { get => _data._foo; set => _data._foo = value; }
public int Bar { get => _data._bar; set => _data._bar = value; }
public Frozen Freeze()
{
var f = new Frozen(_data);
_data = null;
return f;
}
}
public class Frozen
{
private Data _data;
public Frozen(Data data) => _data = data;
public string Foo => _data._foo;
public int Bar => _data._bar;
public Mutable Thaw()
{
var m = new Mutable(_data);
_data = null;
return m;
}
}
I have a number of static List's in my application, which are used to store data from my database and are used when looking up information:
public static IList<string> Names;
I also have some methods to refresh this data from the database:
public static void GetNames()
{
SQLEngine sql = new SQLEngine(ConnectionString);
lock (Names)
{
Names = sql.GetDataTable("SELECT * FROM Names").ToList<string>();
}
}
I initially didnt have the lock() in place, however i noticed very occasionally, the requesting thread couldnt find the information in the list. Now, I am assuming that if the requesting thread tries to access the Names list, it cant until it has been fully updated.
Is this the correct methodology and usage of the lock() statement?
As a sidenote, i noticed on MSDN that one shouldnt use lock() on public variables. Could someone please elaborate in my particular scenario?
lock is only useful if all places intended to be synchronized also apply the lock. So every time you access Names you would be required to lock. At the moment, that only stops 2 threads swapping Names at the same time, which frankly isn't a problem here anyway, as reference swaps are atomic anyway.
Another problem; presumably Names starts off null? You can't lock a null. Equally, you shouldn't lock on something that may change reference. If you want to synchronize, a common approach is something like:
// do not use for your scenario - see below
private static readonly object lockObj = new object();
then lock(lockObj) instead of your data.
With regards to not locking things that are visible externally; yes. That is because some other code could randomly choose to lock on it, which could cause unexpected blocking, and quite possibly deadlocks.
The other big risk is that some of your code obtains the names, and then does a sort/add/remove/clear/etc - anything that mutates the data. Personally, I would be using a read-only list here. In fact, with a read-only list, all you have is a reference swap; since that is atomic, you don't need any locking:
public static IList<string> Names { get; private set; }
public static void UpdateNames() {
List<string> tmp = SomeSqlQuery();
Names = tmp.AsReadOnly();
}
And finally: public fields are very very rarely a good idea. Hence the property above. This will be inlined by the JIT, so it is not a penalty.
No, it's not correct since anyone can use the Names property directly.
public class SomeClass
{
private List<string> _names;
private object _namesLock = new object();
public IEnumerable<string> Names
{
get
{
if (_names == null)
{
lock (_namesLock )
{
if (_names == null)
_names = GetNames();
}
}
return _names;
}
}
public void UpdateNames()
{
lock (_namesLock)
GetNames();
}
private void GetNames()
{
SQLEngine sql = new SQLEngine(ConnectionString);
_names = sql.GetDataTable("SELECT * FROM Names").ToList<string>();
}
}
Try to avoid static methods. At least use a singleton.
The check, lock, check is faster than a lock, check since the write will only occur once.
Assigning a property on usage is called lazy loading.
The _namesLock is required since you can't lock on null.
From the oode you have shown, the first time GetNames() is called the Names property is null. I don't known what a lock on a null object would do. I would add a variable to lock on.
static object namesLock = new object();
Then in GetNames()
lock (namesLock)
{
if (Names == null)
Names = ...;
}
We do the if test inside of the lock() to stop race conditions. I'm assuming that the caller of GetNames() also does the same test.
Here's the task: I need to lock based on a filename. There can be up to a million different filenames. (This is used for large-scale disk-based caching).
I want low memory usage and low lookup times, which means I need a GC'd lock dictionary. (Only in-use locks can be present in the dict).
The callback action can take minutes to complete, so a global lock is unacceptable. High throughput is critical.
I've posted my current solution below, but I'm unhappy with the complexity.
EDIT: Please do not post solutions that are not 100% correct. For example, a solution which permits a lock to be removed from the dictionary between the 'get lock object' phase and the 'lock' phase is NOT correct, whether or not it is an 'accepted' design pattern or not.
Is there a more elegant solution than this?
Thanks!
[EDIT: I updated my code to use looping vs. recursion based on RobV's suggestion]
[EDIT: Updated the code again to allow 'timeouts' and a simpler calling pattern. This will probably be the final code I use. Still the same basic algorithm as in the original post.]
[EDIT: Updated code again to deal with exceptions inside callback without orphaning lock objects]
public delegate void LockCallback();
/// <summary>
/// Provides locking based on a string key.
/// Locks are local to the LockProvider instance.
/// The class handles disposing of unused locks. Generally used for
/// coordinating writes to files (of which there can be millions).
/// Only keeps key/lock pairs in memory which are in use.
/// Thread-safe.
/// </summary>
public class LockProvider {
/// <summary>
/// The only objects in this collection should be for open files.
/// </summary>
protected Dictionary<String, Object> locks =
new Dictionary<string, object>(StringComparer.Ordinal);
/// <summary>
/// Synchronization object for modifications to the 'locks' dictionary
/// </summary>
protected object createLock = new object();
/// <summary>
/// Attempts to execute the 'success' callback inside a lock based on 'key'. If successful, returns true.
/// If the lock cannot be acquired within 'timoutMs', returns false
/// In a worst-case scenario, it could take up to twice as long as 'timeoutMs' to return false.
/// </summary>
/// <param name="key"></param>
/// <param name="success"></param>
/// <param name="failure"></param>
/// <param name="timeoutMs"></param>
public bool TryExecute(string key, int timeoutMs, LockCallback success){
//Record when we started. We don't want an infinite loop.
DateTime startedAt = DateTime.UtcNow;
// Tracks whether the lock acquired is still correct
bool validLock = true;
// The lock corresponding to 'key'
object itemLock = null;
try {
//We have to loop until we get a valid lock and it stays valid until we lock it.
do {
// 1) Creation/aquire phase
lock (createLock) {
// We have to lock on dictionary writes, since otherwise
// two locks for the same file could be created and assigned
// at the same time. (i.e, between TryGetValue and the assignment)
if (!locks.TryGetValue(key, out itemLock))
locks[key] = itemLock = new Object(); //make a new lock!
}
// Loophole (part 1):
// Right here - this is where another thread (executing part 2) could remove 'itemLock'
// from the dictionary, and potentially, yet another thread could
// insert a new value for 'itemLock' into the dictionary... etc, etc..
// 2) Execute phase
if (System.Threading.Monitor.TryEnter(itemLock, timeoutMs)) {
try {
// May take minutes to acquire this lock.
// Trying to detect an occurence of loophole above
// Check that itemLock still exists and matches the dictionary
lock (createLock) {
object newLock = null;
validLock = locks.TryGetValue(key, out newLock);
validLock = validLock && newLock == itemLock;
}
// Only run the callback if the lock is valid
if (validLock) {
success(); // Extremely long-running callback, perhaps throwing exceptions
return true;
}
} finally {
System.Threading.Monitor.Exit(itemLock);//release lock
}
} else {
validLock = false; //So the finally clause doesn't try to clean up the lock, someone else will do that.
return false; //Someone else had the lock, they can clean it up.
}
//Are we out of time, still having an invalid lock?
if (!validLock && Math.Abs(DateTime.UtcNow.Subtract(startedAt).TotalMilliseconds) > timeoutMs) {
//We failed to get a valid lock in time.
return false;
}
// If we had an invalid lock, we have to try everything over again.
} while (!validLock);
} finally {
if (validLock) {
// Loophole (part 2). When loophole part 1 and 2 cross paths,
// An lock object may be removed before being used, and be orphaned
// 3) Cleanup phase - Attempt cleanup of lock objects so we don't
// have a *very* large and slow dictionary.
lock (createLock) {
// TryEnter() fails instead of waiting.
// A normal lock would cause a deadlock with phase 2.
// Specifying a timeout would add great and pointless overhead.
// Whoever has the lock will clean it up also.
if (System.Threading.Monitor.TryEnter(itemLock)) {
try {
// It succeeds, so no-one else is working on it
// (but may be preparing to, see loophole)
// Only remove the lock object if it
// still exists in the dictionary as-is
object existingLock = null;
if (locks.TryGetValue(key, out existingLock)
&& existingLock == itemLock)
locks.Remove(key);
} finally {
// Remove the lock
System.Threading.Monitor.Exit(itemLock);
}
}
}
}
}
// Ideally the only objects in 'locks' will be open operations now.
return true;
}
}
Usage example
LockProvider p = new LockProvider();
bool success = p.TryExecute("filename",1000,delegate(){
//This code executes within the lock
});
Depending on what you are doing with the files (you say disk based caching so I assume reads as well as writes) then I would suggest trying something based upon ReaderWriterLock, if you can upgrade to .Net 3.5 then try ReaderWriterLockSlim instead as it performs much better.
As a general step to reducing the potential endless recursion case in your example change the first bit of the code to the following:
do
{
// 1) Creation/aquire phase
lock (createLock){
// We have to lock on dictionary writes, since otherwise
// two locks for the same file could be created and assigned
// at the same time. (i.e, between TryGetValue and the assignment)
if (!locks.TryGetValue(key, out itemLock))
locks[key] = itemLock = new Object(); //make a new lock!
}
// Loophole (part 1):
// Right here - this is where another thread could remove 'itemLock'
// from the dictionary, and potentially, yet another thread could
// insert a new value for 'itemLock' into the dictionary... etc, etc..
// 2) Execute phase
lock(itemLock){
// May take minutes to acquire this lock.
// Real version would specify a timeout and a failure callback.
// Trying to detect an occurence of loophole above
// Check that itemLock still exists and matches the dictionary
lock(createLock){
object newLock = null;
validLock = locks.TryGetValue(key, out newLock);
validLock = validLock && newLock == itemLock;
}
// Only run the callback if the lock is valid
if (validLock) callback(); // Extremely long-running callback.
}
// If we had an invalid lock, we have to try everything over again.
} while (!validLock);
This replaces your recursion with a loop which avoids any chance of a StackOverflow by endless recursion.
That solution sure looks brittle and complex. Having public callbacks inside locks is bad practice. Why won't you let LockProvider return some sort of 'lock' objects, so that the consumers do the lock themselves. This separates the locking of the locks dictionary from the execution. It might look like this:
public class LockProvider
{
private readonly object globalLock = new object();
private readonly Dictionary<String, Locker> locks =
new Dictionary<string, Locker>(StringComparer.Ordinal);
public IDisposable Enter(string key)
{
Locker locker;
lock (this.globalLock)
{
if (!this.locks.TryGetValue(key, out locker))
{
this.locks[key] = locker = new Locker(this, key);
}
// Increase wait count ínside the global lock
locker.WaitCount++;
}
// Call Enter and decrease wait count óutside the
// global lock (to prevent deadlocks).
locker.Enter();
// Only one thread will be here at a time for a given locker.
locker.WaitCount--;
return locker;
}
private sealed class Locker : IDisposable
{
private readonly LockProvider provider;
private readonly string key;
private object keyLock = new object();
public int WaitCount;
public Locker(LockProvider provider, string key)
{
this.provider = provider;
this.key = key;
}
public void Enter()
{
Monitor.Enter(this.keyLock);
}
public void Dispose()
{
if (this.keyLock != null)
{
this.Exit();
this.keyLock = null;
}
}
private void Exit()
{
lock (this.provider.globalLock)
{
try
{
// Remove the key before releasing the lock, but
// only when no threads are waiting (because they
// will have a reference to this locker).
if (this.WaitCount == 0)
{
this.provider.locks.Remove(this.key);
}
}
finally
{
// Release the keyLock inside the globalLock.
Monitor.Exit(this.keyLock);
}
}
}
}
}
And the LockProvider can be used as follows:
public class Consumer
{
private LockProvider provider;
public void DoStufOnFile(string fileName)
{
using (this.provider.Enter(fileName))
{
// Long running operation on file here.
}
}
}
Note that Monitor.Enter is called before we enter the try statement (using), which means in certain host environments (such as ASP.NET and SQL Server) we have the possibility of locks never being released when an asynchronous exception happens. Hosts like ASP.NET and SQL Server aggressively kill threads when timeouts occur. Rewriting this with the Enter outside the Monitor.Enter inside the try is a bit tricky though.
I hope this helps.
Could you not simply used a named Mutex, with the name derived from your filename?
Although not a lightweight synchronization primitive, it's simpler than managing your own synchronized dictionary.
However if you really do want to do it this way, I'd have thought the following implementation looks simpler. You need a synchonized dictionary - either the .NET 4 ConcurrentDictionary or your own implementation if you're on .NET 3.5 or lower.
try
{
object myLock = new object();
lock(myLock)
{
object otherLock = null;
while(otherLock != myLock)
{
otherLock = lockDictionary.GetOrAdd(key, myLock);
if (otherLock != myLock)
{
// Another thread has a lock in the dictionary
if (Monitor.TryEnter(otherLock, timeoutMs))
{
// Another thread still has a lock after a timeout
failure();
return;
}
else
{
Monitor.Exit(otherLock);
}
}
}
// We've successfully added myLock to the dictionary
try
{
// Do our stuff
success();
}
finally
{
lockDictionary.Remove(key);
}
}
}
There doesn't seem to be an elegant way to do this in .NET, although I have improved the algorithm thanks to #RobV's suggestion of a loop. Here is the final solution I settled on.
It is immune to the 'orphaned reference' bug that seems to be typical of the standard pattern followed by #Steven's answer.
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading;
namespace ImageResizer.Plugins.DiskCache {
public delegate void LockCallback();
/// <summary>
/// Provides locking based on a string key.
/// Locks are local to the LockProvider instance.
/// The class handles disposing of unused locks. Generally used for
/// coordinating writes to files (of which there can be millions).
/// Only keeps key/lock pairs in memory which are in use.
/// Thread-safe.
/// </summary>
public class LockProvider {
/// <summary>
/// The only objects in this collection should be for open files.
/// </summary>
protected Dictionary<String, Object> locks =
new Dictionary<string, object>(StringComparer.Ordinal);
/// <summary>
/// Synchronization object for modifications to the 'locks' dictionary
/// </summary>
protected object createLock = new object();
/// <summary>
/// Attempts to execute the 'success' callback inside a lock based on 'key'. If successful, returns true.
/// If the lock cannot be acquired within 'timoutMs', returns false
/// In a worst-case scenario, it could take up to twice as long as 'timeoutMs' to return false.
/// </summary>
/// <param name="key"></param>
/// <param name="success"></param>
/// <param name="failure"></param>
/// <param name="timeoutMs"></param>
public bool TryExecute(string key, int timeoutMs, LockCallback success){
//Record when we started. We don't want an infinite loop.
DateTime startedAt = DateTime.UtcNow;
// Tracks whether the lock acquired is still correct
bool validLock = true;
// The lock corresponding to 'key'
object itemLock = null;
try {
//We have to loop until we get a valid lock and it stays valid until we lock it.
do {
// 1) Creation/aquire phase
lock (createLock) {
// We have to lock on dictionary writes, since otherwise
// two locks for the same file could be created and assigned
// at the same time. (i.e, between TryGetValue and the assignment)
if (!locks.TryGetValue(key, out itemLock))
locks[key] = itemLock = new Object(); //make a new lock!
}
// Loophole (part 1):
// Right here - this is where another thread (executing part 2) could remove 'itemLock'
// from the dictionary, and potentially, yet another thread could
// insert a new value for 'itemLock' into the dictionary... etc, etc..
// 2) Execute phase
if (System.Threading.Monitor.TryEnter(itemLock, timeoutMs)) {
try {
// May take minutes to acquire this lock.
// Trying to detect an occurence of loophole above
// Check that itemLock still exists and matches the dictionary
lock (createLock) {
object newLock = null;
validLock = locks.TryGetValue(key, out newLock);
validLock = validLock && newLock == itemLock;
}
// Only run the callback if the lock is valid
if (validLock) {
success(); // Extremely long-running callback, perhaps throwing exceptions
return true;
}
} finally {
System.Threading.Monitor.Exit(itemLock);//release lock
}
} else {
validLock = false; //So the finally clause doesn't try to clean up the lock, someone else will do that.
return false; //Someone else had the lock, they can clean it up.
}
//Are we out of time, still having an invalid lock?
if (!validLock && Math.Abs(DateTime.UtcNow.Subtract(startedAt).TotalMilliseconds) > timeoutMs) {
//We failed to get a valid lock in time.
return false;
}
// If we had an invalid lock, we have to try everything over again.
} while (!validLock);
} finally {
if (validLock) {
// Loophole (part 2). When loophole part 1 and 2 cross paths,
// An lock object may be removed before being used, and be orphaned
// 3) Cleanup phase - Attempt cleanup of lock objects so we don't
// have a *very* large and slow dictionary.
lock (createLock) {
// TryEnter() fails instead of waiting.
// A normal lock would cause a deadlock with phase 2.
// Specifying a timeout would add great and pointless overhead.
// Whoever has the lock will clean it up also.
if (System.Threading.Monitor.TryEnter(itemLock)) {
try {
// It succeeds, so no-one else is working on it
// (but may be preparing to, see loophole)
// Only remove the lock object if it
// still exists in the dictionary as-is
object existingLock = null;
if (locks.TryGetValue(key, out existingLock)
&& existingLock == itemLock)
locks.Remove(key);
} finally {
// Remove the lock
System.Threading.Monitor.Exit(itemLock);
}
}
}
}
}
// Ideally the only objects in 'locks' will be open operations now.
return true;
}
}
}
Consuming this code is very simple:
LockProvider p = new LockProvider();
bool success = p.TryExecute("filename",1000,delegate(){
//This code executes within the lock
});