Would the following method ensure that only one thread can read an ID at a time? I have a parallel process which uses the following method and I need it to return unique IDs. Unfortunately I cannot change the way the ID is structured.
private static int Seq = 0;
private static long dtDiff = 0;
private static object thisLock = new object();
private static object BuildClientID(string Code)
{
lock (thisLock)
{
object sReturn = "";
Seq++;
dtDiff++;
if (Seq == 1000)
{
Seq = 0;
dtDiff = DateAndTime.DateDiff(DateInterval.Second, DateTime.Parse("1970-01-01"), DateTime.Now);
}
sReturn = dtDiff.ToString() + Code + Seq.ToString("000");
return sReturn;
}
}
I don't see any reason it wouldn't. Both the lock object and the method are static. The only thing you need to determine is, do you need a more sophisticated form of locking like a Mutex, SpinLock, ReaderWriterLock, or Semaphore.
You'll need to study those, and here is a good link to get started.
Yes, it will work fine as both threads will use the same static object as the lock object and will have to wait for each other.
edit
Based on Dan's comments: consider making Seq and dtDiff properties and put access to them inside the same lock.
Related
I have multiple threads writing data to a common source, and I would like two threads to block each other if and only if they are touching the same piece of data.
It would be nice to have a way to lock specifically on an arbitrary key:
string id = GetNextId();
AquireLock(id);
try
{
DoDangerousThing();
}
finally
{
ReleaseLock(id);
}
If nobody else is trying to lock the same key, I would expect they would be able to run concurrently.
I could achieve this with a simple dictionary of mutexes, but I would need to worry about evicting old, unused locks and that could become a problem if the set grows too large.
Is there an existing implementation of this type of locking pattern.
You can try using a ConcurrentDictionary<string, object> to create named object instances. When you need a new lock instance (that you haven't used before), you can add it to the dictionary (adding is an atomic operation through GetOrAdd) and then all threads can share the same named object once you pull it from the dictionary, based on your data.
For example:
// Create a global lock map for your lock instances.
public static ConcurrentDictionary<string, object> GlobalLockMap =
new ConcurrentDictionary<string, object> ();
// ...
var oLockInstance = GlobalLockMap.GetOrAdd ( "lock name", x => new object () );
if (oLockInstance == null)
{
// handle error
}
lock (oLockInstance)
{
// do work
}
You can use the ConcurrentDictionary<string, object> to create and reuse different locks. If you want to remove locks from the dictionary, and also to reopen in future the same named resource, you have always to check inside the critical region if the previously acquired lock has been removed or changed by other threads. And take care to remove the lock from the dictionary as the last step before leaving the critical region.
static ConcurrentDictionary<string, object> _lockDict =
new ConcurrentDictionary<string, object>();
// VERSION 1: single-shot method
public void UseAndCloseSpecificResource(string resourceId)
{
bool isSameLock;
object lockObj, lockObjCheck;
do
{
lock (lockObj = _lockDict.GetOrAdd(resourceId, new object()))
{
if (isSameLock = (_lockDict.TryGetValue(resourceId, out lockObjCheck) &&
object.ReferenceEquals(lockObj, lockObjCheck)))
{
try
{
// ... open, use, and close resource identified by resourceId ...
// ...
}
finally
{
// This must be the LAST statement
_lockDict.TryRemove(resourceId, out lockObjCheck);
}
}
}
}
while (!isSameLock);
}
// VERSION 2: separated "use" and "close" methods
// (can coexist with version 1)
public void UseSpecificResource(string resourceId)
{
bool isSameLock;
object lockObj, lockObjCheck;
do
{
lock (lockObj = _lockDict.GetOrAdd(resourceId, new object()))
{
if (isSameLock = (_lockDict.TryGetValue(resourceId, out lockObjCheck) &&
object.ReferenceEquals(lockObj, lockObjCheck)))
{
// ... open and use (or reuse) resource identified by resourceId ...
}
}
}
while (!isSameLock);
}
public bool TryCloseSpecificResource(string resourceId)
{
bool result = false;
object lockObj, lockObjCheck;
if (_lockDict.TryGetValue(resourceId, out lockObj))
{
lock (lockObj)
{
if (result = (_lockDict.TryGetValue(resourceId, out lockObjCheck) &&
object.ReferenceEquals(lockObj, lockObjCheck)))
{
try
{
// ... close resource identified by resourceId ...
// ...
}
finally
{
// This must be the LAST statement
_lockDict.TryRemove(resourceId, out lockObjCheck);
}
}
}
}
return result;
}
The lock keyword (MSDN) already does this.
When you lock, you pass the object to lock on:
lock (myLockObject)
{
}
This uses the Monitor class with the specific object to synchronize any threads using lock on the same object.
Since string literals are "interned" – that is, they are cached for reuse so that every literal with the same value is in fact the same object – you can also do this for strings:
lock ("TestString")
{
}
Since you aren't dealing with string literals you could intern the strings you read as described in: C#: Strings with same contents.
It would even work if the reference used was copied (directly or indirectly) from an interned string (literal or explicitly interned). But I wouldn't recommend it. This is very fragile and can lead to hard-to-debug problems, due to the ease with which new instances of a string having the same value as an interned string can be created.
A lock will only block if something else has entered the locked section on the same object. Thus, no need to keep a dictionary around, just the applicable lock objects.
Realistically though, you'll need to maintain a ConcurrentDictionary or similar to allow your objects to access the appropriate lock object.
This is more of a design question I guess than an actual bug or a rant. I wonder what people think about the following behavior:
In .NET, when you want to represent an empty IEnumerable efficiently you can use Enumerable.Empty<MyType>(), this will cache the empty enumerable instance. It's a nice and free micro-optimization I guess that could help if relied upon heavily.
However, this is how the implementation looks like:
public static IEnumerable<TResult> Empty<TResult>() {
return EmptyEnumerable<TResult>.Instance;
}
internal class EmptyEnumerable<TElement>
{
static volatile TElement[] instance;
public static IEnumerable<TElement> Instance {
get {
if (instance == null) instance = new TElement[0];
return instance;
}
}
}
I would expect the assignment to happen within a lock, after another null check, but that's not what happens.
I wonder if this is a conscious decision (i.e. we don't care of potentially creating several objects we will just throw away immediately if this is accessed concurrently, because we would rather avoid locking) or just ignorance?
What would you do?
This is safe because volatile sequences all reads and writes to that field. Before the read in return instance; there is always at least one write setting that field to a valid value.
It is unclear what value is going to be returned because multiple arrays can potentially be created here. But there will always be a non-null array.
Why did they do it? Well, a lock has more overhead than volatile and the implementation is easy enough to pull off. Those extra instances will only be created a few times if multiple threads happen to race to this method. For each thread racing at most one instance will be created. After initialization is complete there is zero garbage.
Note, that without volatile the instance field can flip back to zero after having been assigned. That is very counter intuitive. Without any synchronization the compiler is allowed to rewrite the code like that:
var instanceRead1 = instance;
var returnValue;
if (instanceRead1 == null) {
returnValue = new TElement[0];
instance = returnValue;
}
var instanceRead2 = instance;
if (instanceRead2 == returnValue) return instanceRead2;
else return null;
In the presence of concurrent writes instanceRead2 can be a different value than was just written. No compiler would do such a rewrite but it is legal. The CPU might do something like that on some architectures. Unlikely, but legal. Maybe there is a more plausible rewrite.
In that code there is the possibility of creating more than one array. It's possible to either have a thread create an array and then end up actually using the one created from another thread, or for two different threads to each end up with their own array. However, that just doesn't matter here. The code will work correctly whether multiple objects are created or not. As long as an array is returned it doesn't matter which array is returned by any call ever. Additionally the "expense" of creating an empty array is simply not very high. The decision was made (likely after a fair bit of testing) that the expense of synchronizing access to the field every time the field is accessed ever was greater than the very unlikely possibility that a couple of additional empty arrays were created.
This is not a pattern that you should emulate in your own (quasi) singletons unless you are also in the position in which creating a new instance is cheap, and creating multiple instances doesn't affect the functionality of the code. In effect the only situation in which this works is when you're trying to cache the value of a cheaply computed operation. That's a micro optimization; it's not wrong, but it's also not a big win either.
Although running benchmarks on such small code not really yields the most trustworthy results, here are few options compared (very bluntly though):
The current implementation with volatile instance and null check without lock.
A lock on static object syncRoot.
Static type initializer.
A lock on typeof(T) (so that there is no static type initializer created).
Results (seconds for 1 billion iterations):
21.7
28.8
20.3
29.3
As you can see, the lock approach is the worst one by far. The best would be static type initializer which would also make the code cleaner. The actual reason is probably not really because of the lock but rather the size of the getter and things like code inlining and more options for the compiler to optimize the code.
The speed of creating 1 million (not billion this time) empty arrays for the same machine is 26ms.
The code:
using System;
namespace ConsoleSandbox
{
class T1<T>
{
static volatile T[] _instance;
public static T[] Instance
{
get
{
if (_instance == null) _instance = new T[0];
return _instance;
}
}
}
class T2<T>
{
static T[] _instance;
static object _syncRoot = new object();
public static T[] Instance
{
get
{
if (_instance == null)
lock (_syncRoot)
if (_instance == null)
_instance = new T[0];
return _instance;
}
}
}
class T3<T>
{
static T[] _instance = new T[0];
public static T[] Instance
{
get
{
return _instance;
}
}
}
class T4<T>
{
static T[] _instance;
public static T[] Instance
{
get
{
if (_instance == null)
lock (typeof(T4<T>))
if (_instance == null)
_instance = new T[0];
return _instance;
}
}
}
class Program
{
static void Main(string[] args)
{
int[][] res = new int[2][];
var sw = new System.Diagnostics.Stopwatch();
sw.Start();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T1<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T2<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T3<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T4<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000; i++)
res[i % 2] = new int[0];
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Console.WriteLine(res[0]);
Console.WriteLine(res[1]);
}
}
}
There are a great number of articles available regarding thread safe caching, here's an example:
private static object _lock = new object();
public void CacheData()
{
SPListItemCollection oListItems;
oListItems = (SPListItemCollection)Cache["ListItemCacheName"];
if(oListItems == null)
{
lock (_lock)
{
// Ensure that the data was not loaded by a concurrent thread
// while waiting for lock.
oListItems = (SPListItemCollection)Cache[“ListItemCacheName”];
if (oListItems == null)
{
oListItems = DoQueryToReturnItems();
Cache.Add("ListItemCacheName", oListItems, ..);
}
}
}
}
However, this example depends on the request for the cache also rebuilding the cache.
I'm looking for a solution where the request and rebuild are separate. Here's the scenario.
I have a web service that I want to monitor for certain types of error. If an error occurs, I create an monitor object and cache - it is updatable and is locked accordingly during update. Alls well so far.
Elsewhere, I check for the existence of the cached object, and the data it contains. This would work straight out of the box except for one particular scenario.
If the cache object is being updated - say a status change, I would like to wait and get the latest info rather than the current info, which if returned, would be out of date. So for my fetch code, I need to check if the object is currently being created/updating, and if so wait, then retry.
As I pointed out, there are many examples of cache locking patterns but I can't seem to find one that for this scenario. Any ideas as to how to go about this would be appreciated?
You can try the following code using two locks. Write lock in the setter is quite simple and protects cache from being written by more than one threads. The getter use a simple double-check lock.
Now, the trick is in Refresh() method, which uses the same lock as the getter. The method uses the lock and in the first step removes list from the cache. It will trigger any getter to fail the first null check and wait for the lock. The method in the meantime gets items, sets cache again and releases the lock.
When it comes back to the getter, it reads the cache again and now it contains the list.
public class CacheData
{
private static object _readLock = new object();
private static object _writeLock = new object();
public SPListItemCollection ListItem
{
get
{
var oListItems = (SPListItemCollection) Cache["ListItemCacheName"];
if (oListItems == null)
{
lock (_readLock)
{
oListItems = (SPListItemCollection)Cache["ListItemCacheName"];
if (oListItems == null)
{
oListItems = DoQueryToReturnItems();
Cache.Add("ListItemCacheName", oListItems, ..);
}
}
}
return oListItems;
}
set
{
lock (_writeLock)
{
Cache.Add("ListItemCacheName", value, ..);
}
}
}
public void Refresh()
{
lock (_readLock)
{
Cache.Remove("ListItemCacheName");
var oListItems = DoQueryToReturnItems();
ListItem = oListItems;
}
}
}
You can make the method and property static, if you do not need CacheData instance.
I have a number of static List's in my application, which are used to store data from my database and are used when looking up information:
public static IList<string> Names;
I also have some methods to refresh this data from the database:
public static void GetNames()
{
SQLEngine sql = new SQLEngine(ConnectionString);
lock (Names)
{
Names = sql.GetDataTable("SELECT * FROM Names").ToList<string>();
}
}
I initially didnt have the lock() in place, however i noticed very occasionally, the requesting thread couldnt find the information in the list. Now, I am assuming that if the requesting thread tries to access the Names list, it cant until it has been fully updated.
Is this the correct methodology and usage of the lock() statement?
As a sidenote, i noticed on MSDN that one shouldnt use lock() on public variables. Could someone please elaborate in my particular scenario?
lock is only useful if all places intended to be synchronized also apply the lock. So every time you access Names you would be required to lock. At the moment, that only stops 2 threads swapping Names at the same time, which frankly isn't a problem here anyway, as reference swaps are atomic anyway.
Another problem; presumably Names starts off null? You can't lock a null. Equally, you shouldn't lock on something that may change reference. If you want to synchronize, a common approach is something like:
// do not use for your scenario - see below
private static readonly object lockObj = new object();
then lock(lockObj) instead of your data.
With regards to not locking things that are visible externally; yes. That is because some other code could randomly choose to lock on it, which could cause unexpected blocking, and quite possibly deadlocks.
The other big risk is that some of your code obtains the names, and then does a sort/add/remove/clear/etc - anything that mutates the data. Personally, I would be using a read-only list here. In fact, with a read-only list, all you have is a reference swap; since that is atomic, you don't need any locking:
public static IList<string> Names { get; private set; }
public static void UpdateNames() {
List<string> tmp = SomeSqlQuery();
Names = tmp.AsReadOnly();
}
And finally: public fields are very very rarely a good idea. Hence the property above. This will be inlined by the JIT, so it is not a penalty.
No, it's not correct since anyone can use the Names property directly.
public class SomeClass
{
private List<string> _names;
private object _namesLock = new object();
public IEnumerable<string> Names
{
get
{
if (_names == null)
{
lock (_namesLock )
{
if (_names == null)
_names = GetNames();
}
}
return _names;
}
}
public void UpdateNames()
{
lock (_namesLock)
GetNames();
}
private void GetNames()
{
SQLEngine sql = new SQLEngine(ConnectionString);
_names = sql.GetDataTable("SELECT * FROM Names").ToList<string>();
}
}
Try to avoid static methods. At least use a singleton.
The check, lock, check is faster than a lock, check since the write will only occur once.
Assigning a property on usage is called lazy loading.
The _namesLock is required since you can't lock on null.
From the oode you have shown, the first time GetNames() is called the Names property is null. I don't known what a lock on a null object would do. I would add a variable to lock on.
static object namesLock = new object();
Then in GetNames()
lock (namesLock)
{
if (Names == null)
Names = ...;
}
We do the if test inside of the lock() to stop race conditions. I'm assuming that the caller of GetNames() also does the same test.
While i was looking at some legacy application code i noticed it is using a string object to do thread synchronization. I'm trying to resolve some thread contention issues in this program and was wondering if this could lead so some strange situations. Any thoughts ?
private static string mutex= "ABC";
internal static void Foo(Rpc rpc)
{
lock (mutex)
{
//do something
}
}
Strings like that (from the code) could be "interned". This means all instances of "ABC" point to the same object. Even across AppDomains you can point to the same object (thx Steven for the tip).
If you have a lot of string-mutexes, from different locations, but with the same text, they could all lock on the same object.
The intern pool conserves string storage. If you assign a literal string constant to several variables, each variable is set to reference the same constant in the intern pool instead of referencing several different instances of String that have identical values.
It's better to use:
private static readonly object mutex = new object();
Also, since your string is not const or readonly, you can change it. So (in theory) it is possible to lock on your mutex. Change mutex to another reference, and then enter a critical section because the lock uses another object/reference. Example:
private static string mutex = "1";
private static string mutex2 = "1"; // for 'lock' mutex2 and mutex are the same
private static void CriticalButFlawedMethod() {
lock(mutex) {
mutex += "."; // Hey, now mutex points to another reference/object
// You are free to re-enter
...
}
}
To answer your question (as some others already have), there are some potential problems with the code example you provided:
private static string mutex= "ABC";
The variable mutex is not immutable.
The string literal "ABC" will refer to the same interned object reference everywhere in your application.
In general, I would advise against locking on strings. However, there is a case I've ran into where it is useful to do this.
There have been occasions where I have maintained a dictionary of lock objects where the key is something unique about some data that I have. Here's a contrived example:
void Main()
{
var a = new SomeEntity{ Id = 1 };
var b = new SomeEntity{ Id = 2 };
Task.Run(() => DoSomething(a));
Task.Run(() => DoSomething(a));
Task.Run(() => DoSomething(b));
Task.Run(() => DoSomething(b));
}
ConcurrentDictionary<int, object> _locks = new ConcurrentDictionary<int, object>();
void DoSomething(SomeEntity entity)
{
var mutex = _locks.GetOrAdd(entity.Id, id => new object());
lock(mutex)
{
Console.WriteLine("Inside {0}", entity.Id);
// do some work
}
}
The goal of code like this is to serialize concurrent invocations of DoSomething() within the context of the entity's Id. The downside is the dictionary. The more entities there are, the larger it gets. It's also just more code to read and think about.
I think .NET's string interning can simplify things:
void Main()
{
var a = new SomeEntity{ Id = 1 };
var b = new SomeEntity{ Id = 2 };
Task.Run(() => DoSomething(a));
Task.Run(() => DoSomething(a));
Task.Run(() => DoSomething(b));
Task.Run(() => DoSomething(b));
}
void DoSomething(SomeEntity entity)
{
lock(string.Intern("dee9e550-50b5-41ae-af70-f03797ff2a5d:" + entity.Id))
{
Console.WriteLine("Inside {0}", entity.Id);
// do some work
}
}
The difference here is that I am relying on the string interning to give me the same object reference per entity id. This simplifies my code because I don't have to maintain the dictionary of mutex instances.
Notice the hard-coded UUID string that I'm using as a namespace. This is important if I choose to adopt the same approach of locking on strings in another area of my application.
Locking on strings can be a good idea or a bad idea depending on the circumstances and the attention that the developer gives to the details.
If you need to lock a string, you can create an object that pairs the string with an object that you can lock with.
class LockableString
{
public string _String;
public object MyLock; //Provide a lock to the data in.
public LockableString()
{
MyLock = new object();
}
}
My 2 cents:
ConcurrentDictionary is 1.5X faster than interned strings. I did a benchmark once.
To solve the "ever-growing dictionary" problem you can use a dictionary of semaphores instead of a dictionary of objects. AKA use ConcurrentDictionary<string, SemaphoreSlim> instead of <string, object>. Unlike the lock statements, Semaphores can track how many threads have locked on them. And once all the locks are released - you can remove it from the dictionary. See this question for solutions like that: Asynchronous locking based on a key
Semaphores are even better because you can even control the concurrency level. Like, instead of "limiting to one concurrent run" - you can "limit to 5 concurrent runs". Awesome free bonus isn't it? I had to code an email-service that needed to limit the number of concurrent connections to a server - this came very very handy.
I imagine that locking on interned strings could lead to memory bloat if the strings generated are many and are all unique. Another approach that should be more memory efficient and solve the immediate deadlock issue is
// Returns an Object to Lock with based on a string Value
private static readonly ConditionalWeakTable<string, object> _weakTable = new ConditionalWeakTable<string, object>();
public static object GetLock(string value)
{
if (value == null) throw new ArgumentNullException(nameof(value));
return _weakTable.GetOrCreateValue(value.ToLower());
}