The famous double-checked locking technique in C# - c#

I saw in a mprss book this recommendation of singleton (partial code attached):
public static Singleton GetSingleton()
{
if (s_value != null)
return s_value;
Monitor.Enter(s_lock);
if (s_value == null)
{
Singleton temp = new Singleton();
Interlocked.Exchange(ref s_value, temp);
}
Monitor.Exit(s_lock);
return s_value;
}
We add two lines of code in the 2nd if statement block instead of just writing:
s_value = new Singleton();
this should handle a situation that a second thread enters the method and finds s_value != null, but not initialized.
My question is, can we just write at the 2nd if block instead:
{
Singleton temp = new Singleton();
s_value = temp; // instead of Interlocked.Exchange(ref s_value, temp);
}
So now the function is:
public static Singleton GetSingleton()
{
if (s_value != null)
return s_value;
Monitor.Enter(s_lock);
if (s_value == null)
{
Singleton temp = new Singleton();
s_value = temp;
}
Monitor.Exit(s_lock);
return s_value;
}
I guess not, because they don't use it.
Does anyone have any suggestions?
It is possible that svalue may contains uninitialized?
svalue can be constructed just after temp was fully initialized (may i wrong).
if i wrong can u point of an example it is wrong?
may the compiler produce differenent code?

I'll post this not as a real answer but as an aside: if you're using .NET 4 you really should consider the Lazy<T> singleton pattern:
http://geekswithblogs.net/BlackRabbitCoder/archive/2010/05/19/c-system.lazylttgt-and-the-singleton-design-pattern.aspx
public class LazySingleton3
{
// static holder for instance, need to use lambda to construct since constructor private
private static readonly Lazy<LazySingleton3> _instance
= new Lazy<LazySingleton3>(() => new LazySingleton3());
// private to prevent direct instantiation.
private LazySingleton3()
{
}
// accessor for instance
public static LazySingleton3 Instance
{
get
{
return _instance.Value;
}
}
}
Thread-safe, easy to read, and obvious: what's not to like?

While the assignment is atomic, it is also required that the memory is "synchronized" across the different cores/CPUs (full fence), or another core concurrently reading the value might get an outdated value (outside of the synchronization block). The Interlocked class does this for all its operations.
http://www.albahari.com/threading/part4.aspx
Edit: Wikipedia has some useful information about the issues with the double-locking pattern and where/when to use memory barriers for it to work.

Related

Inform the compiler that a variable might be updated from another thread

This would generally be done using volatile. But in the case of a long or double that's impossible.
Perhaps just making it public is enough, and the compiler then knows that this can be used by another assembly and won't "optimize it out"? Can this be relied upon? Some other way?
To be clear, I'm not worried about concurrent reading/writing of the variable. Only one thing - that it doesn't get optimized out. (Like in https://stackoverflow.com/a/1284007/939213 .)
The best way to prevent code removal is to use the code.
if you are worries about optimizing the while loop in your example
class Test
{
long foo;
static void Main()
{
var test = new Test();
new Thread(delegate() { Thread.Sleep(500); test.foo = 255; }).Start();
while (test.foo != 255) ;
Console.WriteLine("OK");
}
}
you still could use volatile to do this by modifying your while loop
volatile int temp;
//code skipped in this sample
while(test.foo != 255) { temp = (int)foo;}
Now assuming you are SURE you won't have any thread safety issues. you are using your long foo so it won't be optimized away. and you don't care about losing any part of your long since you are just trying to keep it alive.
Make sure you mark your code very clearly if you do something like this. possibly write a VolatileLong class that wraps your long (and your volatile int) so other people understand what you are doing
also other thread-safty tools like locks will prevent code removal. for example the compiler is smart enough not to remove the double if in the sinleton pattern like this.
if (_instance == null) {
lock(_lock) {
if (_instance == null) {
_instance = new Singleton();
}
}
}
return _instance;

EmptyEnumerable<T>.Instance assignment and multi-threading design

This is more of a design question I guess than an actual bug or a rant. I wonder what people think about the following behavior:
In .NET, when you want to represent an empty IEnumerable efficiently you can use Enumerable.Empty<MyType>(), this will cache the empty enumerable instance. It's a nice and free micro-optimization I guess that could help if relied upon heavily.
However, this is how the implementation looks like:
public static IEnumerable<TResult> Empty<TResult>() {
return EmptyEnumerable<TResult>.Instance;
}
internal class EmptyEnumerable<TElement>
{
static volatile TElement[] instance;
public static IEnumerable<TElement> Instance {
get {
if (instance == null) instance = new TElement[0];
return instance;
}
}
}
I would expect the assignment to happen within a lock, after another null check, but that's not what happens.
I wonder if this is a conscious decision (i.e. we don't care of potentially creating several objects we will just throw away immediately if this is accessed concurrently, because we would rather avoid locking) or just ignorance?
What would you do?
This is safe because volatile sequences all reads and writes to that field. Before the read in return instance; there is always at least one write setting that field to a valid value.
It is unclear what value is going to be returned because multiple arrays can potentially be created here. But there will always be a non-null array.
Why did they do it? Well, a lock has more overhead than volatile and the implementation is easy enough to pull off. Those extra instances will only be created a few times if multiple threads happen to race to this method. For each thread racing at most one instance will be created. After initialization is complete there is zero garbage.
Note, that without volatile the instance field can flip back to zero after having been assigned. That is very counter intuitive. Without any synchronization the compiler is allowed to rewrite the code like that:
var instanceRead1 = instance;
var returnValue;
if (instanceRead1 == null) {
returnValue = new TElement[0];
instance = returnValue;
}
var instanceRead2 = instance;
if (instanceRead2 == returnValue) return instanceRead2;
else return null;
In the presence of concurrent writes instanceRead2 can be a different value than was just written. No compiler would do such a rewrite but it is legal. The CPU might do something like that on some architectures. Unlikely, but legal. Maybe there is a more plausible rewrite.
In that code there is the possibility of creating more than one array. It's possible to either have a thread create an array and then end up actually using the one created from another thread, or for two different threads to each end up with their own array. However, that just doesn't matter here. The code will work correctly whether multiple objects are created or not. As long as an array is returned it doesn't matter which array is returned by any call ever. Additionally the "expense" of creating an empty array is simply not very high. The decision was made (likely after a fair bit of testing) that the expense of synchronizing access to the field every time the field is accessed ever was greater than the very unlikely possibility that a couple of additional empty arrays were created.
This is not a pattern that you should emulate in your own (quasi) singletons unless you are also in the position in which creating a new instance is cheap, and creating multiple instances doesn't affect the functionality of the code. In effect the only situation in which this works is when you're trying to cache the value of a cheaply computed operation. That's a micro optimization; it's not wrong, but it's also not a big win either.
Although running benchmarks on such small code not really yields the most trustworthy results, here are few options compared (very bluntly though):
The current implementation with volatile instance and null check without lock.
A lock on static object syncRoot.
Static type initializer.
A lock on typeof(T) (so that there is no static type initializer created).
Results (seconds for 1 billion iterations):
21.7
28.8
20.3
29.3
As you can see, the lock approach is the worst one by far. The best would be static type initializer which would also make the code cleaner. The actual reason is probably not really because of the lock but rather the size of the getter and things like code inlining and more options for the compiler to optimize the code.
The speed of creating 1 million (not billion this time) empty arrays for the same machine is 26ms.
The code:
using System;
namespace ConsoleSandbox
{
class T1<T>
{
static volatile T[] _instance;
public static T[] Instance
{
get
{
if (_instance == null) _instance = new T[0];
return _instance;
}
}
}
class T2<T>
{
static T[] _instance;
static object _syncRoot = new object();
public static T[] Instance
{
get
{
if (_instance == null)
lock (_syncRoot)
if (_instance == null)
_instance = new T[0];
return _instance;
}
}
}
class T3<T>
{
static T[] _instance = new T[0];
public static T[] Instance
{
get
{
return _instance;
}
}
}
class T4<T>
{
static T[] _instance;
public static T[] Instance
{
get
{
if (_instance == null)
lock (typeof(T4<T>))
if (_instance == null)
_instance = new T[0];
return _instance;
}
}
}
class Program
{
static void Main(string[] args)
{
int[][] res = new int[2][];
var sw = new System.Diagnostics.Stopwatch();
sw.Start();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T1<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T2<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T3<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000000; i++)
res[i % 2] = T4<int>.Instance;
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
sw.Restart();
for (var i = 0; i < 1000000; i++)
res[i % 2] = new int[0];
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
Console.WriteLine(res[0]);
Console.WriteLine(res[1]);
}
}
}

C# usage of lock() statements and caching of data

I have a number of static List's in my application, which are used to store data from my database and are used when looking up information:
public static IList<string> Names;
I also have some methods to refresh this data from the database:
public static void GetNames()
{
SQLEngine sql = new SQLEngine(ConnectionString);
lock (Names)
{
Names = sql.GetDataTable("SELECT * FROM Names").ToList<string>();
}
}
I initially didnt have the lock() in place, however i noticed very occasionally, the requesting thread couldnt find the information in the list. Now, I am assuming that if the requesting thread tries to access the Names list, it cant until it has been fully updated.
Is this the correct methodology and usage of the lock() statement?
As a sidenote, i noticed on MSDN that one shouldnt use lock() on public variables. Could someone please elaborate in my particular scenario?
lock is only useful if all places intended to be synchronized also apply the lock. So every time you access Names you would be required to lock. At the moment, that only stops 2 threads swapping Names at the same time, which frankly isn't a problem here anyway, as reference swaps are atomic anyway.
Another problem; presumably Names starts off null? You can't lock a null. Equally, you shouldn't lock on something that may change reference. If you want to synchronize, a common approach is something like:
// do not use for your scenario - see below
private static readonly object lockObj = new object();
then lock(lockObj) instead of your data.
With regards to not locking things that are visible externally; yes. That is because some other code could randomly choose to lock on it, which could cause unexpected blocking, and quite possibly deadlocks.
The other big risk is that some of your code obtains the names, and then does a sort/add/remove/clear/etc - anything that mutates the data. Personally, I would be using a read-only list here. In fact, with a read-only list, all you have is a reference swap; since that is atomic, you don't need any locking:
public static IList<string> Names { get; private set; }
public static void UpdateNames() {
List<string> tmp = SomeSqlQuery();
Names = tmp.AsReadOnly();
}
And finally: public fields are very very rarely a good idea. Hence the property above. This will be inlined by the JIT, so it is not a penalty.
No, it's not correct since anyone can use the Names property directly.
public class SomeClass
{
private List<string> _names;
private object _namesLock = new object();
public IEnumerable<string> Names
{
get
{
if (_names == null)
{
lock (_namesLock )
{
if (_names == null)
_names = GetNames();
}
}
return _names;
}
}
public void UpdateNames()
{
lock (_namesLock)
GetNames();
}
private void GetNames()
{
SQLEngine sql = new SQLEngine(ConnectionString);
_names = sql.GetDataTable("SELECT * FROM Names").ToList<string>();
}
}
Try to avoid static methods. At least use a singleton.
The check, lock, check is faster than a lock, check since the write will only occur once.
Assigning a property on usage is called lazy loading.
The _namesLock is required since you can't lock on null.
From the oode you have shown, the first time GetNames() is called the Names property is null. I don't known what a lock on a null object would do. I would add a variable to lock on.
static object namesLock = new object();
Then in GetNames()
lock (namesLock)
{
if (Names == null)
Names = ...;
}
We do the if test inside of the lock() to stop race conditions. I'm assuming that the caller of GetNames() also does the same test.

Can(should?) Lazy<T> be used as a caching technique?

I'd like to use .NET's Lazy<T> class to implement thread safe caching. Suppose we had the following setup:
class Foo
{
Lazy<string> cachedAttribute;
Foo()
{
invalidateCache();
}
string initCache()
{
string returnVal = "";
//CALCULATE RETURNVAL HERE
return returnVal;
}
public String CachedAttr
{
get
{
return cachedAttribute.Value;
}
}
void invalidateCache()
{
cachedAttribute = new Lazy<string>(initCache, true);
}
}
My questions are:
Would this work at all?
How would the locking have to work?
I feel like I'm missing a lock somewhere near the invalidateCache, but for the life of me I can't figure out what it is.
I'm sure there's a problem with this somewhere, I just haven't figured out where.
[EDIT]
Ok, well it looks like I was right, there were things I hadn't thought about. If a thread sees an outdated cache it'd be a very bad thing, so it looks like "Lazy" is not safe enough. The Property is accessed a lot though, so I was engaging in pre-mature optimization in hopes that I could learn something and have a pattern to use in the future for thread-safe caching. I'll keep working on it.
P.S.: I decided to make the object thread-un-safe and have access to the object be carefully controlled instead.
Well, it's not thread-safe in that one thread could still see the old value after another thread sees the new value after invalidation - because the first thread could have not seen the change to cachedAttribute. In theory, that situation could perpetuate forever, although it's pretty unlikely :)
Using Lazy<T> as a cache of unchanging values seems like a better idea to me - more in line with how it was intended - but if you can cope with the possibility of using an old "invalidated" value for an arbitrarily long period in another thread, I think this would be okay.
cachedAttribute is a shared resource that needs to be protected from concurrent modification.
Protect it with a lock:
private readonly object gate = new object();
public string CachedAttr
{
get
{
Lazy<string> lazy;
lock (gate) // 1. Lock
{
lazy = this.cachedAttribute; // 2. Get current Lazy<string>
} // 3. Unlock
return lazy.Value // 4. Get value of Lazy<string>
// outside lock
}
}
void InvalidateCache()
{
lock (gate) // 1. Lock
{ // 2. Assign new Lazy<string>
cachedAttribute = new Lazy<string>(initCache, true);
} // 3. Unlock
}
or use Interlocked.Exchange:
void InvalidateCache()
{
Interlocked.Exchange(ref cachedAttribute, new Lazy<string>(initCache, true));
}
volatile might work as well in this scenario, but it makes my head hurt.

Threadsafe lazy loading when the loading could fail

I've been spending about an hour searching for a concensus on something I'm trying to accomplish, but have yet to find anything conclusive in a particular direction.
My situation is as follows:
I have a multi-threaded application (.NET web service)
I have classes that use objects that take non-negligible time to load, so I would like to maintain them as static class members
The code that constructs these objects intermittently has a low chance of failure
I was previously using an approach that constructs these objects in a static constructor. The problem with this was that, as mentioned above, the constructor would occasionally fail, and once a .NET static constructor fails, the whole class is hosed until the process is restarted. There are no second chances with that approach.
The most intuitive-seeming approach after this was to use double-checked locking. There are a lot of pages around that talk about the evils of double-checked locking and say to use a static constructor, which I was already doing, but that doesn't seem to be an option for me, as the static constructor has the potential to fail and bring down the whole class.
The implementation (simplified, of course) I'm thinking of using is the following. All class and member names are purely demonstrative and not what I'm actually using. Is this approach going to be problematic? Can anyone suggest a better approach?
public class LazyMembers
{
private static volatile XmlDocument s_doc;
private static volatile XmlNamespaceManager s_nsmgr;
private static readonly object s_lock = new object();
private static void EnsureStaticMembers()
{
if (s_doc == null || s_nsmgr == null)
{
lock (s_lock)
{
if (s_doc == null || s_nsmgr == null)
{
// The following method might fail
// with an exception, but if it succeeds,
// s_doc and s_nsmgr will be initialized
s_doc = LoadDoc(out s_nsmgr);
}
}
}
}
public XmlNamespaceManager NamespaceManager
{
get
{
EnsureStaticMembers();
return s_nsmgr;
}
}
public XmlDocument GetDocClone()
{
EnsureStaticMembers();
return (XmlDocument)s_doc.Clone();
}
}
If you use .NET 4.0 you can refer to Lazy<T> and LazyThreadSafetyMode (it depends on whether you want few instances of T to be created or not in multithreaded environment. In your case you need to refer to Lazy<T>(Func<T> func, LazyThreadSafetyMode) constructor - here(MSDN)
Otherwise (if you use 3.5 or lower) you can CAS techinque to create single instance without locking.
Something like this:
get {
if( _instance == null) {
var singleton = new Singleton();
if(Interlocked.CompareExchange(ref _instance, singleton, null) != null) {
if (singleton is IDisposable) singleton.Dispose();
}
}
return _instance;
}
However, here you can only achieve LazyThreadSafetyMode.Publications behaviour - only one instance will be visible to other threads but few can be created.
Also, there should not be any problems with double check for null in your code - it's safe in .NET world (at least on x86 machines and associated memory model). There were some problems in Java world before 2004, AFAIK.

Categories