Static initialization guarantees singleton thread safety? (C#) [duplicate] - c#

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is the C# static constructor thread safe?
Jon Skeet's excellent article at http://csharpindepth.com/Articles/General/Singleton.aspx and other articles I've read make it clear that that double-check locking doesn't work in both C# and Java unless one explicitly marks the instance as "volatile." If you don't, the check of comparing it to null could possibly return false even though the instance constructor hasn't finished running. In Mr. Skeet's third sample, he states this clearly: "The Java memory model doesn't ensure that the constructor completes before the reference to the new object is assigned to instance. The Java memory model underwent a reworking for version 1.5, but double-check locking is still broken after this without a volatile variable (as in C#)"
However, most everyone agrees (including Mr. Skeet, in samples four and five in his article), that the use of static initialization is a simple way to get a threadsafe singleton instance. He states that "static constructors in C# are specified to execute only when an instance of the class is created or a static member is referenced, and to execute only once per AppDomain."
That makes sense, but what seems to be missing is the guarantee that the reference to the new object is assigned only after the constructor completes - otherwise we'd get the same kind of issue that makes double-check locking fail unless you mark the instance as volatile. Is there a guarantee that, when using static initialization to call the instance constructor (as opposed to calling the instance constructor from a property's get{}, like we do with double-check locking), that the constructor will fully complete before any other thread can get a reference to the object?
Thanks!

that the constructor will fully complete before any other thread can get a reference to the object?
The static initializer will be invoked once only (by the system, at least) per AppDomain, and in a synchronized way, taking "beforefieldinit" into account. So assuming you don't do anything bizarre, any static fields assigned in the static initializer should be OK; any other attempts to use the static field should get held (blocked) behind the static constructor.
the reference to the new object is assigned only after the constructor completes
It happens when it happens. Any static field initializers happen before what you typically think of as the constructor, for example. But since other threads are blocked, this shouldn't be an issue.
However:
if your static initializer itself passes a reference outside (by calling a method with the reference as an argument (including "arg0"), then all bets are off
if you use reflection to invoke the static constructor (yes, you can do this), crazyness often follows

Yes; the guarantee is in the statement that it will only execute once per AppDomain.
It could only be unsafe if it could execute more than once; as stated, it can't, so all is well :)

Related

Any guidance of deciding a particular object for locking [duplicate]

This question already has answers here:
C# lock statement, what object to lock on?
(4 answers)
Closed 8 years ago.
I have seen few different ways of using a particular object for lock construct.
Have a dedicated private static local variable and lock on that
private static object syncObj = new object();
...
lock(syncObj)
{
}
Have a dedicated private local variable and lock on that
private object syncObj = new object();
...
lock(syncObj)
{
}
Another one is to use the object itself.
private List<MyClass> SomeObjects = new List<MyClass>();
....
lock(SomeObjects)
{
}
One approach is to use the type of a particular object.
private List<MyClass> SomeObjects = new List<MyClass>();
...
lock(SomeObjects.GetType())
{
}
Another possibility is using lock(this) but general recommendation seem to try to avoid using this.
My questions.
1. Are there any other ways of using objects for locks?
2. How can I decide of what approach should be used for a particular scenario?
If Marc Gravell's answer to C# lock statement, what object to lock on? were the top vote-getter (and it should have been, IMHO), I would happily have voted to close this question as a duplicate.
But, it's not. And this question does have some minor differences in context. So…
Taking your five examples in particular:
Have a dedicated private static local variable and lock on that
This works well for scenarios where you have some static member that needs to be synchronized. A common implementation convention in .NET is to make all static members thread-safe, and for types which are not themselves specifically targeting multi-threaded code, to not bother making instance members thread-safe.
Note that the declaration should be static readonly, to ensure that the locking object remains the same through the lifetime of the program.
Have a dedicated private local variable and lock on that
This is better for protecting instance members, in a class that is specifically supposed to be thread-safe. While a static lock object would also work, that can be needlessly contentious. That is, private instance members are generally safe from other instances of the object touching them, as classes usually only operate on their own instance members, not those of other instances.
A static lock object would require all threads operating on any instance to synchronize their execution, when it would be safe for threads using different instances to operate concurrently.
As with the static lock object, make the field readonly.
Another one is to use the object itself.
I tend to try to avoid doing this. If the object is maintained privately, and you are sure that the reference is never known to any other code other than your own and that within the object itself, it can be safe enough. But it can be risky, as you can't be sure even if the object is currently never used elsewhere that the code will never change such that it later.
If the reference were to become available outside your own class, then it's possible some other code could also lock on the object in an inopportune way. The worst would be if it acquired some other lock, then tried to lock on that object, while your own code tried to acquire the other lock having already locked on that object. Deadlock. Less bad is simply increasing thread contention on the lock. The code would still work, but may not run as well.
One approach is to use the type of a particular object
This combines the worst of the above: publicly available reference, static member. Strongly advised against.
Another possibility is using lock(this) but general recommendation seem to try to avoid using this.
This is just a variation on #3, except that you are practically guaranteed that the reference will be used by some other code. Not advised.
You can, in the sense that "it is possible", use any reference with a lock statement (i.e. any class value…you definitely should not use any struct value, as suggested in the answer to the question which Preston has suggested as a duplicate for this one). However, there are lots of ways to get this wrong.
IMHO, the best policy is to keep it simple: match the static/instance declaration of whatever it is you're trying to protect (either a whole object, or in some cases specific operations on a specific object), always make the field readonly, and be done with it.

can there be concurrency issues when using C# class with only static methods and no variables?

Have I understood correctly that all threads have copy of method's variables in their own stack so there won't be problems when a static method is called from different threads?
Yes and no. If the parameters are value types, then yes they have their own copies. Or if the reference type is immutable, then it can't be altered and you have no issues. However, if the parameters are mutable reference types, then there are still possible thread safety issues to consider with the arguments being passed in.
Does that make sense? If you pass a reference type as an argument, it's reference is passed "by value" so it's a new reference that refers back to the old object. Thus you could have two different threads potentially altering the same object in a non-thread-safe way.
If each of those instances are created and used only in the thread using them, then chances are low that you'd get bit, but I just wanted to emphasize that just because you're using static methods with only locals/parameters is not a guarantee of thread-safety (same with instance of course as noted by Chris).
Have I understood correctly that all threads have copy of method's variables in their own stack so there won't be problems when a static method is called from different threads?
No.
First off, it is false that "all threads have a copy of the method's local variables in their own stack." A local variable is only generated on the stack when it has a short lifetime; local variables may have arbitrarily long lifetimes if they are (1) closed-over outer variables, (2) declared in an iterator block, or (3) declared in an async method.
In all of those cases a local variable created by an activation of a method on one thread can later be mutated by multiple threads. Doing so is not threadsafe.
Second, there are plenty of possible problems when calling static methods from different threads. The fact that local variables are sometimes allocated on the stack does not magically make access to shared memory by static methods suddenly correct.
can there be concurrency issues when using C# class with only static methods and no variables?
I assume you mean "no static variables" and not "no local variables".
Absolutely there can be. For example, here's a program with no static variables, no non-static methods, no objects created apart from the second thread, and a single local variable to hold a reference to that thread. None of the methods other than the cctor actually do anything at all. This program deadlocks. You cannot assume that just because your program is dead simple that it contains no threading bugs!
Exercise to the reader: describe why this program that appears to contain no locks actually deadlocks.
class MyClass
{
static MyClass()
{
// Let's run the initialization on another thread!
var thread = new System.Threading.Thread(Initialize);
thread.Start();
thread.Join();
}
static void Initialize()
{ /* TODO: Add initialization code */ }
static void Main()
{ }
}
It sounds like you are looking for some magical way of knowing that your program has no threading issues. There is no such magical way of knowing that, short of making it single-threaded. You're going to have to analyze your use of threads and shared data structures.
There is no such guarantee unless all of the variables are immutable reference types or value types.
If the variables are mutable reference types, proper synchronization needs to be performed.
EDIT: Mutable variables only need to be synchronized if they are shared between threads- locally declared mutables that are not exposed outside of the method need not be synchronized.
Yes, unless methods use only local scope variable and no any gloval variable, so there is no any way any of that methods can impact on the state of any object, if this is true, you have no problems to use it in multithreading. I would say, that even , in this conditions, static they or not, is not relevant.
If they are variables local to the method then yes, you have nothing to worry about. Just make sure you are not passing parameters by reference or accessing global variables and changing them in different threads. Then you will be in trouble.
static methods can refer to data in static fields -- either in their class or outside of it -- which may not be thread safe.
So ultimately the answer to your question is "no", because there may be problems, although usually there won't be.
Two threads should still be able to operate on the same object either by the object being passed in to methods on different threads as parameters, or if an object can be accessed globally via Singleton or the like all bets are off.
Mark
As an addendum to the answers about why static methods are not necessarily thread-safe, it's worth considering why they might be, and why they often are.
The first reason why they might be is, I think, the sort of case you were thinking of:
public static int Max(int x, int y)
{
return x > y ? x : y;
}
This pure function is thread-safe because there is no way for it to affect code on any other thread, the locals x and y remain local to the thead they are on, not being stored in a shared location, captured in a delegate, or otherwise leaving the purely local context.
It's always worth noting, that combinations of thread-safe operations can be non thread-safe (e.g. doing a thread-safe read of whether a concurrent dictionary has a key followed by a thread-safe read of the value for that key, is not thread-safe as state can change between those two thread-safe operations). Static members tend not to be members that can be combined in such non thread-safe ways in order to avoid this.
A static method may also guarantee it's own thread-safety:
public object GetCachedEntity(string key)
{
object ret; //local and remains so.
lock(_cache) //same lock used on every operation that deals with _cache;
return _cache.TryGetValue(key, out ret) ? ret : null;
}
OR:
public object GetCachedEntity(string key)
{
object ret;
return _cache.TryGetValue(key, out ret) ? ret : null; //_cache is known to be thread-safe in itself!
}
Of course here this is no different than an instance member which protects itself against corruption from other threads (by co-operating with all other code that deals with the objects they share).
Notably though, it is very common for static members to be thread-safe, and instance members to not be thread-safe. Almost every static member of the FCL guarantees thread-safety in the documentation, and almost every instance member does not barring some classes specifically designed for concurrent use (even in some cases where the instance member actually is thread-safe).
The reasons are two-fold:
The sort of operations most commonly useful for static members are either pure functions (most of the static members of the Math class, for example) or read static read-only variables which will not be changed by other threads.
It's very hard to bring your own synchronisation to a third-party's static members.
The second point is important. If I have an object whose instance members are not thread-safe, then assuming that calls do not affect non-thread-safe data shared between different instances (possible, but almost certainly a bad design), then if I want to share it between threads, I can provide my own locking to do so.
If however, I am dealing with static members that are not thread-safe, it is much harder for me to do this. Indeed, considering that I may be racing not just with my own code, but with code from other parties, it may be impossible. This would make any such public static member next to useless.
Ironically, the reason that static members tend to be thread-safe is not that it's easier to make them so (though that does cover the pure functions), but that it's harder to! So hard in-fact that the author of the code has to do it for the user, because the user won't be able to themselves.

Why can't we lock on a value type?

I was trying to lock a Boolean variable when I encountered the following error :
'bool' is not a reference type as required by the lock statement
It seems that only reference types are allowed in lock statements, but I'm not sure I understand why.
Andreas is stating in his comment:
When [a value type] object is passed from one thread to the other, a copy is made, so the threads end up working on 2 different objects, which is safe.
Is it true? Does that mean that when I do the following, I am in fact modifying two different x in the xToTrue and the xToFalse method?
public static class Program {
public static Boolean x = false;
[STAThread]
static void Main(string[] args) {
var t = new Thread(() => xToTrue());
t.Start();
// ...
xToFalse();
}
private static void xToTrue() {
Program.x = true;
}
private static void xToFalse() {
Program.x = false;
}
}
(this code alone is clearly useless in its state, it is only for the example)
P.S: I know about this question on How to properly lock a value type. My question is not related to the how but to the why.
Just a wild guess here...
but if the compiler let you lock on a value type, you would end up locking nothing at all... because each time you passed the value type to the lock, you would be passing a boxed copy of it; a different boxed copy. So the locks would be as if they were entirely different objects. (since, they actually are)
Remember that when you pass a value type for a parameter of type object, it gets boxed (wrapped) into a reference type. This makes it a brand-new object each time this happens.
You cannot lock a value type because it doesn't have a sync root record.
Locking is performed by CLR and OS internals mechanisms that rely upon an object having a record that can only be accessed by a single thread at a time - sync block root. Any reference type would have:
Pointer to a type
Sync block root
Pointer to the instance data in heap
It expands to:
System.Threading.Monitor.Enter(x);
try {
...
}
finally {
System.Threading.Monitor.Exit(x);
}
Although they would compile, Monitor.Enter/Exit require a reference type because a value type would be boxed to a different object instance each time so each call to Enter and Exit would be operating on different objects.
From the MSDN Enter method page:
Use Monitor to lock objects (that is, reference types), not value types. When you pass a value type variable to Enter, it is boxed as an object. If you pass the same variable to Enter again, it is boxed as a separate object, and the thread does not block. In this case, the code that Monitor is supposedly protecting is not protected. Furthermore, when you pass the variable to Exit, still another separate object is created. Because the object passed to Exit is different from the object passed to Enter, Monitor throws SynchronizationLockException. For more information, see the conceptual topic Monitors.
If you're asking conceptually why this isn't allowed, I would say the answer stems from the fact that a value type's identity is exactly equivalent to its value (that's what makes it a value type).
So anyone anywhere in the universe talking about the int 4 is talking about the same thing - how then can you possibly claim exclusive access to lock on it?
I was wondering why the .Net team decided to limit developers and allow Monitor operate on references only. First, you think it would be good to lock against a System.Int32 instead of defining a dedicated object variable just for locking purpose, these lockers don't do anything else usually.
But then it appears that any feature provided by the language must have strong semantics not just be useful for developers. So semantics with value-types is that whenever a value-type appears in code its expression is evaluated to a value. So, from semantic point of view, if we write `lock (x)' and x is a primitive value type then it's the same as we would say "lock a block of critical code agaist the value of the variable x" which sounds more than strange, for sure :). Meanwhile, when we meet ref variables in code we are used to think "Oh, it's a reference to an object" and imply that the reference can be shared between code blocks, methods, classes and even threads and processes and thus can serve as a guard.
In two words, value type variables appear in code only to be evaluated to their actual value in each and every expression - nothing more.
I guess that's one of the main points.
Because value types don't have the sync block that the lock statement uses to lock on an object. Only reference types carry the overhead of the type info, sync block etc.
If you box your reference type then you now have an object containing the value type and can lock on that object (I expect) since it now has the extra overhead that objects have (a pointer to a sync block that is used for locking, a pointer to the type information etc). As everyone else is stating though - if you box an object you will get a NEW object every time you box it so you will be locking on different objects every time - which completely defeats the purpose of taking a lock.
This would probably work (although it's completely pointless and I haven't tried it)
int x = 7;
object boxed = (object)x;
//thread1:
lock (boxed){
...
}
//thread2:
lock(boxed){
...
}
As long as everyone uses boxed and the object boxed is only set once you would probably get correct locking since you are locking on the boxed object and it's only being created once. DON'T do this though.. it's just a thought exercise (and might not even work - like I said, I haven't tested it ).
As to your second question - No, the value is not copied for each thread. Both threads will be using the same boolean, but the threads are not guaranteed to see the freshest value for it (when one thread sets the value it might not get written back to the memory location immediately, so any other thread reading the value would get an 'old' result).
The following is taken from MSDN:
The lock (C#) and SyncLock (Visual Basic) statements can be used to ensure that a block of code runs to completion without interruption by other threads. This is accomplished by obtaining a mutual-exclusion lock for a given object for the duration of the code block.
and
The argument provided to the lock keyword must be an object based on a reference type, and is used to define the scope of the lock.
I would assume that this is in part because the lock mechanism uses an instance of that object to create the mutual exclusion lock.
According to this MSDN Thread, the changes to a reference variable may not be visible to all the threads and they might end up using stale values, and AFAIK I think value types do make a copy when they are passed between threads.
To quote exactly from MSDN
It's also important to clarify that the fact the assignment is atomic
does not imply that the write is immediately observed by other
threads. If the reference is not volatile, then it's possible for
another thread to read a stale value from the reference some time
after your thread has updated it. However, the update itself is
guaranteed to be atomic (you won't see a part of the underlying
pointer getting updated).
I think this is one of those cases where the answer to why is "because a Microsoft engineer implemented it that way".
The way locking works under the hood is by creating a table of lock structures in memory and then using the objects vtable to remember the position in the table where the required lock is. This gives the appearance that every object has a lock when in fact they don't. Only those that have been locked do. As value types don't have a reference there is no vtable to store the locks position in.
Why Microsoft chose this strange way of doing things is anyone's guess. They could have made Monitor a class you had to instantiate. I'm sure I have seen an article by an MS employee that said that on reflection this design pattern was a mistake, but I can't seem to find it now.

does final static automatically employ lazy instantiation?

the page at http://www.javaworld.com/javaworld/jw-04-2003/jw-0425-designpatterns.html?page=5 says that code like this:
public final static Singleton INSTANCE = new Singleton();
automatically employs lazy instantiation.
I want to verify if
1) all compilers do this, or is it that the compiler is free to do whatever it wishes to
2) and since c# does not have the "final" keyword, what's the best way to translate this into c# (and at the same time it should automatically employ lazy instantiation too)
Yes. The static initializer is guaranteed to run before you are able to access that INSTANCE. There are two negatives with this approach:
If an error occurs within the Singleton's construction, then the error is a little harder to debug ("Error in initializer").
On first use of the class, that object will be instantiated. If you did the locking approach, then it would not be instantiated until it was needed. However, being that the example is a singleton, then this is not a problem at all, but it could be a drag on an unused, yet lazily instantiated piece of code elsewhere that is not a singleton.
The translation for C# is readonly instead of final.
In my opinion, this is still vastly preferable to the secondary approach (synchronized/locked, checked instantiation within the a static getter) because it does not require any synchronization code, which is faster, easier to read and just as easy to use.

Static methods in C#?

What is the performance concern with static method over non-static methods? I have read that Static methods are better in terms of performance but i want to know, how they are faster? If a method is not using any instance member then our compiler should take care of it and treat it as static method.
Edit: Eric comments more on this here, and hints that there are some times when call is used... although note that his new() example isn't guaranteed ;-p
In the original compiler (pre-1.1), the compiler did treat non-virtual instance methods (without this) as static; the problem was that this lead to some odd problems with null checking, i.e.
obj.SomeMethod();
didn't threw an exception (for obj=null and non-virtual method SomeMethod which didn't touch this). Which was bad if you ever changed the implementation of SomeMethod. When they investigated the cost of adding the explicit null check (i.e. null-check then static-call), it turned out to be just the same as using a virtual-call, so they did that instead, which makes it far more flexible and predictable.
Note that the "don't throw an exception" is also entirely the behaviour if SomeMethod is an extension-method (static).
I think at one point you could emit IL to invoke a regular instance method via static-call, but the last time I tried I got the "oh no you don't!" message from the CLR (this operation may destabilise the runtime); either they blocked this entirely, or (perhaps more likely) I borked the custom IL.
Yes a static call would be faster - you don't need to create an instance of the object before you call the method. (Although you obviously won't notice the difference)
In practical terms it doesn't matter if the compiler optimizes a method (makes the instance method static) - you won't call the instance method unless you've already created the instance already, right?
At the end of the day you should rather try to optimize your code for maintainability rather than trying to save 3 nanoseconds here or there.
See this question.
Here's the excerpt:
a static call is 4 to 5 times faster
than constructing an instance every
time you call an instance method.
However, we're still only talking
about tens of nanoseconds per call
I doubt the compiler will treat it as a static method, although you can check for yourself. The benefit would be no creation of the instance. No garbage collector to worry about. And only the static constructor to be called, if there is one.
static methods fast,because constructing an instance
buy if you only create a instance and save static member , performance is equal
they are very small in total performance
so .......
yes static method is fast but the memory acquired by the static variable is not controlled by GC and is not released even if it is not needed, so that is an issue.
but more than anything else you should consider the design of the allpication as the memory and speed has increased by days but your design may suck if you dont make use of static variables properly.

Categories