I recently made a recommendation to one of my colleagues stating that on our current project (C#) "services should be stateless and therefore static".
My colleague agreed and indicated that in our project the services are (and should be) indeed stateless. However my colleague disagreed that static implies no state and that stateless should mean static.
My questions is “does a method marked as static imply that it requires no state and that in a majority of cases should stateless methods be made static”.
Static nearly means global. There is still an instance, and there is still state in that instance, but it is the static instance, which means there is only one and all callers always refer to that one.
does a method marked as static imply
that it requires no state
1) No. You cannot say static methods imply that it requires no state because static methods can access static/singleton resources.
a majority of cases should stateless
methods be made static
2) Yes. Methods that require no state, therefore require no instance, should be generally made static.
A static method, in C#, can access the static variables of its containing class, and if it does, it's not stateless. I've seen some painful instances of non-reentrant "stateless" static methods triggering fun race conditions.
A truly stateless method can indeed be made static, and generally should.
I find it rather scary to say that stateless is the same as static as these are two different worlds. Stateless means that there is no state kept, i.e., a perfect example is an HTTP connection (once data is send, the connection is closed and there's no memory kept), where we are actually trying to do our best to maintain state regardless (login state for one).
Static on the other hand is a term used to describe a way that a method is invoked. In C# that means that a method can be invoked without a class instance, but an instance of a class is not the same as state. There's still the static instance and that's perfectly capable of maintaining state: any static member variable, field or property can maintain state. A static method or class is also perfectly capable of maintaining state by using memory mapped files, a database or whatever. Static is a calling convention, nothing more and is not related to being stateless or not.
I think his statement makes about as much sense as "Democracies should use yellow paper ballots".
He is mixing a high level design concepts "stateless services" with a low level technical implementation detail "using static classes".
Stateless services can (and have been) implemented in languages which supports only static variables (e.g. COBOL, RPG) and languages which dont even allow static variables (Erlang etc.).
I could easily imagine a case where stateless service was implemented largly using static classes, becuase they were there and already implemented the correct business logic, although its generally considered good Java programing practice not to use static classes unless you really need to.
He also serioulsy misunderstands what "static" is all about -- a static variable is a way of storing state between invocations -- and would therefore seem a better match with a "stateful" service.
static is a language keyword and state is a design concept. There's an obvious relationship between these two things, but it is a relationship of the concrete to the metaphysical, not a relationship of cause and effect. It is possible for static methods to refernce some kinds of state information.
Regarding stateless methods, here we are talking about method that don't reference a class instance, i.e. a this pointer. Marking these methods as static improves the clarity of the code and is consistent with best practices. Note that in this case we are talking a specific kind of "statelessness" and not making a general commentary about the use of stateful contexts.
static can be stateful. you just have to define static containers for said state. and the containers are shared among all calls to your static methods.
The short answer to your question is "no", static does not imply "no state".
In reference to your other comments, static can be used to help you implement a stateless service, but static alone is not sufficient. Another tool/technique to help make something stateless is to use immutable data structures. This (currently) isn't one of C#'s fortes, especially when compared to F#.
Beyond rehashing all the definitions of "static" that one can run through, the answer is "yes." Static methods may happily modify the state of the Class itself (represented through static variables), and even modify the state of instances of the class (notably, when they get passed an instance or set of instances). However, most of the time, you will use static methods in cases where no state is changed. The most important example is to find or create an instance (factory methods).
That said, the real answer is "no." In real life (Web Services over HTTP, for instance, or interaction with any kind of Orb), services never expose their service methods using actual static methods. You usually call static methods to get an instance of the service (or an instance of the service factory, from which you get an instance!), and then work with that. This is because your service proxy, internally, needs to keep track of where it's at. So while your methods seem stateless to you, they are really not.
Hope this wasn't too confusing :)
A Static class is not stateless. It can still have variables which, although static, have a state.
A Static class without any class-level variables is stateless. It holds no data.
Every class has a class definition structure, where static fields are represented and stored. Every "instance" of the class has access to the static fields stored in the class definition (a data structure caleld CORINFO_CLASS_STRUCT). Even when NO instances have been created, code anywhere in your assembly can access these static class-level fields by using the syntax classname.StaticFieldName, without any instance at all.
Since the values stored in these static class-level fields are persisted, they are definitely state. In fact, they are state shared by not only any instances of the class that might exist, they are shared throughout the assembly, whether any instances have been created or not.
Even more significant, since once a CORINFO_CLASS_STRUCT class definition has been loaded, unlike a true instance of the class, it is never unloaded until the assembly (or the AppDomain) is unloaded, so it is arguably more stateful than any instance field defined in a class, because an instance field dissapears when the instance gets garbage collected.
For more information check out CORINFO_CLASS_STRUCT link to Don Boxes' great book, Essential .Net
That is generally correct. However you can have static variables, which allow your static methods to have state. For instance, take this FooBarFactory class:
class FooBarFactory
{
private static _id = 0;
public FooBar MakeAFooBar()
{
FooBar foo = new FooBar();
foo.ID = _id;
_id++;
}
}
class FooBar
{
public int ID {get;set;}
}
In that case your static method has a minimum of state, but it's (probably) neccessary.
If you want a single instance with state, use a singleton pattern; it makes the intent clear that you're working with a single-occurrence object.
From that, I would then treat all static classes and methods as stateless. It just helps keep sanity.
Related
I'm using dependency injection in my C# projects and generally everything is ok.
Nevertheless, I often hear the rule "constructor must only consist of trivial operations - assign dependencies and do nothing more" I.e:
//dependencies
interface IMyFooDependency
{
string GetBuzz();
int DoOtherStuff();
}
interface IMyBarDependency
{
void CrunchMe();
}
//consumer
class MyNiceConsumer
{
private readonly IMyFooDependency foo;
private readonly IMyBarDependency bar;
private /*readonly*/ string buzz;//<---question here
MyNiceConsumer(IMyFooDependency foo, IMyBarDependency bar)
{
//omitting null checks
this.foo = foo;
this.bar = bar;
//OR
this.buzz = foo.GetBuzz();//is this a bad thing to do?
}
}
UPD: assume IMyFooDependency can't be replaced with the GetBuzz(), as in that case the answer is obvious: "do not depend on foo".
UPD2: Please, understand that this question is not about eliminating dependency from foo in a hypothetic code, but about understanding a principle of good constructor design.
So, my questions is following: is this really a bad pattern to include non-trivial logic in constructor(i.e. obtaining buzz value, making some calculations based on dependencies.)
Personally I, unless lazy load is necessary, would include foo.GetBuzz() in constructor, as object need to be initialized after call to its constructor.
The only drawback I see: by including non-trivial logic you increase the number of places where something might go wrong, and you'll get an obfuscated error message from your IoC container(but same things happen in case of invalid parameter, so the drawback is rather minor)
Any other considerations for eliding non-trivial constructors?
If you need IMyFooDependency only for buzz creation, then you actually need buzz:
class MyNiceConsumer
{
private readonly IMyBarDependency bar;
private readonly string buzz;
MyNiceConsumer(string buzz, IMyBarDependency bar)
{
this.buzz = buzz;
this.bar = bar;
}
}
And create instance of nice consumer this way:
new MyNiceConsumer(foo.GetBuzz(), bar);
I don't see any difference between obtaining buzz before passing parameters to constructor, or obtaining it inside constructor. Same value will be returned from repository. So, you don't need to depend on repository.
UPDATE: Technically there is nothing wrong with complex initialization logic in constructor. Take a look on winforms InitializeComponent method, where all controls are created, initialized and added to form.
But it violates SRP (creation and initialization) and its hard to test. You can read more about this flaw on writing testable code guide. Main idea:
Do not create collaborators in your constructor, but pass them in.
(Don’t look for things! Ask for things!)
The rationale for not doing any work in the constructor comes from looking at the execution of the program in two phases. The first phase is to wire up your object graph. The second phase is to do the "real work".
There is a tension between this ideal and efficiently maintaining a class's invariants and internal state. The less setup you can do in your constructor, the more difficult all of your methods will be to implement because they must take into account the varying possible internal state of the object. Remember, the constructor is the only code you can be sure is called for an object.
The way out of this conundrum is to realize that an object's "real work" is defined by it's interface and behavior in relation to other objects. That is, the dependencies provided to the constructor and objects provided as arguments to methods later down the road.
Feel free to do any kind of setup you like in your constructor that does not have a noticeable effect on other objects in your system. Likewise, be very sensitive to timing issues in your object's construction.
If you determine that a File object can't exist without a filename provided by the user: don't call keyboard.filename_from_keyboard() in the constructor. Instead you design your system such that the object is created by a factory (provider) during execution with the filename provided to the constructor or you allow the File object to exist without a filename. Maybe it can get it's own filename during execution? This is part of managing your object's lifetime and it's the hardest part IMO. This gets very subtle because "real work" involves creating objects too. But I digress...
In your example you would have to decide if foo.GetBuzz() breaks that condition. If GetBuzz() is a referentially transparent function, you're almost always in the clear to call it in the constructor. If GetBuzz() involves any I/O, user interaction or changes any noticeable internal state of any other object, then it is probably does not need to be called from a constructor.
As lazyberezovsky rightly mentioned, don't look for things! Ask for things!
If the creating code (let's say, MyNiceCreator) treats foo as an opaque value and news up MyNiceConsumer, then most likely creation should not be the responsibility of MyNiceCreator. The code that creates the MyNiceConsumer instance must be able to give the required values to the constructor.
A better pattern:
MyNiceCreator should "ask" for a MyNiceConsumer instance. This way the creation of MyNiceConsumer instance will be the responsibility of the appropriate class.
Are there overall rules/guidelines for what makes a method thread-safe? I understand that there are probably a million one-off situations, but what about in general? Is it this simple?
If a method only accesses local variables, it's thread safe.
Is that it? Does that apply for static methods as well?
One answer, provided by #Cybis, was:
Local variables cannot be shared among threads because each thread gets its own stack.
Is that the case for static methods as well?
If a method is passed a reference object, does that break thread safety? I have done some research, and there is a lot out there about certain cases, but I was hoping to be able to define, by using just a few rules, guidelines to follow to make sure a method is thread safe.
So, I guess my ultimate question is: "Is there a short list of rules that define a thread-safe method? If so, what are they?"
EDIT
A lot of good points have been made here. I think the real answer to this question is: "There are no simple rules to ensure thread safety." Cool. Fine. But in general I think the accepted answer provides a good, short summary. There are always exceptions. So be it. I can live with that.
If a method (instance or static) only references variables scoped within that method then it is thread safe because each thread has its own stack:
In this instance, multiple threads could call ThreadSafeMethod concurrently without issue.
public class Thing
{
public int ThreadSafeMethod(string parameter1)
{
int number; // each thread will have its own variable for number.
number = parameter1.Length;
return number;
}
}
This is also true if the method calls other class method which only reference locally scoped variables:
public class Thing
{
public int ThreadSafeMethod(string parameter1)
{
int number;
number = this.GetLength(parameter1);
return number;
}
private int GetLength(string value)
{
int length = value.Length;
return length;
}
}
If a method accesses any (object state) properties or fields (instance or static) then you need to use locks to ensure that the values are not modified by a different thread:
public class Thing
{
private string someValue; // all threads will read and write to this same field value
public int NonThreadSafeMethod(string parameter1)
{
this.someValue = parameter1;
int number;
// Since access to someValue is not synchronised by the class, a separate thread
// could have changed its value between this thread setting its value at the start
// of the method and this line reading its value.
number = this.someValue.Length;
return number;
}
}
You should be aware that any parameters passed in to the method which are not either a struct or immutable could be mutated by another thread outside the scope of the method.
To ensure proper concurrency you need to use locking.
for further information see lock statement C# reference and ReadWriterLockSlim.
lock is mostly useful for providing one at a time functionality,
ReadWriterLockSlim is useful if you need multiple readers and single writers.
If a method only accesses local variables, it's thread safe. Is that it?
Absolultely not. You can write a program with only a single local variable accessed from a single thread that is nevertheless not threadsafe:
https://stackoverflow.com/a/8883117/88656
Does that apply for static methods as well?
Absolutely not.
One answer, provided by #Cybis, was: "Local variables cannot be shared among threads because each thread gets its own stack."
Absolutely not. The distinguishing characteristic of a local variable is that it is only visible from within the local scope, not that it is allocated on the temporary pool. It is perfectly legal and possible to access the same local variable from two different threads. You can do so by using anonymous methods, lambdas, iterator blocks or async methods.
Is that the case for static methods as well?
Absolutely not.
If a method is passed a reference object, does that break thread safety?
Maybe.
I've done some research, and there is a lot out there about certain cases, but I was hoping to be able to define, by using just a few rules, guidelines to follow to make sure a method is thread safe.
You are going to have to learn to live with disappointment. This is a very difficult subject.
So, I guess my ultimate question is: "Is there a short list of rules that define a thread-safe method?
Nope. As you saw from my example earlier an empty method can be non-thread-safe. You might as well ask "is there a short list of rules that ensures a method is correct". No, there is not. Thread safety is nothing more than an extremely complicated kind of correctness.
Moreover, the fact that you are asking the question indicates your fundamental misunderstanding about thread safety. Thread safety is a global, not a local property of a program. The reason why it is so hard to get right is because you must have a complete knowledge of the threading behaviour of the entire program in order to ensure its safety.
Again, look at my example: every method is trivial. It is the way that the methods interact with each other at a "global" level that makes the program deadlock. You can't look at every method and check it off as "safe" and then expect that the whole program is safe, any more than you can conclude that because your house is made of 100% non-hollow bricks that the house is also non-hollow. The hollowness of a house is a global property of the whole thing, not an aggregate of the properties of its parts.
There is no hard and fast rule.
Here are some rules to make code thread safe in .NET and why these are not good rules:
Function and all functions it calls must be pure (no side effects) and use local variables. Although this will make your code thread-safe, there is also very little amount of interesting things you can do with this restriction in .NET.
Every function that operates on a common object must lock on a common thing. All locks must be done in same order. This will make the code thread safe, but it will be incredibly slow, and you might as well not use multiple threads.
...
There is no rule that makes the code thread safe, the only thing you can do is make sure that your code will work no matter how many times is it being actively executed, each thread can be interrupted at any point, with each thread being in its own state/location, and this for each function (static or otherwise) that is accessing common objects.
It must be synchronized, using an object lock, stateless, or immutable.
link: http://docs.oracle.com/javase/tutorial/essential/concurrency/immutable.html
Have I understood correctly that all threads have copy of method's variables in their own stack so there won't be problems when a static method is called from different threads?
Yes and no. If the parameters are value types, then yes they have their own copies. Or if the reference type is immutable, then it can't be altered and you have no issues. However, if the parameters are mutable reference types, then there are still possible thread safety issues to consider with the arguments being passed in.
Does that make sense? If you pass a reference type as an argument, it's reference is passed "by value" so it's a new reference that refers back to the old object. Thus you could have two different threads potentially altering the same object in a non-thread-safe way.
If each of those instances are created and used only in the thread using them, then chances are low that you'd get bit, but I just wanted to emphasize that just because you're using static methods with only locals/parameters is not a guarantee of thread-safety (same with instance of course as noted by Chris).
Have I understood correctly that all threads have copy of method's variables in their own stack so there won't be problems when a static method is called from different threads?
No.
First off, it is false that "all threads have a copy of the method's local variables in their own stack." A local variable is only generated on the stack when it has a short lifetime; local variables may have arbitrarily long lifetimes if they are (1) closed-over outer variables, (2) declared in an iterator block, or (3) declared in an async method.
In all of those cases a local variable created by an activation of a method on one thread can later be mutated by multiple threads. Doing so is not threadsafe.
Second, there are plenty of possible problems when calling static methods from different threads. The fact that local variables are sometimes allocated on the stack does not magically make access to shared memory by static methods suddenly correct.
can there be concurrency issues when using C# class with only static methods and no variables?
I assume you mean "no static variables" and not "no local variables".
Absolutely there can be. For example, here's a program with no static variables, no non-static methods, no objects created apart from the second thread, and a single local variable to hold a reference to that thread. None of the methods other than the cctor actually do anything at all. This program deadlocks. You cannot assume that just because your program is dead simple that it contains no threading bugs!
Exercise to the reader: describe why this program that appears to contain no locks actually deadlocks.
class MyClass
{
static MyClass()
{
// Let's run the initialization on another thread!
var thread = new System.Threading.Thread(Initialize);
thread.Start();
thread.Join();
}
static void Initialize()
{ /* TODO: Add initialization code */ }
static void Main()
{ }
}
It sounds like you are looking for some magical way of knowing that your program has no threading issues. There is no such magical way of knowing that, short of making it single-threaded. You're going to have to analyze your use of threads and shared data structures.
There is no such guarantee unless all of the variables are immutable reference types or value types.
If the variables are mutable reference types, proper synchronization needs to be performed.
EDIT: Mutable variables only need to be synchronized if they are shared between threads- locally declared mutables that are not exposed outside of the method need not be synchronized.
Yes, unless methods use only local scope variable and no any gloval variable, so there is no any way any of that methods can impact on the state of any object, if this is true, you have no problems to use it in multithreading. I would say, that even , in this conditions, static they or not, is not relevant.
If they are variables local to the method then yes, you have nothing to worry about. Just make sure you are not passing parameters by reference or accessing global variables and changing them in different threads. Then you will be in trouble.
static methods can refer to data in static fields -- either in their class or outside of it -- which may not be thread safe.
So ultimately the answer to your question is "no", because there may be problems, although usually there won't be.
Two threads should still be able to operate on the same object either by the object being passed in to methods on different threads as parameters, or if an object can be accessed globally via Singleton or the like all bets are off.
Mark
As an addendum to the answers about why static methods are not necessarily thread-safe, it's worth considering why they might be, and why they often are.
The first reason why they might be is, I think, the sort of case you were thinking of:
public static int Max(int x, int y)
{
return x > y ? x : y;
}
This pure function is thread-safe because there is no way for it to affect code on any other thread, the locals x and y remain local to the thead they are on, not being stored in a shared location, captured in a delegate, or otherwise leaving the purely local context.
It's always worth noting, that combinations of thread-safe operations can be non thread-safe (e.g. doing a thread-safe read of whether a concurrent dictionary has a key followed by a thread-safe read of the value for that key, is not thread-safe as state can change between those two thread-safe operations). Static members tend not to be members that can be combined in such non thread-safe ways in order to avoid this.
A static method may also guarantee it's own thread-safety:
public object GetCachedEntity(string key)
{
object ret; //local and remains so.
lock(_cache) //same lock used on every operation that deals with _cache;
return _cache.TryGetValue(key, out ret) ? ret : null;
}
OR:
public object GetCachedEntity(string key)
{
object ret;
return _cache.TryGetValue(key, out ret) ? ret : null; //_cache is known to be thread-safe in itself!
}
Of course here this is no different than an instance member which protects itself against corruption from other threads (by co-operating with all other code that deals with the objects they share).
Notably though, it is very common for static members to be thread-safe, and instance members to not be thread-safe. Almost every static member of the FCL guarantees thread-safety in the documentation, and almost every instance member does not barring some classes specifically designed for concurrent use (even in some cases where the instance member actually is thread-safe).
The reasons are two-fold:
The sort of operations most commonly useful for static members are either pure functions (most of the static members of the Math class, for example) or read static read-only variables which will not be changed by other threads.
It's very hard to bring your own synchronisation to a third-party's static members.
The second point is important. If I have an object whose instance members are not thread-safe, then assuming that calls do not affect non-thread-safe data shared between different instances (possible, but almost certainly a bad design), then if I want to share it between threads, I can provide my own locking to do so.
If however, I am dealing with static members that are not thread-safe, it is much harder for me to do this. Indeed, considering that I may be racing not just with my own code, but with code from other parties, it may be impossible. This would make any such public static member next to useless.
Ironically, the reason that static members tend to be thread-safe is not that it's easier to make them so (though that does cover the pure functions), but that it's harder to! So hard in-fact that the author of the code has to do it for the user, because the user won't be able to themselves.
the page at http://www.javaworld.com/javaworld/jw-04-2003/jw-0425-designpatterns.html?page=5 says that code like this:
public final static Singleton INSTANCE = new Singleton();
automatically employs lazy instantiation.
I want to verify if
1) all compilers do this, or is it that the compiler is free to do whatever it wishes to
2) and since c# does not have the "final" keyword, what's the best way to translate this into c# (and at the same time it should automatically employ lazy instantiation too)
Yes. The static initializer is guaranteed to run before you are able to access that INSTANCE. There are two negatives with this approach:
If an error occurs within the Singleton's construction, then the error is a little harder to debug ("Error in initializer").
On first use of the class, that object will be instantiated. If you did the locking approach, then it would not be instantiated until it was needed. However, being that the example is a singleton, then this is not a problem at all, but it could be a drag on an unused, yet lazily instantiated piece of code elsewhere that is not a singleton.
The translation for C# is readonly instead of final.
In my opinion, this is still vastly preferable to the secondary approach (synchronized/locked, checked instantiation within the a static getter) because it does not require any synchronization code, which is faster, easier to read and just as easy to use.
While reading Jon Skeet's article on singletons in C# I started wondering why we need lazy initialization in a first place. It seems like the fourth approach from the article should be sufficient, here it is for reference purposes:
public sealed class Singleton
{
static readonly Singleton instance=new Singleton();
// Explicit static constructor to tell C# compiler
// not to mark type as beforefieldinit
static Singleton()
{
}
Singleton()
{
}
public static Singleton Instance
{
get
{
return instance;
}
}
}
In the rare circumstances, where you have other static methods on a singleton, lazy initialization may have benefits, but this is not a good design.
So can people enlighten me why lazy initialization is such a hot thing?
In those scenarios where whatever it is you are initializing might not be needed at all, and is expensive to initialize, (in terms of CPU cycles or resources), then implementing a lazy initializor saves that cost in those cases where the object is not required.
If the object will always be required, or is relatively cheap to initialize, then there is no added benefit from a lazy initializer.
In any event, implementing a lazy initializer improperly can make a singleton non-thread-safe, so if you need this pattern, be careful to do it correctly. Jon's article has a pattern, (I think it's the last one), that addresses this issue.
You don't need lazy initialization on a singleton, if the only reason you're going to use that type is to reference the instance.
If, however, you reference any other property or method on the type, or the type itself, you will initialize the singleton.
Granted, good design would leave one task for one type, so this shouldn't be a problem. If you make your singleton class to "complex", however, lazy initialization can help keep you from having consequences due to initializing too soon.
I'm not sure that this applies to C# as well, but I'll answer for C++:
One of the ideas behind Singletons is that the order of initialization of static objects is undefined. So if you have a static class that tries to use another static class, you might be in trouble.
A concrete example: Let's say I have a Log class, which I decided to implement as a Singleton. Let's also assume I have a method called LoadDB, which for whatever reason, is a static called at the start of the program. Assuming LoadDB decides it needs to log something, it will call the Singleton. But since the order of static initialization is undefined, it might be doing something that's an error.
A Singleton fixes this, since once you call GetInstance(), no matter where you are in the initialization process, the object is guaranteed to exist.
Again, I don't know if this is relevant, but at least that's the history of it (as far as I remember).
From Java DI perspective, lazy initialization is good, there are always (say, spring) beans in another API that you might not want to use, for example, an eagerly loaded singleton cache (something that is shared with everyone using the cache) might not be needed for you although it may be referred as a dependency in your same code. Whats the point in loading the singleton wasting resources?
The lazy initialization implementation choice is tricky, in spring, would you choose lazy-init="true" (spring eagerly instantiates singletons), an init-method/destroy-method, #PostConstruct, InitializingBean - afterPropertiesSet() method or return same spring instance in a getInstance() method?
The choice is a tradeoff of testability over reusability outside the spring container.