It's generally agreed upon that relying heavily on global stuff is to be avoided. Wouldn't using static classes and methods be the same thing?
Global data is bad. However many issues can be avoided by working with static methods.
I'm going to take the position of Rich Hickey on this one and explain it like this:
To build the most reliable systems in C# use static methods and classes, but not global data. For instance if you hand in a data object into a static method, and that static method does not access any static data, then you can be assured that given that input data the output of the function will always be the same. This is the position taken by Erlang, Lisp, Clojure, and every other Functional Programming language.
Using static methods can greatly simplify multi-threaded coding, since, if programmed correctly, only one thread will have access to a given set of data at a time. And that is really what it comes down to. Having global data is bad since it is a state that can be changed by who knows what thread, and any time. Static methods however allow for very clean code that can be tested in smaller increments.
I know this will be hotly debated, as it flies in the face of C#'s OOP thought process, but I have found that the more static methods I use, the cleaner and more reliable my code is.
This video explains it better than I can, but shows how immutable data, and static methods can produce some extremely thread-safe code.
Let me clarify a bit more some issues with Global Data. Constant (or read-only) global data isn't nearly as big of an issue as mutable (read/write) global data. Therefore if it makes sense to have a global cache of data, use global data! To some extent every application that uses a database will have that, since we could say that all a SQL Database is one massive global variable that holds data.
So making a blanket statement like I did above is probably a bit strong. Instead, let's say that having global data introduces many issues that can be avoid by having local data instead.
Some languages such as Erlang get around this issue by having the cache in a separate thread that handles all requests for that data. This way you know that all requests and modifications to that data will be atomic and the global cache will not be left in some unknown state.
static doesn't necessarely mean global. Classes and members can be static private, hence only applying to the specific class. That said, having too many public static members instead of using appropriate ways to pass data (method calls, callbacks, etc.) is generally bad design.
If you're trying to be purist about your OO development, then statics probably don't fit the mold.
However the real world is messier than theory, and statics are often a very useful way to solve some development problems. Use them when appropriate, and in moderation.
As an addition to whatever else is said, final static variables are just fine; constants are a good thing. The only exception to that is when/if you should just move them to a properties file, to be easier to change.
Mutable static variables are bad because they're just global state. The best discussion I know of about this is here, under the heading "Why Global Variables Should Be Avoided When Unnecessary".
Static methods have several drawbacks that often make them undesirable - the biggest one being that they cannot be used polymorphically.
First, why are the old global variables so bad? Because it is state that is accessible from anywhere, any time. Hard to track.
There are no such problems with static methods.
That leaves static fields (variables). If you declared a public static field in a class, that would truly be a global variable and it would be bad.
But make the static field private and most problems are solved. Or better, they are limited to the containing class and that makes them solvable.
public class Foo
{
private static int counter = 0;
public static int getCounterValue()
{
return counter;
}
//...
public Foo()
{
//other tasks
counter++;
}
}
In the code above you can see that we count how many Foo objects were created. This can be useful in many cases.
The static keyword is not global, it's telling you that it's on class level, which can be very useful in various cases. So, in conclusion, class level things are static, object level things are not static.
Static methods are used to implement traits in Scala. In C#, extension methods (which are static) fulfill that role in part. That could be seen, as the DCI proponents state, as a "higher order form of polymorphism".
Also, static methods can be used to implement functions. This is what F# uses to implement modules. (And also VB.NET.) Functions are useful for (unsurprisingly) functional-programming. And sometimes they're just the way something should be modeled (like the "functions" in the Math class). Again, C# comes close here.
I think the bad thing about global variables is the idea of having global state - variables that can be manipulated anywhere and tend to cause unintended side effects in far-flung areas of a program.
Static data would be similar to global variables in that they introduce a kind of global state. Static methods though are not nearly as bad, assuming they are stateless.
Not entirely. Static actually determines when, where and how often something is instantiated, not who has access to it.
Related
I have been looking for a neat answer to this design question with no success. I could not find help neither in the ".NET Framework design guidelines" nor in the "C# programing guidelines".
I basically have to expose a pattern as an API so the users can define and integrate their algorithms into my framework like this:
1)
// This what I provide
public abstract class AbstractDoSomething{
public abstract SomeThing DoSomething();
}
Users need to implementing this abstract class, they have to implement the DoSomething method (that I can call from within my framework and use it)
2)
I found out that this can also acheived by using delegates:
public sealed class DoSomething{
public String Id;
Func<SomeThing> DoSomething;
}
In this case, a user can only use DoSomething class this way:
DoSomething do = new DoSomething()
{
Id="ThisIsMyID",
DoSomething = (() => new Something())
}
Question
Which of these two options is best for an easy, usable and most importantly understandable to expose as an API?
EDIT
In case of 1 : The registration is done this way (assuming MyDoSomething extends AbstractDoSomething:
MyFramework.AddDoSomething("DoSomethingIdentifier", new MyDoSomething());
In case of 2 : The registration is done like this:
MyFramework.AddDoSomething(new DoSomething());
Which of these two options is best for an easy, usable and most importantly understandable to expose as an API?
The first is more "traditional" in terms of OOP, and may be more understandable to many developers. It also can have advantages in terms of allowing the user to manage lifetimes of the objects (ie: you can let the class implement IDisposable and dispose of instances on shutdown, etc), as well as being easy to extend in future versions in a way that doesn't break backwards compatibility, since adding virtual members to the base class won't break the API. Finally, it can be simpler to use if you want to use something like MEF to compose this automatically, which can simplify/remove the process of "registration" from the user's standpoint (as they can just create the subclass, and drop it in a folder, and have it discovered/used automatically).
The second is a more functional approach, and is simpler in many ways. This allows the user to implement your API with far fewer changes to their existing code, as they just need to wrap the necessary calls in a lambda with closures instead of creating a new type.
That being said, if you're going to take the approach of using a delegate, I wouldn't even make the user create a class - just use a method like:
MyFramework.AddOperation("ThisIsMyID", () => DoFoo());
This makes it a little bit more clear, in my opinion, that you're adding an operation to the system directly. It also completely eliminates the need for another type in your public API (DoSomething), which again simplifies the API.
I would go with the abstract class / interface if:
DoSomething is required
DoSomething will normally get really big (so DoSomething's implementation can be splited into several private / protected methods)
I would go with delegates if:
DoSomething can be treated as an event (OnDoingSomething)
DoSomething is optional (so you default it to a no-op delegate)
Though personally, if in my hand, I would always go by Delegate Model. I just love the simplicity and elegance of higher order functions. But while implementing the model, be careful about memory leaks. Subscribed events are one of the most common reasons of memory leaks in .Net. This means, suppose if you have an object that has some events exposed, the original object would never be disposed until all events are unsubscribed since event creates a strong reference.
As is typical for most of these types of questions, I would say "it depends". :)
But I think the reason for using the abstract class versus the lambda really comes down to behavior. Usually, I think of the lambda being used as a callback type of functionality -- where you'd like something custom happen when something else happens. I do this a lot in my client-side code:
- make a service call
- get some data back
- now invoke my callback to handle that data accordingly
You can do the same with the lambdas -- they are specific and are targeted for very specific situations.
Using the abstract class (or interface) really comes down to where your class' behavior is driven by the environment around it. What's happening, what client am I dealing with, etc.? These larger questions could suggest that you should define a set of behaviors and then allow your developers (or consumers of your API) to create their own sets of behavior based upon their requirements. Granted, you could do the same with lambdas, but I think it would be more complex to develop and also more complex to clearly communicate to your users.
So, I guess my rough rule of thumb is:
- use lambdas for specific callback or side-effect customized behaviors;
- use abstract classes or interfaces to provide a mechanism for object behavior customization (or at least the majority of the object's primary behavior).
Sorry I can't give you a clear definition, but I hope this helps. Good luck!
A few things to consider :
How many different functions/delegates would need to be over-ridden? If may functions, inheretance will group "sets" of overrides in an easier to understand way. If you have a single "registration" function, but many sub-portions can be delegated out to the implementor, this is a classic case of the "Template" pattern, which makes the most sense to be inherited.
How many different implementations of the same function will be needed? If just one, then inheretance is good, but if you have many implementations a delegate might save overhead.
If there are multiple implementations, will the program need to switch between them? Or will it only use a single implementation. If switching is required, delegates might be easier, but I would caution this, especially depending on the answer to #1. See the Strategy Pattern.
If the override needs access to any protected members, then inheretance. If it can rely only on publics, then delegate.
Other choices would be events, and extension methods as well.
When it comes to designing classes and "communication" between them, I always try to design them in such way that all object construction and composing take place in object constructor. I don't like the idea of object construction and composition taking place from outside, like other objects setting properties and calling methods on my object to initialize it. This especially gets ugly when multiple object try to do thisto your object and you never know in what order your props\methods will be executed.
Unforunatly I stumbl on such situations quite often, especially now with the growing popularity of dependecy injection frameworks, lots of libraries and frameworks rely on some kind of external object initialization, and quite often require not only constructor injection on our object but property injection too.
My question are:
Is it ok to have objects that relly on some method, or property to be called on them after which they can consider them initialzied?
Is ther some kind of pattern for situations when your object acting is receiver, and must support multiple interfaces that call it, and the order of these calls does matter? (something better than setting flags, like ThisWasDone, ThatWasCalled)
Is it ok to have objects that relly on some method, or property to be called on them after which they can consider them initialzied?
No. Init methods are a pain since there is no guarantee that they will get called. A simple solution is to switch to interfaces and use factory or builder pattern to compose the implementation.
#Mark Seemann has written a article about it: http://blog.ploeh.dk/2011/05/24/DesignSmellTemporalCoupling.aspx
Is there some kind of pattern for situations when your object acting is receiver, and must support multiple interfaces that call it, and the order of these calls does matter? (something better than setting flags, like ThisWasDone, ThatWasCalled)
Builder pattern.
I think it is OK, but there are implications. If this is an object to be used by others, you need to ensure that an exception is thrown any time a method or property is set or accessed and the initialization should have been called but isn't.
Obviously it is much more convenient and intuitive if you can take care of this in the constructor, then you don't have to implement these checks.
I don't see anything wrong in this. It may be not so convinient, but you can not ALWAYS use initialization in ctor, like you can not alwats drive under green light. These are dicisions that you made based on your app requirements.
It's ok. Immagine if your object, for example, need to read data from TCP stream or a file that ciuld be not present or corrupted. Raise an exception from ctor is baaad.
It's ok. If you think, for example, about some your DSL language compiler, it can looks like:
A) find all global variables and check if there mem allocation sum sutisfies your device requierements
B) parse for errors
C) check for self cycling
And so on...
Hoe this helps.
Answering (1)
Why not? An engine needs the driver because this must enter the key for the car, and later power-on. Will a car do things like detecting current speed if engine is stopeed? Or Will the car show remaining oil without powering-on it?
Some programming goals won't be able to have their actors initialized during its object construction, and this isn't because it's a non-proper way of doing things but because it's the natural, regular and/or semantically-wise way of representing its whole behavior.
Answering (2)
A decent class usage documentation will be your best friend. Like answer to (1), there're some things in this world that should be done in order to get them done rightly, and it's not a problem but a requirement.
Checking objects' state using flags isn't a problem too, it's a good way of adding reliability to your object models, because its own behaviors and consumers of them will be aware about if things got done as expected or not.
First of all, Factory Method.
public class MyClass
{
private MyClass()
{
}
public Create()
{
return new MyClass();
}
}
Second of all, why do you not want another class creating an object for you? (Factory)
public class MyThingFactory
{
IThing CreateThing(Speed speed)
{
if(speed == Speed.Fast)
{
return new FastThing();
}
return new SlowThing();
}
}
Third, why do multiple classes have side effects on new instances of your class? Don't you have declarative control over what other classes have access to your object?
when they say static classes should not have state/side effects does that mean:
static void F(Human h)
{
h.Name = "asd";
}
is violating it?
Edit:
i have a private variable now called p which is an integer. It's never read at all throughout the entire program, so it can't affect any program flow.
is this violating "no side effects"?:
int p;
static void F(Human h)
{
p=123;
h.Name = "asd";
}
the input and output is still always the same in this case..
When you say "they", who are you refering to?
Anyways, moving on. A method such as what you presented is completely fine - if that's what you want it to do, then OK. No worries.
Similarly, it is completely valid for a static class to have some static state. Again, it could be that you would need that at some point.
The real thing to watch out for is something like
static class A
{
private static int x = InitX();
static A()
{
Console.WriteLine("A()");
}
private static int InitX()
{
Console.out.WriteLine("InitX()");
return 0;
}
...
}
If you use something along these lines, then you could easily be confused about when the static constructor is called and when InitX() is called. If you had some side effects / state changing that occurs like in this example, then that would be bad practice.
But as far as your actual question goes, those kind of state changes and side effects are fine.
Edit
Looking at your second example, and taking the rule precisely as it is stated, then, yes, you are in violation of it.
But...
Don't let that rule necessarily stop you from things like this. It can be very useful in some cases, e.g. when a method does intensive calculation, memoization is an easy way to reduce performance cost. While memoization technically has state and side-effects, the output is always the same for every input, which is the really important .
Side effects of a static member mean that it change the value of some other members in its container class. The static member in your case does not effect other members of its class and it is not violating the sentence you have mentioned.
EDIT
In the second example you've added by editting your question you are violating it.
It is perfectly acceptable for methods of a static class to change the state of objects that are passed to them. Indeed, that is the primary use for non-function static methods (since a non-function method which doesn't change the state of something would be pretty useless).
The pattern to be avoided is having a static class where methods have side-effects that are not limited to the passed-in objects or objects referenced by them. Suppose, for example, one had an embroidery-plotting class which had functions to select an embroidery module, and to scale, translate, or rotate future graphic operations. If multiple routines expect to do some drawing, it could be difficult to prevent device-selections or transformations done by one routine from affecting other routines. There are two common ways to resolve this problem:
Have all the static graphic routines accept a parameter which will hold a handle to the current device and world transform.
Have a non-static class which holds a device handle and world transform, and have it expose a full set of graphic methods.
In many cases, the best solution will be to have a class which uses the second approach for its external interface, but possibly uses the first method internally. The first approach is somewhat better with regard to the Single Responsibility Principle, but from an external calling standpoint, using class methods is often nicer than using static ones.
I'm working on a class library and have opted for a route with my design to make implementation and thread safety slightly easier, however I'm wondering if there might be a better approach.
A brief background is that I have a multi-threaded heuristic algorithm within a class library, that once set-up with a scenario should attempt to solve it. However I obviously want it to be thread safe and if someone makes a change to anything while it is solving for that to causes crashes or errors.
The current approach I've got is if I have a class A, then I create a number InternalA instances for each A instance. The InternalA has many of the important properties from the A class, but is internal an inaccessible outside the library.
The downside of this, is that if I wish to extend the decision making logic (or actually let someone do this outside the library) then it means I need to change the code within the InternalA (or provide some sort of delegate function).
Does this sound like the right approach?
It's hard to really say from just that - but I can say that if you can make everything immutable, your life will be a lot easier. Look at how functional languages approach immutable data structures and collections. The less shared mutable data you have, the simple threading will be.
Why Not?
Create generic class, that accepts 2 members class (eg. Lock/Unlock) - so you could provide
Threadsafe impl (implmenetation can use Monitor.Enter/Exit inside)
System-wide safe impl (using Mutex)
Unsafe, but fast (using empty impl).
another way i have had some success with is by using interfaces to achieve functional separation. the cost of this approach is that you end up with some fields 'repeated' because each interface requires total separation from the others fields.
In my case I had 2 threads that need to pass over a set of data that potentially is large and needs as little garbage collection as possible. Ie I only want to pass change information from the first stage to the second. And then have the first process the next work unit.
this was achieved by the use of change buffers to pass changes from one interface to the next.
this allows one thread to work away at one interface, make all its changes and then publish a struct containing the changes that the other interface (thread) needs to apply prior to its work.
by doing this You have a double buffer ... (thread 1 produces a change report whilst thread 2 consumes the last report). If you add more interfaces (and threads) it appears like there are pulses of work moving through the threads.
This was based on my research and I have no doubt that there are better methods available now.
My aim when coming up with this however was to avoid the need for locks in the vast majority of code by designing out race conditions. the other major consideration is performance in garbage collection - which may not be an issue for you.
this way is all good until you need complex interactions between threads ... then you find that you start forcing the layout of your buffer structures for reuse to get around inheritance which in turn has an upkeep overhead.
A little more information on the problem to help...
The heuristic I'm using is to solve TSP like problems. What happens right at the start of each
calculation is that all the aspects that form the problem (sales man/places to visit) are cloned
so they aren't affected across threads.
This means each thread can change data (such as stock left on a sales man etc) as there are a number
of values that change during the calculation as things progress. What I'd quite like to do is allow
the checked such as HasSufficientStock() for a simple example to be override by a developer using the library.
Unforutantely at present however to add further protection across threads and makings some simplier/lightweight
classes I convert them to these internal classes, and these are the things that are actually used and cloned.
For example
class A
{
public double Stock { get; }
// Processing and cloning actually works using these InternalA's
internal InternalA ConvertToInternal() {}
}
internal class InternalA : ICloneable
{
public double Stock { get; set; }
public bool HasSufficientStock() {}
}
I found a while ago (and I want to confirm again) that if you declare a class level variable, you should not call its constructor until the class constructor or load has been called. The reason was performance - but are there other reasons to do or not do this? Are there exceptions to this rule?
ie: this is what I do based on what I think the best practice is:
public class SomeClass
{
private PersonObject _person;
public SomeClass()
{
_person = new PersonObject("Smitface");
}
}
opposed to:
public class SomeClass
{
private PersonObject _person = new PersonObject("Smitface");
public SomeClass()
{
}
}
If you set your variable outside of the constructor then there is no error handling (handeling) available. While in your example it makes no difference, but there are many cases that you may want to have some sort of error handling. In that case using your first option would be correct.
Nescio talked about what implication this would have on your applicaiton if there were some constructor failures.
For that reason, I always use Option #1.
Honestly, if you look at the IL, all that happens in the second case is the compiler moves the initialization to the constructor for you.
Personally, I like to see all initialization done in the constructor. If I'm working on a throwaway prototype project, I don't mind having the initialization and declaration in the same spot, but for my "I want to keep this" projects I do it all in the constructor.
Actually, in spite of what others have said, it can be important whether your initialization is inside or outside the constructor, as there is different behaviour during object construction if the object is in a hierarchy (i.e. the order in which things get run is different).
See this post and this post from Eric Lippert which explains the semantic difference between the two in more detail.
So the answer is that in the majority of cases it doesn't make any difference, and it certainly doesn't make any difference in terms of performance, but in a minority of cases it could make a difference, and you should know why, and make a decision based on that.
There's a common pattern called Dependency Injection or Inversion of Control (IOC) that offers exactly these two mechanisms for "injecting" a dependant object (like a DAL class) into a class that's furthur up the dependency chain (furthur from the database)
In this pattern, using a ctor, you would
public class SomeClass
{
private PersonObject per;
public SomeClass(PersonObject person)
{
per = person;
}
}
private PersonObject Joe = new PersonObject("Smitface");
SomeClass MyObj = new SomeClass(Joe);
Now you could for example, pass in a real DAL class for production call
or a test DAL class in a unit test method...
It depends on the context of how the variable will be used. Naturally constants and static or readonly should be initialized on declaration, otherwise they typically should be initialized in the constructor. That way you can swap out design patterns for how your objects are instatiated fairly easy without having to worry about when the variables will be initialized.
You should generally prefer the second variant. It's more robust to changes in your code. Suppose you add a constructor. Now you have to remember to initialize your variables there as well, unless you use the second variant.
Of course, this only counts if there are no compelling reasons to use in-constructor initialization (as mentioned by discorax).
I prefer the latter, but only becuase I find it neater.
It's personal preference really, they both do the same thing.
I like to initialize in the constructor because that way it all happens in one place, and also because it makes it easier if you decide to create an overloaded constructor later on.
Also, it helps to remind me of things that I'll want to clean up in a deconstructor.
the latter can take advantage of lazy instantiation, i.e. it won't initialize the variable until it is referenced
i think this type of question is stylistic only, so who cares what way you do it. the language allows both, so other people are going to do both. don't make bad assumptions.
The first declaration is actually cleaner. The second conceals the fact the constructor initializes the class in the static constructor. If for any reason the constructor fails, the whole type is unusable for the rest of the applicaiton.
I prefer to initialize variables as soon as possible, since it avoids (some) null errors.
Edit: Obviously in this simplified example there is no difference, however in the general case I believe it is good practice to initialize class variables when they are declared, if possible. This makes it impossible to refer to the variable before it it initialized, which eliminates some errors which would be possible when you initialize fields in the constructor.
As you get more class variables and the initialization sequence in the constructor gets more complex, it gets easier to introduce bugs where one initialization depends on another initialization which haven't happend yet.