Having spent a bit of time learning about functional programming, it's becoming more and more natural for me to want to work with static methods that don't perform any mutation.
Are there any reasons why I should curb this instinct?
The question I find a bit odd, because static methods and methods that perform no mutations are two orthogonal classifications of methods. You can have mutating static methods and nonmutating instance methods.
For me, it has become more and more natural to combine functional and oo programming; I like instance methods that perform no mutations. Functional programming is easier to understand because it discourages complex mutations; OO programming is easier to understand because the code and the data it operates on are close together. Why choose? Embrace the power of "and"; do both!
You can write working programs this way, but it's not idiomatic. If you want to work on a team, I'd try to curb it. If no one else is reading your code, go nuts.
For some reason I think of this quote when I read your question:
You can write Fortran in any language.
If the intent of C# were to be purely functional, static would be unnecessary because everything would be static by default. If you are strict about following OOP practices and the SOLID principles, your code effectively becomes functional (I know there's a quote out there about this somewhere) so you end up getting the best of both worlds.
The reason I would curb it in a multi-user project would be that it's not typical C# (it's really C# with handcuffs). You just need one person to break the rule and declare a static mutable property and everything goes to hell.
Not completely. I do like my extension methods, and Linq, but an OO-language should be used in an OO fashion. Besides, it's all imperative at the CPU, and for several layers on top of that.
Great question.
I think the answer depends on the context of what your code does, and how much of it is static.
My code has been seeing less static cases since I program to interfaces a lot now and mark some methods 'protected virtual' rather than 'static' for cases such as unit testing's extract and override pattern. Thats not to say you couldnt just call static methods from those, though.
There is many reasons should curb this instinct when you write OOP.
Object states and behaviours;
Static class and method need parameters, and you should be sure which paramters will pass to static methods. But class manages itself if method related with is state.
I think that only this reason enough to curb static modifiers.
And for clear reason, we should listen Mrs. Liskov. here
Related
I often find myself in a situation where i am repeating two,three lines of code in a method multiple times and then think whether i should put that in a separate method to avoid code duplication. But then when i move those lines out of the method i find that the method just created is not reusable, was used only one time or requires an overload to be useful for another method.
My Question is what sort of patterns should we be looking for that indicate we should create a new method. I appreciate your response.
Don't put too much functionality in one method/class. Try to follow the single responsibility principle. It'll take some time getting familiar with that approach. But once you reach that level, you'll notice that it's done all by itself. Before coding, try to ask yourself, what functional units your concept includes.
For example, you want to develop an application, that can index the content of pdf files. It's just fictional, but at first sight, I could identify at least three components:
PdfParser - this provides you with the content of a pdf
Indexer - gets input from parser and counts meaningful words
Repository - it's for persistence; this could be made generic; so just say repository.Get<IndexData>(filename) or something
You should also try to code against interfaces. Especially when some kind of UI is involved. For example, you are developing a chat client with WinForms. If you follow the MVC/MVVM-pattern, you can easily (i.e., easier than coding against a Form object) use your original logic with a WPF version of the client.
I would start by reading about the DRY principle (Don't Repeat Yourself) hopefully it will give you a good answer for your question, which is a question that all developers should be asking themselves by the way, great question!!
See Don't repeat yourself
I wanted to leave it at DRY because it is such a simple but powerful concept that will need some reading and a lot of practice to get good add. But let me try to answer directly to your question (IMHO),
If you can't give your method a name that reflects exactly what your method is doing, break it into pieces that have meaning.
You'll find yourself DRYing up your code with ease, reusable pieces will show up, and you probably will never find yourself repeating code.
I would do this even if it meant having methods with only couple of lines of code.
Following this practice will give meaning to your code, make it readable and predictable, and definitely more reusable
If the lines of code that you intend to move to another method perform a specific set of actions (like read a file, calculate a value, etc.) then it is best to refactor into another helper method. Again, do this only if the helper method is being called at several places in your code or if your caller method is too long (definition of too long depends on the developer).
Similar questions
How do programmers practice code reuse
What techniques do you use to maximise code reuse?
Code Reusability: Is it worth it?
Coding Priorities: Performance, Maintainability, Reusability?
As a general rule, always think of those situations as functional entities. If a piece of code functionally performs a task (complex string conversion, parsing, etc), you should write reusable method.
If that function is specific to a certain type, then write an extension method.
You could create a local variable inside your function of type Action<> or Func<> and assign the code snippet to it. Then you can use it everywhere inside your function without polluting your class with too many little helper functions.
If you build a method for reusability, but don't use it in more than one place, then the reusability of you method isn't really verified.
Extract methods when it makes sense, and redesign those methods for reusability when you actually have the opportunity to reuse code.
I'm pretty new to C# so bear with me.
One of the first things I noticed about C# is that many of the classes are static method heavy. For example...
Why is it:
Array.ForEach(arr, proc)
instead of:
arr.ForEach(proc)
And why is it:
Array.Sort(arr)
instead of:
arr.Sort()
Feel free to point me to some FAQ on the net. If a detailed answer is in some book somewhere, I'd welcome a pointer to that as well. I'm looking for the definitive answer on this, but your speculation is welcome.
Because those are utility classes. The class construction is just a way to group them together, considering there are no free functions in C#.
Assuming this answer is correct, instance methods require additional space in a "method table." Making array methods static may have been an early space-saving decision.
This, along with avoiding the this pointer check that Amitd references, could provide significant performance gains for something as ubiquitous as arrays.
Also see this rule from FXCOP
CA1822: Mark members as static
Rule Description
Members that do not access instance data or call instance methods can
be marked as static (Shared in Visual Basic). After you mark the
methods as static, the compiler will emit nonvirtual call sites to
these members. Emitting nonvirtual call sites will prevent a check at
runtime for each call that makes sure that the current object pointer
is non-null. This can achieve a measurable performance gain for
performance-sensitive code. In some cases, the failure to access the
current object instance represents a correctness issue.
Perceived functionality.
"Utility" functions are unlike much of the functionality OO is meant to target.
Think about the case with collections, I/O, math and just about all utility.
With OO you generally model your domain. None of those things really fit in your domain--it's not like you are coding and go "Oh, we need to order a new hashtable, ours is getting full". Utility stuff often just doesn't fit.
We get pretty close, but it's still not very OO to pass around collections (where is your business logic? where do you put the methods that manipulate your collection and that other little piece or two of data you are always passing around with it?)
Same with numbers and math. It's kind of tough to have Integer.sqrt() and Long.sqrt() and Float.sqrt()--it just doesn't make sense, nor does "new Math().sqrt()". There are a lot of areas it just doesn't model well. If you are looking for mathematical modeling then OO may not be your best bet. (I made a pretty complete "Complex" and "Matrix" class in Java and made them fairly OO, but making them really taught me some of the limits of OO and Java--I ended up "Using" the classes from Groovy mostly)
I've never seen anything anywhere NEAR as good as OO for modeling your business logic, being able to demonstrate the connections between code and managing your relationship between data and code though.
So we fall back on a different model when it makes more sense to do so.
The classic motivations against static:
Hard to test
Not thread-safe
Increases code size in memory
1) C# has several tools available that make testing static methods relatively easy. A comparison of C# mocking tools, some of which support static mocking: https://stackoverflow.com/questions/64242/rhino-mocks-typemock-moq-or-nmock-which-one-do-you-use-and-why
2) There are well-known, performant ways to do static object creation/logic without losing thread safety in C#. For example implementing the Singleton pattern with a static class in C# (you can jump to the fifth method if the inadequate options bore you): http://www.yoda.arachsys.com/csharp/singleton.html
3) As #K-ballo mentions, every method contributes to code size in memory in C#, rather than instance methods getting special treatment.
That said, the 2 specific examples you pointed out are just a problem of legacy code support for the static Array class before generics and some other code sugar was introduced back in C# 1.0 days, as #Inerdia said. I tried to answer assuming you had more code you were referring to, possibly including outside libraries.
The Array class isn't generic and can't be made fully generic because this would break backwards compatibility. There's some sort of magic going on where arrays implement IList<T>, but that's only for single-dimension arrays with a lower bound of 0 – "list-ish" arrays.
I'm guessing the static methods are the only way to add generic methods that work over any shape of array regardless of whether it qualifies for the above-mentioned compiler magic.
So yeah, the question basically says it all. What do you gain when you ensure that private members / methods / whatever are marked private (or protected, or public, or internal, etc) appropriately?
I mean, of course I could just go and mark all my methods as public and everything should still work fine. Of course, if we'd talk about good programming practice (which I am a solid advocate of, by the way ), I'd mark a method as private if it should be marked as such, no questions asked.
But let's set aside good programming practice, and just look at this in terms of actual quantitative gain. What do I get for proper scoping of my methods, members, classes, etc.?
I'm thinking that this would most generally translate to performance gains, but I'd appreciate it if someone could provide more detail about it.
(For purposes of this question, I'm thinking more along C#.NET, but hey, feel free to provide answers on whatever language / framework you deem fit.)
EDIT: Most pointed out that this doesn't lead to performance gain, and yeah, thinking back, I don't even know why I thought that. Lack of coffee probably.
In any case, any good programmer should know about how proper scopes (1) help your code maintenance / (2) control the proper use of your library / app / package; I was kinda curious as to whether or not there was any other benefit you get from it that's not apparently obvious outright. Based on the answers below, it looks like it basically sums up to just those two things most importantly.
Performance has absolutely nothing to do with the visibility of methods. Virtual methods have some overhead, but that's not why we scope. It has to do with maintenance of code. Your public methods are the API to your class or library. You as a class designer want to provide some guarantee to the outside world that future changes aren't going to break other peoples code. By marking some methods private, you take away the ability for users to depend on certain implementations which allows you freedom to change that implementation at will.
Even languages that don't have visibility modifiers, like python, have conventions for marking methods as internal and subject to change. By prefixing the method with an _underscore(), you're signalling to the outside world that if you use that method, you do so at your own risk, as it can change at any time.
On the other hand, public methods are an explicit entry way into your code. Every effort should go towards making public methods backward compatible to avoid the pitfalls I described above.
By better encapsulation, you provide a better API. Only methods / properties that are of interest of the user of your class are available : visible.
Next to that, you ensure that certain variables that should not be called / modified, cannot be called/modified.
That's the most important thing. Why do you think this would lead to performance gains ?
As I see you gain two important features from proper scoping. You API is reduced in size and clearly focused on the task at hand.
Second, you get a less brittle implementation as you are free to change implementation details without altering the exposed API.
I cannot see how accessibility modifiers would affect performance in any way.
There are mainly two types of methods/properties.
That are helpful to perform a task to whoever consumes it. (Recommended Scope: Public)
That are helpful to the above methods to get their task done. (Recommended Scope: Private or Protected)
Type 1 methods are the only methods that any client code requires and does not need any other method. This avoids confusion, keeps things simple and prevents client code to do something wrong.
Type 2 methods are methods into which Type 1 methods are divided. They help Type 1 methods to complete their task and still allow them to be simple, concise, less complex and more readable. They are not really needed for client code but just the class/module itself.
A fair example would be of a car. What you have is a gas pedal, brakes, gearbox, etc. You don't have an interface to minor details for what is under the hood. That is for the mechanic.
In C# programming, it helps to make sure that your API/classes/methods/members are "easy to use correctly and difficult to use incorrectly".
I know this has been discussed many times, but I am not sure I really understand why Java and C# designers chose to omit this feature from these languages. I am not interested in how I can make workarounds (using interfaces, cloning, or any other alternative), but rather in the rationale behind the decision.
From a language design perspective, why has this feature been declined?
P.S: I'm using words such as "omitted", which some people may find inadequate, as C# was designed in an additive (rather than subtractive) approach. However, I am using such words because the feature existed in C++ before these languages were designed, so it is omitted in the sense of being removed from a programmer's toolbox.
In this interview, Anders said:
Anders Hejlsberg: Yes. With respect to
const, it's interesting, because we
hear that complaint all the time too:
"Why don't you have const?" Implicit
in the question is, "Why don't you
have const that is enforced by the
runtime?" That's really what people
are asking, although they don't come
out and say it that way.
The reason that const works in C++ is
because you can cast it away. If you
couldn't cast it away, then your world
would suck. If you declare a method
that takes a const Bla, you could pass
it a non-const Bla. But if it's the
other way around you can't. If you
declare a method that takes a
non-const Bla, you can't pass it a
const Bla. So now you're stuck. So you
gradually need a const version of
everything that isn't const, and you
end up with a shadow world. In C++ you
get away with it, because as with
anything in C++ it is purely optional
whether you want this check or not.
You can just whack the constness away
if you don't like it.
I guess primarily because:
it can't properly be enforced, even in C++ (you can cast it)
a single const at the bottom can force a whole chain of const in the call tree
Both can be problematic. But especially the first: if it can't be guaranteed, what use is it? Better options might be:
immutable types (either full immutability, or popsicle immutability)
As to why they did it those involved have said so:
http://blogs.msdn.com/ericgu/archive/2004/04/22/118238.aspx
http://blogs.msdn.com/slippman/archive/2004/01/22/61712.aspx
also mentioned by Raymond Chen
http://blogs.msdn.com/oldnewthing/archive/2004/04/27/121049.aspx
In a multi language system this would have been very complex.
As for Java, how would you have such a property behave? There are already techniques for making objects immutable, which is arguably a better way to achieve this with additional benefits. In fact you can emulate const behaviour by declaring a superclass/superinterface that implements only the methods that don't change state, and then having a subclass/subinterface that implements the mutating methods. By upcasting your mutable class to an instance of class with no write methods, other bits of code cannot modify the object without specifically casting it back to the mutable version (which is equivalent to casting away const).
Even if you don't want the object to be strictly immutable, if you really wanted (which I wouldn't recommend) you could put some kind of 'lock' mode on it so that it could only be mutated when unlocked. Have the lock/unlock methods be private, or protected as appropriate, and you get some level of access control there. Alternatively, if you don't intend for the method taking it as a parameter to modify it at all, pass in a copy of that object, or if copying the entire object is too heavyweight then some other lightweight data object that contains just the necessary information. You could even use dynamic proxies to create a proxy to your object that turn any calls to mutation methods into no-ops.
Basically there are already a whole bunch of ways to prevent a class being mutated, which let you choose one that fits most appropriately into your situation (hint: choose pure immutability wherever possible as it makes the object trivially threadsafe and easier to reason with in general). There are no obvious semantics for how const could be implemented that would be an improvement on these techniques, it would be another thing to learn that would either lack flexibility, or be so flexible as to be useless.
That is, unless I've missed something, which is entirely possible. :-)
Java have its own version of const; final. Joshua Bloch describes in his Effective Java
how you effectively use the final keyword. (btw, const is a reserved keyword in Java, for future discrepancies)
I know that in C# 3.0 you can do some functional programming magic with Linq and lambda expression and all that stuff. However, is it really possible to go completely "pure" functional in C#? By "pure" I mean having methods that are pure (always gives the same output for the same input) and completely free of side-effects. How do we get around the fact that we do not even have immutable integer type in C#?
If you want to program in a pure functional way, there is nothing stopping you.
On the other hand, if you have some program, there is no magic flag you can flip to force the program to behave in a pure functional way.
For ints (immutable)
If you use an int as a parameter, it is passed by value. Any changes are not propogated to the caller.
If you use an int declared in one method's scope in a closure within the method, than that int variable is shared. In this case, one must either pledge not to modify the int (programmer enforced), or simply not use an int in this way.
And if you truly need an immutable int, have you seen the readonly keyword?
Have you looked into F#? It seems much more along the lines of what you are talking about. C# just really isn't designed with functional programming in mind, and therefore won't really give you any of the benefits that are normally associated with a functional language.
Unfortunately no - C# is not a pure functional language and it does not intend to be. What has happened is that the C# team has seen that there are benefits to adding certain functionally-styled constructs and syntax to the language.
Functional purity is better found in other places (Lisp derivatives like Common Lisp and Scheme are good places to start).