What are the benefits of proper scoping? - c#

So yeah, the question basically says it all. What do you gain when you ensure that private members / methods / whatever are marked private (or protected, or public, or internal, etc) appropriately?
I mean, of course I could just go and mark all my methods as public and everything should still work fine. Of course, if we'd talk about good programming practice (which I am a solid advocate of, by the way ), I'd mark a method as private if it should be marked as such, no questions asked.
But let's set aside good programming practice, and just look at this in terms of actual quantitative gain. What do I get for proper scoping of my methods, members, classes, etc.?
I'm thinking that this would most generally translate to performance gains, but I'd appreciate it if someone could provide more detail about it.
(For purposes of this question, I'm thinking more along C#.NET, but hey, feel free to provide answers on whatever language / framework you deem fit.)
EDIT: Most pointed out that this doesn't lead to performance gain, and yeah, thinking back, I don't even know why I thought that. Lack of coffee probably.
In any case, any good programmer should know about how proper scopes (1) help your code maintenance / (2) control the proper use of your library / app / package; I was kinda curious as to whether or not there was any other benefit you get from it that's not apparently obvious outright. Based on the answers below, it looks like it basically sums up to just those two things most importantly.

Performance has absolutely nothing to do with the visibility of methods. Virtual methods have some overhead, but that's not why we scope. It has to do with maintenance of code. Your public methods are the API to your class or library. You as a class designer want to provide some guarantee to the outside world that future changes aren't going to break other peoples code. By marking some methods private, you take away the ability for users to depend on certain implementations which allows you freedom to change that implementation at will.
Even languages that don't have visibility modifiers, like python, have conventions for marking methods as internal and subject to change. By prefixing the method with an _underscore(), you're signalling to the outside world that if you use that method, you do so at your own risk, as it can change at any time.
On the other hand, public methods are an explicit entry way into your code. Every effort should go towards making public methods backward compatible to avoid the pitfalls I described above.

By better encapsulation, you provide a better API. Only methods / properties that are of interest of the user of your class are available : visible.
Next to that, you ensure that certain variables that should not be called / modified, cannot be called/modified.
That's the most important thing. Why do you think this would lead to performance gains ?

As I see you gain two important features from proper scoping. You API is reduced in size and clearly focused on the task at hand.
Second, you get a less brittle implementation as you are free to change implementation details without altering the exposed API.
I cannot see how accessibility modifiers would affect performance in any way.

There are mainly two types of methods/properties.
That are helpful to perform a task to whoever consumes it. (Recommended Scope: Public)
That are helpful to the above methods to get their task done. (Recommended Scope: Private or Protected)
Type 1 methods are the only methods that any client code requires and does not need any other method. This avoids confusion, keeps things simple and prevents client code to do something wrong.
Type 2 methods are methods into which Type 1 methods are divided. They help Type 1 methods to complete their task and still allow them to be simple, concise, less complex and more readable. They are not really needed for client code but just the class/module itself.
A fair example would be of a car. What you have is a gas pedal, brakes, gearbox, etc. You don't have an interface to minor details for what is under the hood. That is for the mechanic.

In C# programming, it helps to make sure that your API/classes/methods/members are "easy to use correctly and difficult to use incorrectly".

Related

Does not use OO Encapsulation creates Security Risks?

This question maybe sounds some stupid, but i have the curiosity to know, if expose the fields of an object like publics, can create a security risk or a hole in my aplication that can be exploited by other persons?
public class AClass
{
public int AProperty { get; set; }
//Less Secure?
public int APublicField;
}
Thanks
Access modifiers (public, private, protected) are simply a mechanism to allow different levels of encapsulation for those with source access. You should employ them with OOP best practices to make your logic easy to maintain. They have absolutely nothing to do with the security of your application.
In C# you can use RTTI to reflect on private data just as easily as public. Thanks to the CLR and binary compatibility guarantees of C# you can even do this on external binaries from within your own. Even if you couldn't use reflection, an attacker can use a dissasembler and inspect the code as intermediate language. You may want to read this post.
An astute attacker familiar with the platform could even look for clues on how to reverse engineer your application running kernel level debuggers such as softICE.
You could always try to use some form of security through obscurity by encapsulating your logic in an attempt to make the execution less obvious, but obviously if someone is persistant enough they will find a way.
The best bet for security is to do your research and look for well used libraries both commercial and open source. The more exposure there is in cryptography the better as library developers and white hats can collaborate to fix exploits as they are discovered.
The two "fields" in your snippet are essentially equivalent. The first one is simply an auto-property. So, neither is more or less "secure" than the other.
In fact, the setters and getters are just a historical best practice of the OOP.
Both of the solutions that you exposed don't respect the encapsulation, because when you change the name or the type of an attribute you must change it in all the classes where it's used.
However, one advantage of the public attribute is that you get less code, which helps to understand and maintain the code easily.

C# classes - Why so many static methods?

I'm pretty new to C# so bear with me.
One of the first things I noticed about C# is that many of the classes are static method heavy. For example...
Why is it:
Array.ForEach(arr, proc)
instead of:
arr.ForEach(proc)
And why is it:
Array.Sort(arr)
instead of:
arr.Sort()
Feel free to point me to some FAQ on the net. If a detailed answer is in some book somewhere, I'd welcome a pointer to that as well. I'm looking for the definitive answer on this, but your speculation is welcome.
Because those are utility classes. The class construction is just a way to group them together, considering there are no free functions in C#.
Assuming this answer is correct, instance methods require additional space in a "method table." Making array methods static may have been an early space-saving decision.
This, along with avoiding the this pointer check that Amitd references, could provide significant performance gains for something as ubiquitous as arrays.
Also see this rule from FXCOP
CA1822: Mark members as static
Rule Description
Members that do not access instance data or call instance methods can
be marked as static (Shared in Visual Basic). After you mark the
methods as static, the compiler will emit nonvirtual call sites to
these members. Emitting nonvirtual call sites will prevent a check at
runtime for each call that makes sure that the current object pointer
is non-null. This can achieve a measurable performance gain for
performance-sensitive code. In some cases, the failure to access the
current object instance represents a correctness issue.
Perceived functionality.
"Utility" functions are unlike much of the functionality OO is meant to target.
Think about the case with collections, I/O, math and just about all utility.
With OO you generally model your domain. None of those things really fit in your domain--it's not like you are coding and go "Oh, we need to order a new hashtable, ours is getting full". Utility stuff often just doesn't fit.
We get pretty close, but it's still not very OO to pass around collections (where is your business logic? where do you put the methods that manipulate your collection and that other little piece or two of data you are always passing around with it?)
Same with numbers and math. It's kind of tough to have Integer.sqrt() and Long.sqrt() and Float.sqrt()--it just doesn't make sense, nor does "new Math().sqrt()". There are a lot of areas it just doesn't model well. If you are looking for mathematical modeling then OO may not be your best bet. (I made a pretty complete "Complex" and "Matrix" class in Java and made them fairly OO, but making them really taught me some of the limits of OO and Java--I ended up "Using" the classes from Groovy mostly)
I've never seen anything anywhere NEAR as good as OO for modeling your business logic, being able to demonstrate the connections between code and managing your relationship between data and code though.
So we fall back on a different model when it makes more sense to do so.
The classic motivations against static:
Hard to test
Not thread-safe
Increases code size in memory
1) C# has several tools available that make testing static methods relatively easy. A comparison of C# mocking tools, some of which support static mocking: https://stackoverflow.com/questions/64242/rhino-mocks-typemock-moq-or-nmock-which-one-do-you-use-and-why
2) There are well-known, performant ways to do static object creation/logic without losing thread safety in C#. For example implementing the Singleton pattern with a static class in C# (you can jump to the fifth method if the inadequate options bore you): http://www.yoda.arachsys.com/csharp/singleton.html
3) As #K-ballo mentions, every method contributes to code size in memory in C#, rather than instance methods getting special treatment.
That said, the 2 specific examples you pointed out are just a problem of legacy code support for the static Array class before generics and some other code sugar was introduced back in C# 1.0 days, as #Inerdia said. I tried to answer assuming you had more code you were referring to, possibly including outside libraries.
The Array class isn't generic and can't be made fully generic because this would break backwards compatibility. There's some sort of magic going on where arrays implement IList<T>, but that's only for single-dimension arrays with a lower bound of 0 – "list-ish" arrays.
I'm guessing the static methods are the only way to add generic methods that work over any shape of array regardless of whether it qualifies for the above-mentioned compiler magic.

Does reflection breaks the idea of private methods, because private methods can be access outside of the class?

Does reflection break the idea of private methods? Because private methods can be accessed from outside of the class? (Maybe I don't understand the meaning of reflection or miss something else, please tell me)
http://en.wikipedia.org/wiki/Reflection_%28computer_science%29
Edit:
If relection breaks the idea of private methods - do we use private methods only for program logic and not for program security?
Thanks
do we use private methods only for program logic and not for program security?
It is not clear what you mean by "program security". Security cannot be discussed in a vacuum; what resources are you thinking of protecting against what threats?
The CLR code access security system is intended to protect resources of user data from the threat of hostile partially trusted code running on the user's machine.
The relationship between reflection, access control and security in the CLR is therefore complicated. Briefly and not entirely accurately, the rules are these:
full trust means full trust. Fully trusted code can access every single bit of memory in the process. That includes private fields.
The ability to reflect on privates in partial trust is controlled by a permission; if it is not granted then partial trust code may not do reflection on privates.
See Link for details.
The desktop CLR supports a mode called "restricted skip visibility" in which the rules for how reflection and the security system interact are slightly different. Basically,
partially trusted code that has the right to use private reflection may access a private field via reflection if the partially trusted code is accessing a private field from a type that comes from an assembly with equal or less trust.
See
Link
for details
The executive summary is: you can lock partially trusted code down sufficiently that it is not able to use reflection to look at private stuff. You cannot lock down full trust code; that's why it's called "full trust". If you want to restrict it then don't trust it.
So: does making a field private protect it from the threat of low trust code attempting to read it, and thereby steal user's data? Yes. Does it protect it from the threat of high trust code reading it? No. If the code is both trusted by the user and hostile to the user then the user has a big problem. They should not have trusted that code.
Note that for example, making a field private does not protect a secret in your code from a user who has your code and is hostile to you. The security system protects good users from evil code. It doesn't protect good code from evil users. If you want to make something private to keep it from a user then you are on a fool's errand. If you want to make it private to keep a secret from evil hackers who have lured the user into running hostile low-trust code then that is a good technique.
Reflection does provide a way to circumvent Java's Access Protection Modifiers and therefore violates strict encapsulation as it realised in C++ and Java. However this does not matter as much as you might think.
Access Protection Modifiers are intended to assist programmers to develop modular well factored systems, not to be uncompromising gate keepers. There are sometimes very good reasons to break strict encapsulation such as Unit Testing and framework development.
While it may initially be difficult to stomach the idea that Access Protection Modifiers are easily circumventable, try to remember that there are many languages (Python, Ruby etc.) that do not have them at all. These languages are used to build large and complex systems just like languages which do provide access protection.
There is some debate on whether Access Protection Modifiers are a help or a hindrance. Even if you do value access protection treat it like a helping hand, not the making or breaking of your project.
Yes, but it is not a problem.
Encapsulation is not about security or secrets, just about organizing things.
Reflection is not part of 'normal' programming. If you want to use it to break encapsulation, you accept the risks (versioning problems etc)
Reflection should only be used when there are no better (less invasive) ways to accomplish something.
Reflection is for system-level 'tooling' like persistence mapping and should be tucked away in well tested libraries. I would find any use of reflection in normal application code suspect.
I started with "it is not a problem". I meant: as long as you use reflection as intended. And careful.
It's like your house. Locks only keep out honest people, or people who aren't willing to pick your lock.
Data is data, if someone is determined enough, they can do anything with your code. Literally anything.
So yes, reflection will allow people to do things you don't want them to do with your code, for example access private fields and methods. However, the important thing is that people will not accidentally do this. If they're using reflection, they know they're doing something they probably aren't intended to do, just like no one accidentally picks the lock on your front door.
No, reflection doesn't break the idea of private methods. At least not per se. There is nothing that says that reflection can't obey access restrictions.
Badly designed reflection breaks the idea of private methods, but that doesn't have anything to do with reflection per se: anything which is badly designed can break the idea of private methods. In particular, a bad design of private methods can also obviously break the idea of private methods.
What do I mean by badly designed? Well, as I said above, there is nothing stopping you from having a language in which reflection obeys access restrictions. The problem with this is that e.g. debuggers, profilers, coverage tools, IntelliSense, IDEs, tools in general need to be able to violate access restrictions. Since there is no way to present different different versions of reflection to different clients, most languages opt for tools over safety. (E is the counterexample, which has absolutely no reflective capabilities whatsoever, as a conscious design choice.)
But, who says that you cannot present different versions of reflection to different clients? Well, the problem is simply that in the classical implementation of reflection, all objects are reponsible for reflecting on themselves, and since there is only one of every object, there can be only version of reflection.
So, where does the idea of bad design come in? Well, note the word "responsible" in the above paragraph. Every object is responsible for reflecting on itself. Also, every object is responsible for whatever it is that it was written for in the first place. In other words: every object has at least two responsibilities. This violates one of the basic principles of object-oriented design: the Single Responsibility Principle.
The solution is rather simple: break up the object. The original object is simply responsible for whatever it was originally written for. And there is another object (called a Mirror because it is an object that reflects other objects) which is responsible for reflection. And now that the responsibility for reflection is broken out into a separate object, what's stopping us from having not one, but two, three, many Mirror Objects? One that respects access restrictions, one that only allows an object to reflect upon itself but not any other objects, one that only allows introspection (i.e. is read-only), one that only allows to reflect on read-only callsite information (i.e. for a profiler), one that gives full access to the entire system including violating access restrictions (for a debugger), one that only gives read-only access to the method names and signatures and respects access restrictions (for IntelliSense) and so on …
As a nice bonus, this means that Mirrors are essentially Capabilities (in the capability-security sense of the word) for reflection. IOW: Mirrors are the Holy Grail on the decade-long quest to reconcile security and runtime dynamic metaprogramming.
The concept of Mirrors was originally invented in Self from where it carried over into Animorphic Smalltalk/Strongtalk and then Newspeak. Interestingly, the Java Debugging Interface is based on Mirrors, so the designers of Java (or rather the JVM) clearly knew about them, but Java's reflection is broken.
Yes, reflection breaks this idea. Native languages also have some tricks to break OOP rules, for example, in C++ it is possible to change private class members using pointer tricks. However, by using these tricks, we get the code which can be incompatible with future class versions - this is the price we pay for breaking OOP rules.
It does, as other already stated.
However, I remember that in Java there can be a security manager active, that could prevent you from accessing any private methods, even with reflection, if you don't have the rights to do so. If you run your local JVM, such a manager is usually not active.
Yes, Reflection could be used to violate encapsulation and even cause incorrect behavior. Keep in mind that the assembly needs to be trusted to perform reflection, so there are still some protections in place.
Yes it breaks the encapsulation, if you want it to. However, it can be put to good use - like writing unit tests for private methods, or sometimes - as I have learned from my own experience - getting around bugs in third party APIs :)
Note that encapsulation != security. Encapsulation is an object oriented design concept and is only meant for improving the design. For security, there is SecurityManager in java.
Yes. Reflection breaks encapsulation principle. That's not only to get access to private members but rather expose whole structure of a class.
I think this is a matter of opinion, but if you are using reflection to get around the encapsulation put in place by a developer on a class, then you are defeating the purpose.
So, to answer your question, it breaks the idea of encapsulation (or information hiding), which simply states that private properties/methods are private so they cant be mucked with outside the class.
Reflection makes it possible for any CLR class to examine and manipulate properties and fields of other CLR classes, but not necessarily to do so sensibly. It's possible for a class to obscure the meaning of properties and fields or protect them against tampering by having them depend in non-obvious fashion upon each other, static fields, underlying OS info, etc.
For example, a class could keep in some variable an encrypted version of the OS handle for its main window. Using reflection, another class could see that variable, but without knowing the encryption method it could not identify the window to which it belonged or make the variable refer to another window.
I've seen classes that claim to act as "universal serializers"; they can be very useful if applied to something like a data-storage-container class which is missing a "serializable" attribute but is otherwise entirely straightforward. They will produce gobbledygook if applied to any class whose creator has endeavored to obscure things.
Yes, it does break encapsulation. But there are many good reasons to use it in some situations.
For example:
I use MSCaptcha in some websites, but it renders a < div> around the < img > tag that messes with my HTML. Then i can use a standard < img> tag and use reflection to get the value of the captcha's image id to construct a URL.
The image id is a private Property but using reflection i can get that value.
access control through private/protected/package/public is not primarily meant for security.
it helps good guys to do the right thing, but doesn't prevent bad guys from doing wrong things.
generally we assume others are good guys, and we include their code into our application without though.
if you can't trust the guy of a library you are including, you are screwed.

Why C# implements methods as non-virtual by default?

Unlike Java, why does C# treat methods as non-virtual functions by default? Is it more likely to be a performance issue rather than other possible outcomes?
I am reminded of reading a paragraph from Anders Hejlsberg about several advantages the existing architecture is bringing out. But, what about side effects? Is it really a good trade-off to have non-virtual methods by default?
Classes should be designed for inheritance to be able to take advantage of it. Having methods virtual by default means that every function in the class can be plugged out and replaced by another, which is not really a good thing. Many people even believe that classes should have been sealed by default.
virtual methods can also have a slight performance implication. This is not likely to be the primary reason, however.
I'm surprised that there seems to be such a consensus here that non-virtual-by-default is the right way to do things. I'm going to come down on the other - I think pragmatic - side of the fence.
Most of the justifications read to me like the old "If we give you the power you might hurt yourself" argument. From programmers?!
It seems to me like the coder who didn't know enough (or have enough time) to design their library for inheritance and/or extensibility is the coder who's produced exactly the library I'm likely to have to fix or tweak - exactly the library where the ability to override would come in most useful.
The number of times I've had to write ugly, desperate work-around code (or to abandon usage and roll my own alternative solution) because I can't override far, far outweighs the number of times I've ever been bitten (e.g. in Java) by overriding where the designer might not have considered I might.
Non-virtual-by-default makes my life harder.
UPDATE: It's been pointed out [quite correctly] that I didn't actually answer the question. So - and with apologies for being rather late....
I kinda wanted to be able to write something pithy like "C# implements methods as non-virtual by default because a bad decision was made which valued programs more highly than programmers". (I think that could be somewhat justified based on some of the other answers to this question - like performance (premature optimisation, anyone?), or guaranteeing the behaviour of classes.)
However, I realise I'd just be stating my opinion and not that definitive answer that Stack Overflow desires. Surely, I thought, at the highest level the definitive (but unhelpful) answer is:
They're non-virtual by default because the language-designers had a decision to make and that's what they chose.
Now I guess the exact reason that they made that decision we'll never.... oh, wait! The transcript of a conversation!
So it would seem that the answers and comments here about the dangers of overriding APIs and the need to explicitly design for inheritance are on the right track but are all missing an important temporal aspect: Anders' main concern was about maintaining a class's or API's implicit contract across versions. And I think he's actually more concerned about allowing the .Net / C# platform to change under code rather than concerned about user-code changing on top of the platform. (And his "pragmatic" viewpoint is the exact opposite of mine because he's looking from the other side.)
(But couldn't they just have picked virtual-by-default and then peppered "final" through the codebase? Perhaps that's not quite the same.. and Anders is clearly smarter than me so I'm going to let it lie.)
Because it's too easy to forget that a method may be overridden and not design for that. C# makes you think before you make it virtual. I think this is a great design decision. Some people (such as Jon Skeet) have even said that classes should be sealed by default.
To summarize what others said, there are a few reasons:
1- In C#, there are many things in syntax and semantics that come straight from C++. The fact that methods where not-virtual by default in C++ influenced C#.
2- Having every method virtual by default is a performance concern because every method call must use the object's Virtual Table. Moreover, this strongly limits the Just-In-Time compiler's ability to inline methods and perform other kinds of optimization.
3- Most importantly, if methods are not virtual by default, you can guarantee the behavior of your classes. When they are virtual by default, such as in Java, you can't even guarantee that a simple getter method will do as intended because it could be overridden to do anything in a derived class (of course you can, and should, make the method and/or the class final).
One might wonder, as Zifre mentioned, why the C# language did not go a step further and make classes sealed by default. That's part of the whole debate about the problems of implementation inheritance, which is a very interesting topic.
C# is influenced by C++ (and more). C++ does not enable dynamic dispatch (virtual functions) by default. One (good?) argument for this is the question: "How often do you implement classes that are members of a class hiearchy?". Another reason to avoid enabling dynamic dispatch by default is the memory footprint. A class without a virtual pointer (vpointer) pointing to a virtual table, is ofcourse smaller than the corresponding class with late binding enabled.
The performance issue is not so easy to say "yes" or "no" to. The reason for this is the Just In Time (JIT) compilation which is a run time optimization in C#.
Another, similar question about "speed of virtual calls.."
The simple reason is design and maintenance cost in addition to performance costs. A virtual method has additional cost as compared with a non-virtual method because the designer of the class must plan for what happens when the method is overridden by another class. This has a big impact if you expect a particular method to update internal state or have a particular behavior. You now have to plan for what happens when a derived class changes that behavior. It's much harder to write reliable code in that situation.
With a non-virtual method you have total control. Anything that goes wrong is the fault of the original author. The code is much easier to reason about.
If all C# methods were virtual then the vtbl would be much bigger.
C# objects only have virtual methods if the class has virtual methods defined. It is true that all objects have type information that includes a vtbl equivalent, but if no virtual methods are defined then only the base Object methods will be present.
#Tom Hawtin: It is probably more accurate to say that C++, C# and Java are all from the C family of languages :)
Coming from a perl background I think C# sealed the doom of every developer who might have wanted to extend and modify the behaviour of a base class' thru a non virtual method without forcing all users of the new class to be aware of potentially behind the scene details.
Consider the List class' Add method. What if a developer wanted to update one of several potential databases whenever a particular List is 'Added' to? If 'Add' had been virtual by default the developer could develop a 'BackedList' class that overrode the 'Add' method without forcing all client code to know it was a 'BackedList' instead of a regular 'List'. For all practical purposes the 'BackedList' can be viewed as just another 'List' from client code.
This makes sense from the perspective of a large main class which might provide access to one or more list components which themselves are backed by one or more schemas in a database. Given that C# methods are not virtual by default, the list provided by the main class cannot be a simple IEnumerable or ICollection or even a List instance but must instead be advertised to the client as a 'BackedList' in order to ensure that the new version of the 'Add' operation is called to update the correct schema.
It is certainly not a performance issue. Sun's Java interpreter uses he same code to dispatch (invokevirtual bytecode) and HotSpot generates exactly the same code whether final or not. I believe all C# objects (but not structs) have virtual methods, so you are always going to need the vtbl/runtime class identification. C# is a dialect of "Java-like languages". To suggest it comes from C++ is not entirely honest.
There is an idea that you should "design for inheritance or else prohibit it". Which sounds like a great idea right up to the moment you have a severe business case to put in a quick fix. Perhaps inheriting from code that you don't control.
Performance.
Imagine a set of classes that override a virtual base method:
class Base {
public virtual int func(int x) { return 0; }
}
class ClassA: Base {
public override int func(int x) { return x + 100; }
}
class ClassB: Base {
public override int func(int x) { return x + 200; }
}
Now imagine you want to call the func method:
Base foo;
//...sometime later...
int x = foo.func(42);
Look at what the CPU has to actually do:
mov ecx, bfunc$ -- load the address of the "ClassB.func" method from the VMT
push 42 -- push the 42 argument
call [eax] -- call ClassB.func
No problem? No, problem!
The assembly isn't that hard to follow:
mov ecx, foo$: This needs to reach into memory, and hit the part of the object's Virtual Method Table (VMT) to get the address of the overridden foo method. The CPU will begin the fetch of the data from memory, and then it will continue on:
push 42: Push the argument 42 onto the stack for the call to the function. No problem, that can run right away, and then we continue to:
call [ecx] Call the address of the ClassB.func function. ← π•Šπ•‹π”Έπ•ƒπ•ƒ!!!
That's a problem. The address of ClassB.func function has not been fetched from the VMT yet. This means that the CPU doesn't know where the go to next. Ideally it would follow a jump and continue spectatively executing instructions as it waits for the address of ClassB.func to come back from memory. But it can't; so we wait.
If we are lucky: the data already is in the L2 cache. Getting a value out of the L2 cache into a place where it can be used is going to take 12-15 cycles. The CPU can't know where to go next without having to wait for memory for 12-15 cycles.
𝕋𝕙𝕖 β„‚β„™π•Œ π•šπ•€ 𝕀π•₯𝕒𝕝𝕝𝕖𝕕 𝕗𝕠𝕣 πŸ™πŸš-πŸ™πŸ 𝕔π•ͺ𝕔𝕝𝕖𝕀
Our program is stuck doing nothing for 12-15 cycles.
The CPU core has 7 execution engines. The main job of the CPU is keeping those 7 pipelines full of stuff to do. That means:
JITing your machine code into a different order
Starting the fetch from memory as soon as possible, letting us move on to other things
executing 100, 200, 300 instructions ahead. It will be executing 17 iterations ahead in your loop, across multiple function call and returns
it has a branch predictor to try to guess which way a comparison will go, so that it can continue executing ahead while we wait. If it guesses wrong, then it does have to undo all that work. But the branch predictor is not stupid - it's right 94% of the time.
Your CPU has all this power, and capability, and it's just STALLED FOR 15 CYCLES!?
This is awful. This is terrible. And you suffer this penalty every time you call a virtual method - whether you actually overrode it or not.
Our program is 12-15 cycles slower every method call because the language designer made virtual methods opt-out rather than opt-in.
This is why Microsoft decided to not make all methods virtual by default: they learned from Java's mistakes.
Someone ported Android to C#, and it was faster
In 2012, the Xamarin people ported all of Android's Dalvik (i.e. Java) to C#. From them:
Performance
When C# came around, Microsoft modified the language in a couple of significant ways that made it easier to optimize. Value types were introduced to allow small objects to have low overheads and virtual methods were made opt-in, instead of opt-out which made for simpler VMs.
(emphasis mine)

Why was constness removed from Java and C#?

I know this has been discussed many times, but I am not sure I really understand why Java and C# designers chose to omit this feature from these languages. I am not interested in how I can make workarounds (using interfaces, cloning, or any other alternative), but rather in the rationale behind the decision.
From a language design perspective, why has this feature been declined?
P.S: I'm using words such as "omitted", which some people may find inadequate, as C# was designed in an additive (rather than subtractive) approach. However, I am using such words because the feature existed in C++ before these languages were designed, so it is omitted in the sense of being removed from a programmer's toolbox.
In this interview, Anders said:
Anders Hejlsberg: Yes. With respect to
const, it's interesting, because we
hear that complaint all the time too:
"Why don't you have const?" Implicit
in the question is, "Why don't you
have const that is enforced by the
runtime?" That's really what people
are asking, although they don't come
out and say it that way.
The reason that const works in C++ is
because you can cast it away. If you
couldn't cast it away, then your world
would suck. If you declare a method
that takes a const Bla, you could pass
it a non-const Bla. But if it's the
other way around you can't. If you
declare a method that takes a
non-const Bla, you can't pass it a
const Bla. So now you're stuck. So you
gradually need a const version of
everything that isn't const, and you
end up with a shadow world. In C++ you
get away with it, because as with
anything in C++ it is purely optional
whether you want this check or not.
You can just whack the constness away
if you don't like it.
I guess primarily because:
it can't properly be enforced, even in C++ (you can cast it)
a single const at the bottom can force a whole chain of const in the call tree
Both can be problematic. But especially the first: if it can't be guaranteed, what use is it? Better options might be:
immutable types (either full immutability, or popsicle immutability)
As to why they did it those involved have said so:
http://blogs.msdn.com/ericgu/archive/2004/04/22/118238.aspx
http://blogs.msdn.com/slippman/archive/2004/01/22/61712.aspx
also mentioned by Raymond Chen
http://blogs.msdn.com/oldnewthing/archive/2004/04/27/121049.aspx
In a multi language system this would have been very complex.
As for Java, how would you have such a property behave? There are already techniques for making objects immutable, which is arguably a better way to achieve this with additional benefits. In fact you can emulate const behaviour by declaring a superclass/superinterface that implements only the methods that don't change state, and then having a subclass/subinterface that implements the mutating methods. By upcasting your mutable class to an instance of class with no write methods, other bits of code cannot modify the object without specifically casting it back to the mutable version (which is equivalent to casting away const).
Even if you don't want the object to be strictly immutable, if you really wanted (which I wouldn't recommend) you could put some kind of 'lock' mode on it so that it could only be mutated when unlocked. Have the lock/unlock methods be private, or protected as appropriate, and you get some level of access control there. Alternatively, if you don't intend for the method taking it as a parameter to modify it at all, pass in a copy of that object, or if copying the entire object is too heavyweight then some other lightweight data object that contains just the necessary information. You could even use dynamic proxies to create a proxy to your object that turn any calls to mutation methods into no-ops.
Basically there are already a whole bunch of ways to prevent a class being mutated, which let you choose one that fits most appropriately into your situation (hint: choose pure immutability wherever possible as it makes the object trivially threadsafe and easier to reason with in general). There are no obvious semantics for how const could be implemented that would be an improvement on these techniques, it would be another thing to learn that would either lack flexibility, or be so flexible as to be useless.
That is, unless I've missed something, which is entirely possible. :-)
Java have its own version of const; final. Joshua Bloch describes in his Effective Java
how you effectively use the final keyword. (btw, const is a reserved keyword in Java, for future discrepancies)

Categories