I came across this today, and I am surprised that I haven't noticed it before. Given a simple C# program similar to the following:
public class Program
{
public static void Main(string[] args)
{
Method(); // Called the method with no arguments.
Method("a string"); // Called the method with a string.
Console.ReadLine();
}
public static void Method()
{
Console.WriteLine("Called the method with no arguments.");
}
public static void Method(string aString = "a string")
{
Console.WriteLine("Called the method with a string.");
}
}
You get the output shown in the comments for each method call.
I understand why the compiler chooses the overloads that it does, but why is this allowed in the first place? I am not asking what the overload resolution rules are, I understand those, but I am asking if there is a technical reason why the compiler allows what are essentially two overloads with the same signature?
As far as I can tell, a function overload with a signature that differs from another overload only through having an additional optional argument offers nothing more than it would if the argument (and all preceding arguments) were simply required.
One thing it does do is makes it possible for a programmer (who probably isn't paying enough attention) to think they're calling a different overload to the one that they actually are.
I suppose it's a fairly uncommon case, and the answer for why this is allowed may just be because it's simply not worth the complexity to disallow it, but is there another reason why C# allows function overloads to differ from others solely through having one additional optional argument?
His point that Eric Lippert could have an answer lead me to this https://meta.stackoverflow.com/a/323382/1880663, which makes it sounds like my question will only annoy him. I'll try to rephrase it to make it clearer that I'm asking about the language design, and that I'm not looking for a spec reference
I appreciate it! I am happy to talk about language design; what annoys me is when I waste time doing so when the questioner is very unclear about what would actually satisfy their request. I think your question was phrased clearly.
The comment to your question posted by Hans is correct. The language design team was well aware of the issue you raise, and this is far from the only potential ambiguity created by optional / named arguments. We considered a great many scenarios for a long time and designed the feature as carefully as possible to mitigate potential problems.
All design processes are the result of compromise between competing design principles. Obviously there were many arguments for the feature that had to be balanced against the significant design, implementation and testing costs, as well as the costs to users in the form of confusion, bugs, and so on, from accidental construction of ambiguities such as the one you point out.
I'm not going to rehash what was dozens of hours of debate; let me just give you the high points.
The primary motivating scenario for the feature was, as Hans notes, popular demand, particularly coming from developers who use C# with Office. (And full disclosure, as a guy on the team that wrote the C# programming model for Word and Excel before I joined the C# team, I was literally the first one asking for it; the irony that I then had to implement this difficult feature a couple years later was not lost on me.) Office object models were designed to be used from Visual Basic, a language that has long had optional / named parameter support.
C# 4 might have seemed like a bit of a "thin" release in terms of obvious features. That's because a lot of the work done in that release was infrastructure for allowing more seamless interoperability with object models that were designed for dynamic languages. The dynamic typing feature is the obvious one, but there were numerous other small features added that combine together to make working with dynamic and legacy COM object models easier. Named / optional arguments was just one of them.
The fact that we had existing languages like VB that had this specific feature for decades and the world hadn't ended yet was further evidence that the feature was both doable and valuable. It's great having an example where you can learn from its successes and failures before designing a new version of the feature.
As for the specific situation you mention: we considered doing things like detecting when there was a possible ambiguity and making a warning, but that then opens up a whole other cans of worms. Warnings have to be for code that is common, plausible and almost certainly wrong, and there should be a clear way to address the problem that causes the warning to go away. Writing an ambiguity detector is a lot of work; believe me, it took way longer to write the ambiguity detection in overload resolution than it took to write the code to handle successful cases. We didn't want to spend a lot of time on adding a warning for a rare scenario that is hard to detect and that there might be no clear advice on how to eliminate the warning.
Also, frankly, if you write code where you have two methods named the same thing that do something completely different depending on which one you call, you already have a larger design problem on your hands! Fix that problem first, rather than worrying that someone is going to accidentally call the wrong method; make it so that either method is the right one to call.
This behaviour is specified by Microsoft at the MSDN. Have a look at Named and Optional Arguments (C# Programming Guide).
If two candidates are judged to be equally good, preference goes to a candidate that does not have optional parameters for which arguments were omitted in the call. This is a consequence of a general preference in overload resolution for candidates that have fewer parameters.
A reason why they decided to implement it the way like this could be if you want to overload a method afterwards. So you don't have to change all your method calls that are already written.
UPDATE
I'm surprised, also Jon Skeet has no real explantation why they did it like this.
I think this question basically boils down to how those signatures are represented by the intermediate language. Note that the signatures of both overloads are not equal! The second method has a signature like this:
.method public hidebysig static void Method([opt] string aString) cil managed
{
.param [1] = string('a string')
// ...
}
In IL the signature of the method is different. It takes a string, which is marked as optional. This changes the behaviour of how the parameter get's initialize, but does not change the presence of this parameter.
The compiler is not able to decide, which method you are calling, so it uses the one that fits best, based on the parameters you provide. Since you did not provide any parameters for the first call, it assumes that you are calling the overload without any parameters.
In the end it is a question about good code design. As a rule of thumb, I either use optional parameters or overloads, depending on what I want to do: Optional parameters are good, if the logic within the method does not depend on the provided arguments, while overloads are good to provide a different implementation for different sets of arguments. If you ever find yourself checking if a parameter equals a default value in order to decide what to do, you should probably go for an overload. On the other hand, if you find yourself repeating large chunks of code in many overloads, you should try extracting optional parameters.
There's also a good answer of Chuck Skeet to this question.
Related
After a couple hours of research (on MSDN websites and so on) I didn't manage to find out why the generic Dictionary<TKey, TValue> does not provide a ForEach() method like List<T> does. Could someone please give me an explanation? (I know that it's not hard to implement it as an extension method, a great example can be seen here, I just was wondering whether there might be a particular reason why it's not provided by the .NET libraries in the first place.)
Thanks in advance.
Because it's questionable why List<T> has it in the first place. No need to repeat the same mistake everywhere. Eric Lippert gives two reason as to why in his blog post :
The first reason is that doing so violates the functional programming principles that all the other sequence operators are based upon. Clearly the sole purpose of a call to this method is to cause side effects. (...)
The second reason is that doing so adds zero new representational power to the language. Doing this lets you rewrite this perfectly clear code:
foreach(Foo foo in foos){ statement involving foo; }
into this code:
foos.ForEach((Foo foo)=>{ statement involving foo; });
which uses almost exactly the same characters in slightly different order. And yet the second version is harder to understand, harder to debug, and introduces closure semantics, thereby potentially changing object lifetimes in subtle ways.
(...)
I've looked at both the Named Parameter Idiom and the Boost::Parameter library. What advantages does each one have over the other? Is there a good reason to always choose one over the other, or might each of them be better than the other in some situations (and if so, what situations)?
Implementing the Named Parameter Idiom is really easy, almost about as easy as using Boost::Parameter, so it kind of boils down to one main point.
-Do you already have boost dependencies? If you don't, Boost::parameter isn't special enough to merit adding the dependency.
Personally I've never seen Boost::parameter in production code, 100% of the time its been a custom implementation of Named Parameters, but that's not necessarily a good thing.
Normally, I'm a big fan of Boost, but I wouldn't use the Boost.Parameter library for a couple of reasons:
If you don't know what's going on,
the call looks like you're assigning
a value to a variable in the scope
on the calling function before
making the call. That can be very
confusing.
There is too much boilerplate code necessary to set it up in the first place.
Another point, while I have never used Named Parameter Idiom, I have used Boost Parameter for defining up to 20 optional arguments. And, my compile times are insane. What used to take a couple seconds, now takes 30sec. This adds up if you have a library of stuff that use your one little application that you wrote using boost parameter. Of course, I might be implementing it wrongly, but I hope this changes, because other than that, i really like it.
The Named Parameter idiom is a LOT simpler. I can't see (right now) why we would need the complexity of the Boost::Parameter library. (Even the supposed "feature" Deduced parameters, seems like a way to introduce coding errors ;) )
You probably don't want Boost.Parameter for general application logic so much as you would want it for library code that you are developing where it can be quite a time saver for clients of the library.
Never heard of either, but reviewing the links, named parameter is WAY easier and more obvious to understand. I'd pick it in a heartbeat over the boost implementation.
From the article Anders Hejsberg interview, "the way we do overload resolution in C# is different from any other language"
Can somebody provide some examples with C# and Java?
What Anders was getting at here was that the original design team explicitly designed the overload resolution algorithm to have certain properties that worked nicely with versioning scenarios, even though those properties seem backwards or confusing when you consider the scenarios without versioning.
Probably the most common example of that is the rule in C# that if any method on a more-derived class is an applicable candidate, it is automatically better than any method on a less-derived class, even if the less-derived method has a better signature match. This rule is not, to my knowledge, found in other languages that have overload resolution. It seems counterintuitive; if there's a method that is a better signature match, why not choose it? The reason is because the method that is a better signature match might have been added in a later version and thereby be introducing a "brittle base class" failure.
For more thoughts on how various languages handle brittle base class failures, see
Link
and for more thoughts on overload resolution, see
Link
The way that C# handles overloading from an internal perspective is what's different.
The complete quote from Anders:
I have always described myself as a
pragmatic guy. It's funny, because
versioning ended up being one of the
pillars of our language design. It
shows up in how you override virtual
methods in C#. Also, the way we do
overload resolution in C# is different
from any other language I know of, for
reasons of versioning. Whenever we
looked at designing a particular
feature, we would always cross check
with versioning. We would ask, "How
does versioning change this? How does
this function from a versioning
perspective?" It turns out that most
language design before has given very
little thought to that.
Unlike Java, why does C# treat methods as non-virtual functions by default? Is it more likely to be a performance issue rather than other possible outcomes?
I am reminded of reading a paragraph from Anders Hejlsberg about several advantages the existing architecture is bringing out. But, what about side effects? Is it really a good trade-off to have non-virtual methods by default?
Classes should be designed for inheritance to be able to take advantage of it. Having methods virtual by default means that every function in the class can be plugged out and replaced by another, which is not really a good thing. Many people even believe that classes should have been sealed by default.
virtual methods can also have a slight performance implication. This is not likely to be the primary reason, however.
I'm surprised that there seems to be such a consensus here that non-virtual-by-default is the right way to do things. I'm going to come down on the other - I think pragmatic - side of the fence.
Most of the justifications read to me like the old "If we give you the power you might hurt yourself" argument. From programmers?!
It seems to me like the coder who didn't know enough (or have enough time) to design their library for inheritance and/or extensibility is the coder who's produced exactly the library I'm likely to have to fix or tweak - exactly the library where the ability to override would come in most useful.
The number of times I've had to write ugly, desperate work-around code (or to abandon usage and roll my own alternative solution) because I can't override far, far outweighs the number of times I've ever been bitten (e.g. in Java) by overriding where the designer might not have considered I might.
Non-virtual-by-default makes my life harder.
UPDATE: It's been pointed out [quite correctly] that I didn't actually answer the question. So - and with apologies for being rather late....
I kinda wanted to be able to write something pithy like "C# implements methods as non-virtual by default because a bad decision was made which valued programs more highly than programmers". (I think that could be somewhat justified based on some of the other answers to this question - like performance (premature optimisation, anyone?), or guaranteeing the behaviour of classes.)
However, I realise I'd just be stating my opinion and not that definitive answer that Stack Overflow desires. Surely, I thought, at the highest level the definitive (but unhelpful) answer is:
They're non-virtual by default because the language-designers had a decision to make and that's what they chose.
Now I guess the exact reason that they made that decision we'll never.... oh, wait! The transcript of a conversation!
So it would seem that the answers and comments here about the dangers of overriding APIs and the need to explicitly design for inheritance are on the right track but are all missing an important temporal aspect: Anders' main concern was about maintaining a class's or API's implicit contract across versions. And I think he's actually more concerned about allowing the .Net / C# platform to change under code rather than concerned about user-code changing on top of the platform. (And his "pragmatic" viewpoint is the exact opposite of mine because he's looking from the other side.)
(But couldn't they just have picked virtual-by-default and then peppered "final" through the codebase? Perhaps that's not quite the same.. and Anders is clearly smarter than me so I'm going to let it lie.)
Because it's too easy to forget that a method may be overridden and not design for that. C# makes you think before you make it virtual. I think this is a great design decision. Some people (such as Jon Skeet) have even said that classes should be sealed by default.
To summarize what others said, there are a few reasons:
1- In C#, there are many things in syntax and semantics that come straight from C++. The fact that methods where not-virtual by default in C++ influenced C#.
2- Having every method virtual by default is a performance concern because every method call must use the object's Virtual Table. Moreover, this strongly limits the Just-In-Time compiler's ability to inline methods and perform other kinds of optimization.
3- Most importantly, if methods are not virtual by default, you can guarantee the behavior of your classes. When they are virtual by default, such as in Java, you can't even guarantee that a simple getter method will do as intended because it could be overridden to do anything in a derived class (of course you can, and should, make the method and/or the class final).
One might wonder, as Zifre mentioned, why the C# language did not go a step further and make classes sealed by default. That's part of the whole debate about the problems of implementation inheritance, which is a very interesting topic.
C# is influenced by C++ (and more). C++ does not enable dynamic dispatch (virtual functions) by default. One (good?) argument for this is the question: "How often do you implement classes that are members of a class hiearchy?". Another reason to avoid enabling dynamic dispatch by default is the memory footprint. A class without a virtual pointer (vpointer) pointing to a virtual table, is ofcourse smaller than the corresponding class with late binding enabled.
The performance issue is not so easy to say "yes" or "no" to. The reason for this is the Just In Time (JIT) compilation which is a run time optimization in C#.
Another, similar question about "speed of virtual calls.."
The simple reason is design and maintenance cost in addition to performance costs. A virtual method has additional cost as compared with a non-virtual method because the designer of the class must plan for what happens when the method is overridden by another class. This has a big impact if you expect a particular method to update internal state or have a particular behavior. You now have to plan for what happens when a derived class changes that behavior. It's much harder to write reliable code in that situation.
With a non-virtual method you have total control. Anything that goes wrong is the fault of the original author. The code is much easier to reason about.
If all C# methods were virtual then the vtbl would be much bigger.
C# objects only have virtual methods if the class has virtual methods defined. It is true that all objects have type information that includes a vtbl equivalent, but if no virtual methods are defined then only the base Object methods will be present.
#Tom Hawtin: It is probably more accurate to say that C++, C# and Java are all from the C family of languages :)
Coming from a perl background I think C# sealed the doom of every developer who might have wanted to extend and modify the behaviour of a base class' thru a non virtual method without forcing all users of the new class to be aware of potentially behind the scene details.
Consider the List class' Add method. What if a developer wanted to update one of several potential databases whenever a particular List is 'Added' to? If 'Add' had been virtual by default the developer could develop a 'BackedList' class that overrode the 'Add' method without forcing all client code to know it was a 'BackedList' instead of a regular 'List'. For all practical purposes the 'BackedList' can be viewed as just another 'List' from client code.
This makes sense from the perspective of a large main class which might provide access to one or more list components which themselves are backed by one or more schemas in a database. Given that C# methods are not virtual by default, the list provided by the main class cannot be a simple IEnumerable or ICollection or even a List instance but must instead be advertised to the client as a 'BackedList' in order to ensure that the new version of the 'Add' operation is called to update the correct schema.
It is certainly not a performance issue. Sun's Java interpreter uses he same code to dispatch (invokevirtual bytecode) and HotSpot generates exactly the same code whether final or not. I believe all C# objects (but not structs) have virtual methods, so you are always going to need the vtbl/runtime class identification. C# is a dialect of "Java-like languages". To suggest it comes from C++ is not entirely honest.
There is an idea that you should "design for inheritance or else prohibit it". Which sounds like a great idea right up to the moment you have a severe business case to put in a quick fix. Perhaps inheriting from code that you don't control.
Performance.
Imagine a set of classes that override a virtual base method:
class Base {
public virtual int func(int x) { return 0; }
}
class ClassA: Base {
public override int func(int x) { return x + 100; }
}
class ClassB: Base {
public override int func(int x) { return x + 200; }
}
Now imagine you want to call the func method:
Base foo;
//...sometime later...
int x = foo.func(42);
Look at what the CPU has to actually do:
mov ecx, bfunc$ -- load the address of the "ClassB.func" method from the VMT
push 42 -- push the 42 argument
call [eax] -- call ClassB.func
No problem? No, problem!
The assembly isn't that hard to follow:
mov ecx, foo$: This needs to reach into memory, and hit the part of the object's Virtual Method Table (VMT) to get the address of the overridden foo method. The CPU will begin the fetch of the data from memory, and then it will continue on:
push 42: Push the argument 42 onto the stack for the call to the function. No problem, that can run right away, and then we continue to:
call [ecx] Call the address of the ClassB.func function. β πππΈππ!!!
That's a problem. The address of ClassB.func function has not been fetched from the VMT yet. This means that the CPU doesn't know where the go to next. Ideally it would follow a jump and continue spectatively executing instructions as it waits for the address of ClassB.func to come back from memory. But it can't; so we wait.
If we are lucky: the data already is in the L2 cache. Getting a value out of the L2 cache into a place where it can be used is going to take 12-15 cycles. The CPU can't know where to go next without having to wait for memory for 12-15 cycles.
πππ ββπ ππ€ π€π₯πππππ ππ π£ ππ-ππ ππͺππππ€
Our program is stuck doing nothing for 12-15 cycles.
The CPU core has 7 execution engines. The main job of the CPU is keeping those 7 pipelines full of stuff to do. That means:
JITing your machine code into a different order
Starting the fetch from memory as soon as possible, letting us move on to other things
executing 100, 200, 300 instructions ahead. It will be executing 17 iterations ahead in your loop, across multiple function call and returns
it has a branch predictor to try to guess which way a comparison will go, so that it can continue executing ahead while we wait. If it guesses wrong, then it does have to undo all that work. But the branch predictor is not stupid - it's right 94% of the time.
Your CPU has all this power, and capability, and it's just STALLED FOR 15 CYCLES!?
This is awful. This is terrible. And you suffer this penalty every time you call a virtual method - whether you actually overrode it or not.
Our program is 12-15 cycles slower every method call because the language designer made virtual methods opt-out rather than opt-in.
This is why Microsoft decided to not make all methods virtual by default: they learned from Java's mistakes.
Someone ported Android to C#, and it was faster
In 2012, the Xamarin people ported all of Android's Dalvik (i.e. Java) to C#. From them:
Performance
When C# came around, Microsoft modified the language in a couple of significant ways that made it easier to optimize. Value types were introduced to allow small objects to have low overheads and virtual methods were made opt-in, instead of opt-out which made for simpler VMs.
(emphasis mine)
Do default parameters for methods violate Encapsulation?
What was the rationale behind not providing default parameters in C#?
I would take this as the "official" answer from Microsoft. However, default (and named) parameters will most definitely be available in C# 4.0.
No, it doesn't affect encapsulation in any way. It simply is not often necessary. Often, creating an overload which takes fewer arguments is a more flexible and cleaner solution, so C#'s designer simply did not see a reason to add the complexity of default parameters to the language.
Adding "Another way to do the same thing" is always a tradeoff. In some cases it may be convenient. But the more syntax you make legal, the more complex the language becomes to learn, and the more you may wall yourself in, preventing future extension. (Perhaps they'd one day come up with another extension to the language, which uses a similar syntax. Then that'd be impossible to add, because it'd conflict with the feature they added earlier)
As has been noted, default parameters were not a prioritized feature, but is likely to be added in C# 4.0. However, I believe there were excellent reasons not to include it earlier (in 4.0, as I've understood it, itΓ€s mostly to support duck typing styles of programming where default parameters increases type compatibility).
I believe excessive parameter lists (certainly more than 4-5 distinct parameters) is a code smell. Default parameters are not evil in themselves, but risk encouraging poor design, delaying the refactoring into more objects.
To your first question - no, it's exactly the same as providing multiple overloaded constructors. As for the second, I couldn't say.
Default parameters will be included in C# 4.0
Some reading material about it:
click
click
It also seems that the author of this post will publish an article in the near future on the 'why' MS chooses to implement default params in C#
Here is an answer why it's not provided in C#
http://blogs.msdn.com/csharpfaq/archive/2004/03/07/85556.aspx
One drawback with the default parameter implementation in C# 4.0 is that it creates a dependency on the parameters name. This already existed in VB, which could be one reason why they chose to implement it in 4.0.
Another drawback is that the default value depends on how you cast your object. You can read about it here: http://saftsack.fs.uni-bayreuth.de/~dun3/archives/optional-parameters-conclusion-treat-like-unsafe/216.html .