From the article Anders Hejsberg interview, "the way we do overload resolution in C# is different from any other language"
Can somebody provide some examples with C# and Java?
What Anders was getting at here was that the original design team explicitly designed the overload resolution algorithm to have certain properties that worked nicely with versioning scenarios, even though those properties seem backwards or confusing when you consider the scenarios without versioning.
Probably the most common example of that is the rule in C# that if any method on a more-derived class is an applicable candidate, it is automatically better than any method on a less-derived class, even if the less-derived method has a better signature match. This rule is not, to my knowledge, found in other languages that have overload resolution. It seems counterintuitive; if there's a method that is a better signature match, why not choose it? The reason is because the method that is a better signature match might have been added in a later version and thereby be introducing a "brittle base class" failure.
For more thoughts on how various languages handle brittle base class failures, see
Link
and for more thoughts on overload resolution, see
Link
The way that C# handles overloading from an internal perspective is what's different.
The complete quote from Anders:
I have always described myself as a
pragmatic guy. It's funny, because
versioning ended up being one of the
pillars of our language design. It
shows up in how you override virtual
methods in C#. Also, the way we do
overload resolution in C# is different
from any other language I know of, for
reasons of versioning. Whenever we
looked at designing a particular
feature, we would always cross check
with versioning. We would ask, "How
does versioning change this? How does
this function from a versioning
perspective?" It turns out that most
language design before has given very
little thought to that.
Related
I came across this today, and I am surprised that I haven't noticed it before. Given a simple C# program similar to the following:
public class Program
{
public static void Main(string[] args)
{
Method(); // Called the method with no arguments.
Method("a string"); // Called the method with a string.
Console.ReadLine();
}
public static void Method()
{
Console.WriteLine("Called the method with no arguments.");
}
public static void Method(string aString = "a string")
{
Console.WriteLine("Called the method with a string.");
}
}
You get the output shown in the comments for each method call.
I understand why the compiler chooses the overloads that it does, but why is this allowed in the first place? I am not asking what the overload resolution rules are, I understand those, but I am asking if there is a technical reason why the compiler allows what are essentially two overloads with the same signature?
As far as I can tell, a function overload with a signature that differs from another overload only through having an additional optional argument offers nothing more than it would if the argument (and all preceding arguments) were simply required.
One thing it does do is makes it possible for a programmer (who probably isn't paying enough attention) to think they're calling a different overload to the one that they actually are.
I suppose it's a fairly uncommon case, and the answer for why this is allowed may just be because it's simply not worth the complexity to disallow it, but is there another reason why C# allows function overloads to differ from others solely through having one additional optional argument?
His point that Eric Lippert could have an answer lead me to this https://meta.stackoverflow.com/a/323382/1880663, which makes it sounds like my question will only annoy him. I'll try to rephrase it to make it clearer that I'm asking about the language design, and that I'm not looking for a spec reference
I appreciate it! I am happy to talk about language design; what annoys me is when I waste time doing so when the questioner is very unclear about what would actually satisfy their request. I think your question was phrased clearly.
The comment to your question posted by Hans is correct. The language design team was well aware of the issue you raise, and this is far from the only potential ambiguity created by optional / named arguments. We considered a great many scenarios for a long time and designed the feature as carefully as possible to mitigate potential problems.
All design processes are the result of compromise between competing design principles. Obviously there were many arguments for the feature that had to be balanced against the significant design, implementation and testing costs, as well as the costs to users in the form of confusion, bugs, and so on, from accidental construction of ambiguities such as the one you point out.
I'm not going to rehash what was dozens of hours of debate; let me just give you the high points.
The primary motivating scenario for the feature was, as Hans notes, popular demand, particularly coming from developers who use C# with Office. (And full disclosure, as a guy on the team that wrote the C# programming model for Word and Excel before I joined the C# team, I was literally the first one asking for it; the irony that I then had to implement this difficult feature a couple years later was not lost on me.) Office object models were designed to be used from Visual Basic, a language that has long had optional / named parameter support.
C# 4 might have seemed like a bit of a "thin" release in terms of obvious features. That's because a lot of the work done in that release was infrastructure for allowing more seamless interoperability with object models that were designed for dynamic languages. The dynamic typing feature is the obvious one, but there were numerous other small features added that combine together to make working with dynamic and legacy COM object models easier. Named / optional arguments was just one of them.
The fact that we had existing languages like VB that had this specific feature for decades and the world hadn't ended yet was further evidence that the feature was both doable and valuable. It's great having an example where you can learn from its successes and failures before designing a new version of the feature.
As for the specific situation you mention: we considered doing things like detecting when there was a possible ambiguity and making a warning, but that then opens up a whole other cans of worms. Warnings have to be for code that is common, plausible and almost certainly wrong, and there should be a clear way to address the problem that causes the warning to go away. Writing an ambiguity detector is a lot of work; believe me, it took way longer to write the ambiguity detection in overload resolution than it took to write the code to handle successful cases. We didn't want to spend a lot of time on adding a warning for a rare scenario that is hard to detect and that there might be no clear advice on how to eliminate the warning.
Also, frankly, if you write code where you have two methods named the same thing that do something completely different depending on which one you call, you already have a larger design problem on your hands! Fix that problem first, rather than worrying that someone is going to accidentally call the wrong method; make it so that either method is the right one to call.
This behaviour is specified by Microsoft at the MSDN. Have a look at Named and Optional Arguments (C# Programming Guide).
If two candidates are judged to be equally good, preference goes to a candidate that does not have optional parameters for which arguments were omitted in the call. This is a consequence of a general preference in overload resolution for candidates that have fewer parameters.
A reason why they decided to implement it the way like this could be if you want to overload a method afterwards. So you don't have to change all your method calls that are already written.
UPDATE
I'm surprised, also Jon Skeet has no real explantation why they did it like this.
I think this question basically boils down to how those signatures are represented by the intermediate language. Note that the signatures of both overloads are not equal! The second method has a signature like this:
.method public hidebysig static void Method([opt] string aString) cil managed
{
.param [1] = string('a string')
// ...
}
In IL the signature of the method is different. It takes a string, which is marked as optional. This changes the behaviour of how the parameter get's initialize, but does not change the presence of this parameter.
The compiler is not able to decide, which method you are calling, so it uses the one that fits best, based on the parameters you provide. Since you did not provide any parameters for the first call, it assumes that you are calling the overload without any parameters.
In the end it is a question about good code design. As a rule of thumb, I either use optional parameters or overloads, depending on what I want to do: Optional parameters are good, if the logic within the method does not depend on the provided arguments, while overloads are good to provide a different implementation for different sets of arguments. If you ever find yourself checking if a parameter equals a default value in order to decide what to do, you should probably go for an overload. On the other hand, if you find yourself repeating large chunks of code in many overloads, you should try extracting optional parameters.
There's also a good answer of Chuck Skeet to this question.
Consider, for example, the method:
public static Attribute GetCustomAttribute(this ParameterInfo element, Type attributeType);
defined in System.Reflection.CustomAttributeExtensions
Wouldn't it make more sense to define instead:
public static T GetCustomAttribute<T>(this ParameterInfo element, T attributeType) where T : Attribute;
And save the casting?
The non-generic method of retrieving custom attribute is from the old .NET days when generics weren't implemented.
For current and future coding, you can take advantage of CustomAttributeExtensions.GetCustomAttributes<T> - if you're coding using .NET 4.5 version and above -.
Sadly - or maybe actually - software has a sequential evolution. I mean, there was a time where generic weren't with us (.NET 1.0 and 1.1), and there's a lot of code base that's inherited from early .NET versions, and because of framework team prioritzation it seems like not every method that would be better using a generic parameter(s) has been already implemented.
About inherited code
#BenRobinson said in some comment here bellow:
The point I was making was that these extension methods were added in
.net 4.5 (all of them not just the generic ones) and extension methods
were added after generics so non generic extension methods of any kind
have nothing to do with backwards compatibility with .net 1.0/1.1.
I'm adding this to avoid confusion: don't understand inherited code in terms of Microsoft not changing the non-generic code base because of backwards compatibility with third-party code.
Actually, I'm pointing out that current .NET version itself has a lot of code inherited from early .NET versions, either if the intention of Microsoft is maintaining backwards compatibility with third-party code or not.
I assume or guess that .NET Framework Team has prioritized new base class library (BCL) and satellite frameworks additions and some of members coming from pre-generics era are still as is because the change isn't worth the effort, or we could discuss if it's worth the effort and they did design decision mistakes, but StackOverflow isn't a discussion board, is it?
There is indeed a an overload that is equivalent to your example, GetCustomAttribute<T>(ParameterInfo), however to call this method without nasty reflection, you need to know the type of T at compile time. If you only know the type of T at run time then the equivalent method is GetCustomAttribute(ParameterInfo, Type)
Generics was added to the C# language in version 2. I believe that attributes was in the language at version 1 or 1.1 (can't remember which, I think it was in version 1 but I could be wrong).
This means that even though they would have saved a lot of unnecessary casting by changing all methods to use generics instead, they could have broken backwards compatability. And breaking backwards compatability is bad™.
Edit:
Also, I just thought about one more reason.
If you are writing reflection code it's often quite a hassle to call a generic method via reflection (C# has a really stupid api for doing so...) so if you are writing reflection code then using the non-generic version is in many cases much easier than using the generic one.
Edit again:
Damn, Ben Robinson beat me to the reflection point by a minute! :)
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
We are trying to use the IDesign C# Coding standard. Unfortunately, I found no comprehensive document to explain all the rules that it gives, and also his book does not always help.
Here are the open questions that remain for me (from chapter 2, Coding Practices):
No. 26: Avoid providing explicit values for enums unless they are integer powers of 2
No. 34: Always explicitly initialize an array of reference types using a for loop
No. 50: Avoid events as interface members
No. 52: Expose interfaces on class hierarchies
No. 73: Do not define method-specific constraints in interfaces
No. 74: Do not define constraints in delegates
Here's what I think about those:
I thought that providing explicit values would be especially useful when adding new enum members at a later point in time. If these members are added between other already existing members, I would provide explicit values to make sure the integer representation of existing members does not change.
No idea why I would want to do this. I'd say this totally depends on the logic of my program.
I see that there is alternative option of providing "Sink interfaces" (simply providing already all "OnXxxHappened" methods), but what is the reason to prefer one over the other?
Unsure what he means here: Could this mean "When implementing an interface explicitly in a non-sealed class, consider providing the implementation in a protected virtual method that can be overridden"? (see Programming .NET Components 2nd Edition, end of chapter “Interfaces and Class Hierarchies”).
I suppose this is about providing a "where" clause when using generics, but why is this bad on an interface?
I suppose this is about providing a "where" clause when using generics, but why is this bad on a delegate?
Ok so I will basically chip in with the little rep I have on stackoverflow: You can all kill me for opening a religious debate.
The hint to working with Juval's code standard is really in it's preface:
"I believe that while fully understanding every insight that goes into a particular programming decision may require reading books and even years of experience, applying the standard should not."
Next hint is to read his work.
Next would be doing what he advices.
Maybe on the way understand why he is recognised as a "microsoft software legend".
The next might be that it is delivered in a read-only pdf.
That didnt satisfy you?
The key here is realizing this is not a religious narrative, it is indeed a pragmatic one.
My personal experience with his standard is pretty much:
If you follow the advices as best you can, you will get a productive outcome.
Your needs of refactoring will be less, you productivity in the later cycles of the project will be higher.
This is an experience based evaluation.
Basically what you are looking for are the answers to "why" certain portions of the coding standard is in there, specifically these: 26, 34, 50, 52, 73, 74
You are asking these questions from a pragmatic perspective: You want to incorporate the standard and you "somehow" know this will give you a benefit in the longer run.
By examining these rules, through experience, you will perhaps come to understand why they are indeed there. Experience is here doing and not doing your code in accordance with the principles.
But really that isnt the point. The point is that by working from the basis of a well considered, mature and reliable standard, your work now becomes work of quality.
On how to follow a standard
Really read the rules as advices, they are already very strictly worded: "Avoid, do not, do and so on" really means what they say. Avoid means "think really hard before breaking the principle". "Do not" really means "do not".
Juval is all about decoupling and most of his harder-to-grok advices really comes from not thinking "separation" into your code design.
Sometimes this is hard to do with a team that works in less abstract terms or in more feature oriented environments, and you can therefore sometimes find it necessary to break rules, "to get the team moving", but you really should refactor this to the standard when you can do so.
After a few years, a few projects more, you perhaps will come to understand the rationale for each and every rule in the simple guide. If you are a bright student. Me, I'm not so bright, so I base it on experience: Stuff that is in the non-standardized parts of the code often fails during integration testing. It is often harder to come back to. It often ties poorly to the environment and so on.
It really is a matter of trust. If you cant find it in yourself to trust this (not adopt it - trust it), standard, I will propose to you that you will find it hard to ever really write comprehensible c# code that is generally of a acceptable quality.
This of course not considering the few really smart and really experienced out there that managed to build up their own set of internalizable rule sets on how to write code that is:
Stable, readable, maintainable, extendable and generally mega-ble.
This not saying that this is the only standard, just that it do indeed work. Also over time.
You will be able to find other standards, most if not all, though sub-standard to this for working with code you "really want to work" in the long run, involving a team of developers and changing real time situations.
So consider your standard, it is really setting the bar for the quality you provide for the $ you earn.
Do not reconsider it, unless you have "real good reasons to".
But do learn from it.
You reached here and you are still not satisfied. You want answers!
Darn it. Ok let me give you my clouded and sub-par experience based view on the rules your team can't grok.
No. 26: Avoid providing explicit values for enums unless they are integer powers of 2
Its about simplicity and working within that simplicity.
Enum's in c# are already numbered from "i=0" and injecting "++i" for every new field you define. The simplest way to [read code with/] think about enums are therefore either you flag them or you enumerate them.
When you dont do this, you are probably doing something wrong.
If you use the enum to "map" some other domain model, this rule can be broken, but the enum should be visibily separated from normal enums through placement/namespace/etc - to indicate you are "doing something not ordinary".
Now look at the enums you have created out-of-standard. Really look at them. Probably they are mapping something that really do not belong in a hard-coded enum at all. You wanted the efficiency of a switch, but you have in actuality now begun to hardcode out-of-domain properties. Sometimes this is ok and can be refactored later, sometimes it is not.
So avoid it.
The argument of "inserting into the middle" in the development process, is not really an issue that should break the standard. Why? Well if you are using or in any way storing the "int value" of the enumeration, then you are already diverging into a usage pattern where you need to stay very focused indeed. Not using 0, 1, 2 is not the answer to problems in that particular domain, i.e you are probably "doing it wrong".
The EF code-first argument is, probably not an issue to circumvent the rule here. If you feel you have such a need, please do describe your situation in a separate thread and I will try to guide you how to both follow the standard and work with your persistance framework.
My initial take would be: If they are code-first poco entities, then let them abide by the code standard, After all your team have decided to work with a code-first view on the data model. Allow them to do just that, following the same semantics as the rest of their code base.
If you run into specific problems related to the mapping to database tables, then solve these problems while maintaining the standard. For EF 4.3 I would say use the get/set with int pattern and replace in 5.0.
The real gritty stuff here is how to maintain this entity as the database evolves. Be sure to provide clear migration paths for your production database when your enum containing entities change design. Be very sure if the entities themselves change values. This go for add/remove/insert of new definitions. Not just add.
No. 34: Always explicitly initialize an array of reference types using a for loop
My guess is this is an "experience based" rule. In such a way that he looked through a lot of code and this point seemed to fail often for those that did not do it.
No. 50: Avoid events as interface members
Basically a question of separation - your interfaces are now coupled "both" ways into the interface and out of it. Read his book for clarification on why that is not "best practice" in the c# world.
In my limited view this is probably argumentation along the lines of: "Call-backs are very different in nature from function-calls. they make assumptions on the state of the receiver at an undefined "later time" of execution. If you want to provide callbacks, by all means do provide them, but make them into separate interfaces and also define an interface to negotiate these callbacks, in effect establishing all the hard things you wanted to abstract away, but really just circumvented. Or probably just use a recognized pattern. aka read the book."
No. 52: Expose interfaces on class hierarchies
Separation. I cant really explain it any further than: The alternative is hard to factor into a larger context/solution.
No. 73: Do not define method-specific constraints in interfaces
Separation. same as #52
No. 74: Do not define constraints in delegates
Separation.
Conclusion
......
Another view on #50 is that it is one of those rules where:
If you dont get it, its probably not important for you.
Importance here is an estimate on how critical the code is - and how critically you want that code to always work as intended.
This again leads into a broader context on how to verify your code actually is quality.
Some would say this can be done with tests only.
They however fail to understand the interconnectedness between defensively writing quality software and aggressively trying to prove it wrong.
In the real world, those two should be parts of a greater totality with the actual requirements of the software being the third.
Doing a quick mock-up? who cares. Doing a little visual thingy that you understand the runtime constraints of? who cares - well probably you should, but perhaps acceptable. and so on.
In the end compliance to a coding standard is not something you take on lightly.
There are reasons for doing it that goes beyond just the code:
Collaboration and communication primarily.
Agreement on shared views.
Assertions of assumptions.
etc.
No. 26: Power of two means you want to use the enum as a bitmask (flags). Thats the only reason to specify enum values. For adding new members later on, you can still append them to the enum definition without changing existing values. No reason to put them between existing members.
No. 34: I think he wants to avoid the situation where you have an array wich contains (partially) uninitialized pointers (null references). As the consumer of an array its tempting
not to check for null entries in a valid array variable.
No. 26: He's quite wrong, at least for public enums. The problem crops up when you remove an item, not when you add one (for which adding to the end of the list would be equivalent to adding an item with the next available value).
For the others, I'm really not sure why he's making these recommendations, although I have to admit that I find a few (or maybe most) of them rather dubious...
So yeah, the question basically says it all. What do you gain when you ensure that private members / methods / whatever are marked private (or protected, or public, or internal, etc) appropriately?
I mean, of course I could just go and mark all my methods as public and everything should still work fine. Of course, if we'd talk about good programming practice (which I am a solid advocate of, by the way ), I'd mark a method as private if it should be marked as such, no questions asked.
But let's set aside good programming practice, and just look at this in terms of actual quantitative gain. What do I get for proper scoping of my methods, members, classes, etc.?
I'm thinking that this would most generally translate to performance gains, but I'd appreciate it if someone could provide more detail about it.
(For purposes of this question, I'm thinking more along C#.NET, but hey, feel free to provide answers on whatever language / framework you deem fit.)
EDIT: Most pointed out that this doesn't lead to performance gain, and yeah, thinking back, I don't even know why I thought that. Lack of coffee probably.
In any case, any good programmer should know about how proper scopes (1) help your code maintenance / (2) control the proper use of your library / app / package; I was kinda curious as to whether or not there was any other benefit you get from it that's not apparently obvious outright. Based on the answers below, it looks like it basically sums up to just those two things most importantly.
Performance has absolutely nothing to do with the visibility of methods. Virtual methods have some overhead, but that's not why we scope. It has to do with maintenance of code. Your public methods are the API to your class or library. You as a class designer want to provide some guarantee to the outside world that future changes aren't going to break other peoples code. By marking some methods private, you take away the ability for users to depend on certain implementations which allows you freedom to change that implementation at will.
Even languages that don't have visibility modifiers, like python, have conventions for marking methods as internal and subject to change. By prefixing the method with an _underscore(), you're signalling to the outside world that if you use that method, you do so at your own risk, as it can change at any time.
On the other hand, public methods are an explicit entry way into your code. Every effort should go towards making public methods backward compatible to avoid the pitfalls I described above.
By better encapsulation, you provide a better API. Only methods / properties that are of interest of the user of your class are available : visible.
Next to that, you ensure that certain variables that should not be called / modified, cannot be called/modified.
That's the most important thing. Why do you think this would lead to performance gains ?
As I see you gain two important features from proper scoping. You API is reduced in size and clearly focused on the task at hand.
Second, you get a less brittle implementation as you are free to change implementation details without altering the exposed API.
I cannot see how accessibility modifiers would affect performance in any way.
There are mainly two types of methods/properties.
That are helpful to perform a task to whoever consumes it. (Recommended Scope: Public)
That are helpful to the above methods to get their task done. (Recommended Scope: Private or Protected)
Type 1 methods are the only methods that any client code requires and does not need any other method. This avoids confusion, keeps things simple and prevents client code to do something wrong.
Type 2 methods are methods into which Type 1 methods are divided. They help Type 1 methods to complete their task and still allow them to be simple, concise, less complex and more readable. They are not really needed for client code but just the class/module itself.
A fair example would be of a car. What you have is a gas pedal, brakes, gearbox, etc. You don't have an interface to minor details for what is under the hood. That is for the mechanic.
In C# programming, it helps to make sure that your API/classes/methods/members are "easy to use correctly and difficult to use incorrectly".
Do default parameters for methods violate Encapsulation?
What was the rationale behind not providing default parameters in C#?
I would take this as the "official" answer from Microsoft. However, default (and named) parameters will most definitely be available in C# 4.0.
No, it doesn't affect encapsulation in any way. It simply is not often necessary. Often, creating an overload which takes fewer arguments is a more flexible and cleaner solution, so C#'s designer simply did not see a reason to add the complexity of default parameters to the language.
Adding "Another way to do the same thing" is always a tradeoff. In some cases it may be convenient. But the more syntax you make legal, the more complex the language becomes to learn, and the more you may wall yourself in, preventing future extension. (Perhaps they'd one day come up with another extension to the language, which uses a similar syntax. Then that'd be impossible to add, because it'd conflict with the feature they added earlier)
As has been noted, default parameters were not a prioritized feature, but is likely to be added in C# 4.0. However, I believe there were excellent reasons not to include it earlier (in 4.0, as I've understood it, itäs mostly to support duck typing styles of programming where default parameters increases type compatibility).
I believe excessive parameter lists (certainly more than 4-5 distinct parameters) is a code smell. Default parameters are not evil in themselves, but risk encouraging poor design, delaying the refactoring into more objects.
To your first question - no, it's exactly the same as providing multiple overloaded constructors. As for the second, I couldn't say.
Default parameters will be included in C# 4.0
Some reading material about it:
click
click
It also seems that the author of this post will publish an article in the near future on the 'why' MS chooses to implement default params in C#
Here is an answer why it's not provided in C#
http://blogs.msdn.com/csharpfaq/archive/2004/03/07/85556.aspx
One drawback with the default parameter implementation in C# 4.0 is that it creates a dependency on the parameters name. This already existed in VB, which could be one reason why they chose to implement it in 4.0.
Another drawback is that the default value depends on how you cast your object. You can read about it here: http://saftsack.fs.uni-bayreuth.de/~dun3/archives/optional-parameters-conclusion-treat-like-unsafe/216.html .