I have a function that I would like to overload to take the same data in a different form, that is:
void encode(byte[,,],float)
and
void encode(Bitmap[],float)
I have written one overload of the function, and it is quite long(about 60 lines).
My question is, when writing the second overload, should I copy most of the code of the first overload and make little changes, or should I convert the data and call the first overload?
Never copy your code from a method to another one, It is a big mistake. For overloading you should make a method with most parameters and for other overloads call the one that has most parameters with some default values.
If at all possible, avoid large repetitions.
One overload calling another, as you suggest, is often a good approach.
It can also often work well to factor out the commonality into a private method that both overloads call. This private method could be generic if necessary to allow for similar operations on different types.
There are though times when repetition is inevitable, particularly when overloading on the primitive types. Even here though see if you can factor out at least some of the functionality, or consider T4 templates.
Related
I have recently found an interesting behavior of C# compiler. Imagine an interface like this:
public interface ILogger
{
void Info(string operation, string details = null);
void Info(string operation, object details = null);
}
Now if we do
logger.Info("Create")
The compiler will complain that he does not know which overload to chose (Ambiguous invocation...). Seems logical, but when you try to do this:
logger.Info("Create", null)
It will suddenly have no troubles figuring out that null is a string. Moreover it seems that the behavior of finding the right overload has changed with time and I had found a bug in an old code that worked before and stopped working because compiler decided to use another overload.
So I am really wondering why does C# not generate the same error in the second case as it does in the first. Seems very logical to do this, but instead it tries and resolves it to random overload.
P.S. I don't think that it's good to provide such ambiguous interfaces and do not recommend that, but legacy is legacy and has to be maintained :)
There was a breaking change introduced in C# 6 that made the overload resolution better. Here it is with the list of features:
Improved overload resolution
There are a number of small improvements to overload resolution, which will likely result in more things just working the way you’d expect them to. The improvements all relate to “betterness” – the way the compiler decides which of two overloads is better for a given argument.
One place where you might notice this (or rather stop noticing a problem!) is when choosing between overloads taking nullable value types. Another is when passing method groups (as opposed to lambdas) to overloads expecting delegates. The details aren’t worth expanding on here – just wanted to let you know!
but instead it tries and resolves it to random overload.
No, C# doesn't pick overloads randomly, that case is the ambiguous call error. C# picks the better method. Refer to section 7.5.3.2 Better function member in the C# specs:
7.5.3.2 Better function member
Otherwise, if MP has more specific parameter types than MQ, then MP is better than MQ. Let {R1, R2, …, RN} and {S1, S2, …, SN} represent the uninstantiated and unexpanded parameter types of MP and MQ. MP’s parameter types are more specific than MQ’s if, for each parameter, RX is not less specific than SX, and, for at least one parameter, RX is more specific than SX:
Given that string is more specific than object and there is an implicit cast between null and string, then the mystery is solved.
I have a method that takes 30 parameters. I took the parameters and put them into one class, so that I could just pass one parameter (the class) into the method. Is it perfectly fine in the case of refactoring to pass in an object that encapsulates all the parameters even if that is all it contains.
That is a great idea. It is typically how data contracts are done in WCF for example.
One advantage of this model is that if you add a new parameter, the consumer of the class doesn't need to change just to add the parameter.
As David Heffernan mentions, it can help self document the code:
FrobRequest frobRequest = new FrobRequest
{
FrobTarget = "Joe",
Url = new Uri("http://example.com"),
Count = 42,
};
FrobResult frobResult = Frob(frobRequest);
While other answers here are correctly point out that passing an instance of a class is better than passing 30 parameters, be aware that a large number of parameters may be a symptom of an underlying issue.
E.g., many times static methods grow in their number of parameters, because they should have been instance methods all along, and you are passing a lot of info that could more easily be maintained in an instance of that class.
Alternatively, look for ways to group the parameters into objects of a higher abstraction level. Dumping a bunch of unrelated parameters into a single class is a last resort IMO.
See How many parameters are too many? for some more ideas on this.
It's a good start. But now that you've got that new class, consider turning your code inside-out. Move the method which takes that class as a parameter into your new class (of course, passing an instance of the original class as the parameter). Now you've got a big method, alone in a class, and it will be easier to tease it apart into smaller, more manageable, testable methods. Some of those methods might move back to the original class, but a fair chunk will probably stay in your new class. You've moved beyond Introduce Parameter Object on to Replace Method with Method Object.
Having a method with thirty parameters is a pretty strong sign that the method is too long and too complicated. Too hard to debug, too hard to test. So you should do something about it, and Introduce Parameter Object is a fine place to start.
Whilst refactoring to a Parameter Object isn't in itself a bad idea it shouldn't be used to hide the problem that a class that needs 30 pieces of data provided from elsewhere could still be something of a code smell. The Introduce Parameter Object refactoring should probably be regarded as a step along the way in a broader refactoring process rather than the end of that procedure.
One of the concerns that it doesn't really address is that of Feature Envy. Does the fact that the class being passed the Parameter Object is so interested in the data of another class not indicate that maybe the methods that operate on that data should be moved to where the data resides? It's really better to identify clusters of methods and data that belong together and group them into classes, thereby increasing encapsulation and making your code more flexible.
After several iterations of splitting off behaviour and the data it operates on into separate units you should find that you no longer have any classes with enormous numbers of dependencies which is always a better end result because it'll make your code more supple.
That is an excellent idea and a very common solution to the problem. Methods with more than 2 or 3 parameters get exponentially harder and harder to understand.
Encapsulating all this in a single class makes for much clearer code. Because your properties have names you can write self-documenting code like this:
params.height = 42;
params.width = 666;
obj.doSomething(params);
Naturally when you have a lot of parameters the alternative based on positional identication is simply horrid.
Yet another benefit is that adding extra parameters to the interface contract can be done without forcing changes at all call sites. However, this is not always as trivial as it seems. If different call sites require different values for the new parameter, then it is harder to hunt them down than with the parameter based approach. In the parameter based approach, adding a new parameter forces a change at each call site to supply the new parameter and you can let the compiler do the work of finding them all.
Martin Fowler calls this Introduce Parameter Object in his book Refactoring. With that citation, few would call it a bad idea.
30 parameters is a mess. I think it's way prettier to have a class with the properties. You could even create multiple "parameter classes" for groups of parameters that fit in the same category.
You could also consider using a structure instead of a class.
But what you're trying to do is very common and a great idea!
It can be reasonable to use a Plain Old Data class whether you're refactoring or not. I'm curious as to why you thought it might not be.
Maybe C# 4.0's optional and named parameters be a good alternative to this?
Anyway, the method you are describing can also be good for abstracting the programs behavior. For example you could have one standard SaveImage(ImageSaveParameters saveParams)-function in an Interface where ImageSaveParameters also is an interface and can have additional parameters depending on the image-format. For example JpegSaveParameters has a Quality-property while PngSaveParameters has a BitDepth-property.
This is how the save save-dialog in Paint.NET does it so it is a very real life example.
As stated before: it is the right step to do, but consider the followings too:
your method might be too complex (you should consider dividing it into more methods, or even turn it into a separate class)
if you create the class for the parameters, make it immutable
if many of the parameters could be null or could have some default value, you might want to use the builder pattern for your class.
So many great answers here. I would like to add my two cents.
Parameter object is a good start. But there is more that could be done. Consider the following (ruby examples):
/1/ Instead of simply grouping all the parameters, see if there can be meaningful grouping of parameters. You might need more than one parameter object.
def display_line(startPoint, endPoint, option1, option2)
might become
def display_line(line, display_options)
/2/ Parameter object may have lesser number of properties than the original number of parameters.
def double_click?(cursor_location1, control1, cursor_location2, control2)
might become
def double_click?(first_click_info, second_click_info)
# MouseClickInfo being the parameter object type
# having cursor_location and control_at_click as properties
Such uses will help you discover possibilities of adding meaningful behavior to these parameter objects. You will find that they shake off their initial Data Class smell sooner to your comfort. :--)
My question is about naming, design, and implementation choices. I can see myself going in two different directions with how to solve an issue and I'm interested to see where others who may have come across similar concerns would handle the issue. It's part aesthetics, part function.
A little background on the code... I created a type called ISlice<T> that provides a reference into a section of a source of items which can be a collection (e.g. array, list) or a string. The core support comes from a few implementation classes that support fast indexing using the Begin and End markers for the slice to get the item from the original source. The purpose is to provide slicing capabilities similar to what the Go language provides while using Python style indexing (i.e. both positive and negative indexes are supported).
To make creating slices (instances of ISlice<T>) easier and more "fluent", I created a set of extension methods. For example:
static public ISlice<T> Slice<T>(this IList<T> source, int begin, int end)
{
return new ListSlice<T>(source, begin, end);
}
static public ISlice<char> Slice(this string source, int begin, int end)
{
return new StringSlice(source, begin, end);
}
There are others, such as providing optional begin/end parameters, but the above will suffice for where I'm going with this.
These routines work well and make it easy to slice up a collection or a string. What I also need is way to take a slice and create a copy of it as an array, a list, or a string. That's where things get "interesting". Originally, I thought I'd need to create ToArray, ToList extension methods, but then remembered that the LINQ variants perform optimizations if your collection implements ICollection<T>. In my case, ISlice<T>, does inherits from it, though much to my chagrin as I dislike throwing NotSupportedExceptions from methods like Add. Regardless, I get those for free. Great.
What about converting back into a string as there's no built-in support for converting an IEnumerable<char> easily back into a string? Closest thing I found is one of the string.Concat overloads, but it would not handle chars as efficiently as it could. Just as important from a design stand point is that it doesn't jump out as a "conversion" routine.
The first thought was to create a ToString extension method, but that doesn't work as ToString is an instance method which means it trumps extension methods and would never be called. I could override ToString, but the behavior would be inconsistent as ListSlice<T> would need to special case its ToString for times where T is a char. I don't like that as the ToString will give something useful when the type parameter is a char, but the class name in other cases. Also, if there are other slice types created in the future I'd have to create a common base class to ensure the same behavior or each class would have to implement this same check. An extension method on the interface would handle that much more elegantly.
The extension method leads me to a naming convention issue. The obvious is to use ToString, but as stated earlier it's not allowed. I could name it something different, but what? ToNewString? NewString? CreateString? Something in the To-family of methods would let it fall in with the ToArray/ToList routines, but ToNewString sticks out as being 'odd' when seen in the intellisense and code editor. NewString/CreateString are not as discoverable as you'd have to know to look for them. It doesn't fit the "conversion method" pattern that the To-family methods provide.
Go with overriding ToString and accept the inconsistent behavior hardcoded into the ListSlice<T> implementation and other implementations? Go with the more flexible, but potentially more poorly named extension method route? Is there a third option I haven't considered?
My gut tells me to go with the ToString despite my reservations, though, it also occurred to me... Would you even consider ToString giving you a useful output on a collection/enumerable type? Would that violate the principle of least surprise?
Update
Most implementations of slicing operations provide a copy, albeit a subset, of the data from whatever source was used for the slice. This is perfectly acceptable in most use cases and leaves for a clean API as you can simply return the same data type back. If you slice a list, you return a list containing only the items in the range specified in the slice. If you slice a string, you return a string. And so on.
The slicing operations I'm describing above are solving an issue when working with constraints which make this behavior undesirable. For example, if you work with large data sets, the slice operations would lead to unnecessary additional memory allocations not to mention the performance impact of copying the data. This is especially true if the slices will have further processing done on them before getting to your final results. So, the goal of the slice implementation is to have references into larger data sets to avoid making unnecessary copies of the information until it becomes beneficial to do so.
The catch is that at the end of the processing the desire to turn the slice-based processed data back into a more API and .NET friendly type like lists, arrays, and strings. It makes the data easier to pass into other APIs. It also allows you to discard the slices, thus, also the large data set the slices referenced.
Would you even consider ToString giving you a useful output on a collection/enumerable type? Would that violate the principle of least surprise?
No, and yes. That would be completely unexpected behavior, since it would behave differently than every other collection type.
As for this:
What about converting back into a string as there's no built-in support for converting an IEnumerable>char< easily back into a string?
Personally, I would just use the string constructor taking an array:
string result = new string(mySlice.ToArray());
This is explicit, understood, and expected - I expect to create a new string by passing an object to a constructor.
Perhaps the reason for your conundrum is the fact that you are treating string as a ICollection<char>. You haven't provide details about the problem that you are trying to solve but maybe that's a wrong assumption.
It's true that a string is an IEnumerable<char>. But as you've noticed assuming a direct mapping to a collection of chars creates problems. Strings are just too "special" in the framework.
Looking at it from the other end, would it be obvious that the difference between an ISlice<char> and ISlice<byte> is that you can concatenate the former into a string? Would there be a concatenate operation on the latter that makes sense? What about ISlice<string>? Shouldn't I be able to concatenate those as well?
Sorry I'm not providing specific answers but maybe these questions will point you at the right solution for your problem.
Overloaded methods tend to encourage a habit of duplicating the code between all methods of the method group. For example, I may concat a string, write it to file, etc in one method but then do the same in another method but with the addition of an additional parameter (Creating the overload).
The methods themselves could go in a base class which will make the concrete class look cleaner but the base class will have the problem then (working around the problem). The params keyword seems like a solution but I can imagine if I really think this idea through (using params rather than individual parameters), there'll be some sort of other issue.
Am I therefore the only one to think that overloads promote code duplication?
Thanks
Usually I'd have the actual implementation in the overload with the most parameters, and have the other overloads call this one passing defaults for the parameters which aren't set.
I certainly wouldn't be duplicating code which writes to a file across different overloads - in fact, that code alone could probably be refactored out into its own properly parameterized private method.
In addition to the options above, upcoming in the new version of c# is default parameter capabilities, which is basically just syntatic sugar for what Winston suggested.
public string WriteToFile(string file, bool overWrite = false){
}
A common pattern that more or less eliminates this problem is to have one base implementation, and have each overload call the base implementation, like so:
string WriteToFile(string fileName, bool overwrite) {
// implementation
}
string WriteToFile(string fileName) {
WriteToFile(fileName, false);
}
This way there is only one implementation.
If all of the overloads use the same code, they just handle it slightly differently, perhaps you should either make another function that each overload calls, or if one of them is a base, generic version, then each of the other overloads should call the generic one.
Is there a cost associated with overloading methods in .Net?
So if I have 3 methods like:
Calculate (int)
Calculate (float)
Calculate (double)
and these methods are called at runtime "dynamically" based on what's passed to the Calculate method, what would be the cost of this overload resolution?
Alternatively I could have a single Calculate and make the difference in the method body, but I thought that would require the method to evaluate the type every time it's called.
Are there better ways/designs to solves this with maybe no overhead? Or better yet, what's the best practice to handle cases like these? I want to have the same class/method name, but different behaviour.
EDIT: Thanks all. Jusdt one thing if it's any different. I was wondering say you have a DLL for these methods and a program written in C# that allows the user to add these methods as UI items (without specifying the type). So the user adds UI item Calculate (5), and then Calculate (12.5), etc, and the C# app executes this, would there still be no overhead?
As far as the runtime is concerned these are different methods. Is the same as writing:
CalculateInt(int)
CalculateFloat(float)
Regarding the performance issues, except very special cases, you can safely ignore the method call overhead.
This question is about method overloading, not polymorphism. As far as I know, there isn't a penalty for method overloading, since the compiler will figure out which method to call at compile time based on the type of the argument being pass to it.
Polymorphism only comes into play where you are using a derived type as a substitute for a base class.
First, unless your profiler is telling you there's a performance problem, you shouldn't sacrifice good design for assumed performance gains.
To answer your question, the resolution of method calls isn't dynamic. It's determined at compile time, so there's no cost in that respect. The only cost would be in the potential implicit casting of a type to fit the parameter type.
The method overload (i.e. the method table slot) is not resolved dynamically at runtime, it is resolved statically at compile time, so there is zero cost in having overloaded methods.
I assume you're thinking of virtual methods where the actual implementation may be overridden by a derived type and the actual method picked based on the concrete type. But that has nothing to do with overloading as virtual methods occupy the same method table slot.
There is no "cost" at run time because the compiler determines which method to call at compile time. The IL that is generated specifically calls the method that takes the appropriate parameters.
Take this for example"
public class Calculator
{
public void Calculate(int value)
{
//Do Something
}
public void Calculate(decimal value)
{
//Do Something
}
public void Calculate(double value)
{
//Do Something
}
}
static void Main(string[] args)
{
int i = 0;
Calculator calculator = new Calculator();
calculator.Calculate(i);
}
The following call is made in IL to calculate the variable "i":
L_000b: callvirt instance void ConsoleApplication1.Calculator::Calculate(int32)
Notice that it specifies the method is of type int32 which is the same type as the variable passed in from the Main method.
So if there is a cost at all it's only at compile time. No worries.
As JaredPar noted below:
There is a misconception in your question. Overloaded methods in C# are
not called dynamically at runtime. All
method calls are bound statically at
compile time. Hence there is no
"searching" for a method at runtime,
it's predetermined at compile time
which overload will be called.
3 different methods are produced in IL based on the type. The only cost will be if you are casting from one type to another, but that cost won't be significant unless you have a large amount to do. So you could stick to Calculate(double) and cast from there.
There is a misconception in your question. Overloaded methods in C# are not called dynamically at runtime. All method calls are bound statically at compile time. Hence there is no "searching" for a method at runtime, it's predetermined at compile time which overload will be called.
Note: This changes a bit with C# 4.0 and dynamic. With a dynamic object it is possible that an overload will be chosen at runtime based on the type of the object. This is not the case with C# 3.0 and below though.