How is validating a parameter "high up" on the callstack done? - c#

I've been reading the Framework Design Guidelines book, a book on designing frameworks in .NET, with excerpts from the framework designers on the decisions they made regarding each section (E.g. parameter design, exception handling, etc).
One of the tips, under parameter design, is to validate parameters as "high up on the callstack" as possible. This is because the work here is not as expensive as it is low on the callstack, so a performance penalty is not as costly when validating high up in the callstack.
Does this mean that when I pass parameters into a method or constructor, I validate them before doing anything else, or do I do so just before using the parameters (So there could be 100 lines of code between the parameter in the definition and the usage of the parameter)?
Thanks

Prefer to validate in the public API of an assembly. That means the public methods of the public classes.
Prefer to validate in the public methods of your classes. So if your class requires a non-null pointer to another object to work correctly, you could enforce this by requiring it as a constructor parameter and throwing an exception when a null pointer is supplied. From that point forward none of the member methods need to test if the pointer is non-null.
The idea is that no user can break your class (or assembly) by feeding invalid data. Of course the code won't work either way, but if you fail in a controlled way, it's more clear to the calling code what is wrong, and you won't have unpleasant side effects like resource leaks (or worse).

Failing fast is generally a good practice. All arguments passed to a method should be validated as soon as possible, without any unnecessary calculations being performed before, because that eases debugging and allows for easier recovery from the faulty situation.
In respect to input validation I consider performance a minor concern.

I haven't read the specific guidelines you mention, but I expect they're talking about the case where method A calls method B, which calls method C and a parameter value gets passed through all three calls. It's better to validate that parameter at the start of method A than somewhere in the middle of method C because if it's invalid, then you get to skip all of the stuff that happens in A and B and the start of C. This is especially true if B or C are called inside loops because then the low-level validation would occur many times instead of just once at the start of A.
Of course you have to balance that with how complicated the validation of the parameter is. It may just be way easier to understand if you validate it in the same place you use it.

Validate them as early as you can in your method!

What I believe this to mean is that you should validate data that could be invalid as soon as you receive it. Once it has been validated then no more checks are needed. If you wait until the bottom of the call stack then you may have to validate many times because your call tree may have many branches.
I would whole-heartedly agree with this advice, but not on the grounds of performance. By validating at the point of entry you are in a much better position to give a meaningful error message to the client who supplied the data. And by reducing the amount of validation that you do, you will end up with much clearer code.

Related

Understanding the point of Delegates in C#

I have done a fair bit of reading and I am at the stage where I am beginning to grasp what they do, but am at a loss when it comes to why or where I would use them. In each example I've seen the recurring definition seems to be as a method pointer, and you can use this in place of a call to the method which is apparently useful when the developer doesn't know which method to call or the selection of a method is based on a condition or state.
This is where I struggle a bit, why can't I just have an if statement or a switch statement and then call the method directly based on the outcome? What's so bad about calling a method directly from an object instance? From my understanding a Delegate offers a better way to do this but I can't understand what's better about it, from my perspective it's just a round-about way to achieve the same thing an if statement could do when deciding which method to call.
I'm at a loss and have been rambling on for quite a bit now, any help at all on the matter would be greatly appreciated!
why can't I just have an if statement or a switch statement and then
call the method directly based on the outcome?
This would be fine, if you had 2 or 3 different branches and methods. Now imagine having tens or hundreds of methods which can be potentially called depending on a situation. I wouldn't want to be the one to write that if statement.
Imagine having 100 different potential abilities for a character in a game. Each ability can have its own method to perform. Depending on what abilities a player has, you can just throw those methods into a list for that character using delegates. Now its fully customize-able, and player's abilities aren't hard-written into the code, they can be picked up or lost during the course of the game super easily, and there can be thousands of abilities not to mention the amount of potential combinations.
Think about it this way. According to SOLID principle of OOD (for example) every object should have responsibility over a single part of the functionality. Using this principle we can assume that:
Classes are responsible for working with custom objects, structs with - sets of data, methods are responsible for actions, events are responsible of signalizing that something happens and delegates are responsible for the corresponding action on this events that should take place.
Events and methods are 'busy' with their own single part of the functionality and therefore cannot handle the events themselves and be responsible for methods. That's why we need delegates...

Is it better to pass references down a chain or to use public static variables

Say we have a Game class.
The game class needs to pass down a reference to it's spritebatch. That is, the class calls a method passing it, and that method in turn passes it to other methods, until it is finally used.
Is this bad for performance? Is it better to just use statics?
I see one obvious disadvantage of statics though, being unable to make duplicate functionality in the same application.
It is not easy to answer your question as you have not specifically mentioned the requirement but generally i can give you some advice.
Always consider encapsulation: Do not expose the properties if they are not used else where.
Performance :For reference types, there is no any performance penalty, as they are already a reference type.but if your type is a value type then there will be a very small performance penalty.
So there is a Design or Performance trade off exists, Unless your method is called millions of times, you never have to think about public static property.
There are cons and pros like in everything.
Is this is a good or bad from performance point of view, depends on how computational intensive and how often used that code inside your game.
So here are my considerations on subject.
Passing like parameter:
Cons : passing more variable on stack, to push it into the function call.It's very fast, but again, it depends how the code you're talking about is used, so absence of it can bring some benefits, that's why inserted this point in cons.
Pros : you esplicitly manifest that the function on top of calling stack needs that parameter for read and/or write, so one looking on that code could easily imagine semantic dependencies of your calls.
Use like static:
Cons : There is no clear evidence (if not via direct knowledge or good written documentation) what parameters would or could affect the calculus inside that functions.
Pros : You don't pass it on the stack for all functions in chain.
I would personally recommend: use it like a parameter, because this clearly manifests what calling code depends on and even if there would be some measurable performance drawback, most probably it will not be relevant in your case. But again, as Rico Mariani always suggests: measure, measure, measure...
Statics is mostly not the best way. Because if later one you want to make multiple instances you might be in trouble.
Of course passing references cost a bit of performance, but depending on the amount of creation of instances it will matter more or less. Unless you are creating millions of objects every small amount of time it might be an issue.

Action on each method's return value

What I'd like to do is take some action using the value returned by every method in a class.
So for instance, if I have a class Order which has a method
public Customer GetCustomer()
{
Customer CustomerInstance = // get customer
return CustomerInstance;
}
Let's say I want to log the creation of these - Log(CustomerInstance);
My options (AFAIK) are:
Call Log() in each of these methods before returning the object. I'm not a fan of this because it gets unwieldy if used on a lot of classes with a lot of methods. It also is not an intrinsic part of the method's purpose.
Use composition or inheritance to layer the log callon the Order class similar to:
public Customer GetCustomer()
{
Customer CustomerInstance = this.originalCustomer.GetCustomer();
Log(CustomerInstance);
return CustomerInstance;
}
I don't think this buys me anything over #1.
Create extension methods on each of the returned types:
Customer CustomerInstance = Order.GetCustomer().Log();
which has just as many downsides.
I'm looking to do this for every (or almost every) object returned, automatically if possible, without having to write double the amount of code. I feel like I'm either trying to bend the language into doing something it's not supposed to, or failing to recognize some language feature that would enable this. Possible solutions would be greatly appreciated.
You need to look into Aspect Oriented Programming:
Typically, an aspect is scattered or tangled as code, making it harder to understand and maintain. It is scattered by virtue of the function (such as logging) being spread over a number of unrelated functions that might use its function, possibly in entirely unrelated systems, different source languages, etc. That means to change logging can require modifying all affected modules. Aspects become tangled not only with the mainline function of the systems in which they are expressed but also with each other. That means changing one concern entails understanding all the tangled concerns or having some means by which the effect of changes can be inferred.
Adding logging is one of the uses of this methodology.
You should check Microsofts Enterprise Library.
Think you may find usefull the Policy Injection Application Block.
Your option 1 is, in my opinion, the way to do it. Even if this will be at the end of each method, that's what is done. I would not add extra layers of obscurity because it's 'not an intrinsic purpose' of a method.
By the way, Aspect Oriented Programming addresses exactly this issue that you have (see ChrisF's answer), but then we're not talking C# anymore.

C# / Object oriented design - maintaining valid object state

When designing a class, should logic to maintain valid state be incorporated in the class or outside of it ? That is, should properties throw exceptions on invalid states (i.e. value out of range, etc.), or should this validation be performed when the instance of the class is being constructed/modified ?
It belongs in the class. Nothing but the class itself (and any helpers it delegates to) should know, or be concerned with, the rules that determine valid or invalid state.
Yes, properties should check on valid/invalid values when being set. That's what it's for.
It should be impossible to put a class into an invalid state, regardless of the code outside it. That should make it clear.
On the other hand, the code outside it is still responsible for using the class correctly, so frequently it will make sense to check twice. The class's methods may throw an ArgumentException if passed something they don't like, and the calling code should ensure that this doesn't happen by having the right logic in place to validate input, etc.
There are also more complex cases where there are different "levels" of client involved in a system. An example is an OS - an application runs in "User mode" and ought to be incapable of putting the OS into an invalid state. But a driver runs in "Kernel mode" and is perfectly capable of corrupting the OS state, because it is part of a team that is responsible for implementing the services used by the applications.
This kind of dual-level arrangement can occur in object models; there can be "exterior" clients of the model that only see valid states, and "interior" clients (plug-ins, extensions, add-ons) which have to be able to see what would otherwise be regarded as "invalid" states, because they have a role to play in implementing state transitions. The definition of invalid/valid is different depending on the role being played by the client.
Generally this belongs in the class itself, but to some extent it has to also depend on your definition of 'valid'. For example, consider the System.IO.FileInfo class. Is it valid if it refers to file that no longer exists? How would it know?
I would agree with #Joel. Typcially this would be found in the class. However, I would not have the property accessors implement the validation logic. Rather I'd recommend a validation method for the persistence layer to call when the object is being persisted. This allows you to localize the validation logic in a single place and make different choices for valid/invalid based on the persistence operation being performed. If, for example, you are planning to delete an object from the database, do you care that some of its properties are invalid? Probably not -- as long as the ID and row versions are the same as those in the database, you just go ahead and delete it. Likewise, you may have different rules for inserts and updates, e.g., some fields may be null on insert, but required on update.
It depends.
If the validation is simple, and can be checked using only information contained in the class, then most of the time it's worth while to add the state checks to the class.
There are sometimes, however, where it's not really possible or desirable to do so.
A great example is a compiler. Checking the state of abstract syntax trees (ASTs) to make sure a program is valid is usually not done by either property setters or constructors. Instead, the validation is usually done by a tree visitor, or a series of mutually recursive methods in some sort of "semantic analysis class". In either case, however, properties are validated long after their values are set.
Also, with objects used to old UI state it's usually a bad idea (from a usability perspective) to throw exceptions when invalid values are set. This is particularly true for apps that use WPF data binding. In that case you want to display some sort of modeless feedback to the customer rather than throwing an exception.
The class really should maintain valid values. It shouldn't matter if these are entered through the constructor or through properties. Both should reject invalid values. If both a constructor parameter and a property require the same validation, you can either use a common private method to validate the value for both the property and the constructor or you can do the validation in the property and use the property inside your constructor when setting the local variables. I would recommend using a common validation method, personally.
Your class should throw an exception if it receives invalid values. All in all, good design can help reduce the chances of this happening.
The valid state in a class is best express with the concept of class invariant. It is a boolean expression which must hold true for the objects of that class to be valid.
The Design by Contract approach suggests that you, as a developer of class C, should guarantee that the class invariant holds:
After construction
After a call to a public method
This will imply that, since the object is encapsulated (noone can modify it except via calls to public methods), the invariant will also be satisfied at entering any public method, or at entering the destructor (in languages with destructors), if any.
Each public method states preconditions that the caller must satisfy, and postconditions that will be satisfied by the class at the end of every public method. Violating a precondition effectively violates the contract of the class, so that it can still be correct but it doesn't have to behave in any particular way, nor maintain the invariant, if it is called with a precondition violation. A class that fulfills its contract in the absence of caller violations can be said to be correct.
A concept different from correct but complementary to it (and certainly belonging to the multiple factors of software quality) is that of robust. In our context, a robust class will detect when one of its methods is called without fulfilling the method preconditions. In such cases, an assertion violation exception will typically be thrown, so that the caller knows that he blew it.
So, answering your question, both the class and its caller have obligations as part of the class contract. A robust class will detect contract violations and spit. A correct caller will not violate the contract.
Classes belonging to the public interface of a code library should be compiled as robust, while inner classes could be tested as robust but then run in the released product as just correct, without the precondition checks on. This depends on a number of things and was discussed elsewhere.

What guidelines are appropriate for determining when to implement a class member as a property versus a method?

The .NET coding standards PDF from SubMain that have started showing up in the "Sponsored By" area seems to indicate that properties are only appropriate for logical data members (see pages 34-35 of the document). Methods are deemed appropriate in the following cases:
The operation is a conversion, such as Object.ToString().
The operation is expensive enough that you want to communicate to the user that they should consider caching the result.
Obtaining a property value using the get accessor would have an observable side effect.
Calling the member twice in succession produces different results.
The order of execution is important.
The member is static but returns a value that can be changed.
The member returns an array.
Do most developers agree on the properties vs. methods argument above? If so, why? If not, why not?
They seem sound, and basically in line with MSDN member design guidelines:
http://msdn.microsoft.com/en-us/library/ms229059.aspx
One point that people sometimes seem to forget (*) is that callers should be able to set properties in any order. Particularly important for classes that support designers, as you can't be sure of the order generated code will set properties.
(*) I remember early versions of the Ajax Control Toolkit on Codeplex had numerous bugs due to developers forgetting this one.
As for "Calling the member twice in succession produces different results", every rule has an exception, as the property DateTime.Now illustrates.
Those are interesting guidelines, and I agree with them. It's interesting in that they are setting the rules based on "everything is a property except the following". That said, they are good guidelines for avoiding problems by defining something as a property that can cause issues later.
At the end of the day a property is just a structured method, so the rule of thumb I use is based on Object Orientation -- if the member represents data owned by the entity, it should be defined as a property; if it represents behavior of the entity it should be implemented as a method.
Fully agreed.
According to the coding guidelines properties are "nouns" and methods are "verbs". Keep in mind that a user may call the property very often while thinking it would be a "cheap" operation.
On the other side it's usually expected that a method may "take more time", so a user considers about caching method results.
What's so interesting about those guidelines is that they are clearly an argument for having extension properties as well as extension methods. Shame.
I never personally came to the conclusion or had the gut feeling that properties are fast, but the guidelines say they should be, so I just accept it.
I always struggle with what to name my slow "get" methods while avoiding FxCop warnings. GetPeopleList() sounds good to me, but then FxCop tells me it might be better as a property.

Categories