Under certain circumstances, I get an NRE when closing a form using its Close button, which simply calls the native (WinForms) Close() method.
Certain paths through the code do fine, but one particular path causes the Null Reference Exception. Since this is nominally a scenario where something that is null is being referenced, how could this be taking place when the form is simply being closed? I can imagine there being a memory leak, perhaps, but something null being referenced I'm not understanding.
What potential causes in the code might there be?
UPDATE
Answer to Jon Skeet:
I'm unable to debug this in the normal way for reasons boring and laborious to relate (again), but what I can do is this:
catch (Exception ex)
{
MessageBox.Show(ex.Message);
MessageBox.Show(ex.InnerException.ToString());
SSCS.ExceptionHandler(ex, "frmEntry.saveDSD");
}
The last is an internal/custom exception processing method.
All I get from these lines is:
"Null Reference Exception"
Nothing (empty string)
"Exception: NullReferenceException Location: frmEntry.btnSave.click
Note that the last exception shown now implicates btnSave.click as the culprit, whereas it formerly pointed the finger at saveDSD. Curiouser and curiouser. Is this a case of the Hawthorne Effect* in action (well, a modified Hawthorne Effect, in the sense that adding this debugging code might be changing things).
The canonical Hawthorne Effect being more like this sort of scenario: A bunch of cats are playing basketball. Some girls walk up and watch. The cats start showing off and hotdogging, and one breaks his leg. Is it the girls' fault? No. Would it have happened had they not been watching? No.
UPDATE 2
Talk about the Hawthorne Effect: When I copied the two MessageBox.Show() calls to the catch block of frmEntry.btnSave.click, I then got:
"Null Reference Exception"
"Null Reference Exception"
"Exception: NullReferenceException Location: frmEntry.Closing"
IOW, the location of the NRE keeps moving about, like a squirrel on a country road when [h,sh]e meets two cars going opposite directions.
UPDATE 3
And it happened once more: adding those MessageBox.Show()s to the form's Closing event causes the NRE to pop up out of a different hole and declare itself. This inbred code is begetting (or cloning) half-wits by the score, or so it seems.
UPDATE 4
If writing bad code were a crime, the cat who wrote this stuff should be in solitary confinement in a SuperMax.
Here's the latest egregious headscratcher/howler:
int cancelled = ListRecords.cancelled;
if (cancelled == 0)
. . .
The public "cancelled" member of ListRecords (in another class) is only ever assigned values of 0 and 1. Thus, not only does cancelled sound like a bool, it acts like a bool, too. Why was it not declared as a bool?!?!?
The official image of this codebase should be Edvard Munch's "The Scream"
UPDATE 5
Maybe I'm venting, but recent experiences have caused me to come up with a new name for certain types of code, and an illustration for many projects. Code that doesn't go anywhere, but simply takes up space (assignments made, but are then not acted on, or empty methods/handlers called) I now call "Winchester Mystery House code."
And for the typical project (I've been a "captured" employee as well as a contractor on many projects now, having almost 20 years experience in programming), I now liken the situation to a group of people wallowing in quicksand. When they hire new people to "come aboard," they really want them to jump into the quicksand with them. How is this going to help matters? Is it just a case of "misery loves company"? What they should be doing is saying "Throw us a rope!" not "Come on in, the quicksand is fine!"
Many times it's probably just entropy and "job security" at play; if they keep wrestling with the monstrosity they've chosen or created, their hard-won knowledge in how to keep the beast more-or-less at bay will keep them in their semi-comfy job. If they make any radical changes (most teams do need to make some radical improvements, IMO), this may endanger their employment status. Maybe the answer is to employ/contract "transition teams" to move teams/companies from prehistoric dinosaur-clubbing times into the 21st Century, training the "old guard" so that they can preserve their jobs. After all, there aren't enough of these "saviors" to steal the old hands' jobs anyway - train them up, get them up to speed, and move on. All would benefit. But many seemingly prefer to continue wallowing in the mire, or quicksand.
Before anybody hires a programmer, they should give them at the minimum a test wherein passing the test would prove that they are at least familiar with the basic tenets delineated in Steve McConnell's "Code Complete."
Okay, back to battling the quicksand...
UPDATE 6
Or: "Refactoring Spo-dee-o-dee":
A "gotcha" with refactoring is if you change the name of a variable such as this:
ChangeListType chgLst; // ChangeListType is an enum
...to this:
ChangeListType changeListType;
...and if there is a data table with a column named "chgLst", SQL statements can also get changed, which can ruin, if not your day, at least part of it.
Another potential gotcha is when you get, "[var name] is only assigned but its value is never used" hints. You may think you can just cavalierly 86 all of these, but beware that any related code which you may comment out along with assignments to these dead vars is not producing relied-upon side effects. Admittedly, this would be a bad way to code (side effectual methods storing their return vals in vars that are ignored), but ... it happens, especially when the original coder was a propeller-head mad scientist cowboy.
This code is so chock full of anti-patterns that I wouldn't be surprised if Butterick and/or Simplicity put out a contract on this guy.
On second thought, aside from the architecture, design, coding, and formatting of this project, it's not really all that bad...
Mary Shelley's "Frankenstein" was very prescient indeed, but not in the way most people have thought. Rather than a general foreboding about technology run amok, it is much more specific than that: it is a foregleam of most modern software projects, where "pieces parts" from here and there are jammed together willy-nilly with little regard to whether "the hip bone's connected to the thigh bone" (or should be) and whether those parts match up or will reject one another; the left hand doesn't know what the right is doing, and devil take the hindmost! Etc.
That can happen under any of these situations:
object obj is null and your code has obj.ToString()
string item is null and your code has item.Trim()
Settings mySettings is null and your code has mySettings.Path
Less common things would be a thread running that posts information or a serial port that tries to receive data after the SerialDataReceivedEventHandler has been removed.
Look at what your suspect form's code and the parent form's code are doing about the time this suspect form is closed and what other processes might be running on the suspect form.
Related
A favorable outcome would be preventing this exception, preferably, or at least handling it gracefully.
I am getting an exception thrown within Microsoft code. On top of that, the method throwing the exception is System.Windows.Input.Manipulations.ManipulationSequence.ProcessManipulators, which I can't find in Microsoft Reference Source.
When the exception is thrown, I can see that one line down in the Call Stack window it references Windows.Input.Manipulations.ManipulationProcessor2D.ProcessManipulators, which does exist in Microsoft Reference Source.
But as you can see, it doesn't have a sibling class named ManipulationSequence.
As for the exception itself, it is a System.Argument.OutOfRangeException with a value of Timestamp values must not decrease. Parameter name: timestamp Actual value was 6590630705479.
The fully qualified signature of the method throwing the exception is System.Windows.Input.Manipulations.ManipulationSequence.ProcessManipulators(long timestamp, System.Collections.Generic.IEnumerable<System.Windows.Input.Manipulations.Manipulator2D> manipulators, System.Windows.Input.Manipulations.ManipulationSequence.ISettings settings)
It appears as if one other person in the universe has had this problem, but it could not be reproduced according to the only comment.
I have 6 MediaElement objects on a canvas that are all running videos when being manipulated, so I feel as though it might have something to do with the CPU being taxed and slowing down, possibly making timestamps be sent into the method out of order (though the same problem occurs when using Image rather than MediaElement). The exception happens sporadically, sometimes it will happen after just a few seconds of messing around with the objects, sometimes it can go for a few minutes or more of manipulating the objects.
My code that does the actual manipulation within ManipulationDelta looks like this:
//Get current values to manipulate
TransformGroup group = (TransformGroup)element.RenderTransform.Clone();
TranslateTransform translate = (TranslateTransform)group.Children[0].Clone();
ScaleTransform scale = (ScaleTransform)group.Children[1].Clone();
RotateTransform rotate = (RotateTransform)group.Children[2].Clone();
//...does manipulations on each by changing values...
//Apply transformation changes
group.Children[0] = translate;
group.Children[1] = scale;
group.Children[2] = rotate;
element.RenderTransform = group;
I have a Storyboard in XAML messing with the RotateTransform, so I can't really use MatrixTransform.
I am creating this using WPF with .NET 4.5.1. The error occurs in both Windows 8.1 and Windows 7. Any ideas on how to prevent this exception from occurring?
Some thoughts as I investigate the problem:
I also have ManipulationInertiaStarting in play here as a possible
cause of this error.
I just added e.Handled = true; to the end of ManipulationCompleted, which wasn't there before. I haven't got the error since (though, again, very sporadic, so it is hard to tell when it is fixed).
If a ManipulationDelta method is not yet complete, and it is hit again from user input, could there be some sort of race condition occurring where the first method hit is starved for CPU resources and the second runs through, then when the first method finally completes the timestamp created is in the past?
Per a comment, this isn't likely.
I conferred with a co-worker to gain better understanding. He helped me realize I can't swallow the exception from within my methods that handle manipulation events because the exception is happening before it gets there, in the actual creation of the manipulation data. So the only place I can handle the exception is on App.Main() (the first place in the Call Stack where my code exists), which makes handling it gracefully all the more difficult.
I had this exact problem myself.
After a lot of testing it could be reproduced with slower machines under heavy load.
The Application was for Digital Signage and showed a lot of different items ( Video, Html , Images , etc ) and had also some animations.
I am not sure about it but it seems to be a problem of handling the input events in time.
For myself i could "solve" this issue with outsourcing code from the manipulating to other code asynchronous and also profiling and rewriting code performance-wise.( shortened the path to run inside the event as much as possible and did everything needed to do also later with a Task )
Also i added an Exceptionhandler to my application to "ignore and log" this issue, because it had no other impact.
Feel free to contact me for more info on this.
PS: this is my first answer here, so i hope it is alright the way i wrote it
I had similar problems when developing for WinRT.
It's sometimes possible to use DispatcherUnhandledException event to ignore one particular exception. To do that, add event listener, check if exception is the one you want (because it's generally bad idea to suppress all exception), and then set Handled property.
Ok so the title may have been confusing so i have posted 2 code snippets to illustrate what i mean.
NOTE: allUsers is just a collection.
RegularUser regUser = new RegularUser(userName, password, name, emailAddress);
allUsers.Add(regUser);
VS
allUsers.Add(new RegularUser(userName, password, name, emailAddress));
Which snippet A or B is better and why?
What are the advantages or disadvantages?
The example i wrote was C# but does the language (C#, Java etc) make a difference?
As far as C# is concerned, both of your code examples are practically identical at the IL level. The second examples still creates a reference to the created object and pushes it onto the stack, you just don't have a local variable hooked up to it. This will not create any performance problems at all.
1) Which snippet A or B is better and why?
They're really identical. The compiled code will be nearly identical, since a temporary object is pushed onto the stack, then used in the method call.
2) What are the advantages or disadvantages?
The main advantages and disadvantages to the approach are really just readability.
Your first example has the advantage of keeping a single "operation" per line of code, which, in many ways, is more maintainable.
The second example removes the unnecessary variable declaration, which may be more maintainable.
Personally, I feel that the number of parameters in your RegularUser constructor would probably push me, in this instance, towards your first option. I typically find that, when a line of code gets to be more than about half a screen width on an average monitor, it's easier to read and understand if it's split up. Splitting this up by introducing the temporary and calling Add separately makes this more clear.
However, if you're just adding an integer or a class that's very small, I'd probably vote to skip the unnecessary variable. This is completely a personal preference, however - your milage may (and probably will) vary.
3) The example i wrote was C# but does the language (C#, Java etc) make a difference?
No, for the most part. This is really language/implementation dependent, but most languages will have the same basic behavior and performance in both cases. It is possible (and highly likely) that some languages may treat this differently, but most mainstream languages will not.
I really like to create them the first way unless I really really know what is going on. It is much harder to do debugging if you don't create the object first...
The compiler will just turn the 2nd version into the 1st for you, anyway, so there isn't a net negative effect.
Pros of #1:
easier to debug (!)
theoretically easier to read, clearer
can use the object later
Cons:
more verbose
can be unnecessary, especially for trivial objects
Result:
1 for anything complex to create, or that may need to be inspected easily at debug time
2 for lots of annoying little stuff, like the following.
var list = new List<NameValuePair>(3);
list.Add( new NameValuePair("name", "valuable");
list.add( new NameValuePair("age", "valuable");
list.add( new NameValuePair("height", "not valuable");
var dates = new List<date>();
dates.Add(DateTime.Now());
dates.Add(DateTime.Now().Date().AddYears(-2));
As far as I know there isn't a real difference between languages when it comes to this. Some may not allow it, though.
Both are equal in terms of performance.
In terms of maintainability the second case is a nightmare, it is (nearly) impossible to trace in a debugger. So I tend to prefer the first one. In my early oop days I was always writing the second, because "I knew that they were objects and I was sooo good at grasping objects that I ... blah blah blah", but that wore off with time and especially maintenance time
Also, suppose that someone wants you to
FilterClass.FilterUser(regUser)
or
Database.AddToDatabase(regUser)
because it is the right place to do so, the first scenario is better.
Finally, when do you stop?
allUsers.Add(new RegularUser(new ReadFromInput(new EscapedName(new Name(new String(userName)))), password, name, emailAddress));
[ This is a result of Best Practice: Should functions return null or an empty object? but I'm trying to be very general. ]
In a lot of legacy (um...production) C++ code that I've seen, there is a tendency to write a lot of NULL (or similar) checks to test pointers. Many of these get added near the end of a release cycle when adding a NULL-check provides a quick fix to a crash caused by the pointer dereference--and there isn't a lot of time to investigate.
To combat this, I started to write code that took a (const) reference parameter instead of the (much) more common technique of passing a pointer. No pointer, no desire to check for NULL (ignoring the corner case of actually having a null reference).
In C#, the same C++ "problem" is present: the desire to check every unknown reference against null (ArgumentNullException) and to quickly fix NullReferenceExceptions by adding a null check.
It seems to me, one way to prevent this is to avoid null objects in the first place by using empty objects (String.Empty, EventArgs.Empty) instead. Another would be to throw an exception rather than return null.
I'm just starting to learn F#, but it appears there are far fewer null objects in that enviroment. So maybe you don't really have to have a lot of null references floating around?
Am I barking up the wrong tree here?
Passing non-null just to avoid a NullReferenceException is trading a straightforward, easy-to-solve problem ("it blows up because it's null") for a much more subtle, hard-to-debug problem ("something several calls down the stack is not behaving as expected because much earlier it got some object which has no meaningful information but isn't null").
NullReferenceException is a wonderful thing! It fails hard, loud, fast, and it's almost always quick and easy to identify and fix. It's my favorite exception, because I know when I see it, my task is only going to take about 2 minutes. Contrast this with a confusing QA or customer report trying to describe strange behavior that has to be reproduced and traced back to the origin. Yuck.
It all comes down to what you, as a method or piece of code, can reasonably infer about the code which called you. If you are handed a null reference, and you can reasonably infer what the caller might have meant by null (maybe an empty collection, for example?) then you should definitely just deal with the nulls. However, if you can't reasonably infer what to do with a null, or what the caller means by null (for example, the calling code is telling you to open a file and gives the location as null), you should throw an ArgumentNullException.
Maintaining proper coding practices like this at every "gateway" point - logical bounds of functionality in your code—NullReferenceExceptions should be much more rare.
I tend to be dubious of code with lots of NULLs, and try to refactor them away where possible with exceptions, empty collections, Java Optionals, and so on.
The "Introduce Null Object" pattern in Martin Fowler's Refactoring (page 260) may also be helpful. A Null Object responds to all the methods a real object would, but in a way that "does the right thing". So rather than always check an Order to see if order.getDiscountPolicy() is NULL, make sure the Order has a NullDiscountPolicy in these cases. This streamlines the control logic.
Null gets my vote. Then again, I'm of the 'fail-fast' mindset.
String.IsNullOrEmpty(...) is very helpful too, I guess it catches either situation: null or empty strings. You could write a similar function for all your classes you're passing around.
If you are writing code that returns null as an error condition, then don't: generally, you should throw an exception instead - far harder to miss.
If you are consuming code that you fear may return null, then mostly these are boneheaded exceptions: perhaps do some Debug.Assert checks at the caller to sense-check the output during development. You shouldn't really need vast numbers of null checks in you production, but if some 3rd party library returns lots of nulls unpredictably, then sure: do the checks.
In 4.0, you might want to look at code-contracts; this gives you much better control to say "this argument should never be passed in as null", "this function never returns null", etc - and have the system validate those claims during static analysis (i.e. when you build).
The thing about null is that it doesn't come with meaning. It is merely the absence of an object.
So, if you really mean an empty string/collection/whatever, always return the relevant object and never null. If the language in question allows you to specify that, do so.
In the case where you want to return something that means not a value specifiable with the static type, then you have a number of options. Returning null is one answer, but without a meaning is a little dangerous. Throwing an exception may actually be what you mean. You might want to extend the type with special cases (probably with polymorphism, that is to say the Special Case Pattern (a special case of which is the Null Object Pattern)). You might want to wrap the return value in an type with more meaning. Or you might want to pass in a callback object. There usually are many choices.
I'd say it depends. For a method returning a single object, I'd generally return null. For a method returning a collection, I'd generally return an empty collection (non-null). These are more along the lines of guidelines than rules, though.
If you are serious about wanting to program in a "nullless" environment, consider using extension methods more often, they are immune to NullReferenceExceptions and at least "pretend" that null isn't there anymore:
public static GetExtension(this string s)
{
return (new FileInfo(s ?? "")).Extension;
}
which can be called as:
// this code will never throw, not even when somePath becomes null
string somePath = GetDataFromElseWhereCanBeNull();
textBoxExtension.Text = somePath.GetExtension();
I know, this is only convenience and many people correctly consider it violation of OO principles (though the "founder" of OO, Bertrand Meyer, considers null evil and completely banished it from his OO design, which is applied to the Eiffel language, but that's another story). EDIT: Dan mentions that Bill Wagner (More Effective C#) considers it bad practice and he's right. Ever considered the IsNull extension method ;-) ?
To make your code more readable, another hint may be in place: use the null-coalescing operator more often to designate a default when an object is null:
// load settings
WriteSettings(currentUser.Settings ?? new Settings());
// example of some readonly property
public string DisplayName
{
get
{
return (currentUser ?? User.Guest).DisplayName
}
}
None of these take the occasional check for null away (and ?? is nothing more then a hidden if-branch). I prefer as little null in my code as possible, simply because I believe it makes the code more readable. When my code gets cluttered with if-statements for null, I know there's something wrong in the design and I refactor. I suggest anybody to do the same, but I know that opinions vary wildly on the matter.
(Update) Comparison with exceptions
Not mentioned in the discussion so far is the similarity with exception handling. When you find yourself ubiquitously ignoring null whenever you consider it's in your way, it is basically the same as writing:
try
{
//...code here...
}
catch (Exception) {}
which has the effect of removing any trace of the exceptions only to find it raises unrelated exceptions much later in the code. Though I consider it good to avoid using null, as mentioned before in this thread, having null for exceptional cases is good. Just don't hide them in null-ignore-blocks, it will end up having the same effect as the catch-all-exceptions blocks.
For the exception protagonists they usually stem from transactional programming and strong exception safety guarantees or blind guidelines. In any decent complexity, ie. async workflow, I/O and especially networking code they are simply inappropriate. The reason why you see Google style docs on the matter in C++, as well as all good async code 'not enforcing it' (think your favourite managed pools as well).
There is more to it and while it might look like a simplification, it really is that simple. For one you will get a lot of exceptions in something that wasn't designed for heavy exception use.. anyway I digress, read upon on this from the world's top library designers, the usual place is boost (just don't mix it up with the other camp in boost that loves exceptions, because they had to write music software :-).
In your instance, and this is not Fowler's expertise, an efficient 'empty object' idiom is only possible in C++ due to available casting mechanism (perhaps but certainly not always by means of dominance ).. On ther other hand, in your null type you are capable throwing exceptions and doing whatever you want while preserving the clean call site and structure of code.
In C# your choice can be a single instance of a type that is either good or malformed; as such it is capable of throwing acceptions or simply running as is. So it might or might not violate other contracts ( up to you what you think is better depending on the quality of code you're facing ).
In the end, it does clean up call sites, but don't forget you will face a clash with many libraries (and especially returns from containers/Dictionaries, end iterators spring to mind, and any other 'interfacing' code to the outside world ). Plus null-as-value checks are extremely optimised pieces of machine code, something to keep in mind but I will agree any day wild pointer usage without understanding constness, references and more is going to lead to different kind of mutability, aliasing and perf problems.
To add, there is no silver bullet, and crashing on null reference or using a null reference in managed space, or throwing and not handling an exception is an identical problem, despite what managed and exception world will try to sell you. Any decent environment offers a protection from those (heck you can install any filter on any OS you want, what else do you think VMs do), and there are so many other attack vectors that this one has been overhammered to pieces. Enter x86 verification from Google yet again, their own way of doing much faster and better 'IL', 'dynamic' friendly code etc..
Go with your instinct on this, weight the pros and cons and localise the effects.. in the future your compiler will optimise all that checking anyway, and far more efficiently than any runtime or compile-time human method (but not as easily for cross-module interaction).
I try to avoid returning null from a method wherever possible. There are generally two kinds of situations - when null result would be legal, and when it should never happen.
In the first case, when no result is legal, there are several solutions available to avoid null results and null checks that are associated with them: Null Object pattern and Special Case pattern are there to return substitute objects that do nothing, or do some specific thing under specific circumstances.
If it is legal to return no object, but still there are no suitable substitutes in terms of Null Object or Special Case, then I typically use the Option functional type - I can then return an empty option when there is no legal result. It is then up to the client to see what is the best way to deal with empty option.
Finally, if it is not legal to have any object returned from a method, simply because the method cannot produce its result if something is missing, then I choose to throw an exception and cut further execution.
How are empty objects better than null objects? You're just renaming the symptom. The problem is that the contracts for your functions are too loosely defined "this function might return something useful, or it might return a dummy value" (where the dummy value might be null, an "empty object", or a magic constant like -1.) But no matter how you express this dummy value, callers still have to check for it before they use the return value.
If you want to clean up your code, the solution should be to narrow down the function so that it doesn't return a dummy value in the first place.
If you have a function which might return a value, or might return nothing, then pointers are a common (and valid) way to express this. But often, your code can be refactored so that this uncertainty is removed. If you can guarantee that a function returns something meaningful, then callers can rely on it returning something meaningful, and then they don't have to check the return value.
You can't always return an empty object, because 'empty' is not always defined. For example what does it mean for an int, float or bool to be empty?
Returning a NULL pointer is not necessarily a bad practice, but I think it's a better practice to return a (const) reference (where it makes sense to do so of course).
And recently I've often used a Fallible class:
Fallible<std::string> theName = obj.getName();
if (theName)
{
// ...
}
There are various implementations available for such a class (check Google Code Search), I also created my own.
I have inherited some code that uses the ref keyword extensively and unnecessarily. The original developer apparently feared objects would be cloned like primitive types if ref was not used, and did not bother to research the issue before writing 50k+ lines of code.
This, combined with other bad coding practices, has created some situations that are absurdly dangerous on the surface. For example:
Customer person = NextInLine();
//person is Alice
person.DataBackend.ChangeAddress(ref person, newAddress);
//person could now be Bob, Eve, or null
Could you imagine walking into a store to change your address, and walking out as an entirely different person?
Scary, but in practice the use of ref in this application seems harmlessly superfluous. I am having trouble justifying the extensive amount of time it would take to clean it up. To help sell the idea, I pose the following question:
How else can unnecessary use of ref be destructive?
I am especially concerned with maintenance. Plausible answers with examples are preferred.
You are also welcome to argue clean-up is not necessary.
I would say the biggest danger is if the parameter were set to null inside the function for some reason:
public void MakeNull(ref Customer person)
{
// random code
person = null;
return;
}
Now, you're not just a different person, you've been erased from existence altogether!
As long as whoever is developing this application understands that:
By default, object references are passed by value.
and:
With the ref keyword, object references are passed by reference.
If the code works as expected now and your developers understand the difference, it's probably not worth the effort it's going to take remove them all.
I'll just add the worst use of the ref keyword I've ever seen, the method looked something like this:
public bool DoAction(ref Exception exception) {...}
Yup, you had to declare and pass an Exception reference in order to call the method, then check the method's return value to see if an exception had been caught and returned via the ref exception.
Can you work out why the originator of the code thought that they needed to have the parameter as a ref? Was it because they did update it and then removed the functionality or was it simply because they didn't understand c# at the time?
If you think the clean up is worth it, then go ahead with it - particularly if you have the time now. You might not be in a position to do fix it properly when a real issue does arise, as it will most likely be an urgent bug fix and you won't have the time to do a proper job.
It is quite common in C# to modify the values of arguments in methods since they usually are by value, and not by ref. This applies to both reference and value types; setting a reference to null for instance would change the original reference. This could lead to very strange and painful bugs when other developers work "as usual". Creating recursive methods with ref arguments is a no-go.
Apart from this, you have restrictions on what you can pass by ref. For instance, you cannot pass constant values, readonly fields, properties etc., so that a lot of helper variables are required when calling methods with ref arguments.
Last but not least the performance if likely not nearly as well, since it requires more indirections (a ref is just a reference which needs to be resolved on every access) and may also keep objects alive longer, since the references are not going out of scope as quickly.
To me, smells like a C++ developer making unwarranted assumptions.
I'd be wary of making wholesale changes to something that works. (I'm assuming it works because you don't comment about it being broken, just about it being dangerous).
The last thing you want to do is to break something subtle and have to spend a week tracking down the problem.
I suggest you clean up as you go - one method at a time.
Find a method that uses ref where you're sure it isn't required.
Change the method signature and fix up the calls.
Test.
Repeat.
While the specific problems you have may be more severe than most cases, your situation is pretty common - having a large code base that doesn't comply with our current understanding of the "right way" to do things.
Wholesale "upgrades" often run into difficulties. Refactoring as you go - bringing things up to spec as you work on them - is much safer.
There's precedent here in the building industry. For example, electrical wiring in older (say, 19th century) buildings doesn't need to be touched unless there's a problem. When there is a problem, though, the new work has to be completed to modern standards.
I would try to fix it. Just perform a solution wide string replacement with regex and check the unit tests after that. I am aware that this might break the code. But how often do you use ref? Almost never, right? And given that the developer did not know how it works, I consider the chance that it is used somewhere (the way it should) even smaller. And if the code break - well, rollback ...
How else can unnecessary use of ref be destructive?
The other answers deal with semantic issues which is definitely the most important thing. Good code is self-documenting, and when I give a ref parameter, I assume it will change. If it doesn't, the API failed to be self-documenting.
But for fun, how about we look at another aspect -- performance?
void ChangeAddress(ref Customer person, Address address)
{
person.Address = address;
}
Here person is a reference to a reference, so there will be some indirection introduced whenever you access it. Lets look at some assembly this might translate into:
mov eax, [person] ; load the reference to person.
mov [eax+Address], address ; assign address to person.Address.
Now the non-ref version:
void ChangeAddress(Customer person, Address address)
{
person.Address = address;
}
There's no indirection here, so we can get rid of one read:
mov [person+Address], address ; assign address to person.Address.
In practice, one hopes that .NET caches [person] in the ref version, amortizing the indirection cost over multiple accesses. It probably won't actually be a 50% drop in instruction count outside of trivial methods like the ones here.
In C# I use the #warning and #error directives,
#warning This is dirty code...
#error Fix this before everything explodes!
This way, the compiler will let me know that I still have work to do. What technique do you use to mark code so you won't forget about it?
Mark them with // TODO, // HACK or other comment tokens that will show up in the task pane in Visual Studio.
See Using the Task List.
Todo comment as well.
We've also added a special keyword NOCHECKIN, we've added a commit-hook to our source control system (very easy to do with at least cvs or svn) where it scans all files and refuses to check in the file if it finds the text NOCHECKIN anywhere.
This is very useful if you just want to test something out and be certain that it doesn't accidentaly gets checked in (passed the watchful eyes during the diff of everything thats commited to source control).
I use a combination of //TODO: //HACK: and throw new NotImplementedException(); on my methods to denote work that was not done. Also, I add bookmarks in Visual Studio on lines that are incomplete.
//TODO: Person's name - please fix this.
This is in Java, you can then look at tasks in Eclipse which will locate all references to this tag, and can group them by person so that you can assign a TODO to someone else, or only look at your own.
If I've got to drop everything in the middle of a change, then
#error finish this
If it's something I should do later, it goes into my bug tracker (which is used for all tasks).
'To do' comments are great in theory, but not so good in practice, at least in my experience. If you are going to be pulled away for long enough to need them, then they tend to get forgotten.
I favor Jon T's general strategy, but I usually do it by just plain breaking the code temporarily - I often insert a deliberately undefined method reference and let the compiler remind me about what I need to get back to:
PutTheUpdateCodeHere();
An approach that I've really liked is "Hack Bombing", as demonstrated by Oren Eini here.
try
{
//do stuff
return true;
}
catch // no idea how to prevent an exception here at the moment, this make it work for now...
{
if (DateTime.Today > new DateTime(2007, 2, 7))
throw new InvalidOperationException("fix me already!! no catching exceptions like this!");
return false;
}
Add a test in a disabled state. They show up in all the build reports.
If that doesn't work, I file a bug.
In particular, I haven't seen TODO comments ever decrease in quantity in any meaningful way. If I didn't have time to do it when I wrote the comment, I don't know why I'd have time later.
//TODO: Finish this
If you use VS you can setup your own Task Tags under Tools>Options>Environment>Task List
gvim highlights both "// XXX" and "// TODO" in yellow, which amazed me the first time I marked some code that way to remind myself to come back to it.
I'm a C++ programmer, but I imagine my technique could be easily implemented in C# or any other language for that matter:
I have a ToDo(msg) macro that expands into constructing a static object at local scope whose constructor outputs a log message. That way, the first time I execute unfinished code, I get a reminder in my log output that tells me that I can defer the task no longer.
It looks like this:
class ToDo_helper
{
public:
ToDo_helper(const std::string& msg, const char* file, int line)
{
std::string header(79, '*');
Log(LOG_WARNING) << header << '\n'
<< " TO DO:\n"
<< " Task: " << msg << '\n'
<< " File: " << file << '\n'
<< " Line: " << line << '\n'
<< header;
}
};
#define TODO_HELPER_2(X, file, line) \
static Error::ToDo_helper tdh##line(X, file, line)
#define TODO_HELPER_1(X, file, line) TODO_HELPER_2(X, file, line)
#define ToDo(X) TODO_HELPER_1(X, __FILE__, __LINE__)
... and you use it like this:
void some_unfinished_business() {
ToDo("Take care of unfinished business");
}
It's not a perfect world, and we don't always have infinite time to refactor or ponder the code.
I sometimes put //REVIEW in the code if it's something I want to come back to later. i.e. code is working, but perhaps not convinced it's the best way.
// REVIEW - RP - Is this the best way to achieve x? Could we use algorithm y?
Same goes for //REFACTOR
// REFACTOR - should pull this method up and remove near-dupe code in XYZ.cs
I use // TODO: or // HACK: as a reminder that something is unfinished with a note explaining why.
I often (read 'rarely') go back and finish those things due to time constraints.
However, when I'm looking over the code I have a record of what was left uncompleted and more importantly WHY.
One more comment I use often at the end of the day or week:
// START HERE CHRIS
^^^^^^^^^^^^^^^^^^^^
Tells me where I left off so I can minimize my bootstrap time on Monday morning.
// TODO: <explanation>
if it's something that I haven't gotten around to implementing, and don't want to forget.
// FIXME: <explanation>
if it's something that I don't think works right, and want to come back later or have other eyes look at it.
Never thought of the #error/#warning options. Those could come in handy too.
I use //FIXME: xxx for broken code, and //CHGME: xxx for code that needs attention but works (perhaps only in a limited context).
Todo Comment.
These are the three different ways I have found helpful to flag something that needs to be addressed.
Place a comment flag next to the code that needs to be looked at. Most compilers can recognize common flags and display them in an organized fashion. Usually your IDE has a watch window specifically designed for these flags. The most common comment flag is: //TODO This how you would use it:
//TODO: Fix this before it is released. This causes an access violation because it is using memory that isn't created yet.
One way to flag something that needs to be addressed before release would be to create a useless variable. Most compilers will warn you if you have a variable that isn't used. Here is how you could use this technique:
int This_Is_An_Access_Violation = 0;
IDE Bookmarks. Most products will come with a way to place a bookmark in your code for future reference. This is a good idea, except that it can only be seen by you. When you share your code most IDE's won't share your bookmarks. You can check the help file system of your IDE to see how to use it's bookmarking features.
I also use TODO: comments. I understand the criticism that they rarely actually get fixed, and that they'd be better off reported as bugs. However, I think that misses a couple points:
I use them most during heavy development, when I'm constantly refactoring and redesigning things. So I'm looking at them all the time. In situations like that, most of them actually do get addressed. Plus it's easy to do a search for TODO: to make sure I didn't miss anything.
It can be very helpful for people reading your code, to know the spots that you think were poorly written or hacked together. If I'm reading unfamiliar code, I tend to look for organizational patterns, naming conventions, consistent logic, etc.. If that consistency had to be violated one or two times for expediency, I'd rather see a note to that effect. That way I don't waste time trying to find logic where there is none.
If it's some long term technical debt, you can comment like:
// TODO: This code loan causes an annual interest rate of 7.5% developer/hour. Upfront fee as stated by the current implementation. This contract is subject of prior authorization from the DCB (Developer's Code Bank), and tariff may change without warning.
... err. I guess a TODO will do it, as long as you don't simply ignore them.
This is my list of temporary comment tags I use:
//+TODO Usual meaning.
//+H Where I was working last time.
//+T Temporary/test code.
//+B Bug.
//+P Performance issue.
To indicate different priorities, e.g.: //+B vs //+B+++
Advantages:
Easy to search-in/remove-from the code (look for //+).
Easy to filter on a priority basis, e.g.: search for //+B to find all bugs, search for //+B+++ to only get high priority ones.
Can be used with C++, C#, Java, ...
Why the //+ notation? Because the + symbol looks like a little t, for temporary.
Note: this is not a Standard recommendation, just a personal one.
As most programmers seem to do here, I use TODO comments. Additionally, I use Eclipse's task interface Mylyn. When a task is active, Mylyn remembers all resources I have opened. This way I can track
where in a file I have to do something (and what),
in which files I have to do it, and
to what task they are related.
Besides keying off the "TODO:" comment, many IDE's also key off the "TASK:" comment. Some IDE's even let you configure your own special identifier.
It is probably not a good idea to sprinkle your code base with uninformative TODOs, especially if you have multiple contributors over time. This can be quite confusing to the newcomers. However, what seems to me to work well in practice is to state the author and when the TODO was written, with a header (50 characters max) and a longer body.
Whatever you pack into the TODO comments, I'd recommend to be systematic in how you track them. For example, there is a service that examines the TODO comments in your repository based on git blame (http://www.tickgit.com).
I developed my own command-line tool to enforce the consistent style of the TODO comments using ideas from the answers here (https://github.com/mristin/opinionated-csharp-todos). It was fairly easy to integrate it into the continuous integration so that the task list is re-generated on every push to the master.
It also makes sense to have the task list separate from your IDE for situations when you discuss the TODOs in a meeting with other people, when you want to share it by email etc.