I have inherited some code that uses the ref keyword extensively and unnecessarily. The original developer apparently feared objects would be cloned like primitive types if ref was not used, and did not bother to research the issue before writing 50k+ lines of code.
This, combined with other bad coding practices, has created some situations that are absurdly dangerous on the surface. For example:
Customer person = NextInLine();
//person is Alice
person.DataBackend.ChangeAddress(ref person, newAddress);
//person could now be Bob, Eve, or null
Could you imagine walking into a store to change your address, and walking out as an entirely different person?
Scary, but in practice the use of ref in this application seems harmlessly superfluous. I am having trouble justifying the extensive amount of time it would take to clean it up. To help sell the idea, I pose the following question:
How else can unnecessary use of ref be destructive?
I am especially concerned with maintenance. Plausible answers with examples are preferred.
You are also welcome to argue clean-up is not necessary.
I would say the biggest danger is if the parameter were set to null inside the function for some reason:
public void MakeNull(ref Customer person)
{
// random code
person = null;
return;
}
Now, you're not just a different person, you've been erased from existence altogether!
As long as whoever is developing this application understands that:
By default, object references are passed by value.
and:
With the ref keyword, object references are passed by reference.
If the code works as expected now and your developers understand the difference, it's probably not worth the effort it's going to take remove them all.
I'll just add the worst use of the ref keyword I've ever seen, the method looked something like this:
public bool DoAction(ref Exception exception) {...}
Yup, you had to declare and pass an Exception reference in order to call the method, then check the method's return value to see if an exception had been caught and returned via the ref exception.
Can you work out why the originator of the code thought that they needed to have the parameter as a ref? Was it because they did update it and then removed the functionality or was it simply because they didn't understand c# at the time?
If you think the clean up is worth it, then go ahead with it - particularly if you have the time now. You might not be in a position to do fix it properly when a real issue does arise, as it will most likely be an urgent bug fix and you won't have the time to do a proper job.
It is quite common in C# to modify the values of arguments in methods since they usually are by value, and not by ref. This applies to both reference and value types; setting a reference to null for instance would change the original reference. This could lead to very strange and painful bugs when other developers work "as usual". Creating recursive methods with ref arguments is a no-go.
Apart from this, you have restrictions on what you can pass by ref. For instance, you cannot pass constant values, readonly fields, properties etc., so that a lot of helper variables are required when calling methods with ref arguments.
Last but not least the performance if likely not nearly as well, since it requires more indirections (a ref is just a reference which needs to be resolved on every access) and may also keep objects alive longer, since the references are not going out of scope as quickly.
To me, smells like a C++ developer making unwarranted assumptions.
I'd be wary of making wholesale changes to something that works. (I'm assuming it works because you don't comment about it being broken, just about it being dangerous).
The last thing you want to do is to break something subtle and have to spend a week tracking down the problem.
I suggest you clean up as you go - one method at a time.
Find a method that uses ref where you're sure it isn't required.
Change the method signature and fix up the calls.
Test.
Repeat.
While the specific problems you have may be more severe than most cases, your situation is pretty common - having a large code base that doesn't comply with our current understanding of the "right way" to do things.
Wholesale "upgrades" often run into difficulties. Refactoring as you go - bringing things up to spec as you work on them - is much safer.
There's precedent here in the building industry. For example, electrical wiring in older (say, 19th century) buildings doesn't need to be touched unless there's a problem. When there is a problem, though, the new work has to be completed to modern standards.
I would try to fix it. Just perform a solution wide string replacement with regex and check the unit tests after that. I am aware that this might break the code. But how often do you use ref? Almost never, right? And given that the developer did not know how it works, I consider the chance that it is used somewhere (the way it should) even smaller. And if the code break - well, rollback ...
How else can unnecessary use of ref be destructive?
The other answers deal with semantic issues which is definitely the most important thing. Good code is self-documenting, and when I give a ref parameter, I assume it will change. If it doesn't, the API failed to be self-documenting.
But for fun, how about we look at another aspect -- performance?
void ChangeAddress(ref Customer person, Address address)
{
person.Address = address;
}
Here person is a reference to a reference, so there will be some indirection introduced whenever you access it. Lets look at some assembly this might translate into:
mov eax, [person] ; load the reference to person.
mov [eax+Address], address ; assign address to person.Address.
Now the non-ref version:
void ChangeAddress(Customer person, Address address)
{
person.Address = address;
}
There's no indirection here, so we can get rid of one read:
mov [person+Address], address ; assign address to person.Address.
In practice, one hopes that .NET caches [person] in the ref version, amortizing the indirection cost over multiple accesses. It probably won't actually be a 50% drop in instruction count outside of trivial methods like the ones here.
Related
Is it possible to create an object that can register whether the current thread leaves the method where it was created, or to check whether this has happened when a method on the instance gets called?
ScopeValid obj;
void Method1()
{
obj = new ScopeValid();
obj.Something();
}
void Method2()
{
Method1();
obj.Something(); //Exception
}
Can this technique be accomplished? I would like to develop a mechanism similar to TypedReference and ArgIterator, which can't "escape" the current method. These types are handled specially by the compiler, so I can't mimic this behavior exactly, but I hope it is possible to create at least a similar rule with the same results - disallow accessing the object if it has escaped the method where it was created.
Note that I can't use StackFrame and compare methods, because the object might escape and return to the same method.
Changing method behavior based upon the source of the call is a bad design choice.
Some example problems to consider with such a method include:
Testability - how would you test such a method?
Refactoring the calling code - What if the user of your code just does an end run around your error message that says you can't do that in a different method than it was created? "Okay, fine! I'll just do my bad thing in the same method, says the programmer."
If the user of your code breaks it, and it's their fault, let it break. Better to just document your code with something like:
IInvalidatable - Types which implement this member should be invalidated with Invalidate() when you are done working with this.
Ignoring the obvious point that this almost seems like is re-inventing IDisposible and using { } blocks (which have language support), if the user of your code doesn't use it right, it's not really your concern.
This is likely technically possible with AOP (I'm thinking PostSharp here), but it still depends on the user using your code correctly - they would have to have it in the build process, and failing to function if they aren't using a tool just because you're trying to make it easy on them is evil.
Another point - If you are just attempting to create an object which cannot be used outside of one method, and any attempted operation outside of the method would fail, just declare it a local inside the method.
Related: How to find out which assembly handled the request
Years laters, it seems this feature was finally added to C# 7.2: ref struct.
Another related language feature is the ability to declare a value type that must be stack allocated. In other words, these types can never be created on the heap as a member of another class. The primary motivation for this feature was Span and related structures. Span may contain a managed pointer as one of its members, the other being the length of the span. It's actually implemented a bit differently because C# doesn't support pointers to managed memory outside of an unsafe context. Any write that changes the pointer and the length is not atomic. That means a Span would be subject to out of range errors or other type safety violations were it not constrained to a single stack frame. In addition, putting a managed pointer on the GC heap typically crashes at JIT time.
This prevents the code from moving the value to the heap, which partly solves my original problem. I am not sure how returning a ref struct is constrained, though.
I have been working with Java and C# for a while now and I've been asking myself this many times but haven't ever found the answer I was looking for.
When I have to call an object method (this means it's not static), I have to use the call through instance of class, for example:
MyClass myInstance = new MyClass();
myInstance.nonStaticMethod();
I see this kind of code everywhere, so I was thinking if one-line call (example below) is behaving differently performance wise or it's just the sake of standards?
This is what I meant with one-line call:
new MyClass().nonStaticMethod();
The performance would probably be the same.
However, having calls such as new MyClass().nonStaticMethod(); usually reeks of a code smell - what state was encapsulated by the object that you only needed to invoke a method on it? (i.e. Why was that not a static method?)
EDIT: I do not intend to say it is always bad - in some cases, such idioms are encouraged (such as in the case of fluent builder objects) - but you will notice that in these cases, the resulting object is still significant in some way.
I would go with the first way i.e.
MyClass myInstance = new MyClass();
myInstance.nonStaticMethod();
The problem with the second one i.e. new MyClass().nonStaticMethod();
is in case you want to call another method from same object you don't have any choice The only thing you can do is new MyClass().nonStaticMethod1(); which actually creates a new object everytime. IMHO I don't think any one of them would performance better than the other. IN lack of performance gain I would definitely choose the one which is more clear and understandable hence my choice.
If you look at the byte code generated you can prove there is absolutely no different to the performance or anything else. Except you are using a local variable which should be discarded.
Unless you have measured that you have a performance problem, you should assume you don't and guessing where a performance problem might be is just that, no matter how much experience you have performance optimizing Java applications.
When faced with a question like this, you should first consider what is the simplest and clearest, because the optimizer looks for standard patterns and if you write confusing code, you will not only confuse yourself and others but the optimizer as well and it is more likely to be slower as a result.
Assuming you never need to access the object's instance again, there is no difference. Do whichever you prefer.
Of course if you want to do anything else with that object later on you'll need it in a variable.
Ok so the title may have been confusing so i have posted 2 code snippets to illustrate what i mean.
NOTE: allUsers is just a collection.
RegularUser regUser = new RegularUser(userName, password, name, emailAddress);
allUsers.Add(regUser);
VS
allUsers.Add(new RegularUser(userName, password, name, emailAddress));
Which snippet A or B is better and why?
What are the advantages or disadvantages?
The example i wrote was C# but does the language (C#, Java etc) make a difference?
As far as C# is concerned, both of your code examples are practically identical at the IL level. The second examples still creates a reference to the created object and pushes it onto the stack, you just don't have a local variable hooked up to it. This will not create any performance problems at all.
1) Which snippet A or B is better and why?
They're really identical. The compiled code will be nearly identical, since a temporary object is pushed onto the stack, then used in the method call.
2) What are the advantages or disadvantages?
The main advantages and disadvantages to the approach are really just readability.
Your first example has the advantage of keeping a single "operation" per line of code, which, in many ways, is more maintainable.
The second example removes the unnecessary variable declaration, which may be more maintainable.
Personally, I feel that the number of parameters in your RegularUser constructor would probably push me, in this instance, towards your first option. I typically find that, when a line of code gets to be more than about half a screen width on an average monitor, it's easier to read and understand if it's split up. Splitting this up by introducing the temporary and calling Add separately makes this more clear.
However, if you're just adding an integer or a class that's very small, I'd probably vote to skip the unnecessary variable. This is completely a personal preference, however - your milage may (and probably will) vary.
3) The example i wrote was C# but does the language (C#, Java etc) make a difference?
No, for the most part. This is really language/implementation dependent, but most languages will have the same basic behavior and performance in both cases. It is possible (and highly likely) that some languages may treat this differently, but most mainstream languages will not.
I really like to create them the first way unless I really really know what is going on. It is much harder to do debugging if you don't create the object first...
The compiler will just turn the 2nd version into the 1st for you, anyway, so there isn't a net negative effect.
Pros of #1:
easier to debug (!)
theoretically easier to read, clearer
can use the object later
Cons:
more verbose
can be unnecessary, especially for trivial objects
Result:
1 for anything complex to create, or that may need to be inspected easily at debug time
2 for lots of annoying little stuff, like the following.
var list = new List<NameValuePair>(3);
list.Add( new NameValuePair("name", "valuable");
list.add( new NameValuePair("age", "valuable");
list.add( new NameValuePair("height", "not valuable");
var dates = new List<date>();
dates.Add(DateTime.Now());
dates.Add(DateTime.Now().Date().AddYears(-2));
As far as I know there isn't a real difference between languages when it comes to this. Some may not allow it, though.
Both are equal in terms of performance.
In terms of maintainability the second case is a nightmare, it is (nearly) impossible to trace in a debugger. So I tend to prefer the first one. In my early oop days I was always writing the second, because "I knew that they were objects and I was sooo good at grasping objects that I ... blah blah blah", but that wore off with time and especially maintenance time
Also, suppose that someone wants you to
FilterClass.FilterUser(regUser)
or
Database.AddToDatabase(regUser)
because it is the right place to do so, the first scenario is better.
Finally, when do you stop?
allUsers.Add(new RegularUser(new ReadFromInput(new EscapedName(new Name(new String(userName)))), password, name, emailAddress));
[ This is a result of Best Practice: Should functions return null or an empty object? but I'm trying to be very general. ]
In a lot of legacy (um...production) C++ code that I've seen, there is a tendency to write a lot of NULL (or similar) checks to test pointers. Many of these get added near the end of a release cycle when adding a NULL-check provides a quick fix to a crash caused by the pointer dereference--and there isn't a lot of time to investigate.
To combat this, I started to write code that took a (const) reference parameter instead of the (much) more common technique of passing a pointer. No pointer, no desire to check for NULL (ignoring the corner case of actually having a null reference).
In C#, the same C++ "problem" is present: the desire to check every unknown reference against null (ArgumentNullException) and to quickly fix NullReferenceExceptions by adding a null check.
It seems to me, one way to prevent this is to avoid null objects in the first place by using empty objects (String.Empty, EventArgs.Empty) instead. Another would be to throw an exception rather than return null.
I'm just starting to learn F#, but it appears there are far fewer null objects in that enviroment. So maybe you don't really have to have a lot of null references floating around?
Am I barking up the wrong tree here?
Passing non-null just to avoid a NullReferenceException is trading a straightforward, easy-to-solve problem ("it blows up because it's null") for a much more subtle, hard-to-debug problem ("something several calls down the stack is not behaving as expected because much earlier it got some object which has no meaningful information but isn't null").
NullReferenceException is a wonderful thing! It fails hard, loud, fast, and it's almost always quick and easy to identify and fix. It's my favorite exception, because I know when I see it, my task is only going to take about 2 minutes. Contrast this with a confusing QA or customer report trying to describe strange behavior that has to be reproduced and traced back to the origin. Yuck.
It all comes down to what you, as a method or piece of code, can reasonably infer about the code which called you. If you are handed a null reference, and you can reasonably infer what the caller might have meant by null (maybe an empty collection, for example?) then you should definitely just deal with the nulls. However, if you can't reasonably infer what to do with a null, or what the caller means by null (for example, the calling code is telling you to open a file and gives the location as null), you should throw an ArgumentNullException.
Maintaining proper coding practices like this at every "gateway" point - logical bounds of functionality in your code—NullReferenceExceptions should be much more rare.
I tend to be dubious of code with lots of NULLs, and try to refactor them away where possible with exceptions, empty collections, Java Optionals, and so on.
The "Introduce Null Object" pattern in Martin Fowler's Refactoring (page 260) may also be helpful. A Null Object responds to all the methods a real object would, but in a way that "does the right thing". So rather than always check an Order to see if order.getDiscountPolicy() is NULL, make sure the Order has a NullDiscountPolicy in these cases. This streamlines the control logic.
Null gets my vote. Then again, I'm of the 'fail-fast' mindset.
String.IsNullOrEmpty(...) is very helpful too, I guess it catches either situation: null or empty strings. You could write a similar function for all your classes you're passing around.
If you are writing code that returns null as an error condition, then don't: generally, you should throw an exception instead - far harder to miss.
If you are consuming code that you fear may return null, then mostly these are boneheaded exceptions: perhaps do some Debug.Assert checks at the caller to sense-check the output during development. You shouldn't really need vast numbers of null checks in you production, but if some 3rd party library returns lots of nulls unpredictably, then sure: do the checks.
In 4.0, you might want to look at code-contracts; this gives you much better control to say "this argument should never be passed in as null", "this function never returns null", etc - and have the system validate those claims during static analysis (i.e. when you build).
The thing about null is that it doesn't come with meaning. It is merely the absence of an object.
So, if you really mean an empty string/collection/whatever, always return the relevant object and never null. If the language in question allows you to specify that, do so.
In the case where you want to return something that means not a value specifiable with the static type, then you have a number of options. Returning null is one answer, but without a meaning is a little dangerous. Throwing an exception may actually be what you mean. You might want to extend the type with special cases (probably with polymorphism, that is to say the Special Case Pattern (a special case of which is the Null Object Pattern)). You might want to wrap the return value in an type with more meaning. Or you might want to pass in a callback object. There usually are many choices.
I'd say it depends. For a method returning a single object, I'd generally return null. For a method returning a collection, I'd generally return an empty collection (non-null). These are more along the lines of guidelines than rules, though.
If you are serious about wanting to program in a "nullless" environment, consider using extension methods more often, they are immune to NullReferenceExceptions and at least "pretend" that null isn't there anymore:
public static GetExtension(this string s)
{
return (new FileInfo(s ?? "")).Extension;
}
which can be called as:
// this code will never throw, not even when somePath becomes null
string somePath = GetDataFromElseWhereCanBeNull();
textBoxExtension.Text = somePath.GetExtension();
I know, this is only convenience and many people correctly consider it violation of OO principles (though the "founder" of OO, Bertrand Meyer, considers null evil and completely banished it from his OO design, which is applied to the Eiffel language, but that's another story). EDIT: Dan mentions that Bill Wagner (More Effective C#) considers it bad practice and he's right. Ever considered the IsNull extension method ;-) ?
To make your code more readable, another hint may be in place: use the null-coalescing operator more often to designate a default when an object is null:
// load settings
WriteSettings(currentUser.Settings ?? new Settings());
// example of some readonly property
public string DisplayName
{
get
{
return (currentUser ?? User.Guest).DisplayName
}
}
None of these take the occasional check for null away (and ?? is nothing more then a hidden if-branch). I prefer as little null in my code as possible, simply because I believe it makes the code more readable. When my code gets cluttered with if-statements for null, I know there's something wrong in the design and I refactor. I suggest anybody to do the same, but I know that opinions vary wildly on the matter.
(Update) Comparison with exceptions
Not mentioned in the discussion so far is the similarity with exception handling. When you find yourself ubiquitously ignoring null whenever you consider it's in your way, it is basically the same as writing:
try
{
//...code here...
}
catch (Exception) {}
which has the effect of removing any trace of the exceptions only to find it raises unrelated exceptions much later in the code. Though I consider it good to avoid using null, as mentioned before in this thread, having null for exceptional cases is good. Just don't hide them in null-ignore-blocks, it will end up having the same effect as the catch-all-exceptions blocks.
For the exception protagonists they usually stem from transactional programming and strong exception safety guarantees or blind guidelines. In any decent complexity, ie. async workflow, I/O and especially networking code they are simply inappropriate. The reason why you see Google style docs on the matter in C++, as well as all good async code 'not enforcing it' (think your favourite managed pools as well).
There is more to it and while it might look like a simplification, it really is that simple. For one you will get a lot of exceptions in something that wasn't designed for heavy exception use.. anyway I digress, read upon on this from the world's top library designers, the usual place is boost (just don't mix it up with the other camp in boost that loves exceptions, because they had to write music software :-).
In your instance, and this is not Fowler's expertise, an efficient 'empty object' idiom is only possible in C++ due to available casting mechanism (perhaps but certainly not always by means of dominance ).. On ther other hand, in your null type you are capable throwing exceptions and doing whatever you want while preserving the clean call site and structure of code.
In C# your choice can be a single instance of a type that is either good or malformed; as such it is capable of throwing acceptions or simply running as is. So it might or might not violate other contracts ( up to you what you think is better depending on the quality of code you're facing ).
In the end, it does clean up call sites, but don't forget you will face a clash with many libraries (and especially returns from containers/Dictionaries, end iterators spring to mind, and any other 'interfacing' code to the outside world ). Plus null-as-value checks are extremely optimised pieces of machine code, something to keep in mind but I will agree any day wild pointer usage without understanding constness, references and more is going to lead to different kind of mutability, aliasing and perf problems.
To add, there is no silver bullet, and crashing on null reference or using a null reference in managed space, or throwing and not handling an exception is an identical problem, despite what managed and exception world will try to sell you. Any decent environment offers a protection from those (heck you can install any filter on any OS you want, what else do you think VMs do), and there are so many other attack vectors that this one has been overhammered to pieces. Enter x86 verification from Google yet again, their own way of doing much faster and better 'IL', 'dynamic' friendly code etc..
Go with your instinct on this, weight the pros and cons and localise the effects.. in the future your compiler will optimise all that checking anyway, and far more efficiently than any runtime or compile-time human method (but not as easily for cross-module interaction).
I try to avoid returning null from a method wherever possible. There are generally two kinds of situations - when null result would be legal, and when it should never happen.
In the first case, when no result is legal, there are several solutions available to avoid null results and null checks that are associated with them: Null Object pattern and Special Case pattern are there to return substitute objects that do nothing, or do some specific thing under specific circumstances.
If it is legal to return no object, but still there are no suitable substitutes in terms of Null Object or Special Case, then I typically use the Option functional type - I can then return an empty option when there is no legal result. It is then up to the client to see what is the best way to deal with empty option.
Finally, if it is not legal to have any object returned from a method, simply because the method cannot produce its result if something is missing, then I choose to throw an exception and cut further execution.
How are empty objects better than null objects? You're just renaming the symptom. The problem is that the contracts for your functions are too loosely defined "this function might return something useful, or it might return a dummy value" (where the dummy value might be null, an "empty object", or a magic constant like -1.) But no matter how you express this dummy value, callers still have to check for it before they use the return value.
If you want to clean up your code, the solution should be to narrow down the function so that it doesn't return a dummy value in the first place.
If you have a function which might return a value, or might return nothing, then pointers are a common (and valid) way to express this. But often, your code can be refactored so that this uncertainty is removed. If you can guarantee that a function returns something meaningful, then callers can rely on it returning something meaningful, and then they don't have to check the return value.
You can't always return an empty object, because 'empty' is not always defined. For example what does it mean for an int, float or bool to be empty?
Returning a NULL pointer is not necessarily a bad practice, but I think it's a better practice to return a (const) reference (where it makes sense to do so of course).
And recently I've often used a Fallible class:
Fallible<std::string> theName = obj.getName();
if (theName)
{
// ...
}
There are various implementations available for such a class (check Google Code Search), I also created my own.
Short Version
For those who don't have the time to read my reasoning for this question below:
Is there any way to enforce a policy of "new objects only" or "existing objects only" for a method's parameters?
Long Version
There are plenty of methods which take objects as parameters, and it doesn't matter whether the method has the object "all to itself" or not. For instance:
var people = new List<Person>();
Person bob = new Person("Bob");
people.Add(bob);
people.Add(new Person("Larry"));
Here the List<Person>.Add method has taken an "existing" Person (Bob) as well as a "new" Person (Larry), and the list contains both items. Bob can be accessed as either bob or people[0]. Larry can be accessed as people[1] and, if desired, cached and accessed as larry (or whatever) thereafter.
OK, fine. But sometimes a method really shouldn't be passed a new object. Take, for example, Array.Sort<T>. The following doesn't make a whole lot of sense:
Array.Sort<int>(new int[] {5, 6, 3, 7, 2, 1});
All the above code does is take a new array, sort it, and then forget it (as its reference count reaches zero after Array.Sort<int> exits and the sorted array will therefore be garbage collected, if I'm not mistaken). So Array.Sort<T> expects an "existing" array as its argument.
There are conceivably other methods which may expect "new" objects (though I would generally think that to have such an expectation would be a design mistake). An imperfect example would be this:
DataTable firstTable = myDataSet.Tables["FirstTable"];
DataTable secondTable = myDataSet.Tables["SecondTable"];
firstTable.Rows.Add(secondTable.Rows[0]);
As I said, this isn't a great example, since DataRowCollection.Add doesn't actually expect a new DataRow, exactly; but it does expect a DataRow that doesn't already belong to a DataTable. So the last line in the code above won't work; it needs to be:
firstTable.ImportRow(secondTable.Rows[0]);
Anyway, this is a lot of setup for my question, which is: is there any way to enforce a policy of "new objects only" or "existing objects only" for a method's parameters, either in its definition (perhaps by some custom attributes I'm not aware of) or within the method itself (perhaps by reflection, though I'd probably shy away from this even if it were available)?
If not, any interesting ideas as to how to possibly accomplish this would be more than welcome. For instance I suppose if there were some way to get the GC's reference count for a given object, you could tell right away at the start of a method whether you've received a new object or not (assuming you're dealing with reference types, of course--which is the only scenario to which this question is relevant anyway).
EDIT:
The longer version gets longer.
All right, suppose I have some method that I want to optionally accept a TextWriter to output its progress or what-have-you:
static void TryDoSomething(TextWriter output) {
// do something...
if (output != null)
output.WriteLine("Did something...");
// do something else...
if (output != null)
output.WriteLine("Did something else...");
// etc. etc.
if (output != null)
// do I call output.Close() or not?
}
static void TryDoSomething() {
TryDoSomething(null);
}
Now, let's consider two different ways I could call this method:
string path = GetFilePath();
using (StreamWriter writer = new StreamWriter(path)) {
TryDoSomething(writer);
// do more things with writer
}
OR:
TryDoSomething(new StreamWriter(path));
Hmm... it would seem that this poses a problem, doesn't it? I've constructed a StreamWriter, which implements IDisposable, but TryDoSomething isn't going to presume to know whether it has exclusive access to its output argument or not. So the object either gets disposed prematurely (in the first case), or doesn't get disposed at all (in the second case).
I'm not saying this would be a great design, necessarily. Perhaps Josh Stodola is right and this is just a bad idea from the start. Anyway, I asked the question mainly because I was just curious if such a thing were possible. Looks like the answer is: not really.
No, basically.
There's really no difference between:
var x = new ...;
Foo(x);
and
Foo(new ...);
and indeed sometimes you might convert between the two for debugging purposes.
Note that in the DataRow/DataTable example, there's an alternative approach though - that DataRow can know its parent as part of its state. That's not the same thing as being "new" or not - you could have a "detach" operation for example. Defining conditions in terms of the genuine hard-and-fast state of the object makes a lot more sense than woolly terms such as "new".
Yes, there is a way to do this.
Sort of.
If you make your parameter a ref parameter, you'll have to have an existing variable as your argument. You can't do something like this:
DoSomething(ref new Customer());
If you do, you'll get the error "A ref or out argument must be an assignable variable."
Of course, using ref has other implications. However, if you're the one writing the method, you don't need to worry about them. As long as you don't reassign the ref parameter inside the method, it won't make any difference whether you use ref or not.
I'm not saying it's good style, necessarily. You shouldn't use ref or out unless you really, really need to and have no other way to do what you're doing. But using ref will make what you want to do work.
No. And if there is some reason that you need to do this, your code has improper architecture.
Short answer - no there isn't
In the vast majority of cases I usually find that the issues that you've listed above don't really matter all that much. When they do you could overload a method so that you can accept something else as a parameter instead of the object you are worried about sharing.
// For example create a method that allows you to do this:
people.Add("Larry");
// Instead of this:
people.Add(new Person("Larry"));
// The new method might look a little like this:
public void Add(string name)
{
Person person = new Person(name);
this.add(person); // This method could be private if neccessary
}
I can think of a way to do this, but I would definitely not recommend this. Just for argument's sake.
What does it mean for an object to be a "new" object? It means there is only one reference keeping it alive. An "existing" object would have more than one reference to it.
With this in mind, look at the following code:
class Program
{
static void Main(string[] args)
{
object o = new object();
Console.WriteLine(IsExistingObject(o));
Console.WriteLine(IsExistingObject(new object()));
o.ToString(); // Just something to simulate further usage of o. If we didn't do this, in a release build, o would be collected by the GC.Collect call in IsExistingObject. (not in a Debug build)
}
public static bool IsExistingObject(object o)
{
var oRef = new WeakReference(o);
#if DEBUG
o = null; // In Debug, we need to set o to null. This is not necessary in a release build.
#endif
GC.Collect();
GC.WaitForPendingFinalizers();
return oRef.IsAlive;
}
}
This prints True on the first line, False on the second.
But again, please do not use this in your code.
Let me rewrite your question to something shorter.
Is there any way, in my method, which takes an object as an argument, to know if this object will ever be used outside of my method?
And the short answer to that is: No.
Let me venture an opinion at this point: There should not be any such mechanism either.
This would complicate method calls all over the place.
If there was a method where I could, in a method call, tell if the object I'm given would really be used or not, then it's a signal to me, as a developer of that method, to take that into account.
Basically, you'd see this type of code all over the place (hypothetical, since it isn't available/supported:)
if (ReferenceCount(obj) == 1) return; // only reference is the one we have
My opinion is this: If the code that calls your method isn't going to use the object for anything, and there are no side-effects outside of modifying the object, then that code should not exist to begin with.
It's like code that looks like this:
1 + 2;
What does this code do? Well, depending on the C/C++ compiler, it might compile into something that evaluates 1+2. But then what, where is the result stored? Do you use it for anything? No? Then why is that code part of your source code to begin with?
Of course, you could argue that the code is actually a+b;, and the purpose is to ensure that the evaluation of a+b isn't going to throw an exception denoting overflow, but such a case is so diminishingly rare that a special case for it would just mask real problems, and it would be really simple to fix by just assigning it to a temporary variable.
In any case, for any feature in any programming language and/or runtime and/or environment, where a feature isn't available, the reasons for why it isn't available are:
It wasn't designed properly
It wasn't specified properly
It wasn't implemented properly
It wasn't tested properly
It wasn't documented properly
It wasn't prioritized above competing features
All of these are required to get a feature to appear in version X of application Y, be it C# 4.0 or MS Works 7.0.
Nope, there's no way of knowing.
All that gets passed in is the object reference. Whether it is 'newed' in-situ, or is sourced from an array, the method in question has no way of knowing how the parameters being passed in have been instantiated and/or where.
One way to know if an object passed to a function (or a method) has been created right before the call to the function/method is that the object has a property that is initialized with the timestamp passed from a system function; in that way, looking at that property, it would be possible to resolve the problem.
Frankly, I would not use such method because
I don't see any reason why the code should now if the passed parameter is an object right created, or if it has been created in a different moment.
The method I suggest depends from a system function that in some systems could not be present, or that could be less reliable.
With the modern CPUs, which are a way faster than the CPUs used 10 years ago, there could be the problem to use the right value for the threshold value to decide when an object has been freshly created, or not.
The other solution would be to use an object property that is set to a a value from the object creator, and that is set to a different value from all the methods of the object.
In this case the problem would be to forget to add the code to change that property in each method.
Once again I would ask to myself "Is there a really need to do this?".
As a possible partial solution if you only wanted one of an object to be consumed by a method maybe you could look at a Singleton. In this way the method in question could not create another instance if it existed already.