Why is 'using' improving C# performances - c#

It seems that in most cases the C# compiler could call Dispose() automatically. Like most cases of the using pattern look like:
public void SomeMethod()
{
...
using (var foo = new Foo())
{
...
}
// Foo isn't use after here (obviously).
...
}
Since foo isn't used (that's a very simple detection) and since its not provided as argument to another method (that's a supposition that applies to many use cases and can be extended), the compiler could automatically and immediately call Dispose() without the developper requiring to do it.
This means that in most cases the using is pretty useless if the compiler does some smart job. IDisposable seem low level enough to me to be taken in account by a compiler.
Now why isn't this done? Wouldn't that improve the performances (if the developpers are... dirty).

A couple of points:
Calling Dispose does not increase performance. IDisposable is designed for scenarios where you are using limited and/or unmanaged resources that cannot be accounted for by the runtime.
There is no clear and obvious mechanism as to how the compiler could treat IDisposable objects in the code. What makes it a candidate for being disposed of automatically and what doesn't? If the instance is (or could) be exposed outside of the method? There's nothing to say that just because I pass an object to another function or class that I want it to be usable beyond the scope of the method
Consider, for example, a factory patter that takes a Stream and deserializes an instance of a class.
public class Foo
{
public static Foo FromStream(System.IO.Stream stream) { ... }
}
And I call it:
Stream stream = new FileStream(path);
Foo foo = Foo.FromStream(stream);
Now, I may or may not want that Stream to be disposed of when the method exits. If Foo's factory reads all of the necessary data from the Stream and no longer needs it, then I would want it to be disposed of. If the Foo object has to hold on to the stream and use it over its lifetime, then I wouldn't want it to be disposed of.
Likewise, what about instances that are retrieved from something other than a constructor, like Control.CreateGraphics(). These instances could exist outside of the code, so the compiler wouldn't dispose of them automatically.
Giving the user control (and providing an idiom like the using block) makes the user's intention clear and makes it much easier to spot places where IDisposable instances are not being properly disposed of. If the compiler were to automatically dispose of some instances, then debugging would be that much more difficult as the developer had to decipher how the automatic disposal rules applied to each and every block of code that used an IDisposable object.
In the end, there are two reasons (by convention) for implementing IDisposable on a type.
You are using an unmanaged resource (meaning you're making a P/Invoke call that returns something like a handle that must be released by a different P/Invoke call)
Your type has instances of IDisposable that should be disposed of when this object's lifetime is over.
In the first case, all such types are supposed to implement a finalizer that calls Dispose and releases all unmanaged resources if the developer fails to do so (this is to prevent memory and handle leaks).

Garbage Collection (while not directly related to IDisposable, is what cleans up unused objects) isn't that simple.
Let me re-word this a little bit. Automatically calling Dispose() isn't that simple. It also won't directly increase performance. More on that a little later.
If you had the following code:
public void DoSomeWork(SqlCommand command)
{
SqlConnection conn = new SqlConnection(connString);
conn.Open();
command.Connection = conn;
// Rest of the work here
}
How would the compiler know when you were done using the conn object? Or if you passed a reference to some other method that was holding on to it?
Explicitly calling Dispose() or using a using block clearly states your intent and forces things to get cleaned up properly.
Now, back to performance. Simply calling Dispose() on an Object doesn't guarantee any performance increase. The Dispose() method is used for "cleaning up" resources when you're done with an Object.
The performance increase can come when using un-managed resources. If a managed object doesn't properly dispose of its un-managed resources, then you have a memory leak. Ugly stuff.
Leaving the determination to call Dispose() up to the compiler would take away that level of clarity and make debugging memory leaks caused by un-managed resources that much more difficult.

You're asking the compiler to perform a semantic analysis of your code. The fact that something isn't explicitly referenced after a certain point in the source does not mean that it isn't being used. If I create a chain of references and pass one out to a method, which may or may not store that reference in a property or some other persistent container, should I really expect the compiler to trace through all of that and figure out what I really meant?
Volatile entities may also be a concern.
Besides, using() {....} is more readable and intuitive, which is worth a lot in terms of maintainability.
As engineers or programmers, we strive to be efficient, but that is rarely the same thing as lazy.

Look at the MSDN Artilce for the C# Using Statement The using statement is just a short cut to keep from doing a try and finally in allot of places. Calling the dispose is not a low level functionality like Garbage Collection.
As you can see using is translated into.
{
Font font1 = new Font("Arial", 10.0f);
try
{
byte charset = font1.GdiCharSet;
}
finally
{
if (font1 != null)
((IDisposable)font1).Dispose();
}
}
How would the compiler know where to put the finally block? Does it call it on Garbage Collection?
Garabage Collection doesn't happen as soon as you leave a method. Read this article on Garbage Collection to understand it better. Only after there are no references to the object. A resource could be tied up for much longer than needed.
The thought that keeps popping into my head is that the compiler should not protect developers who do not clean up there resources. Just because a language is managed doesn't mean that it is going to protect from yourself.

C++ supports this; they call it "stack semantics for reference types". I support adding this to C#, but it will require different syntax (changing the semantics based on whether or not a local variable is passed to another method isn't a good idea).

I think that you are thinking about finalizers. Finalizers use the destructor syntax in c#, and they are called automatically by the garbage collector. Finalizers are only appropriate to use when you are cleaning up unmanaged resources.
Dispose is intended to allow for early cleanup of unmanaged resources (and it can be used to clean managed resources as well).
Detection is actually trickier than it looks. What if you have code like this:
var mydisposable = new...
AMethod(mydisposable);
// (not used again)
It's possible that some code in AMethod holds on to a reference to myDisposable.
Maybe it gets assigned to an instance variable inside of that method
Maybe myDisposable subscribes to an event inside of AMethod (then the event publisher holds a reference to myDisposable)
Maybe another thread is spawned by AMethod
Maybe mydisposable becomes "enclosed" by an anonymous method or lamba expression inside of AMethod.
All of those things make it difficult to know for absolute certain that your object is no longer in use, so Dispose is there to let a developer say "ok, I know that it's safe to run my cleanup code now);
Bear in mind also that dispose doesn't deallocate your object -- only the GC can do that. (The GC does have the magic to understand all of the scenarios that I described, and it knows when to clean up your object, and if you really need code to run when the GC detects no references, you can use a finalizer). Be careful with finalizers, though -- they are only for unmanaged allocations that your class owns.
You can read more about this stuff here:
http://msdn.microsoft.com/en-us/magazine/bb985010.aspx
and here: http://www.bluebytesoftware.com/blog/2005/04/08/DGUpdateDisposeFinalizationAndResourceManagement.aspx
If you need unmanaged handle cleanup, read about SafeHandles as well.

It's not the responsibility of the compiler to interpret the scopes in your application and do things like figure out when you no longer need memory. In fact, I'm pretty sure that's an impossible problem to solve, because there's no way for the compiler to know what your program will look like at runtime, no matter how smart it is.
This is why we have the garbage collection. The problem with garbage collection is that it runs on an indeterminate interval, and typically if an object implements IDisposable, the reason is because you want the ability to dispose of it immediately. Like, right now immediately. Constructs such as database connections aren't just disposable because they have some special work to do when they get trashed - it's also because they are scarce.

I seems difficult for the G.C. to know that you won't be using this variable anymore later in the same method. Obviously, if you leave the method, and don't keep a further reference to you variable, the G.C. will dispose it. But using using in you sample, tells the G.C. that you are sure that you will not be using this variable anymore after.

The using statement has nothing to do with performance (unless you consider avoiding resource/memory leaks as performance).
All it does for you is guarantee that the IDisposable.Dispose method is called on the object in question when it goes out of scope, even if an exception has occurred inside the using block.
The Dispose() method is then responsible for releasing any resources used by the object. These are most often unmanaged resources such as files, fonts, images etc, but could also be simple "clean-up" activities on managed objects (not garbage collection however).
Of course if the Dispose() method is implemented badly, the using statement provides zero benefit.

I think the OP is saying "why bother with 'using' when the compiler should be able to work it out magically pretty easily".
I think the OP is saying that
public void SomeMethod()
{
...
var foo = new Foo();
... do stuff with Foo ...
// Foo isn't use after here (obviously).
...
}
should be equivalent to
public void SomeMethod()
{
...
using (var foo = new Foo())
{
... do stuff with Foo ...
}
// Foo isn't use after here (obviously).
...
}
because Foo isn't used again.
The answer of course is that the compiler cannot work it out pretty easily. Garbage Collection (what magically calls "Dispose()" in .NET) is a very complicated field. Just because the symbol isn't being used below that doesn't mean that the variable isn't being used.
Take this example:
public void SomeMethod()
{
...
var foo = new Foo();
foo.DoStuffWith(someRandomObject);
someOtherClass.Method(foo);
// Foo isn't use after here (obviously).
// Or is it??
...
}
In this example, someRandomObject and someOtherClass might both have references to what Foo points out, so if we called Foo.Dispose() it would break them. You say you're just imagining the simple case, but the only 'simple case' where what you're proposing works is the case where you make no method calls from Foo and do not pass Foo or any of its members to anything else - effectively when you don't even use Foo at all in which case you probably have no need to declare it. Even then, you can never be sure that some kind of reflection or event hackery didn't get a reference to Foo just by its very creation, or that Foo didn't hook itself up with something else during its constructor.

In addition to the fine reasons listed above, since the problem can't be solved reliably for all cases, those "easy" cases are something that code analysis tools can and do detect. Let the compiler do stuff deterministically, and let your automatic code analysis tools tell you when you're doing something silly like forgetting to call Dispose.

Related

Do I have to set DataTable to null after using it? [duplicate]

Should you set all the objects to null (Nothing in VB.NET) once you have finished with them?
I understand that in .NET it is essential to dispose of any instances of objects that implement the IDisposable interface to release some resources although the object can still be something after it is disposed (hence the isDisposed property in forms), so I assume it can still reside in memory or at least in part?
I also know that when an object goes out of scope it is then marked for collection ready for the next pass of the garbage collector (although this may take time).
So with this in mind will setting it to null speed up the system releasing the memory as it does not have to work out that it is no longer in scope and are they any bad side effects?
MSDN articles never do this in examples and currently I do this as I cannot
see the harm. However I have come across a mixture of opinions so any comments are useful.
Karl is absolutely correct, there is no need to set objects to null after use. If an object implements IDisposable, just make sure you call IDisposable.Dispose() when you're done with that object (wrapped in a try..finally, or, a using() block). But even if you don't remember to call Dispose(), the finaliser method on the object should be calling Dispose() for you.
I thought this was a good treatment:
Digging into IDisposable
and this
Understanding IDisposable
There isn't any point in trying to second guess the GC and its management strategies because it's self tuning and opaque. There was a good discussion about the inner workings with Jeffrey Richter on Dot Net Rocks here: Jeffrey Richter on the Windows Memory Model and
Richters book CLR via C# chapter 20 has a great treatment:
Another reason to avoid setting objects to null when you are done with them is that it can actually keep them alive for longer.
e.g.
void foo()
{
var someType = new SomeType();
someType.DoSomething();
// someType is now eligible for garbage collection
// ... rest of method not using 'someType' ...
}
will allow the object referred by someType to be GC'd after the call to "DoSomething" but
void foo()
{
var someType = new SomeType();
someType.DoSomething();
// someType is NOT eligible for garbage collection yet
// because that variable is used at the end of the method
// ... rest of method not using 'someType' ...
someType = null;
}
may sometimes keep the object alive until the end of the method. The JIT will usually optimized away the assignment to null, so both bits of code end up being the same.
No don't null objects. You can check out https://web.archive.org/web/20160325050833/http://codebetter.com/karlseguin/2008/04/28/foundations-of-programming-pt-7-back-to-basics-memory/ for more information, but setting things to null won't do anything, except dirty your code.
Also:
using(SomeObject object = new SomeObject())
{
// do stuff with the object
}
// the object will be disposed of
In general, there's no need to null objects after use, but in some cases I find it's a good practice.
If an object implements IDisposable and is stored in a field, I think it's good to null it, just to avoid using the disposed object. The bugs of the following sort can be painful:
this.myField.Dispose();
// ... at some later time
this.myField.DoSomething();
It's good to null the field after disposing it, and get a NullPtrEx right at the line where the field is used again. Otherwise, you might run into some cryptic bug down the line (depending on exactly what DoSomething does).
Chances are that your code is not structured tightly enough if you feel the need to null variables.
There are a number of ways to limit the scope of a variable:
As mentioned by Steve Tranby
using(SomeObject object = new SomeObject())
{
// do stuff with the object
}
// the object will be disposed of
Similarly, you can simply use curly brackets:
{
// Declare the variable and use it
SomeObject object = new SomeObject()
}
// The variable is no longer available
I find that using curly brackets without any "heading" to really clean out the code and help make it more understandable.
In general no need to set to null. But suppose you have a Reset functionality in your class.
Then you might do, because you do not want to call dispose twice, since some of the Dispose may not be implemented correctly and throw System.ObjectDisposed exception.
private void Reset()
{
if(_dataset != null)
{
_dataset.Dispose();
_dataset = null;
}
//..More such member variables like oracle connection etc. _oraConnection
}
The only time you should set a variable to null is when the variable does not go out of scope and you no longer need the data associated with it. Otherwise there is no need.
this kind of "there is no need to set objects to null after use" is not entirely accurate. There are times you need to NULL the variable after disposing it.
Yes, you should ALWAYS call .Dispose() or .Close() on anything that has it when you are done. Be it file handles, database connections or disposable objects.
Separate from that is the very practical pattern of LazyLoad.
Say I have and instantiated ObjA of class A. Class A has a public property called PropB of class B.
Internally, PropB uses the private variable of _B and defaults to null. When PropB.Get() is used, it checks to see if _PropB is null and if it is, opens the resources needed to instantiate a B into _PropB. It then returns _PropB.
To my experience, this is a really useful trick.
Where the need to null comes in is if you reset or change A in some way that the contents of _PropB were the child of the previous values of A, you will need to Dispose AND null out _PropB so LazyLoad can reset to fetch the right value IF the code requires it.
If you only do _PropB.Dispose() and shortly after expect the null check for LazyLoad to succeed, it won't be null, and you'll be looking at stale data. In effect, you must null it after Dispose() just to be sure.
I sure wish it were otherwise, but I've got code right now exhibiting this behavior after a Dispose() on a _PropB and outside of the calling function that did the Dispose (and thus almost out of scope), the private prop still isn't null, and the stale data is still there.
Eventually, the disposed property will null out, but that's been non-deterministic from my perspective.
The core reason, as dbkk alludes is that the parent container (ObjA with PropB) is keeping the instance of _PropB in scope, despite the Dispose().
Stephen Cleary explains very well in this post: Should I Set Variables to Null to Assist Garbage Collection?
Says:
The Short Answer, for the Impatient
Yes, if the variable is a static field, or if you are writing an enumerable method (using yield return) or an asynchronous method (using async and await). Otherwise, no.
This means that in regular methods (non-enumerable and non-asynchronous), you do not set local variables, method parameters, or instance fields to null.
(Even if you’re implementing IDisposable.Dispose, you still should not set variables to null).
The important thing that we should consider is Static Fields.
Static fields are always root objects, so they are always considered “alive” by the garbage collector. If a static field references an object that is no longer needed, it should be set to null so that the garbage collector will treat it as eligible for collection.
Setting static fields to null is meaningless if the entire process is shutting down. The entire heap is about to be garbage collected at that point, including all the root objects.
Conclusion:
Static fields; that’s about it. Anything else is a waste of time.
There are some cases where it makes sense to null references. For instance, when you're writing a collection--like a priority queue--and by your contract, you shouldn't be keeping those objects alive for the client after the client has removed them from the queue.
But this sort of thing only matters in long lived collections. If the queue's not going to survive the end of the function it was created in, then it matters a whole lot less.
On a whole, you really shouldn't bother. Let the compiler and GC do their jobs so you can do yours.
Take a look at this article as well: http://www.codeproject.com/KB/cs/idisposable.aspx
For the most part, setting an object to null has no effect. The only time you should be sure to do so is if you are working with a "large object", which is one larger than 84K in size (such as bitmaps).
I believe by design of the GC implementors, you can't speed up GC with nullification. I'm sure they'd prefer you not worry yourself with how/when GC runs -- treat it like this ubiquitous Being protecting and watching over and out for you...(bows head down, raises fist to the sky)...
Personally, I often explicitly set variables to null when I'm done with them as a form of self documentation. I don't declare, use, then set to null later -- I null immediately after they're no longer needed. I'm saying, explicitly, "I'm officially done with you...be gone..."
Is nullifying necessary in a GC'd language? No. Is it helpful for the GC? Maybe yes, maybe no, don't know for certain, by design I really can't control it, and regardless of today's answer with this version or that, future GC implementations could change the answer beyond my control. Plus if/when nulling is optimized out it's little more than a fancy comment if you will.
I figure if it makes my intent clearer to the next poor fool who follows in my footsteps, and if it "might" potentially help GC sometimes, then it's worth it to me. Mostly it makes me feel tidy and clear, and Mongo likes to feel tidy and clear. :)
I look at it like this: Programming languages exist to let people give other people an idea of intent and a compiler a job request of what to do -- the compiler converts that request into a different language (sometimes several) for a CPU -- the CPU(s) could give a hoot what language you used, your tab settings, comments, stylistic emphases, variable names, etc. -- a CPU's all about the bit stream that tells it what registers and opcodes and memory locations to twiddle. Many things written in code don't convert into what's consumed by the CPU in the sequence we specified. Our C, C++, C#, Lisp, Babel, assembler or whatever is theory rather than reality, written as a statement of work. What you see is not what you get, yes, even in assembler language.
I do understand the mindset of "unnecessary things" (like blank lines) "are nothing but noise and clutter up code." That was me earlier in my career; I totally get that. At this juncture I lean toward that which makes code clearer. It's not like I'm adding even 50 lines of "noise" to my programs -- it's a few lines here or there.
There are exceptions to any rule. In scenarios with volatile memory, static memory, race conditions, singletons, usage of "stale" data and all that kind of rot, that's different: you NEED to manage your own memory, locking and nullifying as apropos because the memory is not part of the GC'd Universe -- hopefully everyone understands that. The rest of the time with GC'd languages it's a matter of style rather than necessity or a guaranteed performance boost.
At the end of the day make sure you understand what is eligible for GC and what's not; lock, dispose, and nullify appropriately; wax on, wax off; breathe in, breathe out; and for everything else I say: If it feels good, do it. Your mileage may vary...as it should...
I think setting something back to null is messy. Imagine a scenario where the item being set to now is exposed say via property. Now is somehow some piece of code accidentally uses this property after the item is disposed you will get a null reference exception which requires some investigation to figure out exactly what is going on.
I believe framework disposables will allows throw ObjectDisposedException which is more meaningful. Not setting these back to null would be better then for that reason.
Some object suppose the .dispose() method which forces the resource to be removed from memory.

Empty "using" statement in Dispose

Recently I've seen some code written as follows:
public void Dipose()
{
using(_myDisposableField) { }
}
This seems pretty strange to me, I'd prefer to see myDisposableField.Dispose();
What reasons are there for using "using" to dispose your objects over explicitly making the call?
No, none at all. It will just compile into an empty try/finally and end up calling Dispose.
Remove it. You'll make the code faster, more readable, and perhaps most importantly (as you continue reading below) more expressive in its intent.
Update: they were being slightly clever, equivalent code needs a null check and as per Jon Skeet's advice, also take a local copy if multi-threading is involved (in the same manner as the standard event invocation pattern to avoid a race between the null check and method call).
IDisposable tmp = _myDisposableField;
if (tmp != null)
tmp.Dispose();
From what I can see in the IL of a sample app I've written, it looks like you also need to treat _myDisposableField as IDisposable directly. This will be important if any type implements the IDisposable interface explicitly and also provides a public void Dispose() method at the same time.
This code also doesn't attempt to replicate the try-finally that exists when using using, but it is sort of assumed that this is deemed unnecessary. As Michael Graczyk points out in the comments, however, the use of the finally offers protection against exceptions, in particular the ThreadAbortException (which could occur at any point). That said, the window for this to actually happen in is very small.
Although, I'd stake a fiver on the fact they did this not truly understanding what subtle "benefits" it gave them.
There is a very subtle but evil bug in the example you posted.
While it "compiles" down to:
try {}
finally
{
if (_myDisposableField != null)
((IDisposable)_myDisposableField).Dispose();
}
objects should be instantiated within the using clause and not outside:
You can instantiate the resource object and then pass the variable to the using statement, but this isn't a best practice. In this case, after control leaves the using block, the object remains in scope but probably has no access to its unmanaged resources. In other words, it's not fully initialized anymore. If you try to use the object outside the using block, you risk causing an exception to be thrown. For this reason, it's better to instantiate the object in the using statement and limit its scope to the using block.
—using statement (C# Reference)
In other words, it's dirty and hacky.
The clean version is extremely clearly spelled out on MSDN:
if you can limit the use of an instance to a method, then use a using block with the constructor call on its border. Do not use Dispose directly.
if you need (but really need) to keep an instance alive until the parent is disposed, then dispose explicitly using the Disposable pattern and nothing else. There are different ways of implementing a dispose cascade, however they need to be all done similarly to avoid very subtle and hard to catch bugs. There's a very good resource on MSDN in the Framework Design Guidelines.
Finally, please note the following you should only use the IDisposable pattern if you use unmanaged resources. Make sure it's really needed :-)
As already discussed in this answer, this is a cheeky way of avoiding a null test, but: there can be more to it than that. In modern C#, in many cases you could achieve similar with a null-conditional operator:
public void Dipose()
=> _myDisposableField?.Dispose();
However, it is not required that the type (of _myDisposableField) has Dispose() on the public API; it could be:
public class Foo : IDisposable {
void IDisposable.Dispose() {...}
}
or even worse:
public class Bar : IDisposable {
void IDisposable.Dispose() {...}
public void Dispose() {...} // some completely different meaning! DO NOT DO THIS!
}
In the first case, Dispose() will fail to find the method, and in the second case, Dispose() will invoke the wrong method. In either of these cases, the using trick will work, as will a cast (although this will do slightly different things again if it is a value-type):
public void Dipose()
=> ((IDisposable)_myDisposableField)?.Dispose();
If you aren't sure whether the type is disposable (which happens in some polymorphism scenarios), you could also use either:
public void Dipose()
=> (_myDisposableField as IDisposable)?.Dispose();
or:
public void Dipose()
{
using (_myDisposableField as IDisposable) {}
}
The using statement defines the span of code after which the referenced object should be disposed.
Yes, you could just call a .dispose once it was done with but it would be less clear (IMHO) what the scope of the object was. YMMV.

Generic function to handle disposing IDisposable objects

I am working on a class that deals with a lot of Sql objects - Connection, Command, DataAdapter, CommandBuilder, etc. There are multiple instances where we have code like this:
if( command != null )
{
command.Dispose();
}
if( dataAdapter != null )
{
dataAdapter.Dispose();
}
etc
I know this is fairly insufficient in terms of duplication, but it has started smelling. The reason why I think it smells is because in some instances the object is also set to null.
if( command != null )
{
command.Dispose();
command = null;
}
I would love to get rid of the duplication if possible. I have come up with this generic method to dispose of an object and set it to null.
private void DisposeObject<TDisposable>( ref TDisposable disposableObject )
where TDisposable : class, IDisposable
{
if( disposableObject != null )
{
disposableObject.Dispose();
disposableObject = null;
}
}
My questions are...
Is this generic function a bad idea?
Is it necessary to set the object to null?
EDIT:
I am aware of the using statement, however I cannot always use it because I have some member variables that need to persist longer than one call. For example the connection and transaction objects.
Thanks!
You should consider if you can use the using statement.
using (SqlCommand command = ...)
{
// ...
}
This ensures that Dispose is called on the command object when control leaves the scope of the using. This has a number of advantages over writing the clean up code yourself as you have done:
It's more concise.
The variable will never be set to null - it is declared and initialized in one statement.
The variable goes out of scope when the object is disposed so you reduce the risk of accidentally trying to access a disposed resource.
It is exception safe.
If you nest using statements then the resources are naturally disposed in the correct order (the reverse order that they were created in).
Is it necessary to set the object to null?
It is not usually necessary to set variables to null when you have finished using them. The important thing is that you call Dispose when you have finished using the resource. If you use the above pattern, it is not only unnecessary to set the variable to null - it will give a compile error:
Cannot assign to 'c' because it is a 'using variable'
One thing to note is that using only works if an object is acquired and disposed in the same method call. You cannot use this pattern is if your resources need to stay alive for the duration of more than one method call. In this case you might want to make your class implement IDisposable and make sure the resources are cleaned up when your Dispose method is called. I this case you will need code like you have written. Setting variables to null is not wrong in this case but it is not important because the garbage collector will clean up the memory correctly anyway. The important thing is to make sure all the resources you own are disposed when your dispose method is called, and you are doing that.
A couple of implementation details:
You should ensure that if your Dispose is called twice that it does not throw an exception. Your utility function handles this case correctly.
You should ensure that the relevant methods on your object raise an ObjectDisposedException if your object has already been disposed.
You should implement IDisposable in the class that owns these fields. See my blog post on the subject. If this doesn't work well, then the class is not following OOP principles, and needs to be refactored.
It is not necessary to set variables to null after disposing them.
If you're objects have a lot of cleanup to do, they might want to track what needs to be deleted in a separate list of disposables, and handle all of them at once. Then at teardown it doesn't need to remember everything that needs to be disposed (nor does it need to check for null, it just looks in the list).
This probably doesn't build, but for expanatory purposes, you could include a RecycleBin in your class. Then the class only needs to dispose the bin.
public class RecycleBin : IDisposable
{
private List<IDisposable> _toDispose = new List<IDisposable>();
public void RememberToDispose(IDisposable disposable)
{
_toDispose.Add(disposable);
}
public void Dispose()
{
foreach(var d in _toDispose)
d.Dispose();
_toDispose.Clear();
}
}
I assume these are fields and not local variables, hence why the using keyword doesn't make sense.
Is this generic function a bad idea?
I think it's a good idea, and I've used a similar function a few times; +1 for making it generic.
Is it necessary to set the object to null?
Technically an object should allow multiple calls to its Dispose method. (For instance, this happens if an object is resurrected during finalization.) In practice, it's up to you whether you trust the authors of these classes or whether you want to code defensively. Personally, I check for null, then set references to null afterwards.
Edit: If this code is inside your own object's Dispose method then failing to set references to null won't leak memory. Instead, it's handy as a defence against double disposal.
I'm going to assume that you are creating the resource in one method, disposing it in another, and using it in one or more others, making the using statement useless for you.
In which case, your method is perfectly fine.
As far the second part of your question ("Is setting it to null necessary?"), the simple answer is "No, but it's not hurting anything".
Most objects hold one resource -- memory, which the Garbage collection deals with freeing, so we don't have to worry about it. Some hold some other resource as well : a file handle, a database connection, etc. For the second category, we must implement IDisposable, to free up that other resource.
Once the Dispose method is called, both categories because the same : they are holding memory. In this case, we can just let the variable go out of scope, dropping the reference to the memory, and allowing to GC to free it eventually -- Or we can force the issue, by setting the variable to null, and explicitly dropping the reference to the memory. We still have to wait until the GC kicks in for the memory to actually be freed, and more than likely the variable will go out of scope anyway, moments after to set it to null, so in the vast majority of cases, it will have no effect at all, but in a few rare cases, it will allow the memory to be freed a few seconds earlier.
However, if your specific case, where you are checking for null to see if you should call Dispose at all, you probably should set it to null, if there is a chance you might call Dispose() twice.
Given that iDisposable does not include any standard way of determining whether an object has been disposed, I like to set things to null when I dispose of them. To be sure, disposing of an object which has already been disposed is harmless, but it's nice to be able to examine an object in a watch window and tell at a glance which fields have been disposed. It's also nice to be able to have code test to ensure that objects which should have been disposed, are (assuming the code adheres to the convention of nulling out variables when disposing of them, and at no other time).
Why don't you use using C# construct? http://msdn.microsoft.com/en-us/library/yh598w02.aspx
Setting to null if not required.
Others have recommended the using construct, which I also recommend. However, I'd like to point out that, even if you do need a utility method, it's completely unnecessary to make it generic in the way that you've done. Simply declare your method to take an IDisposable:
private static void DisposeObject( ref IDisposable disposableObject )
{
if( disposableObject != null )
{
disposableObject.Dispose();
disposableObject = null;
}
}
You never need to set variables to null. The whole point of IDisposable.Dispose is to get the object into a state whereby it can hang around in memory harmlessly until the GC finalises it, so you just "dispose and forget".
I'm curious as to why you think you can't use the using statement. Creating an object in one method and disposing it in a different method is a really big code smell in my book, because you soon lose track of what's open where. Better to refactor the code like this:
using(var xxx = whatever()) {
LotsOfProcessing(xxx);
EvenMoreProcessing(xxx);
NowUseItAgain(xxx);
}
I'm sure there is a standard pattern name for this, but I just call it "destroy everything you create, but nothing else".
I answered this in my question and answer here:
Implementing IDisposable (the Disposable Pattern) as a service (class member)
It implemented a simplistic, reusable component that works for any IDisposable member.

Do you need to dispose of objects and set them to null?

Do you need to dispose of objects and set them to null, or will the garbage collector clean them up when they go out of scope?
Objects will be cleaned up when they are no longer being used and when the garbage collector sees fit. Sometimes, you may need to set an object to null in order to make it go out of scope (such as a static field whose value you no longer need), but overall there is usually no need to set to null.
Regarding disposing objects, I agree with #Andre. If the object is IDisposable it is a good idea to dispose it when you no longer need it, especially if the object uses unmanaged resources. Not disposing unmanaged resources will lead to memory leaks.
You can use the using statement to automatically dispose an object once your program leaves the scope of the using statement.
using (MyIDisposableObject obj = new MyIDisposableObject())
{
// use the object here
} // the object is disposed here
Which is functionally equivalent to:
MyIDisposableObject obj;
try
{
obj = new MyIDisposableObject();
}
finally
{
if (obj != null)
{
((IDisposable)obj).Dispose();
}
}
Objects never go out of scope in C# as they do in C++. They are dealt with by the Garbage Collector automatically when they are not used anymore. This is a more complicated approach than C++ where the scope of a variable is entirely deterministic. CLR garbage collector actively goes through all objects that have been created and works out if they are being used.
An object can go "out of scope" in one function but if its value is returned, then GC would look at whether or not the calling function holds onto the return value.
Setting object references to null is unnecessary as garbage collection works by working out which objects are being referenced by other objects.
In practice, you don't have to worry about destruction, it just works and it's great :)
Dispose must be called on all objects that implement IDisposable when you are finished working with them. Normally you would use a using block with those objects like so:
using (var ms = new MemoryStream()) {
//...
}
EDIT On variable scope. Craig has asked whether the variable scope has any effect on the object lifetime. To properly explain that aspect of CLR, I'll need to explain a few concepts from C++ and C#.
Actual variable scope
In both languages the variable can only be used in the same scope as it was defined - class, function or a statement block enclosed by braces. The subtle difference, however, is that in C#, variables cannot be redefined in a nested block.
In C++, this is perfectly legal:
int iVal = 8;
//iVal == 8
if (iVal == 8){
int iVal = 5;
//iVal == 5
}
//iVal == 8
In C#, however you get a a compiler error:
int iVal = 8;
if(iVal == 8) {
int iVal = 5; //error CS0136: A local variable named 'iVal' cannot be declared in this scope because it would give a different meaning to 'iVal', which is already used in a 'parent or current' scope to denote something else
}
This makes sense if you look at generated MSIL - all the variables used by the function are defined at the start of the function. Take a look at this function:
public static void Scope() {
int iVal = 8;
if(iVal == 8) {
int iVal2 = 5;
}
}
Below is the generated IL. Note that iVal2, which is defined inside the if block is actually defined at function level. Effectively this means that C# only has class and function level scope as far as variable lifetime is concerned.
.method public hidebysig static void Scope() cil managed
{
// Code size 19 (0x13)
.maxstack 2
.locals init ([0] int32 iVal,
[1] int32 iVal2,
[2] bool CS$4$0000)
//Function IL - omitted
} // end of method Test2::Scope
C++ scope and object lifetime
Whenever a C++ variable, allocated on the stack, goes out of scope it gets destructed. Remember that in C++ you can create objects on the stack or on the heap. When you create them on the stack, once execution leaves the scope, they get popped off the stack and gets destroyed.
if (true) {
MyClass stackObj; //created on the stack
MyClass heapObj = new MyClass(); //created on the heap
obj.doSomething();
} //<-- stackObj is destroyed
//heapObj still lives
When C++ objects are created on the heap, they must be explicitly destroyed, otherwise it is a memory leak. No such problem with stack variables though.
C# Object Lifetime
In CLR, objects (i.e. reference types) are always created on the managed heap. This is further reinforced by object creation syntax. Consider this code snippet.
MyClass stackObj;
In C++ this would create an instance on MyClass on the stack and call its default constructor. In C# it would create a reference to class MyClass that doesn't point to anything. The only way to create an instance of a class is by using new operator:
MyClass stackObj = new MyClass();
In a way, C# objects are a lot like objects that are created using new syntax in C++ - they are created on the heap but unlike C++ objects, they are managed by the runtime, so you don't have to worry about destructing them.
Since the objects are always on the heap the fact that object references (i.e. pointers) go out of scope becomes moot. There are more factors involved in determining if an object is to be collected than simply presence of references to the object.
C# Object references
Jon Skeet compared object references in Java to pieces of string that are attached to the balloon, which is the object. Same analogy applies to C# object references. They simply point to a location of the heap that contains the object. Thus, setting it to null has no immediate effect on the object lifetime, the balloon continues to exist, until the GC "pops" it.
Continuing down the balloon analogy, it would seem logical that once the balloon has no strings attached to it, it can be destroyed. In fact this is exactly how reference counted objects work in non-managed languages. Except this approach doesn't work for circular references very well. Imagine two balloons that are attached together by a string but neither balloon has a string to anything else. Under simple ref counting rules, they both continue to exist, even though the whole balloon group is "orphaned".
.NET objects are a lot like helium balloons under a roof. When the roof opens (GC runs) - the unused balloons float away, even though there might be groups of balloons that are tethered together.
.NET GC uses a combination of generational GC and mark and sweep. Generational approach involves the runtime favouring to inspect objects that have been allocated most recently, as they are more likely to be unused and mark and sweep involves runtime going through the whole object graph and working out if there are object groups that are unused. This adequately deals with circular dependency problem.
Also, .NET GC runs on another thread(so called finalizer thread) as it has quite a bit to do and doing that on the main thread would interrupt your program.
As others have said you definitely want to call Dispose if the class implements IDisposable. I take a fairly rigid position on this. Some might claim that calling Dispose on DataSet, for example, is pointless because they disassembled it and saw that it did not do anything meaningful. But, I think there are fallacies abound in that argument.
Read this for an interesting debate by respected individuals on the subject. Then read my reasoning here why I think Jeffery Richter is in the wrong camp.
Now, on to whether or not you should set a reference to null. The answer is no. Let me illustrate my point with the following code.
public static void Main()
{
Object a = new Object();
Console.WriteLine("object created");
DoSomething(a);
Console.WriteLine("object used");
a = null;
Console.WriteLine("reference set to null");
}
So when do you think the object referenced by a is eligible for collection? If you said after the call to a = null then you are wrong. If you said after the Main method completes then you are also wrong. The correct answer is that it is eligible for collection sometime during the call to DoSomething. That is right. It is eligible before the reference is set to null and perhaps even before the call to DoSomething completes. That is because the JIT compiler can recognize when object references are no longer dereferenced even if they are still rooted.
You never need to set objects to null in C#. The compiler and runtime will take care of figuring out when they are no longer in scope.
Yes, you should dispose of objects that implement IDisposable.
If the object implements IDisposable, then yes, you should dispose it. The object could be hanging on to native resources (file handles, OS objects) that might not be freed immediately otherwise. This can lead to resource starvation, file-locking issues, and other subtle bugs that could otherwise be avoided.
See also Implementing a Dispose Method on MSDN.
I agree with the common answer here that yes you should dispose and no you generally shouldn't set the variable to null... but I wanted to point out that dispose is NOT primarily about memory management. Yes, it can help (and sometimes does) with memory management, but it's primary purpose is to give you deterministic releasing of scarce resources.
For example, if you open a hardware port (serial for example), a TCP/IP socket, a file (in exclusive access mode) or even a database connection you have now prevented any other code from using those items until they are released. Dispose generally releases these items (along with GDI and other "os" handles etc. which there are 1000's of available, but are still limited overall). If you don't call dipose on the owner object and explicitly release these resources, then try to open the same resource again in the future (or another program does) that open attempt will fail because your undisposed, uncollected object still has the item open. Of course, when the GC collects the item (if the Dispose pattern has been implemented correctly) the resource will get released... but you don't know when that will be, so you don't know when it's safe to re-open that resource. This is the primary issue Dispose works around. Of course, releasing these handles often releases memory too, and never releasing them may never release that memory... hence all the talk about memory leaks, or delays in memory clean up.
I have seen real world examples of this causing problems. For instance, I have seen ASP.Net web applications that eventually fail to connect to the database (albeit for short periods of time, or until the web server process is restarted) because the sql server 'connection pool is full'... i.e, so many connections have been created and not explicitly released in so short a period of time that no new connections can be created and many of the connections in the pool, although not active, are still referenced by undiposed and uncollected objects and so can't be reused. Correctly disposing the database connections where necessary ensures this problem doesn't happen (at least not unless you have very high concurrent access).
If they implement the IDisposable interface then you should dispose them. The garbage collector will take care of the rest.
EDIT: best is to use the using command when working with disposable items:
using(var con = new SqlConnection("..")){ ...
Always call dispose. It is not worth the risk. Big managed enterprise applications should be treated with respect. No assumptions can be made or else it will come back to bite you.
Don't listen to leppie.
A lot of objects don't actually implement IDisposable, so you don't have to worry about them. If they genuinely go out of scope they will be freed automatically. Also I have never come across the situation where I have had to set something to null.
One thing that can happen is that a lot of objects can be held open. This can greatly increase the memory usage of your application. Sometimes it is hard to work out whether this is actually a memory leak, or whether your application is just doing a lot of stuff.
Memory profile tools can help with things like that, but it can be tricky.
In addition always unsubscribe from events that are not needed. Also be careful with WPF binding and controls. Not a usual situation, but I came across a situation where I had a WPF control that was being bound to an underlying object. The underlying object was large and took up a large amount of memory. The WPF control was being replaced with a new instance, and the old one was still hanging around for some reason. This caused a large memory leak.
In hindsite the code was poorly written, but the point is that you want to make sure that things that are not used go out of scope. That one took a long time to find with a memory profiler as it is hard to know what stuff in memory is valid, and what shouldn't be there.
When an object implements IDisposable you should call Dispose (or Close, in some cases, that will call Dispose for you).
You normally do not have to set objects to null, because the GC will know that an object will not be used anymore.
There is one exception when I set objects to null. When I retrieve a lot of objects (from the database) that I need to work on, and store them in a collection (or array). When the "work" is done, I set the object to null, because the GC does not know I'm finished working with it.
Example:
using (var db = GetDatabase()) {
// Retrieves array of keys
var keys = db.GetRecords(mySelection);
for(int i = 0; i < keys.Length; i++) {
var record = db.GetRecord(keys[i]);
record.DoWork();
keys[i] = null; // GC can dispose of key now
// The record had gone out of scope automatically,
// and does not need any special treatment
}
} // end using => db.Dispose is called
Normally, there's no need to set fields to null. I'd always recommend disposing unmanaged resources however.
From experience I'd also advise you to do the following:
Unsubscribe from events if you no longer need them.
Set any field holding a delegate or an expression to null if it's no longer needed.
I've come across some very hard to find issues that were the direct result of not following the advice above.
A good place to do this is in Dispose(), but sooner is usually better.
In general, if a reference exists to an object the garbage collector (GC) may take a couple of generations longer to figure out that an object is no longer in use. All the while the object remains in memory.
That may not be a problem until you find that your app is using a lot more memory than you'd expect. When that happens, hook up a memory profiler to see what objects are not being cleaned up. Setting fields referencing other objects to null and clearing collections on disposal can really help the GC figure out what objects it can remove from memory. The GC will reclaim the used memory faster making your app a lot less memory hungry and faster.
I have to answer, too.
The JIT generates tables together with the code from it's static analysis of variable usage.
Those table entries are the "GC-Roots" in the current stack frame. As the instruction pointer advances, those table entries become invalid and so ready for garbage collection.
Therefore: If it is a scoped variable, you don't need to set it to null - the GC will collect the object.
If it is a member or a static variable, you have to set it to null
A little late to the party, but there is one scenario that I don't think has been mentioned here - if class A implements IDisposable, and exposes public properties that are also IDisposable objects, then I think it's good practice for class A not only to dispose of the disposable objects that it has created in its Dispose method, but also to set them to null. The reason for this is that disposing an object and letting it get GCed (because there are no more references to it) are by no means the same thing, although it is pretty definitely a bug if it happens. If a client of Class A does dispose its object of type ClassA, the object still exists. If the client then tries to access one of these public properties (which have also now been disposed) the results can be quite unexpected. If they have been nulled as well as disposed, there will be a null reference exception immediately, which will make the problem easier to diagnose.

C#: Is there an Advantage to Disposing Resources in Reverse Order of their Allocation?

Many years ago, I was admonished to, whenever possible, release resources in reverse order to how they were allocated. That is:
block1 = malloc( ... );
block2 = malloc( ... );
... do stuff ...
free( block2 );
free( block1 );
I imagine on a 640K MS-DOS machine, this could minimize heap fragmentation. Is there any practical advantage to doing this in a C# /.NET application, or is this a habit that has outlived its relevance?
If your resources are created well, this shouldn't matter (much).
However, many poorly created libraries don't do proper checking. Disposing of resources in reverse of their allocation typically means that you're disposing of resource dependent on other resources first - which can prevent poorly written libraries from causing problems. (You never dispose of a resource, then use one that's depending on the first's existence in this case.)
It also is good practice, since you're not going to accidentally dispose a resource required by some other object too early.
Here's an example: look at a database operation. You don't want to close/dispose your connection before closing/disposing your command (which uses the connection).
Don't bother. The GarbageCollector reserves the right to defragment and move objects on the heap, so there's no telling what order things are in.
In addition, if you're disposing A and B and A references B it shouldn't matter if A disposes B when you dispose A, since the Dispose method should be callable more than once without an exception being thrown.
If you are referring to the time the destructor on the objects gets called, then that's up the garbage collector, the programming can have very little influence over that, and it is explicity non-deterministic according to the language definition.
If you are referring to calling IDisposable.Dispose(), then that depends on the behavior of the objects that implement the IDisposable interface.
In general, the order doesn't matter for most Framework objects, except to the extent that it matters to the calling code. But if object A maintains a dependency on object B, and object B is disposed, then it could very well be important not to do certain things with object A.
In most cases, Dispose() is not called directly, but rather it is called implicitly as part of a using or foreach statement, in which case the reverse-order pattern will naturally emerge, according to the statement embedding.
using(Foo foo = new Foo())
using(FooDoodler fooDoodler = new FooDoodler(foo))
{
// do stuff
// ...
// fooDoodler automatically gets disposed before foo at the end of the using statement.
}
Nested 'usings' shows you the 'outlived' is not really on, and rarely is (not to go and say never after 40 years of evidence).. And that includes the stack-based VM that runs on say CMOS.
[ Despite some attempts by MSDN.com and Duffius to make it vanish, you know manage it all for you the difference between the heap and stack. What a smart idea.. in space ]
"The runtime doesn't make any guarantees as to the order in which Finalize methods are called. For example, let's say there is an object that contains a pointer to an inner object. The garbage collector has detected that both objects are garbage. Furthermore, say that the inner object's Finalize method gets called first. Now, the outer object's Finalize method is allowed to access the inner object and call methods on it, but the inner object has been finalized and the results may be unpredictable. For this reason, it is strongly recommended that Finalize methods not access any inner, member objects."
http://msdn.microsoft.com/en-us/magazine/bb985010.aspx
So you can worry about your LIFO dispose semantics as much as you like, but if you leak one, the Dispose()'s are going to be called in whatever order the CLR fancies.
(This is more or less what Will said, above)

Categories