Clean Up Vs Memory Reclaim in .Net

Clean Up Vs Memory Reclaim in .Net - c#

I was reading this MSDN reference:
Although the garbage collector is able
to track the lifetime of an object
that encapsulates an unmanaged
resource, it does not have specific
knowledge about how to clean up the
resource. For these types of objects,
the .NET Framework provides the
Object.Finalize method, which allows
an object to clean up its unmanaged
resources properly when the garbage
collector reclaims the memory used by
the object. By default, the Finalize
method does nothing. If you want the
garbage collector to perform cleanup
operations on your object before it
reclaims the object's memory, you must
override the Finalize method in your
class.
I understand how GC works but this give me a thought that what is actually CleanUp? Is it just reclaiming memory if it is than why it is having different name?

Beware that this is not the full story either, as finalizing only occurs when the object is garbage collected. In actual fact you should release all unmanaged resources (file handles, mutexes, unmanaged memory) as soon as possible. You should have a look at the IDisposable interface, which defines the Dispose() function.
Wherever possible your disposer should run the same method to free resources as the finalizer would, but then call GC.SuppressFinalize() to stop it from running again (in the finalizer), as there is a minor performance hit when using objects that implement finalizers.

They used a generic phrase such as "clean up" because other things may need to be done besides just reclaiming memory. I can see how this may be a little confusing, since the quote mentions cleaning up resources and reclaiming memory in the same sentence. In that case, what they mean is that the garbage collector reclaims the memory used by the managed code that actually called into an unmanaged library (a wrapper class, for example), but leaves the unmanaged-specific reclamation process up to the developer (closing file handles, freeing buffers, etc).
As an example, I have a Graphviz wrapper library containing a Graph class. This class wraps the functions used to create graphs, add nodes to them, etc. Internally, this class maintains a pointer to an unmanaged graph structure allocated by Graphiz itself. To the .NET Framework, this is merely an IntPtr and it has no idea how to free it during garbage collection. So, when a managed Graph object is no longer being used, the garbage collector frees up the memory used by the pointer, but not the data it points to. To do this, I have to implement a finalizer that calls the unmanaged function agclose (the Graphviz function that releases the resources used by a graph).

An example would be if you wrote a component that used some operating system resource like named pipe or memory mapped file. You could used the finalize operation to release the resource back to the os.

Cleaning up a non-managed resource could include closing network connections, files, database connections, etc.. Of course, it could also include the deallocation of memory for that resource.

CleanUp is here means free up any bounded resource (Harddisk, Network bandwidth, Sound Card, Memory, CPU, etc) and since .NET have no managed reference to unmanaged code, it could just let you do the job at the right moment yourself using Finalize() method before GC swaps it. If you don't CleanUp you would end up some orphan unmanaged code at an unknown state which are still using resources. It's better to implement IDisposable and CleanUp by calling Dispose() on your object.

Related

Why doesn't CLR handle the cleanup code?

I have just started with the .NET framework. Today, I was taught about the IDisposable interface and the dispose() method. I was taught a few things regarding it:
dispose() should contain the cleanup code corresponding to an object(like closing any resources occupied by any objects - files or database connections,etc.)
I was also told that in case we don't do it in the dispose() method, the same could be done in the destructor, but that doesn't ensure immediate execution, and we are left to the mercy of GC.
And if at all we don't provide any cleanup code at all, the GC will forcefully terminate all connections to resources that our objects were holding. Hence, we should handle the cleanup code ourselves.
But I was curious as to why doesn't CLR handle this on it's own? It takes care of Memory Management, it takes care of Garbage Collection. So, it should very well know which Object holds onto which resource(s) and when that Object dies off. So, it should be capable of de-allocating those resources as well?
I asked a few people about it. The answer I was given was that it is because we need to close it gracefully, where as GC closes it forcefully. Is it actually the reason?

In .NET there's much more than managed code that the GC knows about. There's like a huge volume of unmanaged code involved: all the file handles, database connections, network sockets, ... all this is plain ol' unmanaged Win32 code. You can't even believe that in almost every single BCL function you are calling from your pretty C# application, you will be hitting like tons of unmanaged functions written in C++ (and may God forbid VB6) and buried deep into the internals of the OS itself. All those functions are allocating unmanaged memory, handles, ... The managed world doesn't know what happens there.
For example every single time you open a file (FileStream) you are basically calling (behind the scenes of course) the CreateFile unmanaged Win32 function. This function allocates an unmanaged file handle directly from the file system. .NET and the GC has strictly no way of tracking this unmanaged code and everything it does. That's why those classes implement the IDisposable interface. So that you could always wrap their instances in using statements and ensure that the Dispose method is always called, even in the event of an exception, and this as soon as possible. The Dispose method will take care of calling another unmanaged function to clean the mess it created.
So basically the way you could think about the IDisposable interface is the following:
The day when we have an operating system written in a fully managed language (something like Midori for example from Microsoft Research) we will probably no longer need IDisposable as the GC will be able to completely replace it as it will have knowledge of everything that happens within this system.

The point of IDisposable and Dispose() is that you should clean up unmanaged memory. That's memory .NET didn't allocate, which came from outside sources and thus the GC cannot know about it. So it cannot clean it up for you automatically. Essentially that's precisely the difference between managed and unmanaged memory ;-)
Generally you should implement Dispose() to clean up whatever unmanaged resources your class uses and implement the finalizer to call Dispose() too. The finalizer is just a safeguard, though. It will make sure that those resources get cleaned up eventually, if the caller forgets to dispose of your class properly.

The IDisposable interface is there to provide you a way to clean up un-managed resources. The CLR only manages your managed resources for you.
In other words, the CLR only knows how to clean up the things that it manages. If you open connections to the rest of the system (like opening files, database connections, etc.), those are your responsibility and you need to tell the CLR how you want it to clean those up for you.

It can only take care of memory management for .NET objects. Any code that needs to use unmanaged resources (because it interacts with a C++ library, for example) falls outside the garbage collector's bailiwick. All that code needs to be told when to release its resources the old-fashioned way.

There's no way for the .Net framework (and the GC) to know how to release a un-managed resource. All it can do, is destroy the reference your managed code has to the resource. It is a lot better to actually call .Close() on a connection to your database server (thereby telling it that the connection should go back into the poll of available connection), than just destroying the reference, and letting it timeout on it's own after a set amount of seconds.
So whenever possible, use the IDisposable interface when referencing un-managed resources!

IDisposable is used when you don't want the GC to handle that particular artifact. The most common example are connections, or file handles. You don't want to wait for the GC to run before releasing a file, or to close a connection to the database, since you don't know when that will happen.
Most people associate IDisposable with unmanaged resources, which is mostly accurate, but fail to remember that finalizers are the proper .NET way to handle those. IDisposable provides a way of deterministicly disposing if that is important to your program.

The IDisposable interface is simply a convention to allow you to deterministically dispose of managed and unmanaged resources. It alone doesn't replace garbage collection or do anything involving the garbage collector itself.
It is more apparent with unmanaged resources because unless these are handled (either in a finalizer or with deterministic disposal) they will remain as a memory leak until the process ends. With managed memory, if you don't deterministically dispose of the items they will be undeterministically collected (assuming eventual eligibility for collection) by the GC, because they are managed (this is also the reason why the dispose pattern doesn't include managed items in the finalizer route).
IDisposable itself doesn't do anything, it is just a recognised interface (and is supported in code with the using keyword) that people expect to find when handling items that use consumable resources, unmanaged memory, external items, etc.
The CLR cannot possibly know when an external item is finished with. That is entirely dependent on the flow of your application. If you happen to also not know when to dispose an object, the finalizer syntax is useful. If you implement a finalizer on a custom class, the garbage collection process will run this finalizer just prior to final collection. This is your last chance to tidy up after yourself.

we use Dispose in order to dispose unmanaged resssource as file access or connection database, because GC don't have information about this unmanaged ressource.
you can also use Finalize, but it's not performant because you save your ressource in finalisation structure, and GC pass in the end of dispose cycle by this finalisation structure, and it's not performant

Where can I find more detail on disposal of COM resources in .NET?

The background for my question is this: I am converting a message processing app that uses many COM components from VB6 to C#. Many of the COM components in the application are fine-grained components that are used in high numbers and frequency within message processing loops. I am seeing a massive (and progressively growing) increase in memory usage when processing a set of test messages in the C# app as compared to the VB6 app. I used a memory profiler on the application that confirmed that the high memory usage was due to live instances of COM objects on the application's unmanaged heap. I know that these components are not being "leaked" due to live references because if I put a GC.Collect() at the core of the message processing loop, the memory usage is flat and nearly identical to the VB6 app (although the performance degrades horribly as one would expect).
I have read everything I can find on the multi-generation garbage collector in C#, runtime callable wrappers, unmanaged resource memory allocation, Marshal.ReleaseComObject(), Marshal.FinalReleaseComObject(), etc. None of it explains why the application is holding onto live COM objects in the unmanaged heap when the corresponding RCWs are eligible for garbage collection.
In some articles, I have seen allusions to the possibility that the actual implementation of the garbage collector in C# may involve optimizations such as not performing collection of all eligible objects in a particular generation. If this or something like it were true, it could explain why eligible RCWs and their corresponding COM objects are not collected and destroyed. Another explanation could be if the destruction of a COM object in the unmanaged heap is not directly tied to the collection of its corresponding RCW. I have not found anything that offers this degree of detail on how COM objects are handled in .NET. I need to understand this better because my app's memory usage is currently unacceptable. Any pointers or recommendations would be greatly appreciated.
Edit: I should add that I am quite familiar with finalizers (which are not documented to exist on RCWs) and the IDisposable interface (which RCWs do not implement). To the best of my understanding, Marshal.ReleaseComObject() is the proper method of explicitly "disposing" of a COM reference. I was careful to add that statement for every known usage of a COM object in my app and it resulted in no difference in memory usage.
Further, it is not clear why a lack of disposal or finalization code could be the problem when the addition of an explicit GC.Collect() results in no memory problems. Neither Dispose() nor the presence of a finalizer result in the actual collection of objects. The former permits an object to suppress its finalization step (if any) and the latter allows for the cleanup of unmanaged resources of which none are exposed in an RCW.

Are you implementing IDisposable properly in the classes that instantiate your COM objects?
You need to implement IDisposable and dispose of your COM RCW's in the Dispose() method. Then all code that instantiates classes that implement IDisposable should call it either explicitly or by using a using() statement, like so:
var first = new DisposableObject();
...
first.Dispose();
and
using(var first = new DisposableObject())
{
...
}
IDisposable is the only way to get the CLR to dispose of these objects in a timely manner and to make sure you lose COM references.

Use a Finalizer or destructor to clean up the memory used by the COM objects.
http://msdn.microsoft.com/en-us/library/66x5fx1b.aspx
Alternatively, if you want the objects to clean up immediately, you can implement IDispose, and use a using statement in your code that instantiates the COM object.

What happens if I don't call Dispose on the pen object?

What happens if I don't call Dispose on the pen object in this code snippet?
private void panel_Paint(object sender, PaintEventArgs e)
{
var pen = Pen(Color.White, 1);
//Do some drawing
}

A couple of corrections should be made here:
Regarding the answer from Phil Devaney:
"...Calling Dispose allows you to do deterministic cleanup and is highly recommended."
Actually, calling Dispose() does not deterministically cause a GC collection in .NET - i.e. it does NOT trigger a GC immediately just because you called Dispose(). It only indirectly signals to the GC that the object can be cleaned up during the next GC (for the Generation that the object lives in). In other words, if the object lives in Gen 1 then it wouldn't be disposed of until a Gen 1 collection takes place. One of the only ways (though not the only) that you can programmatically and deterministically cause the GC to perform a collection is by calling GC.Collect(). However, doing so is not recommended since the GC "tunes" itself during runtime by collecting metrics about your memory allocations during runtime for your app. Calling GC.Collect() dumps those metrics and causes the GC to start its "tuning" all over again.
Regarding the answer:
IDisposable is for disposing unmanaged resources. This is the pattern in .NET.
This is incomplete. As the GC is non-deterministic, the Dispose Pattern, (How to properly implement the Dispose pattern), is available so that you can release the resources you are using - managed or unmanaged. It has nothing to do with what kind of resources you are releasing. The need for implementing a Finalizer does have to do with what kind of resources you are using - i.e. ONLY implement one if you have non-finalizable (i.e. native) resources. Maybe you are confusing the two. BTW, you should avoid implementing a Finalizer by using the SafeHandle class instead which wraps native resources which are marshaled via P/Invoke or COM Interop. If you do end up implementing a Finalizer, you should always implement the Dispose Pattern.
One critical note which I haven't seen anyone mention yet is that if disposable object is created and it has a Finalizer (and you never really know whether they do - and you certainly shouldn't make any assumptions about that), then it will get sent directly to the Finalization Queue and live for at least 1 extra GC collection.
If GC.SuppressFinalize() is not ultimately called, then the finalizer for the object will be called on the next GC. Note that a proper implementation of the Dispose pattern should call GC.SuppressFinalize(). Thus, if you call Dispose() on the object, and it has implemented the pattern properly, you will avoid execution of the Finalizer. If you don't call Dispose() on an object which has a finalizer, the object will have its Finalizer executed by the GC on the next collection. Why is this bad? The Finalizer thread in the CLR up to and including .NET 4.6 is single-threaded. Imagine what happens if you increase the burden on this thread - your app performance goes to you know where.
Calling Dispose on an object provides for the following:
reduce strain on the GC for the process;
reduce the app's memory pressure;
reduce the chance of an OutOfMemoryException (OOM) if the LOH (Large Object Heap) gets fragmented and the object is on the LOH;
Keep the object out of the Finalizable and f-reachable Queues if it has a Finalizer;
Make sure your resources (managed and unmanaged) are cleaned up.
Edit:
I just noticed that the "all knowing and always correct" MSDN documentation on IDisposable (extreme sarcasm here) actually does say
The primary use of this interface is
to release unmanaged resources
As anyone should know, MSDN is far from correct, never mentions or shows 'best practices', sometimes provides examples that don't compile, etc. It is unfortunate that this is documented in those words. However, I know what they were trying to say: in a perfect world the GC will cleanup all managed resources for you (how idealistic); it will not, however cleanup unmanaged resources. This is absolutely true. That being said, life is not perfect and neither is any application. The GC will only cleanup resources that have no rooted-references. This is mostly where the problem lies.
Among about 15-20 different ways that .NET can "leak" (or not free) memory, the one that would most likely bite you if you don't call Dispose() is the failure to unregister/unhook/unwire/detach event handlers/delegates. If you create an object that has delegates wired to it and you don't call Dispose() (and don't detach the delegates yourself) on it, the GC will still see the object as having rooted references - i.e. the delegates. Thus, the GC will never collect it.
#joren's comment/question below (my reply is too long for a comment):
I have a blog post about the Dispose pattern I recommend to use - (How to properly implement the Dispose pattern). There are times when you should null out references and it never hurts to do so. Actually, doing so does do something before GC runs - it removes the rooted reference to that object. The GC later scans its collection of rooted references and collects those that do not have a rooted reference. Think of this example when it is good to do so: you have an instance of type "ClassA" - let's call it 'X'. X contains an object of type "ClassB" - let's call this 'Y'. Y implements IDisposable, thus, X should do the same to dispose of Y. Let's assume that X is in Generation 2 or the LOH and Y is in Generation 0 or 1. When Dispose() is called on X and that implementation nulls out the reference to Y, the rooted reference to Y is immediately removed. If a GC happens for Gen 0 or Gen 1, the memory/resources for Y is cleaned up but the memory/resources for X is not since X lives in Gen 2 or the LOH.

The Pen will be collected by the GC at some indeterminate point in the future, whether or not you call Dispose.
However, any unmanaged resources held by the pen (e.g., a GDI+ handle) will not be cleaned up by the GC. The GC only cleans up managed resources. Calling Pen.Dispose allows you to ensure that these unmanaged resources are cleaned up in a timely manner and that you aren't leaking resources.
Now, if the Pen has a finalizer and that finalizer cleans up the unmanaged resources then those said unmanaged resources will be cleaned up when the Pen is garbage collected. But the point is that:
You should call Dispose explicitly so that you release your unmanaged resources, and
You shouldn't need to worry about the implementation detail of if there is a finalizer and it cleans up the unmanaged resources.
Pen implements IDisposable. IDisposable is for disposing unmanaged resources. This is the pattern in .NET.
For previous comments on the this topic, please see this answer.

The underlying GDI+ pen handle will not be released until some indeterminate time in the future i.e. when the Pen object is garbage collected and the object's finalizer is called. This might not be until the process terminates, or it might be earlier, but the point is its non-deterministic. Calling Dispose allows you to do deterministic cleanup and is highly recommended.

The total amount of .Net memory in use is the .Net part + all 'external' data in use. OS objects, open files, database and network connections all take some resources that are not purely .Net objects.
Graphics uses Pens and other objects wich are actually OS objects that are 'quite' expensive to keep around. (You can swap your Pen for a 1000x1000 bitmap file). These OS objects only get removed from the OS memory once you call a specific cleanup function. The Pen and Bitmap Dispose functions do this for you immediatly when you call them.
If you don't call Dispose the garbage collector will come to clean them up 'somewhere in the future*'.
(It will actually call the destructor/finalize code that probably calls Dispose())
*on a machine with infinite memory (or more then 1GB) somewhere in the future can be very far into the future. On a machine doing nothing it can be easily longer then 30 minutes to clean up that huge bitmap or very small pen.

If you really want to know how bad it is when you don't call Dispose on graphics objects you can use the CLR Profiler, available free for the download here. In the installation folder (defaults to C:\CLRProfiler ) is CLRProfiler.doc which has a nice example of what happens when you don't call Dispose on a Brush object. It is very enlightening. The short version is that graphics objects take up a larger chunk of memory than you might expect and they can hang around for a long time unless you call Dispose on them. Once the objects are no longer in use the system will, eventually, clean them up, but that process takes up more CPU time than if you had just called Dispose when you were finished with the objects.
You may also want to read up on using IDisposable here and here.

It will keep the resources until the garbage collector cleans it up

Depends if it implements finalizer and it calls the Dispose on its finalize method. If so, handle will be released at GC.
if not, handle will stay around until process is terminated.

With graphic stuff it can be very bad.
Open the Windows Task Manager. Click "choose columns" and choose column called "GDI Objects".
If you don't dispose certain graphic objects, this number will keep raising and raising.
In older versions of Windows this can crash the whole application (limit was 10000 as far as I remember), not sure about Vista/7 though but it's still a bad thing.

the garbage collector will collect it anyway BUT it matters WHEN:
if you dont call dispose on an object that you dont use it will live longer in memory and gets promoted to higher generations meaning that collecting it has a higher cost.

in back of my mind first idea came to the surface is that this object will be disposed as soon as the method finishes execution!, i dont know where did i got this info!, is it right?

What are unmanaged objects?

What are unmanaged objects? Can you please explain it in terms of the CLR? I learned on the internet that they say unmanaged objects don't run under the CLR environment. Can you please give me an example of unmanaged objects?

Any memory not managed by the CLR memory management (i.e. garbage collector) is unmanaged memory.
An OS file handle is one example of unmanaged memory (under .NET and windows).
To properly dispose of unmanaged
resources, it is recommended that you
implement a public Dispose or Close
method that executes the necessary
cleanup code for the object. The
IDisposable interface provides the
Dispose method for resource classes
that implement the interface. Because
it is public, users of your
application can call the Dispose
method directly to free memory used by
unmanaged resources. When you properly
implement a Dispose method, the
Finalize method becomes a safeguard to
clean up resources in the event that
the Dispose method is not called.
Ref: Cleaning Up Unmanaged Resources

In simple words, the unmanaged objects are the objects that aren't managed by the .Net framework.
Best example is the database connection or file operation are handled by the OS at the end and need to be liberated explicitly (File.Close() or Connection close) and won't be handled automatically by the Garbage collector.

VC++6.0 samples or many of activeX and COM objects your using everyday for your application or website are unmanaged, for example Excel VBA is unmanaged and too many other samples.

I learned on the internet that they say unmanaged objects don't run under the CLR environment.
This is not right, the CLR is pretty much able to do everything whats possible within C. In C# you have got a keywoard called unsafe which allows you to access even pointers and pointer offsets. I have a project where I do heavy Interop with a game engine and the C wrapper is so small, because I can handle all the memory objects within the CLR/C#.
By doesn't run they probably wanted to make it explicitly clear that the unamanged objects are not handled by the virtual machine: you have to do the cleanup or create wrapper classes which do the clean up for you.

Release resources in .Net C#

I'm new to C# and .NET, ,and have been reading around about it.
I need to know why and when do I need to release resources? Doesn't the garbage collector take care of everything? When do I need to implement IDisposable, and how is it different from destructor in C++?
Also, if my program is rather small i.e. a screensaver, do I need to care about releasing resources?
Thanks.

The garbage collector is only aware of memory. That's fine for memory, because one bit of memory is pretty much as good as any other, so long as you've got enough of it. (This is all modulo cache coherency etc.)
Now compare that with file handles. The operating system could have plenty of room to allocate more file handles - but if you've left a handle open to a particular file, no-one else will be able to open that particular file for writing. You should tell the system when you're done with a handle - usually by closing the relevant stream - as soon as you're finished, and do so in a way that closes it even if an exception is thrown. This is usually done with a using statement, which is like a try/finally with a call to Dispose in the finally block.
Destructors in C++ are very different from .NET finalizers, as C++ destructors are deterministic - they're automatically called when the relevant variable falls out of scope, for example. Finalizers are run by the garbage collector at some point after an object is no longer referenced by any "live" objects, but the timing is unpredictable. (In some rare cases, it may never happen.)
You should implement IDisposable yourself if you have any clean-up which should be done deterministically - typically that's the case if one of your instance variables also implements IDisposable. It's pretty rare these days to need to implement a finalizer yourself - you usually only need one if you have a direct hold on operating system handles, usually in the form of IntPtr; SafeHandle makes all of this a lot easier and frees you from having to write the finalizer yourself.

Basically, you need to worry about releasing resources to unmanaged code - anything outside the .NET framework. For example, a database connection or a file on the OS.
The garbage collector deals with managed code - code in the .NET framework.
Even a small application may need to release unmanaged resources, for example it may write to a local text file. When you have finished with the resource you need to ensure the object's Dispose method is called. The using statement simplifies the syntax:
using (TextWriter w = File.CreateText("test.txt"))
{
w.WriteLine("Test Line 1");
}
The TextWriter object implements the IDisposable interface so as soon as the using block is finished the Dispose method is called and the object can be garbage collected. The actual time of collection cannot be guaranteed.
If you create your own classes that need to be Disposed of properly you will need to implement the IDisposable interface and Dispose pattern yourself. On a simple application you probably won't need to do this, if you do this is a good resource.

Resources are of two kinds - managed, and unmanaged. Managed resources will be cleaned up by the garbage collector if you let it - that is, if you release any reference to the object. However, the garbage collection does not know how to release unmanaged resources that a managed object holds - file handles, and other OS resources for example.
IDisposable is best practice when there's a managed resource you want released promptly (like a database connection), and vital when there are unmanaged resources which you need to have released. The typical pattern:
public void Dispose()
protected void Dispose(bool disposing)
Lets you ensure that unmanaged resources are released whether by the Dispose method or by object finalisation.

You don't need to release memory in managed objects like strings or arrays - that is handled by the garbage collector.
You should clean up operating system resources and some unmanaged objects when you have finished using them. If you open a file you should always remember to close that file when you have finished using it. If you open a file exclusively and forget to close, the next time you try to open that file it might still be locked. If something implements IDisposable, you should definitely consider whether you need to close it properly. The documentation will usually tell you what the Dispose method does and when it should be called.
If you do forget, the garbage collector will eventually run the finalizer which should clean up the object correctly and release the unmanaged resources, but this does not happen immediately after the object becomes eligible for garbage collection, and it in fact might not run at all.
Also it is useful to know about the using statement.

The garbage collector releases MEMORY and cleans up - through disposition - elemetns it removes. BUT: IT only does so when it has memory pressure.
THis is seriously idiotic for ressources whree I may want to explicitely release them. Save to file, for example, is supposed to: Open the file, write out the data and - close the file, so it can be copied away by the user if he wants, WITHOUT waiting for the GC to come around and release the memory for the file object, which may not happen for hours.

You only need to worry about precious resources. Most objects you create while programming do not fit into this category. As you say, the garbage collector will take care of these.
What you do need to be mindful of is objects that implement IDisposable, which is an indication that the resources it owns are precious and should not wait for the finalizer thread to be cleaned up. The only time you would need to implement IDisposable is on classes that own a) objects that implement IDisposable (such as a file stream), or b) unmanaged resources.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.