Is it really necessary to release COM components from Office PIA, when you don't need them anymore by invoking Marshal.ReleaseComObject(..)?
I found various and contradictory advices on this topic on the web. In my opinion, since Outlook PIA is always returning a new references to its interfaces as returning values from its methods, it is not necessary to explicitly release it. Am I right?
For VS 2010, see Marshal.ReleaseComObject Is Considered Dangerous.
With Microsoft Office, in general, you do need to explicitly release your references, which can be safely done in two stages:
(1) First release all the minor object to which you do not hold a named object variable via a call to GC.Collect() and then GC.WaitForPendingFinalizers(). (You need to call this twice, if the objects involved could have finalizers, such as when using Visual Studio Tools for Office (VSTO).)
(2) Then explicitly release the objects to which you hold a named variable via a call to Marshall.FinalReleaseComObject() on each object.
That's it. :-)
I discussed this in more detail in a previous post, along with a code example.
PIAs are .NET interop wrappers. This means that in the object's destructor (or Dispose - I can't remember) will automatically handle its reference count. The trick is that some references won't be released until the garbage collector is executed. It depends on what the COM object instantiates. For instance, a COM object that opens database cursors will keep those cursors alive in memory until the reference count on those cursors is released. With the .NET/COM interop, the references aren't released until the garbage collector executes or you explicitly release the reference using Marshal.ReleaseComObject (or FinalReleaseComObject).
I personally haven't worked with the Microsoft Office PIAs, but under most circumstances, you shouldn't have to explicitly release the references. It is only when your application starts to lock other resources or crash that you should start being suspicious about dangling references.
EDIT: If you run into a situation where you do need to cleanup COM/Interop objects, use Marshal.FinalReleaseComObject - which takes the reference count all the way to zero instead of just decrementing by one - and set the object reference to null. You can explicitly force garbage collection (GC.Collect) if you really want to be safe, but be careful of doing GC too often as it does invoke a noticeable performance hit.
There are some good practices here using a managed wrapper..worth checking out..
Maybe it's just my superstition, but I decided to explicitly release the Office PIA via Marshal.ReleaseComObject() because when my application crashed, the references to Excel and Word were staying open. I didn't dig too deep into why (stupid deadlines), but releasing them as part of my class's dispose pattern fixed that problem.
You do need to do so if you want the instance of the Office application to exit, as described in this post.
And it's difficult to get it right in all but the most simple scenarios.
There's one simple rule about .Net/COM interop - When in doubt, always Release(). :-)
My experience shows that you have to, otherwise (at least Outlook) the application may not shut down at all.
But this opens another can of worms, as it looks like the RCWs are per process, thus you can break some other addin, which happens to have a reference to the same object.
I have posted a related question here, but I still have no clear answer. I'll edit this post once I know more.
Related
The outlook addin tries to retrieve Outlook.Exceptions correctly. I found the following documentation:
https://learn.microsoft.com/de-de/office/client-developer/outlook/pia/how-to-find-a-specific-appointment-in-a-recurring-appointment-series
There it says:
[...] When you work with recurring appointment items, you should release any prior references, obtain new references to the recurring appointment item before you access or modify the item, and release these references as soon as you are finished and have saved the changes. This practice applies to the recurring AppointmentItem object, and any Exception or RecurrencePattern object. To release a reference in Visual Basic, set that existing object to Nothing. In C#, explicitly release the memory for that object.
[...]
Now my question is how do I do that? The example on the referenced page does not explicitly release the memory?
Would it be sufficient to set appt to null?
Setting to null is definitely not enough. In this case, additionally you have to force the garbage collector to swipe the heap as soon as possible. You do this correctly with calls to GC.Collect() and GC.WaitForPendingFinalizers(). Calling twice is safe, end ensures that cycles are definitely cleaned up too.
But I'd recommend using System.Runtime.InteropServices.Marshal.ReleaseComObject to release an Outlook object when you have finished using it. Then set a variable to Nothing in Visual Basic (null in C#) to release the reference to the object. This is particularly important if your add-in attempts to enumerate more than 256 Outlook items in a collection that is stored on a Microsoft Exchange Server (this number was increased in latest versions). If you do not release these objects in a timely manner, you can reach the limit imposed by Exchange on the maximum number of items opened at any one time.
The ReleaseComObject method is used to explicitly control the lifetime of a COM object used from managed code. You should use this method to free the underlying COM object that holds references to resources in a timely manner or when objects must be freed in a specific order.
Every time a COM interface pointer enters the common language runtime (CLR), it is wrapped in an RCW.
The RCW has a reference count that is incremented every time a COM interface pointer is mapped to it. The ReleaseComObject method decrements the reference count of an RCW. When the reference count reaches zero, the runtime releases all its references on the unmanaged COM object, and throws a System.NullReferenceException if you attempt to use the object further. If the same COM interface is passed more than one time from unmanaged to managed code, the reference count on the wrapper is incremented every time, and calling ReleaseComObject returns the number of remaining references.
The ReleaseComObject method enables you to force an RCW reference count release so that it occurs precisely when you want it to. However, improper use of ReleaseComObject may cause your application to fail, or may cause an access violation.
Memory management in C# (CLR runtime) is automatic and employs a garbage collector.
Periodically, the garbage collector checks for unreachable objects to be reclaimed. This process is not deterministic unless you force it (but normally you shouldn't).
To have the memory of an object freed, you simply have to make it unreachable: this can happen in various combinations of setting to null all the references to the offending object or, letting go out of scope all of them (provided that you didn't assign it to fields of properties).
Today I have seen a piece of code that first seemed odd to me at first glance and made me reconsider. Here is a shortened version of the code:
if(list != null){
list.Clear();
list = null;
}
My thought was, why not replace it simply by:
list = null;
I read a bit and I understand that clearing a list will remove the reference to the objects allowing the GC to do it's thing but will not "resize". The allocated memory for this list stays the same.
On the other side, setting to null would also remove the reference to the list (and thus to its items) also allowing the GC to do it's thing.
So I have been trying to figure out a reason to do it the like the first block. One scenario I thought of is if you have two references to the list. The first block would clear the items in the list so even if the second reference remains, the GC can still clear the memory allocated for the items.
Nonetheless, I feel like there's something weird about this so I would like to know if the scenario I mentioned makes sense?
Also, are there any other scenarios where we would have to Clear() a list right before setting the reference to null?
Finally, if the scenario I mentioned made sense, wouldn't it be better off to just make sure we don't hold multiple references to this list at once and how would we do that (explicitly)?
Edit: I get the difference between Clearing and Nulling the list. I'm mostly curious to know if there is something inside the GC that would make it so that there would be a reason to Clear before Nulling.
The list.Clear() is not necessary in your scenario (where the List is private and only used within the class).
A great intro level link on reachability / live objects is http://levibotelho.com/development/how-does-the-garbage-collector-work :
How does the garbage collector identify garbage?
In Microsoft’s
implementation of the .NET framework the garbage collector determines
if an object is garbage by examining the reference type variables
pointing to it. In the context of the garbage collector, reference
type variables are known as “roots”. Examples of roots include:
A reference on the stack
A reference in a static variable
A reference in another object on the managed heap that is not eligible for garbage
collection
A reference in the form of a local variable in a method
The key bit in this context is A reference in another object on the managed heap that is not eligible for garbage collection. Thus, if the List is eligible to be collected (and the objects within the list aren't referenced elsewhere) then those objects in the List are also eligible to be collected.
In other words, the GC will realise that list and its contents are unreachable in the same pass.
So, is there an instance where list.Clear() would be useful? Yes. It might be useful if you have two references to a single List (e.g. as two fields in two different objects). One of those references may wish to clear the list in a way that the other reference is also impacted - in which list.Clear() is perfect.
This answer started as a comment for Mick, who claims that:
It depends on which version of .NET you are working with. On mobile platforms like Xamarin or mono, you may find that the garbage collector needs this kind of help in order to do its work.
That statement is begging to be fact checked. So, let us see...
.NET
.NET uses a generational mark and sweep garbage collector. You can see the abstract of the algorithm in What happens during a garbage collection
. For summary, it goes over the object graph, and if it cannot reach a object, that one can be erased.
Thus, the garbage collector will correctly identify the items of the list as collectible in the same iteration, regardless of whatever or not you clear the list. There is no need to decouple the objects beforehand.
This means that clearing the list does not help the garbage collector on the regular implementation of .NET.
Note: If there were another reference to the list, then the fact that you cleared the list would be visible.
Mono and Xamarin
Mono
As it turns out, the same is true for Mono.
Xamarin.Android
Also true for Xamarin.Android.
Xamarin.iOS
However, Xamarin.iOS requires additional considerations. In particular, MonoTouch will use wrapped Objective-C objects which are beyond the garbage collector. See Avoid strong circular references under iOS Performance. These objects require different semantics.
Xamarin.iOS will minimize the use of Objetive-C objects by keeping a cache:
C# NSObjects are also created on demand when you invoke a method or a property that returns an NSObject. At this point, the runtime will look into an object cache and determine whether a given Objective-C NSObject has already been surfaced to the managed world or not. If the object has been surfaced, the existing object will be returned, otherwise a constructor that takes an IntPtr as a parameter is invoked to construct the object.
The system keeps these objects alive even there are no references from managed code:
User-subclasses of NSObjects often contain C# state so whenever the Objective-C runtime performs a "retain" operation on one of these objects, the runtime creates a GCHandle that keeps the managed object alive, even if there are no C# visible references to the object. This simplifies bookeeping a lot, since the state will be preserved automatically for you.
Emphasis mine.
Thus, under Xamarin.iOS, if there were a chance that the list might contain wrapped Objetive-C objects, this code would help the garbage collector.
See the question How does memory management works on Xamarin.IOS, Miguel de Icaza explains in his answer that the semantics are to "retain" the object when you take a reference and "release" it when the reference is null.
On the Objetive-C side, "release" does not mean to destroy the object. Objetive-C uses a reference count garbage collector. When we "retain" the object the counter is incremented and when we "release" the counter is decreased. The system destroys the object when the counter reaches zero. See: About Memory Management.
Therefore, Objetive-C is bad at handling circular references (if A references B and B references A, their reference count is not zero, even if they cannot be reached), thus, you should avoid them in Xamarin.iOS. In fact, forgetting to decouple references will lead to leaks in Xamarin.iOS... See: Xamarin iOS memory leaks everywhere.
Others
dotGNU also uses a generational mark and sweep garbage collector.
I also had a look at CrossNet (that compiles IL to C++), it appears they attempted to implement it too. I do not know how good it is.
It depends on which version of .NET you are working with. On mobile platforms like Xamarin or mono, you may find that the garbage collector needs this kind of help in order to do its work. Whereas on desktop platforms the garbage collector implementation may be more elaborate. Each implementation of the CLI out there is going to have it's own implementation of the garbage collector and it is likely to behave differently from one implementation to another.
I can remember 10 years ago working on a Windows Mobile application which had memory issues and this sort of code was the solution. This was probably due to the mobile platform requiring a garbage collector that was more frugal with processing power than the desktop.
Decoupling objects helps simplify the analysis the garbage collector needs to do and helps avoid scenarios where the garbage collector fails to recognise a large graph of objects has actually become disconnected from all the threads in your application. Which results in memory leaks.
Anyone who believes you can't have memory leaks in .NET is an inexperienced .NET developer. On desktop platforms just ensuring Dispose is called on objects which implement them may be enough, however with other implementations you may find it is not.
List.Clear() will decouple the objects in the list from the list and each other.
EDIT: So to be clear I'm not claiming that any particular implementation currently out there is susceptible to memory leaks. And again depending on when this answer is read the robustness of the garbage collector on any implementation of the CLI currently out there could have changed since the time writing this.
Essentially I'm suggesting if you know that your code needs to be cross platform and used across many implementations of the .NET framework, especially implementations of the .NET framework for mobile devices, it could be worth investing time into decoupling objects when they are no longer required. In that case I'd start off by adding decoupling to classes that already implement Dispose, and then if needed look at implementing IDisposable on classes that don't implement IDisposable and ensuring Dispose is called on those classes.
How to tell for sure if it's needed? You need to instrument and monitor the memory usage of your application on each platform it is to be deployed on. Rather than writing lots of superfluous code, I think the best approach is to wait until your monitoring tools indicate you have memory leaks.
As mentioned in the docs:
List.Clear Method (): Count is set to 0, and references to other
objects from elements of the collection are also released.
In your 1st snippet:
if(list != null){
list.Clear();
list = null;
}
If you just set the list to null, it means that you release the reference of your list to the actual object in the memory (so the list itself is remain in the memory) and waiting for the Garbage Collector comes and release its allocated memory.
But the problem is that your list may contain elements that hold a reference to another objects, for example:
list → objectA, objectB, objectC
objectB → objectB1, objectB2
So, after setting the list to null, now list has no reference and it should be collected by Garbage Collector later, but objectB1 and objectB2 has a reference from objectB (still be in the memory) and because of that, Garbage Collector need to analyse the object reference chain. To make it less confusing, this snippet use .Clear() function to remove this confusion.
Clearing the list ensures that if the list is not garbage collected for some reason, then at the very least, the elements it contained can still be disposed of.
As stated in the comments, preventing other references to the list from existing requires careful planning, and clearing the list before nulling it doesn't incur a big enough performance hit to justify trying to avoid doing so.
We are using EWS Managed API which polls MS Exchange for new mail messages after a given interval. With each invocation of the polling call (PullSubscription.GetEvents()) - Microsofts API is failing to properly dispose the NetworkStream and causes memory to proportionately increase. This was previously discussed here, but never resolved. Using ANTS Profiler we were able to determine which objects were continuously growing in memory and isolate the issue.
Now that the issue has been isolated - is there a way to dispose of a NetworkStream created in an external API that we don't have a reference to? GC.Collect() doesn't seem to dispose it since it still has an active reference. What can we do to cleanup the dangling reference? Is there some wrapper we can use to force cleanup of their buggy SDK?
There is no way to force GC to release memory for a referenced object!
First of all I would suggest to contact microsoft itself for help with this bug.
Second, are you talking about "disposal" or just memory release? They are two totally different things. (IDisposable pattern, finalizers).
Third, can u just dereference the object that are referencing these objects?
Fourth, one possible solution can be to decompile with reflector the code that is giving you the issue, understand a way you can arrive to the fields that are keeping the referenced objects, use reflection in your code to access the private fields and put them to null.
Is a very dirty hack, but if you have no other way is the only thing i can think of. Do this only if you cannot go in any other ways.
The most easy way would be running the part the is interfacing the SDK into its own AppDomain and after your are done unload the AppDomain. This will cause all the memory allocated in the AppDomain to be freed.
But you will need add some work to your project since you can only interchange with an AppDomain the a MarshalByRef object or marked as serilizable.
This will also allow you to monitor the amount of memory consumed by the AppDomain. So you can create your AppDomain, run the buggy SDK in it and if it reaches a special limit of memory consumption you can unload it.
I'm having issues with finalizers seemingly being called early in a C++/CLI (and C#) project I'm working on. This seems to be a very complex problem and I'm going to be mentioning a lot of different classes and types from the code. Fortunately it's open source, and you can follow along here: Pstsdk.Net (mercurial repository) I've also tried linking directly to the file browser where appropriate, so you can view the code as you read. Most of the code we deal with is in the pstsdk.mcpp folder of the repository.
The code right now is in a fairly hideous state (I'm working on that), and the current version of the code I'm working on is in the Finalization fixes (UNSTABLE!) branch. There are two changesets in that branch, and to understand my long-winded question, we'll need to deal with both. (changesets: ee6a002df36f and a12e9f5ea9fe)
For some background, this project is a C++/CLI wrapper of an unmanaged library written in C++. I am not the coordinator of the project, and there are several design decisions that I disagree with, as I'm sure many of you who look at the code will, but I digress. We wrap much of the layers of original library in the C++/CLI dll, but expose the easy-to-use API in the C# dll. This is done because the intention of the project is to convert the entire library to managed C# code.
If you're able to get the code to compile, you can use this test code to reproduce the problem.
The problem
The latest changeset, entitled moved resource management code to finalizers, to show bug, shows the original problem I was having. Every class in this code is uses the same pattern to free the unmanaged resources. Here is an example (C++/CLI):
DBContext::~DBContext()
{
this->!DBContext();
GC::SuppressFinalize(this);
}
DBContext::!DBContext()
{
if(_pst.get() != nullptr)
_pst.reset(); // _pst is a clr_scoped_ptr (managed type)
// that wraps a shared_ptr<T>.
}
This code has two benefits. First, when a class such as this is in a using statement, the resources are properly freed immediately. Secondly, if a dispose is forgotten by the user, when the GC finally decides to finalize the class, the unmanaged resources will be freed.
Here is the problem with this approach, that I simply cannot get my head around, is that occasionally, the GC will decide to finalize some of the classes that are used to enumerate over data in the file. This happens with many different PST files, and I've been able to determine it has something to do with the Finalize method being called, even though the class is still in use.
I can consistently get it to happen with this file (download)1. The finalizer that gets called early is in the NodeIdCollection class that is in DBAccessor.cpp file. If you are able to run the code that was linked to above (this project can be difficult to setup because of the dependencies on the boost library), the application would fail with an exception, because the _nodes list is set to null and the _db_ pointer was reset as a result of the finalizer running.
1) Are there any glaring problems with the enumeration code in the NodeIdCollection class that would cause the GC to finalize this class while it's still in use?
I've only been able to get the code to run properly with the workaround I've described below.
An unsightly workaround
Now, I was able to work around this problem by moving all of the resource management code from the each of the finalizers (!classname) to the destructors (~classname). This has solved the problem, though it hasn't solved my curiosity of why the classes are finalized early.
However, there is a problem with the approach, and I'll admit that it's more a problem with the design. Due to the heavy use of pointers in the code, nearly every class handles its own resources, and requires each class be disposed. This makes using the enumerations quite ugly (C#):
foreach (var msg in pst.Messages)
{
// If this using statement were removed, we would have
// memory leaks
using (msg)
{
// code here
}
}
The using statement acting on the item in the collection just screams wrong to me, however, with the approach it's very necessary to prevent any memory leaks. Without it, the dispose never gets called and the memory is never freed, even if the dispose method on the pst class is called.
I have every intention trying to change this design. The fundamental problem when this code was first being written, besides the fact that I knew little to nothing about C++/CLI, was that I couldn't put a native class inside of a managed one. I feel it might be possible to use scoped pointers that will free the memory automatically when the class is no longer in use, but I can't be sure if that's a valid way to go about this or if it would even work. So, my second question is:
2) What would be the best way to handle the unmanaged resources in the managed classes in a painless way?
To elaborate, could I replace a native pointer with the clr_scoped_ptr wrapper that was just recently added to the code (clr_scoped_ptr.h from this stackexchange question). Or would I need to wrap the native pointer in something like a scoped_ptr<T> or smart_ptr<T>?
Thank you for reading all of this, I know it was a lot. I hope I've been clear enough so that I might get some insight from people a little more experienced than I am. It's such a large question, I intend on adding a bounty when it allows me too. Hopefully, someone can help.
Thanks!
1This file is part of the freely available enron dataset of PST files
The clr_scoped_ptr is mine, and comes from here.
If it has any errors, please let me know.
Even if my code isn't perfect, using a smart pointer is the correct way to deal with this issue, even in managed code.
You do not need to (and should not) reset a clr_scoped_ptr in your finalizer. Each clr_scoped_ptr will itself be finalized by the runtime.
When using smart pointers, you do not need to write your own destructor or finalizer. The compiler-generated destructor will automatically call destructors on all subobjects, and every subobject finalizer will run when it is collected.
Looking closer at your code, there is indeed an error in NodeIdCollection. GetEnumerator() must return a different enumerator object each time it is called, so that each enumeration would begin at the start of the sequence. You're reusing a single enumerator, meaning that position is shared between successive calls to GetEnumerator(). That's bad.
Refreshing my memory of destructors/finalalisers, from some Microsoft documentation, you could at least simplify your code a little, I think.
Here's my version of your sequence:
DBContext::~DBContext()
{
this->!DBContext();
}
DBContext::!DBContext()
{
delete _pst;
_pst = NULL;
}
The "GC::SupressFinalize" is automatically done by C++/CLI, so no need for that. Since the _pst variable is initialised in the constructor (and deleting a null variable causes no problems anyway), I can't see any reason to complicate the code by using smart pointers.
On a debugging note, I wonder if you can help make the problem more apparent by sprinkling in a few calls to "GC::Collect". That should force finalization on dangling objects for you.
Hope this helps a little,
I have been searching for a tool to debug unreleased COM references, that usually cause e.g. Word/Outlook processes to hang in memory in case the code does not call Marshal.ReleaseCOMObject on all COM instances correctly. (Outlook 2007 partially fixes this for outlook addins, but this is a generic question).
Is there a tool that would display at least a list of COM references (by type) held by managed code? Ideally, it would also display memory profiler-style object trees helping to debug where the reference increment occured.
Debugging at runtime is not that important as being able to attach to a hung process - because the problem typically occurs when the code is done with the COM interface and someone forgot to release something - the application (e.g. winword) hangs in memory even after the calling managed application quits.
If such tool does not exist, what is the (technical?) reason? It would be very useful for debugging a lot of otherwise very hard to find problems when working with COM interop.
Not quite the answer (or as simple as) you are looking for, but ReleaseCOMObject (as does the native Release which it wraps) returns an Integer which should indicate the number of outstanding references remaining. You could add code to your project to explicitly do an AddRef so you could then Release, and see the new count, and see if it is what you are expecting, etc.
Here is my answer to debugging COM references using debugger
Troubleshooting a COM+ application deadlock