Excessive memory usage in C# with lots of COM objects

Excessive memory usage in C# with lots of COM objects - c#

I have an application that was originally written in VB6 that I used a tool to convert to C# with pretty good success from a functional perspective. It processes a high volume of message using lots of small to medium sized COM (C++) objects.
I noticed that a particular test run in the old VB6 app that ran using less than 40M of memory required nearly 900M in the C# app. If I put a GC.Collect() in the inner-most message processing loop of the C# app, it uses the same or less memory as the VB6 app although it is then really, really slow. This leads me to believe there is no "leak" in the absolute sense of the word.
I then ran the C# app through the AQTime memory profiler and it reported that there were an excessive number of COM/C++ objects live on the heap. I hypothesized that this was because the runtime callable wrappers around the COM objects were quite small and never (or rarely) triggered collection in C# even if their referenced COM objects were substantially larger. I thought I could address this by adding explicit Marshal.ReleaseComObject() calls around the COM objects in the C# app. I went and did this in a lot of places where the lifetime of the COM objects was easy to determine. I noticed only a very slight reduction in memory usage.
I am wondering why I did not have better success with this. Looking through the static methods in the Marshal class, I see some that lead me to believe either tha I may be missing some subtlety in the handling of COM references or that my assumption that they are immediately destroyed when the RCW's reference count reaches zero is incorrect.
I would appreciate any suggestions for other approachs that I could try or other things that I may have overlooked or misunderstood.

Sorry for the link instead of a good synopsis, but I've never had that issue myself as I've dealt with IE and mshtml in a long lived scenario.
The article states:
When using a COM object from a .NET-based application, there are two objects involved: the RCW and the COM object (or objects). Garbage collection is only aware of the size of the RCW (which can be small), not of the COM object (which may be large). Therefore, while the .NET-based application might release the RCW, garbage collection may not reclaim the RCW even as memory runs out. As long as the RCW stays in memory, the COM object that it manages stays in memory also.
There are two mechanisms that ensure that COM objects are released from memory: the AppDomain object and the ReleaseComObject method. Using an AppDomain provides the simplest solution to managing COM objects but has performance costs and can expose a security risk. Using ReleaseComObject avoids those costs but requires more careful planning and coding.
COM Handling
Marshal.ReleaseComObject

Related

Tracking Down a .NET Windows Service Memory Leak

Before installing my windows service in production, I was looking for reliable tests that I can perform to make sure my code doesn't contain memory leaks.
However, All what I can find on the net was using task manager to look at used memory or some paid memory profiler tools.
From my understanding, looking at the task manager is not really helpful and cannot confirm the memory leakage (in case, there is).
How to confirm whether there is a memory leak or not?
Is there any free tools to find the source of memory leaks?
Note: I'm using .Net Framework 4.6 and Visual Studio 2015 Community

Well you can use task manager.
GC apps can leak memory, and it will show there.
But...
Free tool - ".Net CLR profiler"
There is a free tool, and it's from Microsoft, and it's awesome. This is a must-use for all programs that leak references. Search MS' site.
Leaking references means you forget to set object references to null, or they never leave scope, and this is almost as likely to occur in Garbage collected languages as not - lists building up and not clearing, event handlers pointing to delegates, etc.
It's the GC equivalent of memory leaks and has the same result. This program tells you what references are taking up tons of memory - and you will know if it's supposed to be that way or not, and if not, you can go find them and fix the problem!
It even has a cool visualization of what objects allocate what memory (so you can track down mistakes). I believe there are youtubes of this if you need an explanation.
Wikipedia page with download links...
NOTE: You will likely have to run your app not as a service to use this. It starts first and then runs your app. You can do this with TopShelf or by just putting the guts in a dll that runs from an EXE that implments the service integrations (service host pattern).

Although managed code implies no direct memory management, you still have to manage your instances. Those instances 'claim' memory. And it is all about the usage of these instances, keeping them alive when you don't expect them to be.
Just one of many examples: wrong usage of disposable classes can result in a lot of instances claiming memory. For a windows service, a slow but steady increase of instances can eventually result in to much memory usage.
Yes, there is a tool to analyze memory leaks. It just isn't free. However you might be able to identify your problem within the 7 day trial.
I would suggest to take a loot at the .NET Memory Profiler.
It is great to analyze memory leaks during development. It uses the concept of snapshots to compare new instances, disposed instances etc. This is a great help to understand how your service uses its memory. You can then dig deeper into why new instances get created or are kept alive.
Yes, you can test to confirm whether memory leaks are introduced.
However, just out-of-the box this will not be very useful. This is because no one can anticipate what will happen during runtime. The tool can analyze your app for common issues, but this is not guaranteed.
However, you can use this tool to integrate memory consumption into your unit test framework like NUnit or MSTest.

Of course a memory profiler is the first kind of tool to try, but it will only tell you whether your instances keep increasing. You still want to know whether it is normal that they are increasing. Also, once you have established that some instances keep increasing for no good reason, (meaning, you have a leak,) you will want to know precisely which call trees lead to their allocation, so that you can troubleshoot the code that allocates them and fix it so that it does eventually release them.
Here is some of the knowledge I have collected over the years in dealing with such issues:
Test your service as a regular executable as much as possible. Trying to test the service as an actual service just makes things too complicated.
Get in the habit of explicitly undoing everything that you do at the end of the scope of that thing which you are doing. For example, if you register an observer to the event of some observee, there should should always be some point in time (the disposal of the observer or the observee?) that you de-register it. In theory, garbage collection should take care of that by collecting the entire graph of interconnected observers and observees, but in practice, if you don't kick the habit of forgetting to undo things that you do, you get memory leaks.
Use IDisposable as much as possible, and make your destructors report if someone forgot to invoke Dispose(). More about this method here: Mandatory disposal vs. the "Dispose-disposing" abomination Disclosure: I am the author of that article.
Have regular checkpoints in your program where you release everything that should be releasable (as if the program is performing an orderly shutdown in order to terminate) and then force a garbage collection to see whether you have any leaks.
If instances of some class appear to be leaking, use the following trick to discover the precise calling tree that caused their allocation: within the constructor of that class, allocate an exception object without throwing it, obtain the stack trace of the exception, and store it. If you discover later that this object has been leaked, you have the necessary stack trace. Just don't do this with too many objects, because allocating an exception and obtaining the stack trace from it is ridiculously slow, only Microsoft knows why.

You could try the free Memoscope memory profiler
https://github.com/fremag/MemoScope.Net
I do not agree that you can trust the Task Manager to check if you have a memory leak or not. The problem with a garbage collector is that it can decide based on heuristics to keep the memory after a memory spike and do not return it to the OS. You might have a 2 GB Commit size but 90% of them can be free.
You should use VMMAP to check during the tests what type of memory your process contains. You do not only have the managed heap, but also unmanaged heap, private bytes, stacks (thread leaks), shared files and much more which need to be tracked.
VMMap has also command line interface which makes it possible to create snapshots at regular intervals which you can examine later. If you have a memory growth you can find out which type of memory is leaked which needs depending on the leak type different debugging tooling approaches.

I would not say that the Garbage collector is infallible. There are times when it fails unknowingly and they are not so straight forward. Memory streams are a common cause of memory leaks. You can open them in one context and they may never even get closed, even though the usage is wrapped in a using statement (the definition of a disposable object that should be cleaned up immediately after its usage falls out of scope). If you are experiencing crashes due to running out of memory, Windows does create dump files that you can sift through.
enter link description here
This is by no means fun or easy and is quite tedious but it tends to be your best bet.
Common areas that are easy to create memory leaks are anything that is using the System.Drawing dll, memory streams, and if you are doing some serious multi-threading.

If you use Entity Framework and a DI pattern, perhaps using Castle Windsor, you can easily get memory leaks.
The main thing to do is use the using( ){ } statement where-ever you can to automatically mark objects as disposed.
Also, you want to turn off automatic tracking on Entity Framework where you are only reading and not writing. Best to isolate your writes, use a using() {} at this point, get a dbContext (with tracking on), write your data.
If you want to investigate what is on the heap. The best tool I've used is RedGate ANTS http://www.red-gate.com/products/dotnet-development/ants-memory-profiler/solving-memory-problems/getting-started not cheap but it works.
However, by using the using() {} pattern where-ever you can (don't make a static or singleton DbContext and never have one context in a massive loop of updates, dispose of them as often as you can!) then you find memory isn't often an issue.
Hope this helps.

Unless you're dealing with unmanaged code, i would be so bold to say you don't have to worry about memory leaks. Any unreferenced object in managed code will be removed by the garbage collector, and the possibility in finding a memory leak within the .net framework i would say you should be considered very lucky (well, unlucky). You don't have to worry about memory leak.
However, you can still encounter ever-growing memory usage, if references to objects are never released. For example, say you keep an internal log structure, and you just keep adding entries to a log list. Then every entry still have references from the log list and therefore will never be collected.
From my experience, you can definitely use the task manager as an indicator whether your system has growing issues; if the memory usage steadily keep rising, you know you have an issue. If it grows to a point but eventually converges to a certain size, it indicates it has reached its operating threshold.
If you want a more detailed view of managed memory usage, you can download the process explorer here, developed by Microsoft. It is still quite blunt, but it gives a somewhat better statistical view than task manager.

Why might unmanaged memory account for over 60% of memory used by console application?

I'm profiling memory use with ANTS Memory Profiler 7.0 and noticed that unmanaged memory use is ~193MB (~62%) for a console application that does little more than populate some DTOs from 10 million or so records.
The help text for unmanaged memory says:
The memory is assigned to the parts of the application that aren't running as pure .NET code. This includes the common language runtime itself, graphics buffers and any unmanaged data accessed through P/Invoke or COM+
Why might this figure be so high?

You will inevitable use unmanaged code when accessing a database. The interface to the engine is always code that's been around for a long time, predating .NET and wrapped by managed classes that provide the interop. True for, say, SQL Server and any provider that piggy-backs onto OleDb or ODBC.
These managed classes will always implement IDisposable so you can release the resources consumed by the native provider early. Forgetting to do so is very common and rarely noticed. Other than seeing the process running "heavy", seemingly consuming a lot of handles and unmanaged memory for no good reason. This will be especially the case when the garbage collector does not run frequently enough, something you can see with Perfmon.exe. So beyond not using Dispose, part of the problem can be that you don't do enough work with these DTO objects yet to get enough GC churn.
Review your code and ensure you use Dispose() and the using statement where required.

C# Garbage Collection -> to C++ delete

I'm converting a C# project to C++ and have a question about deleting objects after use. In C# the GC of course takes care of deleting objects, but in C++ it has to be done explicitly using the delete keyword.
My question is, is it ok to just follow each object's usage throughout a method and then delete it as soon as it goes out of scope (ie method end/re-assignment)?
I know though that the GC waits for a certain size of garbage (~1MB) before deleting; does it do this because there is an overhead when using delete?
As this is a game I am creating there will potentially be lots of objects being created and deleted every second, so would it be better to keep track of pointers that go out of scope, and once that size reachs 1MB to then delete the pointers?
(as a side note: later when the game is optimised, objects will be loaded once at startup so there is not much to delete during gameplay)

Your problem is that you are using pointers in C++.
This is a fundamental problem that you must fix, then all your problems go away. As chance would have it, I got so fed up with this general trend that I created a set of presentation slides on this issue. – (CC BY, so feel free to use them).
Have a look at the slides. While they are certainly not entirely serious, the fundamental message is still true: Don’t use pointers. But more accurately, the message should read: Don’t use delete.
In your particular situation you might find yourself with a lot of long-lived small objects. This is indeed a situation which a modern GC handles quite well, and which reference-counting smart pointers (shared_ptr) handle less efficiently. If (and only if!) this becomes a performance problem, consider switching to a small object allocator library.

You should be using RAII as much as possible in C++ so you do not have to explicitly deleteanything anytime.
Once you use RAII through smart pointers and your own resource managing classes every dynamic allocation you make will exist only till there are any possible references to it, You do not have to manage any resources explicitly.

Memory management in C# and C++ is completely different. You shouldn't try to mimic the behavior of .NET's GC in C++. In .NET allocating memory is super fast (basically moving a pointer) whereas freeing it is the heavy task. In C++ allocating memory isn't that lightweight for several reasons, mainly because a large enough chunk of memory has to be found. When memory chunks of different sizes are allocated and freed many times during the execution of the program the heap can get fragmented, containing many small "holes" of free memory. In .NET this won't happen because the GC will compact the heap. Freeing memory in C++ is quite fast, though.
Best practices in .NET don't necessarily work in C++. For example, pooling and reusing objects in .NET isn't recommended most of the time, because the objects get promoted to higher generations by the GC. The GC works best for short lived objects. On the other hand, pooling objects in C++ can be very useful to avoid heap fragmentation. Also, allocating a larger chunk of memory and using placement new can work great for many smaller objects that need to be allocated and freed frequently, as it can occur in games. Read up on general memory management techniques in C++ such as RAII or placement new.
Also, I'd recommend getting the books "Effective C++" and "More effective C++".

Well, the simplest solution might be to just use garbage collection in
C++. The Boehm collector works well, for example. Still, there are
pros and cons (but porting code originally written in C# would be a
likely candidate for a case where the pros largely outweigh the cons.)
Otherwise, if you convert the code to idiomatic C++, there shouldn't be
that many dynamically allocated objects to worry about. Unlike C#, C++
has value semantics by default, and most of your short lived objects
should be simply local variables, possibly copied if they are returned,
but not allocated dynamically. In C++, dynamic allocation is normally
only used for entity objects, whose lifetime depends on external events;
e.g. a Monster is created at some random time, with a probability
depending on the game state, and is deleted at some later time, in
reaction to events which change the game state. In this case, you
delete the object when the monster ceases to be part of the game. In
C#, you probably have a dispose function, or something similar, for
such objects, since they typically have concrete actions which must be
carried out when they cease to exist—things like deregistering as
an Observer, if that's one of the patterns you're using. In C++, this
sort of thing is typically handled by the destructor, and instead of
calling dispose, you call delete the object.

Substituting a shared_ptr in every instance that you use a reference in C# would get you the closest approximation at probably the lowest effort input when converting the code.
However you specifically mention following an objects use through a method and deleteing at the end - a better approach is not to new up the object at all but simply instantiate it inline/on the stack. In fact if you take this approach even for returned objects with the new copy semantics being introduced this becomes an efficient way to deal with returned objects also - so there is no need to use pointers in almost every scenario.

There are a lot more things to take into considerations when deallocating objects than just calling delete whenever it goes out of scope. You have to make sure that you only call delete once and only call it once all pointers to that object have gone out of scope. The garbage collector in .NET handles all of that for you.
The construct that is mostly corresponding to that in C++ is tr1::shared_ptr<> which keeps a reference counter to the object and deallocates when it drops to zero. A first approach to get things running would be to make all C# references in to C++ tr1::shared_ptr<>. Then you can go into those places where it is a performance bottleneck (only after you've verified with a profile that it is an actual bottleneck) and change to more efficient memory handling.

GC feature of c++ has been discussed a lot in SO.
Try Reading through this!!
Garbage Collection in C++

Where can I find more detail on disposal of COM resources in .NET?

The background for my question is this: I am converting a message processing app that uses many COM components from VB6 to C#. Many of the COM components in the application are fine-grained components that are used in high numbers and frequency within message processing loops. I am seeing a massive (and progressively growing) increase in memory usage when processing a set of test messages in the C# app as compared to the VB6 app. I used a memory profiler on the application that confirmed that the high memory usage was due to live instances of COM objects on the application's unmanaged heap. I know that these components are not being "leaked" due to live references because if I put a GC.Collect() at the core of the message processing loop, the memory usage is flat and nearly identical to the VB6 app (although the performance degrades horribly as one would expect).
I have read everything I can find on the multi-generation garbage collector in C#, runtime callable wrappers, unmanaged resource memory allocation, Marshal.ReleaseComObject(), Marshal.FinalReleaseComObject(), etc. None of it explains why the application is holding onto live COM objects in the unmanaged heap when the corresponding RCWs are eligible for garbage collection.
In some articles, I have seen allusions to the possibility that the actual implementation of the garbage collector in C# may involve optimizations such as not performing collection of all eligible objects in a particular generation. If this or something like it were true, it could explain why eligible RCWs and their corresponding COM objects are not collected and destroyed. Another explanation could be if the destruction of a COM object in the unmanaged heap is not directly tied to the collection of its corresponding RCW. I have not found anything that offers this degree of detail on how COM objects are handled in .NET. I need to understand this better because my app's memory usage is currently unacceptable. Any pointers or recommendations would be greatly appreciated.
Edit: I should add that I am quite familiar with finalizers (which are not documented to exist on RCWs) and the IDisposable interface (which RCWs do not implement). To the best of my understanding, Marshal.ReleaseComObject() is the proper method of explicitly "disposing" of a COM reference. I was careful to add that statement for every known usage of a COM object in my app and it resulted in no difference in memory usage.
Further, it is not clear why a lack of disposal or finalization code could be the problem when the addition of an explicit GC.Collect() results in no memory problems. Neither Dispose() nor the presence of a finalizer result in the actual collection of objects. The former permits an object to suppress its finalization step (if any) and the latter allows for the cleanup of unmanaged resources of which none are exposed in an RCW.

Are you implementing IDisposable properly in the classes that instantiate your COM objects?
You need to implement IDisposable and dispose of your COM RCW's in the Dispose() method. Then all code that instantiates classes that implement IDisposable should call it either explicitly or by using a using() statement, like so:
var first = new DisposableObject();
...
first.Dispose();
and
using(var first = new DisposableObject())
{
...
}
IDisposable is the only way to get the CLR to dispose of these objects in a timely manner and to make sure you lose COM references.

Use a Finalizer or destructor to clean up the memory used by the COM objects.
http://msdn.microsoft.com/en-us/library/66x5fx1b.aspx
Alternatively, if you want the objects to clean up immediately, you can implement IDispose, and use a using statement in your code that instantiates the COM object.

What are ways to solve Memory Leaks in C#

I'm learning C#. From what I know, you have to set things up correctly to have the garbage collector actually delete everything as it should be. I'm looking for wisdom learned over the years from you, the intelligent.
I'm coming from a C++ background and am VERY used to code-smells and development patterns. I want to learn what code-smells are like in C#. Give me advice!
What are the best ways to get things deleted?
How can you figure out when you have "memory leaks"?
Edit: I am trying to develop a punch-list of "stuff to always do for memory management"
Thanks, so much.

C#, the .NET Framework uses Managed Memory and everything (but allocated unmanaged resources) is garbage collected.
It is safe to assume that managed types are always garbage collected. That includes arrays, classes and structures. Feel free to do int[] stuff = new int[32]; and forget about it.
If you open a file, database connection, or any other unmanaged resource in a class, implement the IDisposable interface and in your Dispose method de-allocate the unmanaged resource.
Any class which implements IDisposable should be explicitly closed, or used in a (I think cool) Using block like;
using (StreamReader reader = new StreamReader("myfile.txt"))
{
... your code here
}
Here .NET will dispose reader when out of the { } scope.

The first thing with GC is that it is non-deterministic; if you want a resource cleaned up promptly, implement IDisposable and use using; that doesn't collect the managed memory, but can help a lot with unmanaged resources and onward chains.
In particular, things to watch out for:
lots of pinning (places a lot of restrictions on what the GC can do)
lots of finalizers (you don't usually need them; slows down GC)
static events - easy way to keep a lot of large object graphs alive ;-p
events on an inexpensive long-life object, that can see an expensive object that should have been cleaned up
"captured variables" accidentally keeping graphs alive
For investigating memory leaks... "SOS" is one of the easiest routes; you can use SOS to find all instances of a type, and what can see it, etc.

In general, the less you worry about memory allocation in C#, the better off you are. I would leave it to a profiler to tell me when I'm having issues with collection.
You can't create memory leaks in C# in the same way as you do in C++. The garbage collector will always "have your back". What you can do is create objects and hold references to them even though you never use them. That's a code smell to look out for.
Other than that:
Have some notion of how frequently collection will occur (for performance reasons)
Don't hold references to objects longer than you need
Dispose of objects that implement IDisposable as soon as you're done with them (use the using syntax)
Properly implement the IDisposable interface

The main sources of memory leaks I can think of are:
keeping references to objects you don't need any more (usually in some sort of collection) So here you need to remember that all things that you add to a collection that you have reference too will stay in memory.
Having circular references, e.g. having delegates registered with an event. So even though you explicitly don't reference an object, it can't get garbage collected because one of its methods is registered as a delegate with an event. In these cases you need to remember to remove the delegate before discarding the reference.
Interoperating with native code and failing to free it. Even if you use managed wrappers that implement finalizers, often the CLR doesn't clean them fast enough, because it doesn't understand the memory footprint. You should use the using(IDisposable ){} pattern

One other thing to consider for memory management is if you are implementing any Observer patterns and not disposing of the references correctly.
For instance:
Object A watches Object B
Object B is disposed if the reference from A to B is not disposed of property the GC will not properyly dispose of the object. Becuase the event handler is still assigned the GC doesn't see it as a non utilized resource.
If you have a small set of objects you're working with this may me irrelevant. However, if your working with thousands of objects this can cause a gradual increase in memory over the life of the application.
There are some great memory management software applications to monitor what's going on with the heap of your application. I found great benefit from utilizing .Net Memory Profiler.
HTH

I recommend using .NET Memory Profiler
.NET Memory Profiler is a powerful tool for finding memory leaks and optimizing the memory usage in programs written in C#, VB.NET or any other .NET Language.
.NET Memory Profiler will help you to:
View real-time memory and resource information
Easily identify memory leaks by collecting and comparing snapshots of .NET memory
Find instances that are not properly disposed
Get detailed information about unmanaged resource usage
Optimize memory usage
Investigate memory problems in production code
Perform automated memory testing
Retrieve information about native memory
Take a look at their video tutorials:
http://memprofiler.com/tutorials/

Others have already mentioned the importance of IDisposable, and some of the things to watch out for in your code.
I wanted to suggest some additional resources; I found the following invaluable when learning the details of .NET GC and how to trouble-shoot memory issues in .NET applications.
CLR via C# by Jeffrey Richter is an excellent book. Worth the purchase price just for the chapter on GC and memory.
This blog (by a Microsoft "ASP.NET Escalation Engineer") is often my go-to source for tips and tricks for using WinDbg, SOS, and for spotting certain types of memory leaks. Tess even designed .NET debugging demos/labs which will walk you through common memory issues and how to recognize and solve them.
Debugging Tools for Windows (WinDbg, SOS, etc)

You can use tools like CLR profiler it takes some time to learn how to use it correctly, but after all it is free. (It helped me several times to find my memory leakage)

The best way to ensure that objects get deleted, or in .NET lingo, garbage-collected, is to ensure that all root references (references that can be traced through methods and objects to the first method on a thread's call stack) to an object are set to null.
The GC cannot, and will not, collect an object if there are any rooted references to it, no matter whether it implements IDisposable or not.
Circular references impose no penalty or possibility of memory leaks, as the GC marks which objects it has visited in the object graph. In the case of delegates or eventhandlers it may be common to forget to remove the reference in an event to a target method, so that the object that contains the target method can't be collected if the event is rooted.

What are the best ways to get things deleted?
NOTE: the following works only for types containing unmanaged resources. It doesn't help with purely managed types.
Probably the best method is to implement and follow the IDisposable pattern; and call the dispose method on all objects implementing it.
The 'using' statement is your best friend. Loosely put, it will call dispose for you on objects implementing IDisposable.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.