RCW Finalizer Access Violation - c#

I am using COM interop for creating a managed plugin into an unmanaged application using VS2012/.NET 4.5/Win8.1. All the interop stuff seems to be going ok, but when I close the app I get an MDA exception telling me AV's have happened while Releasing COM objects the RCW's were holding onto during Finalizing.
This is the call stack:
clr.dll!MdaReportAvOnComRelease::ReportHandledException() + 0x91 bytes
clr.dll!**SafeRelease_OnException**() + 0x55 bytes
clr.dll!SafeReleasePreemp() + 0x312d5f bytes
clr.dll!RCW::ReleaseAllInterfaces() + 0xf3 bytes
clr.dll!RCW::ReleaseAllInterfacesCallBack() + 0x4f bytes
clr.dll!RCW::Cleanup() + 0x24 bytes
clr.dll!RCWCleanupList::ReleaseRCWListRaw() + 0x16 bytes
clr.dll!RCWCleanupList::ReleaseRCWListInCorrectCtx() + 0x9c bytes
clr.dll!RCWCleanupList::CleanupAllWrappers() + 0x2cd1b6 bytes
clr.dll!RCWCache::ReleaseWrappersWorker() + 0x277 bytes
clr.dll!AppDomain::ReleaseRCWs() + 0x120cb2 bytes
clr.dll!ReleaseRCWsInCaches() + 0x3f bytes
clr.dll!InnerCoEEShutDownCOM() + 0x46 bytes
clr.dll!WKS::GCHeap::**FinalizerThreadStart**() + 0x229 bytes
clr.dll!Thread::intermediateThreadProc() + 0x76 bytes
kernel32.dll!BaseThreadInitThunk() + 0xd bytes
ntdll.dll!RtlUserThreadStart() + 0x1d bytes
My guess is that the Application has already destroyed its COM objects, of which some references were passed to the managed plugin - and the call to the IUnknown::Release the RCW makes makes it go boom.
I can clearly see in the output window (VS) that the app has already started unloading some of it's dll's.
'TestHost.exe': Unloaded 'C:\Windows\System32\msls31.dll'
'TestHost.exe': Unloaded 'C:\Windows\System32\usp10.dll'
'TestHost.exe': Unloaded 'C:\Windows\System32\riched20.dll'
'TestHost.exe': Unloaded 'C:\Windows\System32\version.dll'
First-chance exception at 0x00000001400cea84 in VST3PluginTestHost.exe: 0xC0000005: Access violation reading location 0xffffffffffffffff.
First-chance exception at 0x00000001400cea84 in VST3PluginTestHost.exe: 0xC0000005: Access violation reading location 0xffffffffffffffff.
Managed Debugging Assistant 'ReportAvOnComRelease' has detected a problem in 'C:\Program Files\Steinberg\VST3PluginTestHost\VST3PluginTestHost.exe'.
Additional Information: An exception was caught but handled while releasing a COM interface pointer through Marshal.Release or Marshal.ReleaseComObject or implicitly after the corresponding RuntimeCallableWrapper was garbage collected. This is the result of a user refcount error or other problem with a COM object's Release. Make sure refcounts are managed properly. The COM interface pointer's original vtable pointer was 0x406975a8. While these types of exceptions are caught by the CLR, they can still lead to corruption and data loss so if possible the issue causing the exception should be addressed
So I though I would manage the lifetime my self and wrote a ComReference class that calls Marshal.ReleaseComObject. That did not work correctly and after reading up on it I have to agree that calling Marshal.ReleaseComObject in a scenrario where references are passed around freely, is not a good idea.
Marshal.ReleaseComObject Considered Dangerous
So the question is: Is there a way to manage this situation in order not to cause AV's when exiting the host application?

There are only three real solutions to this problem, and I think that interpretting the "Marshall.ReleaseComObject considered dangerous" article as "Don't use Marshall.ReleaseComObject" can mislead you. Your takeaway could just as easily have been "don't share RCWs freely".
Your three solutions are:
1: Change the execution of your host application to unload plugins before it unloads itself. That's easier said than done. If the plugin system of the host process includes a shutdown event, that would be a good place to deal with it. All of your services that are holding on to RCWs need to release them during shutdown.
2: Use Marshall.ReleaseComObject in a Dispose()-like pattern, ensuring that objects are only stored within a local scope in a manner similar to a using block. This is straight-forward to implement, allows you to release the COM references deterministically, and is generally a very good first approach.
3: Use a COM object broker that can hand out reference counted instances of RCWs and then release those objects when no one is using them. Ensure that every consumer of those objects clean-up prior to the application unloading.
Option #2 works fine as long as you don't store/share references to the managed RCW. I would use #2 up until you identify that your COM object has high activation costs and that caching/sharing is relevant.

This is a problem with the native COM reference counts. Your object is being Release()d from native code with refcount=1, it is destroyed, then the CLR comes along and tries Release() it. You need to track down where the reference count is going wrong. It crashes in the CLR because it runs cleanup after the native code is finished.
First step is to track down the type of object that isn't being counted properly. I did this by running gflags.exe against my .exe file and turn on "User mode stack traces". "Full page heap" may help also.
Run the application in windbg. Run .symfix. Run bp clr!SafeReleasePreemp "r rcx; gc"; g to log the interface pointers. When it crashes, the previous log entry should contain the interface pointer that was already destroyed. Run !heap -p -a [address of COM pointer] and it will print the stack of where it was released.
If you're unlucky, it won't crash right away, and the interface pointer that is causing problems won't be the most recent log. If you can run your native COM under the Debug configuration, it may help.
MS made the RCW header available. The members m_pIdentity (offset 0x88 on x64) and m_aInterfaceEntries (offset 0x8 on x64) are of interest. The RCW is in #rdx on entry to SafeReleasePreemp
Next step is to rerun with breakpoints on Interface::AddRef, Interface::QueryInterface, and Interface::Release to see which one is mismatched. _ATL_DEBUG_INTERFACES may help if you're using ATL.

Related

Crash in GC finalizer thread, what's the problem with "DestroyScout"?

I'm facing with a .Net server application, which crashes on an almost weekly basis on a problem in a "GC Finalizer Thread", more exactly at line 798 of "mscorlib.dll ...~DestroyScout()", according to Visual Studio.
Visual Studio also tries to open the file "DynamicILGenerator.gs". I don't have this file, but I've found a version of that file, where line 798 indeed is inside the destructor or the DestroyScout (whatever this might mean).
I have the following information in my Visual Studio environment:
Threads :
Not Flagged > 5892 0 Worker Thread GC Finalizer Thread mscorlib.dll!System.Reflection.Emit.DynamicResolver.DestroyScout.~DestroyScout
Call stack:
[Managed to Native Transition]
> mscorlib.dll!System.Reflection.Emit.DynamicResolver.DestroyScout.~DestroyScout() Line 798 C#
[Native to Managed Transition]
kernel32.dll!#BaseThreadInitThunk#12() Unknown
ntdll.dll!__RtlUserThreadStart() Unknown
ntdll.dll!__RtlUserThreadStart#8() Unknown
Locals (no way to be sure if that $exception object is correct):
+ $exception {"Exception of type 'System.ExecutionEngineException' was thrown."} System.ExecutionEngineException
this Cannot obtain value of the local variable or argument because it is not available at this instruction pointer,
possibly because it has been optimized away. System.Reflection.Emit.DynamicResolver.DestroyScout
Stack objects No CLR objects were found in the stack memory range of the current frame.
Source code of "DynamicILGenerator.cs", mentioning the DestroyScout class (line 798 is mentioned in comment):
private class DestroyScout
{
internal RuntimeMethodHandleInternal m_methodHandle;
[System.Security.SecuritySafeCritical] // auto-generated
~DestroyScout()
{
if (m_methodHandle.IsNullHandle())
return;
// It is not safe to destroy the method if the managed resolver is alive.
if (RuntimeMethodHandle.GetResolver(m_methodHandle) != null)
{
if (!Environment.HasShutdownStarted &&
!AppDomain.CurrentDomain.IsFinalizingForUnload())
{
// Somebody might have been holding a reference on us via weak handle.
// We will keep trying. It will be hopefully released eventually.
GC.ReRegisterForFinalize(this);
}
return;
}
RuntimeMethodHandle.Destroy(m_methodHandle); // <===== line 798
}
}
Watch window (m_methodHandle):
m_methodHandle Cannot obtain value of the local variable or argument because
it is not available at this instruction pointer,
possibly because it has been optimized away.
System.RuntimeMethodHandleInternal
General dump module information:
Dump Summary
------------
Dump File: Application_Server2.0.exe.5296.dmp : C:\Temp_Folder\Application_Server2.0.exe.5296.dmp
Last Write Time: 14/06/2022 19:08:30
Process Name: Application_Server2.0.exe : C:\Runtime\Application_Server2.0.exe
Process Architecture: x86
Exception Code: 0xC0000005
Exception Information: The thread tried to read from or write to a virtual address
for which it does not have the appropriate access.
Heap Information: Present
System Information
------------------
OS Version: 10.0.14393
CLR Version(s): 4.7.3920.0
Modules
-------
Module Name Module Path Module Version
----------- ----------- --------------
...
clr.dll C:\Windows\Microsoft.NET\Framework\v4.0.30319\clr.dll 4.7.3920.0
...
Be aware: the dump arrived on a Windows-Server 2016 computer, I'm investigating the dump on my Windows-10 environment (don't be mistaking on OS Version in the dump summary)!
Edit
What might the destroyscout be trying to destroy? That might be very interesting.
I don't know what exactly is causing this crash, but I can tell you what DestroyScout does.
It's related to creating dynamic methods. The class DynamicResolver needs to clean up related unmanaged memory, which is not tracked by GC. But it cannot be cleaned up until there are definitely no references to the method anymore.
However, because malicious (or outright weird) code can use a long WeakReference which can survive a GC, and therefore resurrect the reference to the dynamic method after its finalizer has run. Hence DestroyScout comes along with its strange GC.ReRegisterForFinalize code in order to ensure that it's the last reference to be destroyed.
It's explained in a comment in the source code
// We can destroy the unmanaged part of dynamic method only after the managed part is definitely gone and thus
// nobody can call the dynamic method anymore. A call to finalizer alone does not guarantee that the managed
// part is gone. A malicious code can keep a reference to DynamicMethod in long weak reference that survives finalization,
// or we can be running during shutdown where everything is finalized.
//
// The unmanaged resolver keeps a reference to the managed resolver in long weak handle. If the long weak handle
// is null, we can be sure that the managed part of the dynamic method is definitely gone and that it is safe to
// destroy the unmanaged part. (Note that the managed finalizer has to be on the same object that the long weak handle
// points to in order for this to work.) Unfortunately, we can not perform the above check when out finalizer
// is called - the long weak handle won't be cleared yet. Instead, we create a helper scout object that will attempt
// to do the destruction after next GC.
As to your crash, this is happening in internal code, and is causing an ExecutionEngineException. This most likely happens when there is memory corruption, when memory is used in a way it wasn't supposed to be.
Memory corruption can happen for a number of reasons. In order of likelihood:
Incorrect use of PInvoke to native Win32 functions (DllImport and asscociated marshalling).
Incorrect use of unsafe (including library classes such as Unsafe and Buffer which do the same thing).
Multi-threaded race conditions on objects which the Runtime does not expect to be used multi-threaded. This can cause such problems as torn reads and memory-barrier violations.
A bug in .NET itself. This can be the easiest to exclude: just upgrade to the latest build.
Consider submitting the crash report to Microsoft for investigation.
Edit from the author:
In order to submit a crash report to Microsoft, the following URL can be used: https://www.microsoft.com/en-us/unifiedsupport. Take into account that this is a paying service and that you might need to deliver your entire source code Microsoft in order to get a full analysis of your crash dump.

When I use Socket.IO, why I got an error An unhandled exception of type 'System.OutOfMemoryException'

I coded a program to get the screen shot and send to the server. Every time, I got a screenshot and turned into base64 then sent it using Socket.IO. (using SocketIOClient.dll)
Dictionary<string, string> image = new Dictionary<string, string>();
image.add("image", "");
private void windowMonitorTimer_Tick(object sender, EventArgs e)
{
image["image"] = windowMonitorManager.MonitorScreen();
client.getSocket().Emit("Shot", image);
}
windowMonitorManager.MonitorScreen() is for return a base64 string. If I do not use client.getSocket().Emit("Shot", image), the program could run correct, but if I add this line, the program stop like 2 seconds(send nearly 80 times) and give me the error :
An unhandled exception of type 'System.OutOfMemoryException' occurred in mscorlib.dll
If I do not send the string as long as this, just a short string "hello", it sends 1600 times then occurs the same problem.
Somebody knows how to debug this problem?
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
I try to test socket.Emit(), and find it has its limit.
For example, I send a string of 10000000, after 88 times, it occurs the out of memory problem.
If I send a string of 5000000, after 170 times, it occurs the same problem.
Out of memory is mostly an exception thrown when the process consumes much larger memory than what is allowed by default system (like 2 GB on a 32 bit system), on a 64 bit system, it is higher, but is still bound by the certain practical limit, it's not the theory value of 2^64, it differs from OS to OS and is also dependent on underlying RAM, but is large enough for a single process, now this situation can happens due to multiple reasons:
Memory leak (most prominent), mostly associated with the unmanaged code calls, if there's a handle or memory allocation that is not de-allocated or freed, over a period of time it leads to huge memory allocation for a process and thus the exception, when system cannot map any more.
Managed code can too leak and I have done that, when objects gets continuously created and they are not de-referenced, i.e they are still reachable in GC context, so you can lead to this scenario, I have done this in my code :)
This is not a null reference or a corruption, so taking direct stack trace will be of little use in this scenario, simply because you may get a different stack every time, it will be like the stack of the process threads, when exception happens and it would be mostly misleading, so do not try that way. Executing method of a thread doesn't mean it caused memory leak and it will be different for different threads.
How to debug:
Number of simple steps can be taken to narrow down, but before anything ensure that you have the debug version with valid pdb files for all the loaded binaries of the process.
To know whether is it a leak, monitor process "working set", "virtual bytes" counters either via task manager or preferably via perfmon, since it is much more accurate, also it provides visual graph.
Now a leak is a leak, so steps like increasing address space in a 32 bit system to 3 GB in place of default 2 GB can only help for sometime, but perfmon will tell you if there's a stabilization point, like in few cases, process needs 2.2 GB of memory, so 2 GB by default is not enough but 3 GB in boot.config and UserVA setting for fine tuning 3 GB will help avoiding exception.
If you are using 64 bit system then that's not a worry point, but ensure that your binary is compiled for X64 or any CPU, a 32 bit binary will run as a WOW process and will have limitation on 64 bit system too.
Also try a small utility handles from sysinternals, running it mutliple times in a batch for a process will provide details of a leaking handle, in terms of number of handles like file, mutex etc allocated
Once you have confirmed a genuine leak, not a settings or configuration or system issue, then comes the memory profilers. In free ones, you get lot of information through free tools like windbg, umdh and leakdiag, they infact point you to exact stack trace which is leaking. umdh and leakdiag are both very good tools, they let you know the leaking function. Leakdiag is more exhaustive than UMDH, but for runtime heap UMDH is good enough
Professional memory profilers like:
Dot Memory - http://www.jetbrains.com/dotmemory/
Ants - Red Gate - http://www.red-gate.com/products/dotnet-development/ants-memory-profiler/
are also very good, i personally find Dot memory much more helpful and it helps in quickly pointing to the leaking function or type, with little effort, provided you have correct symbol files. Both have free download version available
Mostly resolving out of memory exception is a gradual and iterative process, because this one could be hidden deep inside, leaking a chunk memory with every execution and bringing the complete process to knees. Let me know if you need help in using a specific tool, then we can see what more can be done to debug the issue further. Happy Debugging
Sounds to me like there's a bug in the SocketIOClient DLL. Without the DLL I cannot reproduce the problem, but tracking it down sounds easy.
Since C# is a garbage collected language, the only way to get an Out Of Memory (OOM) is if you have too much memory allocated that cannot be tracked down to a 'root object'. There are several kinds of root objects:
Static variables (or threadstatic)
Methods variables (locals / arguments) in your stack trace
All objects that you reference from these two (direct / indirect) will contribute to your memory pressure. If you allocate memory that's not available, the GC will first attempt to free memory before throwing an OOM; if not enough memory is free after the GC completed, an OOM will be thrown.
One obvious reason why this might happen is because you're running a 32-bit process, which is the default in Visual Studio nowadays. This can be fixed in the project properties. However, most processes don't need more than 2 GB of memory, so it's more likely that you 'leaked' memory somewhere. So let's break it down:
Locals or arguments that leak
Ways to solve these kind of OOM:
Open visual studio, ctrl d,e (or debug -> exceptions)
Click OutOfMemoryException -> check the 'throw' box
Run the program.
When the out of memory (OOM) hits, you browse through the nodes in the stack trace (or 'parallel stacks') and check the size of the variables. In most cases it's a single buffer or collection that causes the problem. F.ex. in your case a buffer might fill up with socket data, which is never emptied.
Static variables that leak
Other cases of OOM are usually buffers that granually fill up and have a reference to the main tree. The easiest way to find these is to use a memory profiler, like Red Gate / ANTS memory profiler. Run your program in the profiler, take some snapshots and check for 'large instances'.
In general, I usually try to avoid static variables alltogether, which solves a whole world of problems.
Oh and in this case...
Perhaps it's worth noting that there are a lot of good socket libraries out there... even though I don't know the specifics of SocketIOClient, you might want to consider using a widely supported, proven socket library like WCF/SOAP or Protobuf. There's a lot of material online on how to use these in just about any scenario, so if the problem is in SocketIOClient you might want to consider that...
I would guess you are running your timer too frequently and eating memory to an unsustainable point. Did you try lowering its frequency?
If lowering the frequency doesn't help, your code or the SocketIOClient.dll library may be leaking memory. I suggest that first you review the usage of that library to verify that you are not leaving resources open.

.Net Growing Memory Issue

I have an application which does a bunch of text parsing. After each pass it spits out some info into a database and clears out all the internal state.
My issue is the memeory allocated in Windows Task Mgr / Resource Monitor keeps growing and growing. I've done some profile using .Net Mem Profiler and it looks like it should be going down. Here is a screen shot from the profiler:
But in the Task Mgr after each pass the memory private working set increases. I'd like for the memory to grow as its used and then return to a normal level after each pass, that way I could keep this thing running.
Any advice on what to look for or any ideas what causes this?
A few things to check for are:
Event handlers keeping objects alive, make sure that if the object which subscribes to an event goes out of scope before the object publishing the event that it unsubscribes from the event to prevent the publishing object keep a reference to it.
Make sure that you call dispose on any object which implements IDisposable. Generally speaking objects which implement IDisposable contains resources which need special tidy up.
If you reference any com objects, make sure you release them properly.
You shouldn't ever have to call GC.Collect() within production code.
Task Mgr is not an accurate representation of the memory your application is actually using. Its more a representation of how much memory windows has allotted or planned for your application- if you app needs more, windows can expand this number, but if your app needs less, windows may not re-allocate this memory until another application actually needs it..
I think what you have above is a pretty accurate representation of what your application is actually doing (ie, its not leaking).
This forces .NET to collect all unused objects from memory, therefore reclaiming some of it:
GC.Collect();
GC.WaitForPendingFinalizers();
If you do have a memory leak, you can use the SOS debugging extension to try to find it. This article is also a very good example and a little more complete then what my answer will include.
You can use this in either VS or WinDbg, and the only difference is how you load the dll. For Visual Studio, first enable unmanaged debugging in the debug tab of your project's properties. When it comes time to load it, use .load SOS.dll in the Immediate Window. For WinDbg, either open the executable or attach to the process, and, to load it, use .loadby sos clr for .NET 4 or .loadby sos mscorwks for 2 or 3.5.
After letting the application run for a while, pause it (break all). Now you can load SOS. Upon success, enter !dumpheap -stat. This will list off how much memory each class is using. If this isn't enough to find the leak, the other article I linked goes into more depth on how to locate the memory leak.
One way to solve this is break up your work into separate executable programs. Once a program completes its work and ends, all its memory will be reclaimed by the system.

FileSystemWatcher and System Out of Memory Exception

A little context
There is a wpf based application which i left opened for 2-3 days without performing any activity throws out of memory exception , this is very Weird situation and does not happen all of the time. During this ideal activity , my application does nt perform any activity but just a file system watcher contineously watching a shared location , so i thought that would be a problem but i am not sure. Any suggestion is always welcomed.
Are you adding something to a list\collection when a FileSystemWatcher event happens? You could be doing this directly, or more possibly indirectly if it is non-obvious.
This could eventually lead to OOM and would be dependant on how many events there had been, so the time taken to reach OOM could be highly variable.
FileSystemWatcher on its own will not lead OOM. It maintains an internal buffer, but it will overwrite the buffer if file system event data is not taken via FSW events quickly enough.
So no, the FileSystemWatcher will not lead to OOM on its own - the internal buffer mechanism removes this possibility by design.
windbg (the debugger from Debugger Tools for Windows, included in the Windows SDK. itself included with VS these days) includes a command to dump statistics on the managed heap. Including what kind of types are allocated. This should help identify what objects are not being collected (probably via some reference that should be been cleared).
This should get you started: http://blogs.msdn.com/b/tess/archive/2005/11/25/496973.aspx

How to debug unreleased COM references from managed code?

I have been searching for a tool to debug unreleased COM references, that usually cause e.g. Word/Outlook processes to hang in memory in case the code does not call Marshal.ReleaseCOMObject on all COM instances correctly. (Outlook 2007 partially fixes this for outlook addins, but this is a generic question).
Is there a tool that would display at least a list of COM references (by type) held by managed code? Ideally, it would also display memory profiler-style object trees helping to debug where the reference increment occured.
Debugging at runtime is not that important as being able to attach to a hung process - because the problem typically occurs when the code is done with the COM interface and someone forgot to release something - the application (e.g. winword) hangs in memory even after the calling managed application quits.
If such tool does not exist, what is the (technical?) reason? It would be very useful for debugging a lot of otherwise very hard to find problems when working with COM interop.
Not quite the answer (or as simple as) you are looking for, but ReleaseCOMObject (as does the native Release which it wraps) returns an Integer which should indicate the number of outstanding references remaining. You could add code to your project to explicitly do an AddRef so you could then Release, and see the new count, and see if it is what you are expecting, etc.
Here is my answer to debugging COM references using debugger
Troubleshooting a COM+ application deadlock

Categories