I built an add-on to Microsoft Word. When the user clicks a button, it runs a number of processes that export a list of Microsoft Word documents to Filtered HTML. This works fine.
Where the code falls down is in processing large amounts of files. After the file conversions are done and I call the next function, the app crashes and I get this information from Visual Studio:
Managed Debugging Assistant 'DisconnectedContext' has detected a problem in 'C:\Program Files\Microsoft Office\root\Office16\WINWORD.EXE'.
Additional information: Transition into COM context 0x56255b88 for
this RuntimeCallableWrapper failed with the following error: System
call failed. (Exception from HRESULT: 0x80010100
(RPC_E_SYS_CALL_FAILED)). This is typically because the COM context
0x56255b88 where this RuntimeCallableWrapper was created has been
disconnected or it is busy doing something else. Releasing the
interfaces from the current COM context (COM context 0x56255cb0). This
may cause corruption or data loss. To avoid this problem, please
ensure that all COM contexts/apartments/threads stay alive and are
available for context transition, until the application is completely
done with the RuntimeCallableWrappers that represents COM components
that live inside them.
After some testing, I realized that if I simply remove all the code after the file conversions, there are no problems. To resolve this, I place the remainder of my code in yet another button.
The problem is I don't want to give the user two buttons. After reading various other threads, it sounds like my code has a memory or threading issue. The answers I am reading do not help me truly understand what to do next.
I feel like this is what I want to do:
1- Run conversion.
2- Close thread/cleanup memory issue from conversion.
3- Continue running code.
Unfortunately, I really don't know how to do #2 or if it is even possible. Your help is very much appreciated.
or it is busy doing something else
The managed debugging assistant diagnostic you got is pretty gobbledygooky but that's the part of the message that accurately describes the real problem. You have a firehose problem, the 3rd most common issue associated with threading. The mishap is hard to diagnose because this goes wrong inside the Word plumbing and not your code.
Trying not to commit the same gobbledygook sin myself, what goes wrong is that the interop calls you make into the Office program are queued, waiting for their turn to get executed. The underlying "system call" that the error code hints at is PostMessage(). Wherever there is a queue, there is a risk that the queue gets too large. Happens when the producer (your program) is adding items too the queue far faster than the consumer (the Office program) removes them. The firehose problem. Unless the producer slows down, the queue will grow without bounds and something is going to fail if it is allowed to grow endlessly, at a minimum the process runs out of memory.
It is not allowed to get close to that problem. The underlying queue that PostMessage() uses is protected by the OS. Windows fails the call when the queue already contains 10,000 messages. That's a fatal error that RPC does not know how to recover from, or rather should not try to recover from. Something is amiss and it isn't pretty. It returns an error code to your program to tell you about it. That's RPC_E_SYS_CALL_FAILED. Nothing much better happens in your program, the CLR doesn't know how to recover from it either, nor does your code. So the show is over, the interop call you made got lost and was not executed by Word.
Finding a completely reliable workaround for this awkward problem is not that straight-forward. Beware that this can happen on any interop call, so catching the exception and trying again is pretty drastically unpractical. But do keep in mind that the Q+D fix is very simple. The plain problem is that your program is running too fast, slowing it down with a Thread.Sleep() or Task.Delay() call is quite crude but will always fix the issue. Well, assuming you delay enough.
I think, but don't know for a fact because nobody ever posts repro code, that this issue is also associated with using a console mode app or a worker thread in your program. If it is a console mode app then try applying the [STAThread] attribute to your Main() method. If it is a worker thread then call Thread.SetApartmentState() before starting the thread, but beware it is very important to also create the Application interface on that worker thread. Not otherwise a workaround for an add-in.
If neither of those workarounds is effective or too unpractical then consider that you can automagically slow your program down, and ensure the queue is emptied, by occasionally reading something back from the Office program. Something silly, any property getter call will do. Necessarily you can't get the property value until the Office program catches up. That can still fail, there is also a 60 second time-out on the interop call. But that's something you can fix, you can call CoRegisterMessageFilter() in your program to install a callback that runs when the timeout trips. Very gobbledygooky as well, but the cut-and-paste code is readily available.
Related
I have a strange (beginner) situation. I started develop an asynchronous application which communicates with external devices via a serial port.
The gui and methods looks good. I can communicate with a device on the serial port well for the first 'test - lap**'. But when I execute the same command in the same application (Visual Studio debug mode myapplication not restarted) for a second time, the app freezes somewhere. I cant find the affected line/part/method etc....
Further strange experience: when I leave the app in this freeze stage for a long enough time, once the VS dropped an Exception (after half an hour or little bit more).
Managed Debugging Assistant 'DisconnectedContext' : 'Transition into COM context 0x16057d0 for this RuntimeCallableWrapper failed with the following error: System call failed.
(Exception from HRESULT: 0x80010100 (RPC_E_SYS_CALL_FAILED)). This is typically because the COM context 0x16057d0 where this RuntimeCallableWrapper was created has been disconnected or it is busy doing something else.
Releasing the interfaces from the current COM context (COM context 0x1605888). This may cause corruption or data loss.
To avoid this problem, please ensure that all COM contexts/apartments/threads stay alive and are available for context transition, until the application is completely done with the RuntimeCallableWrappers that represents COM components that live inside them.'
Additionally, the commands were sent in the background to the serial port, but the gui and / the code does not execute as in the first lap. Yes, I use async / await Task.Run(() => way to try avoid the GUI freeze. And when my tasks are done, I close the port for unexpected hidden behavior. The request and core-logic requires closing the serial port.
When I build the code, the compiler does not indicate an error.
I tried to use the Debug mode Breakpoints and step into mode without success. In the other words, my app runs in debug mode and sent some code/ received response well, but for more manual test(lap) in the same app( not closed/terminated) something is stuck or freezes or does not respond, and I want to find exactly where... :(
So, can someone advise something on how can I find the root cause (line) of my problem? I am aware of that it is quite hard to answer this question without an example, however the code is complex. Therefore, I would like your help/ advice.
You can try to run the app with my Runtime Flow extension. It may help you visualize what code is executed and where it hangs.
Is there a way how to at least postpone termination of managed app (by few dozens of milliseconds) and set some shared flag to give other threads chance to gracefully terminate (the SO thread itself wouldn't obviously execute anything further)? I'm contemplating to use JIT debugger or CLR hosting for this - I'm curios if anybody tried this before.
Why would I want to do something so wrong?:
Without too much detail - imagine this analogy - you are in a casino betting on a roulette and suddenly find out that the roulette is unreliable fake. So you want to immediately leave the casino, BUT likely want to collect your bets from the table first.
Unfortunately I cannot leverage separate process for this as there are very tight performance requirements.
Tried and didn't work:
.NET behavior for StackOverflowException (and contradicting info on MSDN) has been discussed several times on SO - to quickly sum up:
HandleProcessCorruptedStateExceptionsAttribute (e.g. on appdomain unhandled exception handler) doesn't work
ExecuteCodeWithGuaranteedCleanup doesn't work
legacyUnhandledExceptionPolicy doesn't work
There may be few other attempts how to handle StackOverflowExceptions - but it seems to be apparent that CLR terminates the whole process as is mentioned in this great answer by Hans Passant.
Considering to try:
JIT debugger - leave the thread with exception frozen, set some
shared flag (likely in pinned location) and thaw other threads for a
short time.
CLR hosting and setting unhandled exception policy
Do you have any other idea? Or any experience (successful/unsuccessful) with those two ways?
The word "fake" isn't quite the correct one for your casino analogy. There was a magnitude 9 earth quake and the casino building along with the roulette table, the remaining chips and the player disappeared in a giant cloud of smoke and dust.
The only shot you have at running code after an SOE is to stay far away from that casino, it has to run in another process. A "guard" process that starts your misbehaving program, it can use the Process.ExitCode to detect the crash. It will be -1073741571 (0xc00000fd). The process state is gone, you'll have to use one of the .NET out-of-process interop methods (like WCF, named pipes, sockets, memory-mapped file) to make the guard process aware of things that need to be done to clean up. This needs to be transactional, you cannot reason about the exact point in time that the crash occurred since it might have died while updating the guard.
Do beware that this is rarely worth the effort. Because an SOE is pretty indistinguishable from an everyday process abort. Like getting killed by Task Manager. Or the machine losing power. Or being subjected to the effects of an earth quake :)
A StackOverflowException is an immediate and critical exception from which the runtime cannot recover - that's why you can't catch it, or recover from it, or anything else. In order to run another method (whether that's a cleanup method or anything else), you have to be able to create a stack frame for that method, and the stack is already full (that's what a StackOverflowException means!). You can't run another method because running a method is what causes the exception in the first place!
Fortunately, though, this kind of exception is always caused by program structure. You should be able to diagnose and fix the error in your code: when you get the exception, you will see in your call stack that there's a loop of one or more methods recursing indefinitely. You need to identify what the faulty logic is and fix it, and that'll be a lot easier than trying to fix the unfixable exception.
I'm getting a strange error on a SharpDX program I made.
The program contains one form MainForm, which inherits from SharpDX.Windows.RenderForm (I'm doing Direct3D 9). I have some logic that kills the program by calling MainForm.Close(), and it works perfectly.
However, when I close the form with the X button, or by double clicking the top left corner of the screen, the program ends with code -1073610751 (0xc0020001).
This is a relatively minor annoyance, because it only happens when the program is finishing, so it doesn't really matter if it exits with an error, because it is actually finishing.
However, this error does not happen when I set a breakpoint at the last line of my Main(). If I do so, and then close the window as I explained, the breakpoint gets hit, and resuming ends the program with code 0.
Apart from SharpDX and one pure C DLL I am calling to one-shot process some data, I am not doing mixed code, or any other weird stuff.
I've looked around, but this code appears to be related to string bindings? other people seem to have this problem when doing weird mixed C++/CLI stuff, but I'm not doing anything like that.
Any ideas? at least on how to get more concise information on this error code?
It is a very low-level RPC error. Which is likely to be used in your program, it is the underlying protocol on top of which COM runs. There are plenty of candidates, SharpDX itself uses the COM interop layer to make DirectX calls. And DirectX itself is very likely to make such kind of calls to your video driver.
It is also the kind of error code you'd expect to get triggered if there's a shutdown-order problem. Like using a COM interface after it was already released. Shutting down a program cleanly can be a difficult problem to solve, especially when there are lots of threads. There are in any DirectX app. It is also very easy to ignore such a problem, even if it is known and recorded in somebody's bug database. Because, as you noted, the program otherwise shuts down okay without any nasty exceptions. RPC already prevented it from blowing up, you are seeing the error code it generated.
There's very little you can do yourself about this problem, this is code you did not write and you'll never find the programmer who did. If you see a first-chance exception notification in the Output window then you could enable the unmanaged debugger, use Debug + Exceptions and tick the Thrown checkbox for Win32 exception, enable the Microsoft Symbol server and you'll get a stack trace when the exception is thrown. Beware this will be in the bowels of native code with no source to look at. But it could pin-point the DLL that's causing the problem. Still nothing you can do to fix that DLL. I'd recommend a video driver update, the most common source of trouble. That's about as far as you can take it.
I'm getting close to desperate.. I am developing a field service application for Windows Mobile 6.1 using C# and quite some p/Invoking. (I think I'm referencing about 50 native functions)
On normal circumstances this goes without any problem, but when i start stressing the GC i'm getting a nasty 0xC0000005 error witch seems uncatchable. In my test i'm rapidly closing and opening a dialog form (the form did make use of native functions, but for testing i commented these out) and after a while the Windows Mobile error reporter comes around to tell me that there was an fatal error in my application.
My code uses a try-catch around the Application.Run(masterForm); and hooks into the CurrentDomain.UnhandledException event, but the application still crashes. Even when i attach the debugger, visual studio just tells me "The remote connection to the device has been lost" when the exception occurs..
Since I didn't succeed to catch the exception in the managed environment, I tried to make sense out of the Error Reporter log file. But this doesn't make any sense, the only consistent this about the error is the application where it occurs in.
The thread where the application occurs in is unknown to me, the module where the error occurs differs from time to time (I've seen my application.exe, WS2.dll, netcfagl3_5.dll and mscoree3_5.dll), even the error code is not always the same. (most of the time it's 0xC0000005, but i've also seen an 0X80000002 error, which is a warning accounting the first byte?)
I tried debugging through bugtrap, but strangely enough this crashes with the same error code (0xC0000005). I tried to open the kdmp file with visual studio, but i can't seem to make any sense out of this because it only shows me disassembler code when i step into the error (unless i have the right .pbb files, which i don't). Same goes for WinDbg.
To make a long story short: I frankly don't have a single clue where to look for this error, and I'm hoping some bright soul on stackoverflow does. I'm happy to provide some code but at this moment I don't know which piece to provide..
Any help is greatly appreciated!
[EDIT May 3rd 2010]
As you can see in my comment to Hans I retested the whole program after I uncommented all P/Invokes, but that did not solve my problem. I tried reproducing the error with as little code as possible and eventually it looks like multi-threaded access is the one giving me all the problems.
In my application I have a usercontrol that functions as a finger / flick scroll list. In this control I use a bitmap for each item in the list as a canvas. Drawing on this canvas is handled by a separate thread and when i disable this thread, the error seems to disappear.. I'll do some more tests on this and will post the results here.
Catching this exception is not an option. It is the worst kind of heart attack a thread can suffer, the CPU has detected a serious problem and cannot continue running code. This is invariably caused by misbehaving unmanaged code, it sounds like you've got plenty of it running in your program. You need to focus on debugging that unmanaged code to get somewhere.
The two most common causes of an AV are
Heap corruption. The unmanaged code has written data to the heap improperly, destroying the structural integrity of the heap. Typically caused by overflowing the boundary of an allocated block of memory. Or using a heap block after it was freed. Very hard to diagnose, the exception will be raised long after the damage was done.
Stack corruption. Most typically caused by overflowing the boundaries of an array that was allocated on the stack. This can overwrite the values of other variables on the stack or destroy the function return address. A bit easier to diagnose, it tends to repeat well and has an immediate effect. One side-effect is that the debugger loses its ability to display the call stack right after the damage was done.
Heap corruption is the likely one and the hard one. This is most typically tackled by debugging the code in the debug build with a debug allocator that watches the integrity of the heap. The <crtdbg.h> header provides one. It's not a guaranteed approach, you can have some really nasty Heisenbugs that only rear their head in the Release build. Very few options available then, other than careful code review. Good luck, you'll need it.
It turns out to be an exception caused by Interlocked.
In my code there is an integer _drawThreadIsRunning which is set to 1 when the draw-thread is running, and set to 0 otherwise. I set this value using Interlocked:
if (Interlocked.Exchange(ref _drawThreadIsRunning, 1) == 0) { /* run thread */ }
When i change this line the whole thing works, so it seems that there is a problem with threadsafety somewhere, but i can't figure it out. (ie. i don't want to waste more time figuring it out)
Thanks for the help guys!
I'm writing an app for WM that handles incoming SMS events. I tried making it multi-threaded (using ThreadPool.QueueWorkItem) where the SmsMessage was passed along. However, I noticed that when I did that, the program would only handle the first sms event - afterwards, NO SMS were received by the device at all! But when the program exits, then all the missed SMSs arrived.
Based on that, I'd guess that the answer to my question is that SmsMessage objects are NOT thread-safe, even though there's not really an indication that that's the case.
So what if we want to try and thread an SmsMessage Object? I'll try that out tonight by making a copy of the SmsMessage (probably by using the constructor w/ item id), or I'll create an empty one and copy the fields manually.
DISCOVERY:
I narrowed down my problem. I was able to get everything to work in a background thread when I copied the SmsMessage into my own object, taking care not to reference any of the SmsMessage's objects. SMSs went flying through with no problem.
However, when I set up an MessageIntercepter to launch an application, and within that application instance, use a background thread to send an SMS, the application would work fine, but after it exits my code it crashes and displays the "There was an error in yourapp.exe" and asks me if you want to send the crash data to MS. I could never figure out what that error was, but I found out that if I sent the SMS from the same thread that launched the application, everything worked fine.
So, threading when the app is open = fine, as long as you don't pass/use the SmsMessage
Threading when the app is externally launched = fine, as long as you don't send an SmsMessage in another thread.
According to the MSDN Library (Microsoft.WindowsMobile.PocketOutlook.SmsMessage):
Any public static (Shared in Visual
Basic) members of this type are
thread-safe. Any instance members are
not guaranteed to be thread-safe.
So the answer to your question is: it is not thread-safe.
EDIT: I'm glad you managed to get this thing working. I'm pretty sure it's some race condition or unmanaged resource handling issue deep down in the PocketOutlook library.
Personally, I think you should keep all your messaging-related code in a single thread, from the point the MessageInterceptor created until the point it gets disposed and only pass objects that are written by you so you know that they don't have dodgy unmanaged dependecies (that's kind of what you did I think) - this should be enough to avoid these problems.
What's the point of having 2 SMS interceptor threads anyway? It's not like a phone could receive 2 text messages the same time.
Without seeing more code, it's hard to guess what's happening. I strongly suspect that it was a problem with your threading code rather than with SmsMessage.
If you could explain the architecture of your application, that would help a lot. I wouldn't be surprised to find you'd got a deadlock somewhere which was blocking everything.