Handling rude application aborts in .NET - c#

I know I'm opening myself to a royal flaming by even asking this, but I thought I would see if StackOverflow has any solutions to a problem that I'm having...
I have a C# application that is failing at a client site in a way that I am unable to reproduce locally. Unfortunately, it is very difficult (impossible) for me to get any information that at all helps in isolating the source of the problem.
I have in place a rather extensive error monitoring framework which is watching for unhandled exceptions in all the usual places:
Backstop exception handler in threads I control
Application.ThreadException for WinForms exceptions
AppDomain.CurrentDomain.UnhandledException
Which logs detailed information in a place where I have access to them.
This has been very useful in the past to identify issues in production code, but has not been giving me any information at about the current series of issues.
My best guess is that the core issue is one of the "rude" exception types (thread abort, out of memory, stack overflow, access violation, etc.) that are escalating to a rude shutdown that are ripping down the process before I have a chance to see what is going on.
Is there anything that I can be doing to snapshot information as my process is crashing that would be useful? Ideally, I would be able to write out my custom log format, but I would be happy if I could have a reliable way of ensuring that a crash dump is written somewhere.
I was hoping that I could implement class deriving from CriticalFinalizerObject and have it spit a last-chance error log out when it is disposing, but that doesn't seem to be triggered in the StackOverflow scenario which I tested.
I am unable to use Windows Error Reporting and friends due to the lack of a code signing certificate.
I'm not trying to "recover" from arbitrary exceptions, I'm just trying to make a note of what went wrong on the way down.
Any ideas?

You could try creating a minidump file. This is a C++ API, but it should be possible to write a small C++ program that starts your application keeps a handle to the process, waits on the process handle, and then uses the process handle to create a minidump when the application dies.

If you have done what you claim:
Try-Catch on the Application.Run
Unhandled Domain Exceptions
Unhandled Thread Exceptions
Try Catch handlers in all threads
Then you would have caught the exception except perhaps if it is being thrown by a third party or COM component.
You certainly haven't given enough information.
What events does the client say leads up to the exception?
What COM or third party components do you use? (Do you properly instance and reference these components? Do you pass valid arguments to COM function calls?)
Do you make use of any un-managed - un-safe code?
Are you positive that you have all throw-capable calls covered with try-catch?
I'm just saying that no-one can offer you any helpful advice unless you post a heck of lot more information and even at that we probably can only speculate as to the source of you problem.
Have a set of fresh eyes look at your code.
Some errors cannot be caught by logging.
See this similar question for more details:
StackOverflowException in .NET
Here's a link explaining asynchronous exceptions (and why you can't recover from them):
http://www.bluebytesoftware.com/blog/PermaLink.aspx?guid=c1898a31-a0aa-40af-871c-7847d98f1641

Related

C# : catch all errors/exceptions of a mixed managed/unmanaged process

I have a big and complex process that runs on a production environment that's basically a WPF user interface developed in C#. It also hosts threads and DLL's written in C++ unmanaged and managed code.
Typically, if an exception raises, it's caught and the related stack dump is written in a log file for post-mortem debugging purposes. Unfortunately, from time to time, the application crashes without writing any information in the log so we have no clue about who's causing the crash.
Does anybody know how to detect and eventually trace all the causes that make the application crash and are not detected with a simple try-catch block?
To give an example I saw that StackOverflow Exception is not caught and also errors like 0xc0000374 coming from unmanaged code are not detected. This is not a matter of debugging it. I know I can attach a debugger to the system and try to reproduce the problem. But as I told this is a production system and I have to analyze issues coming from the field after the problem occurred.
Unlike C# exceptions, C++ exceptions do not catch hardware exceptions such as access violations or stack overflows since C++ apps run unmanaged and directly on the cpu.
For post-crash analysis I would suggest using something like breakpad. breakpad will create a dump file which will give you very useful information such as a call-stack, running threads and stack/heap memory depending on your configuration.
This way you would not need to witness the crash happening or even try to reproduce it which I know from experience can be very difficult. All you would need is a way to retrieve these crash dumps from your users devices.
You can log exception by subscribing to AppDomain.UnhandledException event. Its args.ExceptionObject argument is of type object and is designed not to be limited by C# exceptions, so you can call ToString method to log it somewhere.
Also check MSDN docs for its limitations. For instance:
Starting with the .NET Framework 4, this event is not raised for exceptions that corrupt the state of the process, such as stack overflows or access violations, unless the event handler is security-critical and has the HandleProcessCorruptedStateExceptionsAttribute attribute.
Solved ! I followed Mohamad Elghawi suggestion and I integrated breakpad. After I struggled a lot in order to make it compiling, linking and working under Visual Studio 2008, I was able to catch critical system exceptions and generate a crash dump. I'm also able to generate a dump on demand, this is useful when the application stuck for some reason and I'm able to identify this issue with an external thread that monitors all the others.
Pay attention ! The visual studio solution isn't included in the git repo and the gyp tool, in contradiction as wrongly mentioned in some threads, it's also not there. You have to download the gyp tool separately and work a bit on the .gyp files inside the breadpad three in order to generate a proper solution. Furthermore some include files and definitions are missing if you want to compile it in Visual Studio 2008 so you have also to manage this.
Thanks guys !

Exception control when release an application?

Possibly an obvious question to some but couldn't find a duplicate.
I'm packaging the final version of a Windows Forms solution I've been working on and am getting it ready for online distribution. What are the best practices when doing so? We've already had some trouble with packaging the installation file and have run into hurdles to test the program on different PCs, both 32 and 64-bit included.
More specifically, should "throw;" commands be commented out or left in the final release? Would this expose any of the inner workings of the solution itself?
Released application should not crash when exception occurs. You will want to inform the user, something went wrong and log your exception, but you do not want to crash! Informing user should be done in a friendly manner and not just by putting exception.ToString() into the message box.
It is a good practice to add Application.ThreadException or AppDomain.CurrentDomain.UnhandledException handlers to handle all exceptions in your Application. How exactly to do that, is answered in the following thread: Catch Application Exceptions in a Windows Forms Application
However, make sure that your application survives in a usable state, i.e. handle exceptions in a proper way for your application.
I usually add a preprocessor directive for handling exceptions on the application level, since I want them to trow while debugging. For example:
#if !DEBUG
Application.ThreadException += new ThreadExceptionEventHandler(MyHandler);
#endif
It should also be mentioned, that if you have code pieces where you anticipate that Exception might occur, such as network communication error, you should handle those pieces explicitly. What I am saying is, we should not completely forget about exception handling, just because we configured an unhandled exception handler on the application level.
Keep all of your exception handling intact.
Add an event to the starting form in the application, attaching to the Application.UnhandledException event. This will fire if an exception propogates up the stack.
This is the point to inform the user that the application has crashed. Log the error here and then abort gracefully.
Your point about revealing internals, thats up to you to decide. You can obfuscate the source code if you wish, but if you are releasing in Release build mode, and you are not providing the .PDB, then this is the first step.
Ultimately, the DLL / EXE can be decompiled anyway, so its up to you. Debug mode will reveal a lot more than Release mode, but not much more.
Ideally, you should be catching anything that's thrown higher with throw;. Carefully check your code and try to ensure that thrown exceptions are dealt with appropriately. Unhandled exceptions are logged - you can see this information in the Windows Event Viewer. Depending on what details you put in them, unhandled exceptions could give clues as to the inner workings of your application. However, I would suggest that unhandled exceptions are a poor source of information, and that anyone who wanted to know how your application worked could simply disassemble it, unless you've obfuscated it.
Some exceptions cannot be caught by surrounding code with try/catch blocks, so your application should also implement an unhandled exception handler. This gives you the opportunity to show the user an error message and do something with the exception - log it, send it to support, discard it, etc.

How should I deal with errors in my C++ based dll?

I have created a C++ DLL and I am using it in a C# application. How should I report errors?
Use exceptions and throw my errors, or print them on std::cout or std::cerr, or something else? If I issue an exception inside my DLL, will my C# application be able to catch it? What is the best course of action on this regard?
Here's an example output from C# using PInvoke to call a method which throws std::exception.
ex = System.Runtime.InteropServices.SEHException (0x80004005):
External component has thrown an exception.
at ConsoleTester.Program.throw_exception()
at ConsoleTester.Program.Main(String[] args) in ConsoleTester.cs:line 18
Note: In this case throw_exception() is the exposed native method and it called a sub-method, but you can't see that part of the stack trace. All you get for deepest stack frame is the native boundary.
So, it isn't perfect, but it does work. Returning error codes is probably the most standard way to handle this, but if you're the only consumer of the library it probably won't make much difference.
Note: stdin/stdout is generally the worst way to handle errors. The exception being that it's not so bad to write a custom error handling object or set of routines that everything in the application can access when something goes wrong. (The output from such an interface might sometimes be stdin/stdout or a file or whatever is useful as configured) think log4j or log4net here...
Generally, logging is only part of error handling. You've still got to signal other parts of your application to respond to adverse conditions and (hopefully) recover from them. Here, only error codes or exceptions really work well (and exceptions are to be minimized from main program flow anyways).
Don't print errors on stdout or stderr! You need to return errors programatically so the C# application has a chance to handle them. What if the host application is a GUI app?
Throwing exceptions from a C++ DLL is fraught with peril. Even if your application was C++, it would have to be compiled with the exact same compiler as the DLL, like #ebyrob said. Calling from C#, I'm not sure.
Best course of action is returning error codes.
It really depends on how strong the error is. Most libraries that I've seen will return a success or failure result value from their function calls that you can check for manually in your code when you use it. They usually provide another method that just retrieves the error message in case you want to see it.
Save throw exceptions for the really big stuff that you can't continue without, this will force people using your library to fix those errors (or at the very least, see that there is a big problem).
I would not recommend printing anything in the console window, it is a very slow operation and having it in there forces anyone using your library to have that overhead with little option for optimization. If they want to print the error messages, they can just retrieve the error data from your library and print them out themselves.

Try... Catch block infestation

I'm developing a suite of Excel add-ins for a company. I haven't done add-ins before, so I'm not terribly familiar with some of the intricacies. After delivering my first product, the user encountered errors that I didn't experience/encounter/notice during my testing. Additionally, I was having difficulty reproducing them from within Visual Studios debug environment.
I wound up writing a light weight logging class that received messages from various parts of the program. The program isn't huge, so it wasn't a whole lot of work. But what I did end up with was nearly every single line of code wrapped up in Try... Catch blocks so I could log things happening in the users environment.
I think I implemented it decently enough, I tried to avoid wrapping calls to other classes or modules and instead putting the block inside the call, so I could more accurately identify who was throwing, and I didn't swallow anything, I always threw the exception after I recorded the information I was interested in.
My question is, essentially, is this okay? Is there a better way to tackle this? Am I waaaay off base?
Quick Edit: Importantly, it did work. And I was able to nail down the bug and resolve it.
No, you are not way off base. I believe this is the only way to handle errors when writing Add-ins. I am selling an Outlook add-in myself which uses this pattern. A couple of notes though:
You only need to wrap the top-level methods, either exposed to the user interface directly or triggered by other events.
Make sure your logging routine traverses the Exception tree recursively, also logging InnerExceptions.
Instead of rethrowing the exception you might consider displaying some sort of error form instead.
And then a couple of comments to those notes:
I'm sure you understand this, but your comment "nearly every single line of code is wrapped(...)" made me want to underline this. But yes, all your code should eventually end up in a catch (System.Exception)-block so that you can log your Exception. I disagree completely with Greg saying this is "dangerous". What is dangerous is not handling your exceptions.
If you do this I don't think you need to "avoid wrapping calls to other classes and modules", if I understand you correctly. I have a published a convenient extension method GetAsString that allows me to log what I need at github.
In Outlook, if an Exception bubbles up to Outlook itself, your Add-in might get disabled or even crash Outlook if it happens on a background thread. Isn't it the same in Excel? Therefore I go to great lengths not to let any exception out of my application. Of course you need to make sure your application can continue running after this, or allow for a graceful shutdown.

How to catch absolutely all exceptions / errors

I have a windows service application, running under WinXPe, which sometimes fails with an error and displays an message box to the user:
"The instruction at “”
referenced memory at “0x00000000”. The
memory could not be “read.” Press OK
to exit the program
If the user clicks "Ok" the service is restarting.
I have tried to catch all unhandled exceptions with registering a eventhandler at AppDomain.CurrentDomain.UnhandledException
in the handler I log the exception details and exit the application.
But the error I mentioned above is NOT handled from "UnhandledException".
The application is heavily multi threaded, using System.Threading.Timer and System.Threading.Thread. And it's using some third party libs, one of these libs are using native interop, I have no source of the native lib.
I tried to point out the error with an debugger attached, but the error doesn't show up ;)
The application has to run several days before the error occurs.
I need a way to handle such a error.
Thanks
See Vectored Exception Handling
This is part of windows SEH (Structured Exception Handling) and IIRC here is precious few errors that you could not at least be notified of in such a case.
You will probably want to write any handling code directly to the native WIN32 API (in unsafe/unmanaged code) and using pre-allocated (static?) buffers only, because there will be many things unreliable at that moment in time.
Beware of/stay away from threading, locking primitives, memory allocations, disk IO; preferrably use Windows default API's to, e.g. restart the process or produce a minidump and things like that
That error is not a managed exception. It's a lower level memory access violation. Essentially a NULL pointer access in native code.
This is something you're supposed to be completely protect from in managed code, so it's likely one of your native libraries or the way you're using them. If the error only appears after a few days of execution, you might be best off first going through any native library calls, checking their signatures and making sure you pass them data that makes sense.

Categories