How to catch absolutely all exceptions / errors - c#

I have a windows service application, running under WinXPe, which sometimes fails with an error and displays an message box to the user:
"The instruction at “”
referenced memory at “0x00000000”. The
memory could not be “read.” Press OK
to exit the program
If the user clicks "Ok" the service is restarting.
I have tried to catch all unhandled exceptions with registering a eventhandler at AppDomain.CurrentDomain.UnhandledException
in the handler I log the exception details and exit the application.
But the error I mentioned above is NOT handled from "UnhandledException".
The application is heavily multi threaded, using System.Threading.Timer and System.Threading.Thread. And it's using some third party libs, one of these libs are using native interop, I have no source of the native lib.
I tried to point out the error with an debugger attached, but the error doesn't show up ;)
The application has to run several days before the error occurs.
I need a way to handle such a error.
Thanks

See Vectored Exception Handling
This is part of windows SEH (Structured Exception Handling) and IIRC here is precious few errors that you could not at least be notified of in such a case.
You will probably want to write any handling code directly to the native WIN32 API (in unsafe/unmanaged code) and using pre-allocated (static?) buffers only, because there will be many things unreliable at that moment in time.
Beware of/stay away from threading, locking primitives, memory allocations, disk IO; preferrably use Windows default API's to, e.g. restart the process or produce a minidump and things like that

That error is not a managed exception. It's a lower level memory access violation. Essentially a NULL pointer access in native code.
This is something you're supposed to be completely protect from in managed code, so it's likely one of your native libraries or the way you're using them. If the error only appears after a few days of execution, you might be best off first going through any native library calls, checking their signatures and making sure you pass them data that makes sense.

Related

C# : catch all errors/exceptions of a mixed managed/unmanaged process

I have a big and complex process that runs on a production environment that's basically a WPF user interface developed in C#. It also hosts threads and DLL's written in C++ unmanaged and managed code.
Typically, if an exception raises, it's caught and the related stack dump is written in a log file for post-mortem debugging purposes. Unfortunately, from time to time, the application crashes without writing any information in the log so we have no clue about who's causing the crash.
Does anybody know how to detect and eventually trace all the causes that make the application crash and are not detected with a simple try-catch block?
To give an example I saw that StackOverflow Exception is not caught and also errors like 0xc0000374 coming from unmanaged code are not detected. This is not a matter of debugging it. I know I can attach a debugger to the system and try to reproduce the problem. But as I told this is a production system and I have to analyze issues coming from the field after the problem occurred.
Unlike C# exceptions, C++ exceptions do not catch hardware exceptions such as access violations or stack overflows since C++ apps run unmanaged and directly on the cpu.
For post-crash analysis I would suggest using something like breakpad. breakpad will create a dump file which will give you very useful information such as a call-stack, running threads and stack/heap memory depending on your configuration.
This way you would not need to witness the crash happening or even try to reproduce it which I know from experience can be very difficult. All you would need is a way to retrieve these crash dumps from your users devices.
You can log exception by subscribing to AppDomain.UnhandledException event. Its args.ExceptionObject argument is of type object and is designed not to be limited by C# exceptions, so you can call ToString method to log it somewhere.
Also check MSDN docs for its limitations. For instance:
Starting with the .NET Framework 4, this event is not raised for exceptions that corrupt the state of the process, such as stack overflows or access violations, unless the event handler is security-critical and has the HandleProcessCorruptedStateExceptionsAttribute attribute.
Solved ! I followed Mohamad Elghawi suggestion and I integrated breakpad. After I struggled a lot in order to make it compiling, linking and working under Visual Studio 2008, I was able to catch critical system exceptions and generate a crash dump. I'm also able to generate a dump on demand, this is useful when the application stuck for some reason and I'm able to identify this issue with an external thread that monitors all the others.
Pay attention ! The visual studio solution isn't included in the git repo and the gyp tool, in contradiction as wrongly mentioned in some threads, it's also not there. You have to download the gyp tool separately and work a bit on the .gyp files inside the breadpad three in order to generate a proper solution. Furthermore some include files and definitions are missing if you want to compile it in Visual Studio 2008 so you have also to manage this.
Thanks guys !

How should I deal with errors in my C++ based dll?

I have created a C++ DLL and I am using it in a C# application. How should I report errors?
Use exceptions and throw my errors, or print them on std::cout or std::cerr, or something else? If I issue an exception inside my DLL, will my C# application be able to catch it? What is the best course of action on this regard?
Here's an example output from C# using PInvoke to call a method which throws std::exception.
ex = System.Runtime.InteropServices.SEHException (0x80004005):
External component has thrown an exception.
at ConsoleTester.Program.throw_exception()
at ConsoleTester.Program.Main(String[] args) in ConsoleTester.cs:line 18
Note: In this case throw_exception() is the exposed native method and it called a sub-method, but you can't see that part of the stack trace. All you get for deepest stack frame is the native boundary.
So, it isn't perfect, but it does work. Returning error codes is probably the most standard way to handle this, but if you're the only consumer of the library it probably won't make much difference.
Note: stdin/stdout is generally the worst way to handle errors. The exception being that it's not so bad to write a custom error handling object or set of routines that everything in the application can access when something goes wrong. (The output from such an interface might sometimes be stdin/stdout or a file or whatever is useful as configured) think log4j or log4net here...
Generally, logging is only part of error handling. You've still got to signal other parts of your application to respond to adverse conditions and (hopefully) recover from them. Here, only error codes or exceptions really work well (and exceptions are to be minimized from main program flow anyways).
Don't print errors on stdout or stderr! You need to return errors programatically so the C# application has a chance to handle them. What if the host application is a GUI app?
Throwing exceptions from a C++ DLL is fraught with peril. Even if your application was C++, it would have to be compiled with the exact same compiler as the DLL, like #ebyrob said. Calling from C#, I'm not sure.
Best course of action is returning error codes.
It really depends on how strong the error is. Most libraries that I've seen will return a success or failure result value from their function calls that you can check for manually in your code when you use it. They usually provide another method that just retrieves the error message in case you want to see it.
Save throw exceptions for the really big stuff that you can't continue without, this will force people using your library to fix those errors (or at the very least, see that there is a big problem).
I would not recommend printing anything in the console window, it is a very slow operation and having it in there forces anyone using your library to have that overhead with little option for optimization. If they want to print the error messages, they can just retrieve the error data from your library and print them out themselves.

How to get information on a Buffer Overflow Exception in a mixed application?

In all WPF application I develop there is a global exception handler subscribed to AppDomain.CurrentDomain.UnhandledException which logs everything it can find and then shows a dialog box telling the user to contact the author, where the log file is etc. This works extremely well and both clients and me are very happy with it because it allows fixing problems fast.
However during development of a mixed WPF / C# / CLI / C++ application there are sometimes application crashes that do not make it to the aforementioned exception handler. Instead, a standard windows dialog box saying "XXX hast stopped working" pops up. In the details it shows eg
Problem Event Name: BEX
Application Name: XXX.exe
Fault Module Name: clr.dll
...
This mostly happens when calling back a managed function from within umanaged code, and when that function updates the screen. I didn't took me long to figure out the root cause of problem, but only because I can reproduce the crash at my machine and hook up the debugger: in all occasions the native thread was still at the point of calling the function pointer to the managed delegate that calls directly into C#/WPF code.
The real problem would be when this happens on a client machine: given that usually clients are not the best error reporters out there it migh take very, very long to figure out what is wrong when all they can provide me are the details above.
The question: what can I do to get more information for crashes like these? Is there a way to get an exception like this call a custom error handler anyway? Or get a process dump? When loading the symbols for msvcr100_clr0004.dll and clr.dll (loaded on the thread where the break occurrs), the call stack is like this:
msvcr100_clr0400.dll!__crt_debugger_hook()
clr.dll!___report_gsfailure() + 0xeb bytes
clr.dll!_DoJITFailFast#0() + 0x8 bytes
clr.dll!CrawlFrame::CheckGSCookies() + 0x2c3b72 bytes
Can I somehow hook some native C++ code into __crt_debugger_hook() (eg for writing a minidump)? Which leads me to an additional question: how does CheckGSCookies behave on a machine with no debugger installed, would it still call the same code?
update some clarification on the code: native C++ calls a CLI delegate (to which a native function pointer is acquired using GetFunctionPointerForDelegate which in turn calls a C# System.Action. This Action updates a string (bound to a WPF label) and raises a propertychanged event. This somehow invokes a buffer overflow (when updating very fast) in an unnamed thread that was not created directly in my code.
update looking into SetUnhandledExceptionFilter, which didn't do anything at first, I found this nifty article explaining how to catch any exception. It works, and I was able to write a minidump in an exception filter installed by using that procedure. The dump gives basically the same information as hooking the debugger: the real problem seems to be that the Status string is being overwritten (by being called from the native thread) while at the same time being read from the ui thread. All nice ans such, but it does require dll hooking which is not my favorite method to solve things. Another way would still be nice.
The .NET 4 version of the CLR has protection against buffer overflow attacks. The basic scheme is that the code writes a "cookie" at the end of an array or stack frame, then checks later if that cookie still has the original value. If it has changed, the runtime assumes that malware has compromised the program state and immediately aborts the program. This kind of abort does not go through the usual unhandled exception mechanism, that would be too exploitable.
The odds that your program is really being attacked are small of course. Much more likely is that your native code has a pointer bug and scribbles junk into a CLR structure or stack frame. Debugging that isn't easy, the damage is usually done well before the crash. One approach is to comment out code until the crash disappears.

How to catch exceptions from another program (for logging)?

I am working on a tool that monitors a number of applications and ensures they are always running and in a clean state. Some of these applications have unhandled exceptions which do occur periodically and present the 'send crash report' window. I do not have the source code to these applications.
Is there any mechanism I could use to catch the exceptions, or simply identify their exception type, as well as identify the application's main executable file that threw the exception.
I'm not trying to do anything crazy like catch and handle it on the applications behalf, I'm simply trying to capture the exception type, log it and then restart the application.
Trapping unhandled exceptions requires calling SetUnhandledExceptionFilter() in the process. That's going to be difficult to do if you don't have source code, although it is technically possible by injecting a DLL into the process. This however cannot be done with managed code, you can't get the CLR initialized properly.
The default unhandled exception handler that Windows installs will always run WerFault.exe, the Windows Error Reporting tool. That can be turned off but that's a system setting. Expecting your user or admin to do this is not realistic. Only after WER runs will the JIT debugger get a shot at it.
I recommend a simpler approach, one that's also much more selective. Use the Process class to get the program you're interested in protecting started. When the Exited event fires, use the ExitCode property to find out how it terminated. Any negative value is a sure sign that the process died on an unhandled exception, the exit code matches the exception code. You can use the EventLog class to read the event message that WER writes to the Windows event log. And you can restart it the same way you got it started.
Without modifying the source of the application or injecting a DLL into the process I do not believe this is possible in a reliable fashion. What you're attempting to do is inspect type information across a process boundary. This is not easy to achieve for a number of reasons
Getting the name of the exception implies executing code in the target process. The name can be generated a number of ways (.ToString, or .GetType().Name) but all involve executing some method
You want to do this after the application has already crashed and hence may be in a very bad state. Consider trying to execute code if memory was corrupted (perhaps even corrupting the in memory definitions of the type).
The crash could occur in native code and not involve any managed data
If you want to monitor application crashes system wide, you can register yourself as a just-in-time debugger. You can edit the registry to specify which debugger to run when an application crashes. The example they give is Doctor Watson, but it could be your application instead.

Handling rude application aborts in .NET

I know I'm opening myself to a royal flaming by even asking this, but I thought I would see if StackOverflow has any solutions to a problem that I'm having...
I have a C# application that is failing at a client site in a way that I am unable to reproduce locally. Unfortunately, it is very difficult (impossible) for me to get any information that at all helps in isolating the source of the problem.
I have in place a rather extensive error monitoring framework which is watching for unhandled exceptions in all the usual places:
Backstop exception handler in threads I control
Application.ThreadException for WinForms exceptions
AppDomain.CurrentDomain.UnhandledException
Which logs detailed information in a place where I have access to them.
This has been very useful in the past to identify issues in production code, but has not been giving me any information at about the current series of issues.
My best guess is that the core issue is one of the "rude" exception types (thread abort, out of memory, stack overflow, access violation, etc.) that are escalating to a rude shutdown that are ripping down the process before I have a chance to see what is going on.
Is there anything that I can be doing to snapshot information as my process is crashing that would be useful? Ideally, I would be able to write out my custom log format, but I would be happy if I could have a reliable way of ensuring that a crash dump is written somewhere.
I was hoping that I could implement class deriving from CriticalFinalizerObject and have it spit a last-chance error log out when it is disposing, but that doesn't seem to be triggered in the StackOverflow scenario which I tested.
I am unable to use Windows Error Reporting and friends due to the lack of a code signing certificate.
I'm not trying to "recover" from arbitrary exceptions, I'm just trying to make a note of what went wrong on the way down.
Any ideas?
You could try creating a minidump file. This is a C++ API, but it should be possible to write a small C++ program that starts your application keeps a handle to the process, waits on the process handle, and then uses the process handle to create a minidump when the application dies.
If you have done what you claim:
Try-Catch on the Application.Run
Unhandled Domain Exceptions
Unhandled Thread Exceptions
Try Catch handlers in all threads
Then you would have caught the exception except perhaps if it is being thrown by a third party or COM component.
You certainly haven't given enough information.
What events does the client say leads up to the exception?
What COM or third party components do you use? (Do you properly instance and reference these components? Do you pass valid arguments to COM function calls?)
Do you make use of any un-managed - un-safe code?
Are you positive that you have all throw-capable calls covered with try-catch?
I'm just saying that no-one can offer you any helpful advice unless you post a heck of lot more information and even at that we probably can only speculate as to the source of you problem.
Have a set of fresh eyes look at your code.
Some errors cannot be caught by logging.
See this similar question for more details:
StackOverflowException in .NET
Here's a link explaining asynchronous exceptions (and why you can't recover from them):
http://www.bluebytesoftware.com/blog/PermaLink.aspx?guid=c1898a31-a0aa-40af-871c-7847d98f1641

Categories