I have a multithreaded .Net C# application, it uses Direct3D 9/10 and XAudio2. (Direct3D is accessed by only one thread, same for XAudio2. Direct3D isn't the problem cause the error is manifest in either DX9 or DX10 mode without any change in its behaviour.)
Sometimes (there are some areas that gives this problem randomly) this application crash in a rather unspectacular way. Even if the application is started through visual studio with debugger it crash without giving ANY kind of exception. (It start by saying "applicationname.svchost.exe is crashed, etc..etc..Do you want to debug?", if I press yes it tells me "you cannot debug an application already closed.)
There is no way to find out what's the cause of the crash? Cause i've run out of ideas, the debugger isn't giving me any information at all. Without an exception I can't even do a stacktrace or a dump. :P (I'm supposing is a synchronization problem (even thought in that area I'm only doing sequential work...), but hey why isn't launching an exception? :|)
In the areas where the problem occurs I'm unloading a reloading a series of classes related to a novel (in a sequential core thread, so I doubt it can be an issue) and starting a new music through XAudio2. (BTW, what are the multithreading consideration about XAudio2? Is it safe to call from multiple thread?)
Thanks for the help.
P.S. There is a software to attach to mine to monitor all the calls and tells me what's the last call before the crash?
You should try using Windbg, analyzing the crash dump should point you to the problem, if your suspicion is right and it is a synchronization problem, the cause of the problem may be hard to spot.
Have you checked Event Logs in your Windows Administration Panel?
All error of any kind are always logged in this section with minimal details.
One time I had an application that was crashing without exceptions and the only help I found was the Event Log Viewer where I discovered that the source of the crash was a StackOverflowException.
Related
I have a big and complex process that runs on a production environment that's basically a WPF user interface developed in C#. It also hosts threads and DLL's written in C++ unmanaged and managed code.
Typically, if an exception raises, it's caught and the related stack dump is written in a log file for post-mortem debugging purposes. Unfortunately, from time to time, the application crashes without writing any information in the log so we have no clue about who's causing the crash.
Does anybody know how to detect and eventually trace all the causes that make the application crash and are not detected with a simple try-catch block?
To give an example I saw that StackOverflow Exception is not caught and also errors like 0xc0000374 coming from unmanaged code are not detected. This is not a matter of debugging it. I know I can attach a debugger to the system and try to reproduce the problem. But as I told this is a production system and I have to analyze issues coming from the field after the problem occurred.
Unlike C# exceptions, C++ exceptions do not catch hardware exceptions such as access violations or stack overflows since C++ apps run unmanaged and directly on the cpu.
For post-crash analysis I would suggest using something like breakpad. breakpad will create a dump file which will give you very useful information such as a call-stack, running threads and stack/heap memory depending on your configuration.
This way you would not need to witness the crash happening or even try to reproduce it which I know from experience can be very difficult. All you would need is a way to retrieve these crash dumps from your users devices.
You can log exception by subscribing to AppDomain.UnhandledException event. Its args.ExceptionObject argument is of type object and is designed not to be limited by C# exceptions, so you can call ToString method to log it somewhere.
Also check MSDN docs for its limitations. For instance:
Starting with the .NET Framework 4, this event is not raised for exceptions that corrupt the state of the process, such as stack overflows or access violations, unless the event handler is security-critical and has the HandleProcessCorruptedStateExceptionsAttribute attribute.
Solved ! I followed Mohamad Elghawi suggestion and I integrated breakpad. After I struggled a lot in order to make it compiling, linking and working under Visual Studio 2008, I was able to catch critical system exceptions and generate a crash dump. I'm also able to generate a dump on demand, this is useful when the application stuck for some reason and I'm able to identify this issue with an external thread that monitors all the others.
Pay attention ! The visual studio solution isn't included in the git repo and the gyp tool, in contradiction as wrongly mentioned in some threads, it's also not there. You have to download the gyp tool separately and work a bit on the .gyp files inside the breadpad three in order to generate a proper solution. Furthermore some include files and definitions are missing if you want to compile it in Visual Studio 2008 so you have also to manage this.
Thanks guys !
I have an application (written in C# / ClickOnce) which, for the most part, works fine; it has no memory leaks and runs reliably and is stable for days at a time.
However, it also utilises MEF (so plugins/extensions can be dynamically added to the core assembly). Again, this 'works' currently, but if an exception/fatal error occurs in an externally linked assembly/plugin, it will crash the entire application.
After some recent testing, I found that the application had crashed after around 14 hours of [successful] operation.
With that in mind, my question is really two-fold:
a) is it possible to catch any unhanded exception a plugin (or the main application) may throw, so it can at least output info for debugging assistance?
b) I can't be sure if it was the plugin or the main application which failed. Therefore, I cannot think where to start debugging/tracing the issue. How does one go about finding a bug which only occurs after such a long period of time?
Thanks for any input.
As noted in the comments (I guess I should have read those before I started typing up an answer...)
When an application throws an exception that is unhandled, it fires the UnhandledException event just before death and you can subscribe to that in order to log any details that will help you figure out what happened.
See this SO post for an example on how to use it (includes ThreadException).
I have a windows application that calls an external .dll. After a while, there were fatal errors being brought to my attention that had to do with user marshaling. There was a source online that with that particular error I was to change my target to x86 rather than AnyCPU. I did so, and now whenever I let the app run, it will exit debug mode and crash the application. But if I set a break point immediately after the .dll call, and step over each line until I receive control of the application again, it doesn't crash. Is there anything specific that could be causing this? Has does one debug this issue?
Thanks!
Stepping code solving an issue is often a symptom of timing problems in the original code. If an external resource loads asynchronously, it will not show up on the stack of the current thread in the debugger, but will be able to be called. Stepping over code induces a delay in the flow.
Thank you all for you suggestions! Fortunately, I ended up getting it to work (with minimal understanding as to why it works) but changing the build target to specifically x86 machines rather than "AnyCPU." This was suggested by a website and can no longer find :\ Hope this helps others than run into a similar issue!
I consider the most common cause of this sort of thing to be uninitialized variables. They pick up whatever was in memory and the presence of a debugger can easily change what's in the unused part of the stack--the memory that's going to become local variables when the next routine is called. Check the DLLs code.
Note that your "fix" makes me even more suspect that this is the real answer.
(Then there's also the really crazy case of a problem with the debugger. Long ago I hit a case where the debugger had no problem loading an invalid value into a segment register if you were single stepping.)
I'm getting close to desperate.. I am developing a field service application for Windows Mobile 6.1 using C# and quite some p/Invoking. (I think I'm referencing about 50 native functions)
On normal circumstances this goes without any problem, but when i start stressing the GC i'm getting a nasty 0xC0000005 error witch seems uncatchable. In my test i'm rapidly closing and opening a dialog form (the form did make use of native functions, but for testing i commented these out) and after a while the Windows Mobile error reporter comes around to tell me that there was an fatal error in my application.
My code uses a try-catch around the Application.Run(masterForm); and hooks into the CurrentDomain.UnhandledException event, but the application still crashes. Even when i attach the debugger, visual studio just tells me "The remote connection to the device has been lost" when the exception occurs..
Since I didn't succeed to catch the exception in the managed environment, I tried to make sense out of the Error Reporter log file. But this doesn't make any sense, the only consistent this about the error is the application where it occurs in.
The thread where the application occurs in is unknown to me, the module where the error occurs differs from time to time (I've seen my application.exe, WS2.dll, netcfagl3_5.dll and mscoree3_5.dll), even the error code is not always the same. (most of the time it's 0xC0000005, but i've also seen an 0X80000002 error, which is a warning accounting the first byte?)
I tried debugging through bugtrap, but strangely enough this crashes with the same error code (0xC0000005). I tried to open the kdmp file with visual studio, but i can't seem to make any sense out of this because it only shows me disassembler code when i step into the error (unless i have the right .pbb files, which i don't). Same goes for WinDbg.
To make a long story short: I frankly don't have a single clue where to look for this error, and I'm hoping some bright soul on stackoverflow does. I'm happy to provide some code but at this moment I don't know which piece to provide..
Any help is greatly appreciated!
[EDIT May 3rd 2010]
As you can see in my comment to Hans I retested the whole program after I uncommented all P/Invokes, but that did not solve my problem. I tried reproducing the error with as little code as possible and eventually it looks like multi-threaded access is the one giving me all the problems.
In my application I have a usercontrol that functions as a finger / flick scroll list. In this control I use a bitmap for each item in the list as a canvas. Drawing on this canvas is handled by a separate thread and when i disable this thread, the error seems to disappear.. I'll do some more tests on this and will post the results here.
Catching this exception is not an option. It is the worst kind of heart attack a thread can suffer, the CPU has detected a serious problem and cannot continue running code. This is invariably caused by misbehaving unmanaged code, it sounds like you've got plenty of it running in your program. You need to focus on debugging that unmanaged code to get somewhere.
The two most common causes of an AV are
Heap corruption. The unmanaged code has written data to the heap improperly, destroying the structural integrity of the heap. Typically caused by overflowing the boundary of an allocated block of memory. Or using a heap block after it was freed. Very hard to diagnose, the exception will be raised long after the damage was done.
Stack corruption. Most typically caused by overflowing the boundaries of an array that was allocated on the stack. This can overwrite the values of other variables on the stack or destroy the function return address. A bit easier to diagnose, it tends to repeat well and has an immediate effect. One side-effect is that the debugger loses its ability to display the call stack right after the damage was done.
Heap corruption is the likely one and the hard one. This is most typically tackled by debugging the code in the debug build with a debug allocator that watches the integrity of the heap. The <crtdbg.h> header provides one. It's not a guaranteed approach, you can have some really nasty Heisenbugs that only rear their head in the Release build. Very few options available then, other than careful code review. Good luck, you'll need it.
It turns out to be an exception caused by Interlocked.
In my code there is an integer _drawThreadIsRunning which is set to 1 when the draw-thread is running, and set to 0 otherwise. I set this value using Interlocked:
if (Interlocked.Exchange(ref _drawThreadIsRunning, 1) == 0) { /* run thread */ }
When i change this line the whole thing works, so it seems that there is a problem with threadsafety somewhere, but i can't figure it out. (ie. i don't want to waste more time figuring it out)
Thanks for the help guys!
we have a dotnet 2.0 desktop winforms app and it seems to randomly crash. no stack trace, vent log, or anything. it just dissapears.
There are a few theories:
machine simply runs out of resources. some people have said you will always get a window handle exception or a gdi exception but others say it might simply cause crashes.
we are using wrappers around non managed code for 2 modules. exceptions inside either of these modules could cause this behavior.
again, this is not reproducible so i wanted to see if there were any suggestions on how to debug better or anything i can put on the machine to "catch" the crash before it happens to help us understand whats going on.
Your best bet is to purchase John Robbins' book "Debugging Microsoft .NET 2.0 Applications". Your question can go WAY deeper than we have room to type here.
Sounds for me like you need to log at first - maybe you can attach with PostSharp a logger to your methods (see Log4PostSharp) . This will certainly slow you down a lot and produce tons of messages. But you should be able to narrow the problematic code down ... Attach there more logs - remove others. Maybe you can stress-test this parts, later. If the suspect parts are small enough you might even do a code review there.
I know, your question was about debugging - but this could be an approach, too.
You could use Process Monitor from SysInternals (now a part of Microsoft) filtered down to just what you want to monitor. This would give you a good place to start. Just start with a tight focus or make sure you have plenty of space for the log file.
I agree with Boydski. But I also offer this suggestion. Take a close look at the threads if you're doing multi-threading. I had an error like this once that took a long long time to figure out and actually ended up with John Robbins on the phone helping out with it. It turned out to be improper exception handling with threads.
Run your app, with pdb files, and attach WinDbg, make run.
Whem crash occur WinDbg stop app.
Execute this command to generate dump file :
.dump /ma c:\myapp.dmp
An auxiliary tool of analysis is ADPlus
Or try this :
Capturing user dumps using Performance alert
How to use ADPlus to troubleshoot "hangs" and "crashes"
Debugging on the Windows Platform
Do you have a try/catch block around the line 'Application.Run' call that starts the GUI thread? If not do add one and put some logging in for any exceptions thrown there.
You can also wrie up the Application.ThreadException event and log that in case that gives you some more hints.
you should use a global exception handler:
http://msdn.microsoft.com/en-us/library/system.windows.forms.application.threadexception.aspx
I my past I have this kind of behaviour primarily in relation to COM objects not being released and/or threading issues. Dive into the suggestions that you have gotten here already, but I would also suggest to look into whether you properly release the non-managed objects, so that they do not leak memory.