I'm developing a suite of Excel add-ins for a company. I haven't done add-ins before, so I'm not terribly familiar with some of the intricacies. After delivering my first product, the user encountered errors that I didn't experience/encounter/notice during my testing. Additionally, I was having difficulty reproducing them from within Visual Studios debug environment.
I wound up writing a light weight logging class that received messages from various parts of the program. The program isn't huge, so it wasn't a whole lot of work. But what I did end up with was nearly every single line of code wrapped up in Try... Catch blocks so I could log things happening in the users environment.
I think I implemented it decently enough, I tried to avoid wrapping calls to other classes or modules and instead putting the block inside the call, so I could more accurately identify who was throwing, and I didn't swallow anything, I always threw the exception after I recorded the information I was interested in.
My question is, essentially, is this okay? Is there a better way to tackle this? Am I waaaay off base?
Quick Edit: Importantly, it did work. And I was able to nail down the bug and resolve it.
No, you are not way off base. I believe this is the only way to handle errors when writing Add-ins. I am selling an Outlook add-in myself which uses this pattern. A couple of notes though:
You only need to wrap the top-level methods, either exposed to the user interface directly or triggered by other events.
Make sure your logging routine traverses the Exception tree recursively, also logging InnerExceptions.
Instead of rethrowing the exception you might consider displaying some sort of error form instead.
And then a couple of comments to those notes:
I'm sure you understand this, but your comment "nearly every single line of code is wrapped(...)" made me want to underline this. But yes, all your code should eventually end up in a catch (System.Exception)-block so that you can log your Exception. I disagree completely with Greg saying this is "dangerous". What is dangerous is not handling your exceptions.
If you do this I don't think you need to "avoid wrapping calls to other classes and modules", if I understand you correctly. I have a published a convenient extension method GetAsString that allows me to log what I need at github.
In Outlook, if an Exception bubbles up to Outlook itself, your Add-in might get disabled or even crash Outlook if it happens on a background thread. Isn't it the same in Excel? Therefore I go to great lengths not to let any exception out of my application. Of course you need to make sure your application can continue running after this, or allow for a graceful shutdown.
Related
I have read several times that using
catch (Exception ex)
{
Logger.LogError(ex);
}
without re throwing is wrong, because you may be hiding exceptions that you don't know about from the rest of the code.
However, I am writing a WCF service and am finding myself doing this in several places in order to ensure that the service does not crash. (SOA states that clients should not know or care about internal service errors since they unaware of the service implementation)
For instance, my service reads data from the file system. Since the file system is unpredictable I am trapping all exceptions from the read code. This might be due to bad data, permission problems, missing files etc etc. The client doesn't care, it just gets a "Data not available" response and the real reason is logged in the service log. I don't care either, I just know there was a problem reading and I don't want to crash.
Now I can understand there may be exceptions thrown unrelated to the file system. eg. maybe I'm out of memory and trying to create a read buffer has thrown an exception. The fact remains however, that the memory problem is still related to the read. I tried to read and was unable to. Maybe there is still enough memory around for the rest of the service to run. Do I rethrow the memory exception and crash the service even though it won't cause a problem for anything else?
I do appreciate the general idea of only catching exceptions you can deal with, but surely if you have an independent piece of code that can fail without affecting anything else, then it's ok to trap any errors generated by that code? Surely it's no different from having an app wide exception handler?
EDIT: To clarify, the catch is not empty, the exception is logged. Bad example code by me, sorry. Have changed now.
I wouldn't say that your service works as expected if there are permission problems on the disk. imho a service returning "Data not available" is even worse than a service returning "Error".
imagine that you are the user of your service. You make a call to it and it returns "No data". You know that you're call is correct, but since you don't get any data you'll assume that the problem is yours and go back investigating. Lots of hours can be spent in this way.
Which is better? Treating the error as an error, or lie to your users?
Update
What the error depends on doesn't really matter. Access problems should be dealt with. A disk that fails sometimes should be mirrored etc etc. SOA puts more responsibilities on you as a developer. Hiding errors doesn't make them go away.
SOA should pass errors. It may not be a detailed error, but it should be enough for the client to understand that the server had a problem. Based on that, the client may try again later on, or just log the error or inform the service desk that a manual action might need to be taken.
If you return "No data", you won't give your users a chance to treat the error as they see fit.
Update2
I do use catch all in my top level. I log exceptions to the event log (which is being monitored).
Your original question didn't have anything in the catch (which it do now). It's fine as long as you remember to monitor that log (so that you can correct those errors).
If you are going to adopt this strategy you are going to make it very hard for deployment teams to work out why the client fails to work. At the minimum log something somewhere.
One of the main issues becomes that you will never know that something went wrong. It isn't only your clients / consumers that have the error hidden from them, it is you as the service developer yourself.
There's absolutely no problem with that code. Make sure the user gets a nice message, like "Data not available due to an internal problem. The issue has been logged and the problem will be dealt with." And everybody's fine.
It's only a problem if you eat and swallow the exception so that nobody in the world will ever fix it.
Exceptions need always to be dealt with, either by writing special code or by just fixing the path that results in the exception. You can't always anticipate all errors, but you can commit to fixing them as soon as you become aware of them.
I believe the idea is to prevent missing errors. Consider this: The service returns fine but is not doing as expected. You have to go and search through event logs, file logs etc. to find a potential error. If you put a trace write in there, you can identify hopefully the cause, but likely the area of hte issue and a timestamp to correlate errors with other logs.
If you use .NET Trace, you can implement in code and not turn it on until required. Then to debug you can turn it on without having to recompile code.
WCF Services can use FaultException to pass exceptions back to the client. I have found this useful when building a WCF n-tier application server.
It's commendable to ensure that your service does not crash, in a production environment. When debugging, you need the application to crash, for obvious reasons.
However, from your explanations, I get the impression that you're catching the exceptions you expect to be thrown, and catch the rest with an empty catch block, meaning you'll never know what happened.
It's correct to catch all exceptions as a last resort, but please log those too, they are no less interesting than those you expected. After catching and logging all exceptions related to network, I/O, security, authentication, timeout, bad data, whatever, log the rest.
In general you can say that every exception occurs in a specific circumstance. For debugging purposes it is usefull to have as much information as possible about the failure so the person who is about to fix the bug know where to look and can fix the code in a minimum amount of time, which save his boss some money.
Besides that, throwing and catching specific exceptions will make your code more readable/understable and easier to maintain. Catching a general exception will fail on this point in every point of view. Therefore, also make sure that, when throwing an exception, that is is well documented (e.g. in XML comments so the caller knows the method can throw the specific exception) and has a clear name on what went wrong. Also include only information in the exception that is directly related to the problem (like id's and such).
In vb.net, a somewhat better approach would be "Catch Ex As Exception When Not IsEvilException(Ex)", where "IsEvilException" is a Boolean function which checks whether Ex is something like OutOfMemoryException, ExecutionEngineException, etc. The semantics of catching and rethrowing an exception are somewhat different from the semantics of leaving an exception uncaught; if a particular "catch" statement would do nothing with the exception except rethrow "as is", it would be better not to catch it in the first place.
Since C# does not allow exception filtering, you're probably stuck with a few not-very-nice choices:
Explicitly catch and rethrow any "evil" types of exceptions, and then use a blanket catch for the rest
Catch all exceptions, rethrow all the evil ones "as-is", and then have your logic handle the rest
Catch all exceptions and have your logic handle them without regard for the evil ones, figuring that if the CPU is on fire it will cause enough other exceptions to be raised elsewhere to bring the program down before persistent data gets corrupted.
One problem with exception handling in both Java and .net is that it binds tightly three concepts which are actually somewhat orthogonal:
What condition or action triggered the exception
Should the exception be acted upon
Should the exception be considered resolved
Note that sometimes an exception will involve two or more types of state corruption (e.g. an attempt was made to update an object from information at a stream, and something goes wrong reading the stream, leaving both the stream and the object being updated in bad states). Code to handle either condition should act upon the exception, but the exception shouldn't be considered resolved until code for both has been run. Unfortunately, there is no standard pattern for coding such behavior. One could add a virtual read-only "Resolved" property to any custom exceptions one implements, and rethrow any such exceptions if "Resolved" returns false, but no existing exceptions will support such a property.
If something is wrong, then it is wrong. It should crash.
You can never assure proper operations if something is crashing in the code. Even if that means that a page/form will always crash as soon as it is visited, you can never assure that things will keep working, or any changes be functionally committed.
And if a data resource crashes and instead returns a resource with 0 entries, the user will just be incredibly confused by something not returning any results when there should be obvious results. Valueable time will be spent by the user trying to find out what (s)he did wrong.
The only times I would recommend catching all exceptions (and logging them, not ignoring them), is if the code that is being ran is from plugins/compiled code from users, or some weird COM library, that you have no control over yourself.
I've found myself doing too much error handling with try\catch statements and getting my code ugly with that. You guys have any technique or framework to make this more elegant? (In c# windows forms or asp.net).
You need to read up on structured exception handling. If you're using as many exception handlers as it sounds then you're doing it wrong.
Exception handling isn't like checking return values. You are supposed to handle some exceptions in limited, key spots in your code not all over the place. Remeber that exceptions "bubble up" the call stack!
Here is a good and well-reviewed CodeProject article on exception best practices.
Java land had pretty the same problem. You just look at method and you can't at a first glance understand what it is doing, because all you see is try/catch blocks. Take a 30-40 line method and throw away all try statements and catch blocks and you might end up with 5-6 lines of pure application logic. This isn't such a big problem with C# as it has unchecked exceptions, but it gets really ugly in Java code. The funny thing is the try/catch blocks were intended to solve the very same problem in the first place. Back then it was caused by errno/errstr madness.
What the Java guys usually do is based on how do you typically handle exception. Most of the time you can't really do anything to correct the problem. You just notify the user that whatever he was trying to do didn't work, put back application in a certain state and maybe log and exception with complete stacktrace to log file.
Since you handle all the exceptions like this, the solution is to have a catch-all exception handler, which sits on top of application stack and catches all exceptions that are thrown and propagated up the stack. With ASP.NET you might use something like this:
http://www.developer.com/net/asp/article.php/961301/Global-Exception-Handling-with-ASPNET.htm
At the same time you are free to override that global handler by placing try/catch block in your code, where you feel something can be done, to correct the problem.
Just adding a Try Catch does not solve the problem. This topic a too big to handle as one question. You need to do some reading.
http://msdn.microsoft.com/en-us/library/8ey5ey87%28VS.71%29.aspx
http://www.codeproject.com/KB/architecture/exceptionbestpractices.aspx
Also FXCop, and VS Team System will warn you on some design issues.
Such heavy reliance on exception handling (in any language) does suggest that the mechanism is being misused. I always understood that exception handling was designed to trap the truly exceptional, unforeseeable event. It is not designed to handle (for instance) invalid data entry by a user - this is normal operation and your design and application coding must deal with such normal processing.
http://msdn.microsoft.com/en-us/library/ff664698(v=PandP.50).aspx
Check out Microsoft's Exception Handling Application Block. It has an intial learning curve, but is good stuff once you get it figured out.
Which is the best place to handle the exceptions ? BLL, DAL or PL ?
Should I allow the methods in the DAL and BLL to throw the exceptions up the chain and let the PL handle them? or should I handle them at the BLL ?
e.g
If I have a method in my DAL that issues "ExecuteNonQuery" and updates some records, and due to one or more reason, 0 rows are affected. Now, how should I let my PL know that whether an exception happened or there really was no rows matched to the condition. Should I use "try catch" in my PL code and let it know through an exception, or should I handle the exception at DAL and return some special code like (-1) to let the PL differentiate between the (exception) and (no rows matched condition i.e. zero rows affected) ?
It makes no sense to let an exception that is thrown in the DAL bubble up to the PL - how is the user supposed to react if the database connection could not be established?
Catch and handle exceptions early, if you can handle them. Do not just swallow them without outputting a hint or a log message - this will lead to severe difficulties and bugs that are hard to track.
The short answer is it depends!
You should only ever handle an exception if you can do something useful with it. The 'something useful' again depends on what you are doing. You may want to log the details of the exception although this isn't really handling it and you should really re-throw the exception after logging in most circumstances. You may want to wrap the exception in some other (possibly custom) exception in order to add more information to the exception. As #mbeckish touches on, you may want to try to recover from the exception by retrying the operation for example - you should be careful not to retry forever however. Finally (excuse the pun) you may want to use a finally block to clean up any resources such as an open DB connection. What you choose to do with the exception will influence where you handle it. It is likely that there isn't a great deal of useful things that can be done with many exceptions other than to report to the user that an error has occurred in which case it would be more than acceptable to handle the exception in the UI layer and report the problem to the user (you should probably log the exception as well, further down your layers).
When throwing exceptions yourself, you should only ever throw exceptions in 'exceptional'circumstances as there is a big overhead in throwing exceptions. In your example you suggest you may be thinking of throwing an exception if no records are updated by your operation. Is this really exceptional? A better thing to do in this situation would be to return the number of records updated - this may still be an error condition that needs to be reported to the user, but isn't exceptional like the command failing because the connecction to the DB has gone down.
This is a reasonable article on exception handling best practices.
This is a huge topic with lots of unneeded controversy (people with loud voices giving bad info!) If you're willing to deal with that, follow s1mm0t's advice, it is mostly agreeable.
However if you want a one-word answer, put them in the PL. Serious. If you can get away with it put your error handling in a global exception handler (all errors should log and give a code to look up the log in production for security reasons (esp if web), but give the full details back during development for speed reasons).
Edit: (clarification) you have to deal with some errors everywhere - but this is not an 'every function' norm. Most of the time let them bubble up to the PL and handle .NET's global error with your own code: log the full call stack from there via a common routine that is accessible from all 3 layers via event handlers (see EDIT at bottom of message). This means you will not have try/catch sprinkled thru all your code; just sections you expect and error and can handle it right there, or, non-critical sections, with which you log the error and inform the user of unavailable functionality (this is even more rare and for super-reliable/critical programs)
Aside from that, when working with limited-resource items, I often use the 'using' keyword or try/finally/end try without the catch. for multithreading lock/mutex/re-entry prevention flags/etc. you also need try/finally in ALL cases so your program still works (especially stateful apps).
If you're using exceptions improperly (e.g., to deal with non-bugs when you should be using the IF statement or checking it an iffy operation will work before you try it), this philosophy will fall apart more.
A side note, in thick client apps especially when there is the possibility of losing significant amounts or the users' input, you may be better with more try/catches where you attempt to save the data (marked as not-yet-valid of course).
EDIT: another need for at least having the logging routine in the PL - this will work differently depending on the platform. An app we're working on shares the BLL/DAL with 3 PL versions: an ASP.Net version, a winforms version, and a console app batch mode regression testing version. The logging routine that is called is actually in the BLL (the DAL only throws errors or totally handles any it gets or re-throws them). However this raises an event that is handled by the PL; on the web it puts it in the server's log and does web-style error message display (friendly message for production); in WinForms a special message window appears with tech support info, etc. and logs the error behind the scenes (developers can do something 'secret' to see the full info). And of course in the testing version it is a much simpler process but different as well.
Not sure how I'd have done that in the BLL except for passing a parameter 'what platform,' but since it doesn't include winforms or asp libraries that the logging depends on, that still would be a trick.
The layer that knows what to do to set things right should be the layer that handles the exception.
For example, if you decide to handle deadlock errors by retrying the query a certain number of times, then you could build that into your DAL. If it continues to fail, then you might want to let the exception bubble up to the next layer, which can then decide if it knows how to properly handle this exception.
All layers in your application should manage exceptions gracefullly. This is know as a cross cutting corncern, because it appears in all your layers.
I belive that using a framework like Enterprise Exception Block with unity, you will end up with a better code overall.
Take a look at this post
http://msdn.microsoft.com/en-us/library/ff664698(v=PandP.50).aspx
It will take sometime to master it, but there are lots of examples and screencast around there.
How to handle exceptions depends on technical and business needs. For complex or highly important database updates I include out params that pass a small list of known errors backup to the DL. This way, known error scenarios can be programmatically solved in some cases. In other cases the error needs to be logged and the user should be notified of an error.
I make a practice of notifying a human being of errors. Sure, logging will give us detailed information, but it's no replacement for the response time of a human being. Not only that, but why force developers to watch system logs just to see if things are going south? Talk about unnecessary cost.
If you have time to define potential errors/exceptions and programmatically solve them, then by all means do it. Many times errors/exceptions are unexpected. That's why it is important to be prepared for that unexpected and what better way to do that than involving a human being.
Overall, one should be on the defensive when planning exception handling. Programs grow or they die. A part of growing is introducing bugs. So don't spin your wheels trying to kill them all.
The question to you is where is the exception relevant? If it is a data access exception it should be caught in the DAL. If it is a logic exception it should be caught in the BLL. If it is a presentation exception then in the PL.
For instance if your DAL throws an exception it should return a null or a false or whatever the case may be to your BLL. Your BLL should know what to do if the DAL returns a null, maybe it passes it right through, maybe it tries calling another function, etc. The same goes with your PL if the BLL passes through a null from the DAL or returns something specific of its own then the presentation layer should be able to notify the end user that there was an issue.
Of course you won't get the verbose exception messages, but that is a good thing as far as your users are concerned. You should have a flexible logging system to catch these exceptions and report them to a database or an ip:port or whatever you decide.
Essentially you need to think in terms of separation of concerns if the concern is a data issue or a logic issue it should be handled accordingly.
I know I'm opening myself to a royal flaming by even asking this, but I thought I would see if StackOverflow has any solutions to a problem that I'm having...
I have a C# application that is failing at a client site in a way that I am unable to reproduce locally. Unfortunately, it is very difficult (impossible) for me to get any information that at all helps in isolating the source of the problem.
I have in place a rather extensive error monitoring framework which is watching for unhandled exceptions in all the usual places:
Backstop exception handler in threads I control
Application.ThreadException for WinForms exceptions
AppDomain.CurrentDomain.UnhandledException
Which logs detailed information in a place where I have access to them.
This has been very useful in the past to identify issues in production code, but has not been giving me any information at about the current series of issues.
My best guess is that the core issue is one of the "rude" exception types (thread abort, out of memory, stack overflow, access violation, etc.) that are escalating to a rude shutdown that are ripping down the process before I have a chance to see what is going on.
Is there anything that I can be doing to snapshot information as my process is crashing that would be useful? Ideally, I would be able to write out my custom log format, but I would be happy if I could have a reliable way of ensuring that a crash dump is written somewhere.
I was hoping that I could implement class deriving from CriticalFinalizerObject and have it spit a last-chance error log out when it is disposing, but that doesn't seem to be triggered in the StackOverflow scenario which I tested.
I am unable to use Windows Error Reporting and friends due to the lack of a code signing certificate.
I'm not trying to "recover" from arbitrary exceptions, I'm just trying to make a note of what went wrong on the way down.
Any ideas?
You could try creating a minidump file. This is a C++ API, but it should be possible to write a small C++ program that starts your application keeps a handle to the process, waits on the process handle, and then uses the process handle to create a minidump when the application dies.
If you have done what you claim:
Try-Catch on the Application.Run
Unhandled Domain Exceptions
Unhandled Thread Exceptions
Try Catch handlers in all threads
Then you would have caught the exception except perhaps if it is being thrown by a third party or COM component.
You certainly haven't given enough information.
What events does the client say leads up to the exception?
What COM or third party components do you use? (Do you properly instance and reference these components? Do you pass valid arguments to COM function calls?)
Do you make use of any un-managed - un-safe code?
Are you positive that you have all throw-capable calls covered with try-catch?
I'm just saying that no-one can offer you any helpful advice unless you post a heck of lot more information and even at that we probably can only speculate as to the source of you problem.
Have a set of fresh eyes look at your code.
Some errors cannot be caught by logging.
See this similar question for more details:
StackOverflowException in .NET
Here's a link explaining asynchronous exceptions (and why you can't recover from them):
http://www.bluebytesoftware.com/blog/PermaLink.aspx?guid=c1898a31-a0aa-40af-871c-7847d98f1641
What is Environment.FailFast?
How is it useful?
It is used to kill an application. It's a static method that will instantly kill an application without being caught by any exception blocks.
Environment.FastFail(String) can
actually be a great debugging tool.
For example, say you have an
application that is just downright
giving you some weird output. You have
no idea why. You know it's wrong, but
there are just no exceptions bubbling
to the surface to help you out. Well,
if you have access to Visual Studio
2005's Debug->Exceptions... menu item,
you can actually tell Visual Studio to
allow you to see those first chance
exceptions. If you don't have that,
however you can put
Environment.FastFail(String) in an
exception, and use deductive reasoning
and process of elimination to find out
where your problem in.
Reference
It also creates a dump and event viewer entry, which might be useful.
It's a way to immediately exit your application without throwing an exception.
Documentation is here.
Might be useful in some security or data-critical contexts.
Failfast can be used in situations where you might be endangering the user's data. Say in a database engine, when you detect a corruption of your internal data structures, the only sane course of action is to halt the process as quickly as possible, to avoid writing garbage to the database and risk corrupting it and lose the user's data. This is one possible scenario where failfast is useful.
Another use is to catch programmer errors. Say you are writing a library and some function accepts a pointer that cannot be null in any circumstance, that is, if it's null, you are clearly in presence of a programmer error. You can return an error like E_POINTER or throw some InvalidArgument exception and hope someone notices, but you'll get their attention better by failing fast :-)
Note that I'm not restricting the example to pointers, you can generalize to any parameter or condition that should never happen. Failing fast ultimately results in better quality apps, as many bugs no longer go unnoticed.
Finally, failing fast helps with capturing the state of the process as faithfully as possible (as a memory dump gets created), in particular when failing fast immediately upon detecting an unrecoverable error or a really unexpected condition.
If the process was allowed to continue, say the 'finally' clauses would run or the stack would be unwound, and things would get destroyed or disposed-of, before a memory dump is taken, then the state of the process might be altered in such as way that makes it much more difficult to diagnose the root cause of the problem.
It kills the application and even skips try/finally blocks.
From .NET Framework Design Guidelines on Exception Throwing:
✓ CONSIDER terminating the process by calling System.Environment.FailFast (.NET Framework 2.0 feature) instead of throwing an exception if your code encounters a situation where it is unsafe for further execution.
Joe Duffy discusses failing fast and the discipline to make it useful, here.
http://joeduffyblog.com/2014/10/13/if-youre-going-to-fail-do-it-fast/
Essentially, he's saying that for programming bugs - i.e. unexpected errors that are the fault of the programmer and not the programme user or other inputs or situations that can be reasonable expected to be bad - then deciding to always fail fast for unexpected errors has been seen to improve code quality.
I think since its an optional team decision and discipline, use of this API in C# is rare since in reality we're all mostly writing LoB apps for 12 people in HR or an online shop at best.
So for us, we'd maybe use this when we want deny the consumer of our API the opportunity of making any further moves.
An unhandled exception that is thrown (or rethrown) within a Task won't take effect until the Task is garbage-collected, at some perhaps-random time later.
This method lets you crash the process now -- see this answer.