Quick and simple question:
Is this:
private static void SetupConnection()
{
try
{
TcpClient client = new TcpClient(myServer, myPort);
//Do whatever...
}
catch (SocketException)
{
//Server is closed. Retry in 10 minutes.
Thread.Sleep(600000);
SetupConnection();
}
a viable alternative to this:
private static void SetupConnection()
{
while (true)
{
try
{
TcpClient client = new TcpClient(myServer, myPort);
//Do whatever...
break;
}
catch (SocketException)
{
//Server is closed. Retry in 10 minutes.
Thread.Sleep(600000);
}
}
}
While the second one looks "cleaner" I am still rather curious if the first one is also acceptable – and if it isn't, then why not?
Recursion is bad in this case, because if your program runs for too long and connection is retried, you will eventually hit a StackOverflowException.
Why Recursion is Bad ?
You will need to understand what is call stack.
This is a stack which is maitained in the memory. Every time a new method is called, that method's reference and parmaeters are added in this stack. This stack will be maintained in memory (RAM).
With recursion, if you keep on calling method without any boundary condition, the stack will keep on growing.
After some time, it would be in a state where it cannot accept any further entries because there is not enough memory to hold it.
That's when you will get "Stack Overflow Exception".
Can we rewrite every recursive algorithm without using recursion ?
Yes you can. The aproach is generally called as "Iterative approach".
In every case, you can use an auxillary stack / list / group of variables - to hold the parameters you were using in recursion.
Then you can iterate over these variables, until you reach the boundary condition.
That way, you can achieve the same result without calling your method again and again.
It is always better to use this approach.
Then why people write recursive algorithms ?
Sometimes, recursive algorithm is very easy to read. Iterative approach code may not be easy to read and that's why people try to write recursive code many times.
You can decide whether to use iterative approach or recursive approach based on two things:
Input samples
Code maintainability
If you still want to write recursive method, DO NOT forget to add the boundary condition after which recursion will stop.
For ex. Tree travesal code (pre order, in order , post order) is easier to understand if you write recursive algorithm.
This will work fine as long as you have some limit on number of nodes / levels in the tree. If you already know that your tree is very huge, probably you would go for iterative aproach.
Hope this helps you to understand these approaches better.
We inherited a somewhat poorly-designed WCF service that we want to improve. One problem with it is that it has over a hundred methods (on two different interfaces), most of which we suspect are not used. We decided to put some logging on each of the methods to track when and how they're called. To make the tracing code refactor-friendly and typo-proof, we implemented it like so:
public void LogUsage()
{
try
{
MethodBase callingMethod = new StackTrace().GetFrame(1).GetMethod();
string interfaceName = callingMethod.DeclaringType.GetInterfaces()[0].Name;
_loggingDao.LogUsage(interfaceName, callingMethod.Name, GetClientAddress(), GetCallingUrl());
}
catch (Exception exception)
{
_legacyLogger.Error("Error in usage tracking", exception);
}
}
LogUsage() is then called at the start of each method we want to trace.
The service is very high traffic, on the order of 500,000+ calls/day. 99.95% of the time, this code executes beautifully. But the other 0.05% of the time, GetInterfaces() returns an empty (but not null) array.
Why would GetInterfaces() occasionally return inconsistent results?
This may seem so trivial - a 0.05% error rate is something we can usually only dream of. But the whole point is to identify all the service touchpoints, and if this error is always coming out of one (or a few) method calls, then our tracing is incomplete. I've tried to reproduce this error in my development environment by calling each and every method on the service, but to no avail.
StackTrace is notoriously unreliable, especially in multi-threaded environments. Or rather, it is highly reliable, but isn't very practical at times. Asking for the 'last method that was called' can have unexpected results. Try logging the DeclaringType. You might be surprised what you find there. Note that while this is a 0.05% failure rate now, it might easily increase with the complexity of your application.
In order to properly implement reusable tracing code, you'll need to rely on the .NET 4.5 feature Caller Information, by using a dynamic proxy (e.g. Castle Dynamic Proxy), or by using an AOP framework such as PostSharp. Alternatively, you can just code tracing by hand.
From Erik Lippert (who works on the C# compiler team for MS) in response to Getting Type T from a StackFrame:
The stack frame does not actually tell you who called your method. The
stack frame tells you where control is going to return to. The stack
frame is the reification of continuation. The fact that who called the
method and where control will return to are almost always the same
thing is the source of your confusion, but I assure you that they need
not be the same.
The whole post is worth reading...
I would like to write a vulnerable program, to better understand Stack Overflow (causes) in c#, and also for educational purposes. Basically, I "just" want a stack overflow, that overwrites the EIP, so I get control over it and can point to my own code.
My problem is: Which objects do use the stack as memory location?
For example: the Program parses a text file with recursive bytewise reading until a line break is found (yeah, I think nobody would do this, but as this is only for learning...). Currently, I'm appending a string with the hex value of chars in a text file. This string is a field of an object that is instanciated after calling main().
Using WinDbg, I got these values after the stack has overflown from (nearly) endless recursion:
(14a0.17e0): Break instruction exception - code 80000003 (first chance)
eax=00000000 ebx=00000000 ecx=0023f618 edx=778570b4 esi=fffffffe edi=00000000
eip=778b04f6 esp=0023f634 ebp=0023f660 iopl=0
BTW I'm using a Win7x86 AMD machine, if this is from interest.
I've seen many C++ examples causing a stack overflow using strcpy, is there any similar method in c#?
Best Regards,
NoMad
edit: I use this code to cause the stack overflow.
class FileTest
{
FileStream fs = new FileStream("test.txt", FileMode.Open, FileAccess.Read);
string line = String.Empty;
public FileTest()
{
Console.WriteLine(ReadTillBreak());
}
private string ReadTillBreak()
{
int b = 0;
b = fs.ReadByte();
line += (char)b;
if (b != 13)
ReadTillBreak();
return line;
}
}
Is it possible to overflow the stack and write into the eip with the line string (so, content of test.txt)?
The reason you can do exploit stack corrupts in C and C++ is because you handle memory yourself and the language allows you to do all sorts of crazy stuff. C# runs in an environment that is specifically designed to prevent a lot of these problems. I.e. while you can easily generate a stack overflow in C# there's no way that you can modify the control flow of the program that way using managed code.
The way exploits against managed environments usually work is by breaking out of the sandbox so to speak. As long as the code runs in the sandbox there are a lot of these tricks that will simply not work.
If you want to learn about stack corruption I suggest you stick to C or C++.
I'm not entirely clear on you descriptions of what you have tried. Stack overflows do not generally "overwrite the EIP".
To cause a stack overflow, the most straight forward way is something like this.
void RecursiveMethod()
{
RecursiveMethod();
}
Since each call to this method stores the return address on the stack, calling it endlessly like this without returning will eventually use up all stack space. Of course, modern Windows applications have tons of stack space so it could take a while. You could increase the amount of stack usage for each call by adding arguments or local variables within the method.
I've been poking around mscorlib to see how the generic collection optimized their enumerators and I stumbled on this:
// in List<T>.Enumerator<T>
public bool MoveNext()
{
List<T> list = this.list;
if ((this.version == list._version) && (this.index < list._size))
{
this.current = list._items[this.index];
this.index++;
return true;
}
return this.MoveNextRare();
}
The stack size is 3, and the size of the bytecode should be 80 bytes. The naming of the MoveNextRare method got me on my toes and it contains an error case as well as an empty collection case, so obviously this is breaching separation of concern.
I assume the MoveNext method is split this way to optimize stack space and help the JIT, and I'd like to do the same for some of my perf bottlenecks, but without hard data, I don't want my voodoo programming turning into cargo-cult ;)
Thanks!
Florian
If you're going to think about ways in which List<T>.Enumerator is "odd" for the sake of performance, consider this first: it's a mutable struct. Feel free to recoil with horror; I know I do.
Ultimately, I wouldn't start mimicking optimisations from the BCL without benchmarking/profiling what difference they make in your specific application. It may well be appropriate for the BCL but not for you; don't forget that the BCL goes through the whole NGEN-alike service on install. The only way to find out what's appropriate for your application is to measure it.
You say you want to try the same kind of thing for your performance bottlenecks: that suggests you already know the bottlenecks, which suggests you've got some sort of measurement in place. So, try this optimisation and measure it, then see whether the gain in performance is worth the pain of readability/maintenance which goes with it.
There's nothing cargo-culty about trying something and measuring it, then making decisions based on that evidence.
Separating it into two functions has some advantages:
If the method were to be inlined, only the fast path would be inlined and the error handling would still be a function call. This prevents inlining from costing too much extra space. But 80 bytes of IL is probably still above the threshold for inlining (it was once documented as 32 bytes, don't know if it's changed since .NET 2.0).
Even if it isn't inlined, the function will be smaller and fit within the CPU's instruction cache more easily, and since the slow path is separate, it won't have to be fetched into cache every time the fast path is.
It may help the CPU branch predictor optimize for the more common path (returning true).
I think that MoveNextRare is always going to return false, but by structuring it like this it becomes a tail call, and if it's private and can only be called from here then the JIT could theoretically build a custom calling convention between these two methods that consists of just a jmp instruction with no prologue and no duplication of epilogue.
So, I know that try/catch does add some overhead and therefore isn't a good way of controlling process flow, but where does this overhead come from and what is its actual impact?
Three points to make here:
Firstly, there is little or NO performance penalty in actually having try-catch blocks in your code. This should not be a consideration when trying to avoid having them in your application. The performance hit only comes into play when an exception is thrown.
When an exception is thrown in addition to the stack unwinding operations etc that take place which others have mentioned you should be aware that a whole bunch of runtime/reflection related stuff happens in order to populate the members of the exception class such as the stack trace object and the various type members etc.
I believe that this is one of the reasons why the general advice if you are going to rethrow the exception is to just throw; rather than throw the exception again or construct a new one as in those cases all of that stack information is regathered whereas in the simple throw it is all preserved.
I'm not an expert in language implementations (so take this with a grain of salt), but I think one of the biggest costs is unwinding the stack and storing it for the stack trace. I suspect this happens only when the exception is thrown (but I don't know), and if so, this would be decently sized hidden cost every time an exception is thrown... so it's not like you are just jumping from one place in the code to another, there is a lot going on.
I don't think it's a problem as long as you are using exceptions for EXCEPTIONAL behavior (so not your typical, expected path through the program).
Are you asking about the overhead of using try/catch/finally when exceptions aren't thrown, or the overhead of using exceptions to control process flow? The latter is somewhat akin to using a stick of dynamite to light a toddler's birthday candle, and the associated overhead falls into the following areas:
You can expect additional cache misses due to the thrown exception accessing resident data not normally in the cache.
You can expect additional page faults due to the thrown exception accessing non-resident code and data not normally in your application's working set.
for example, throwing the exception will require the CLR to find the location of the finally and catch blocks based on the current IP and the return IP of every frame until the exception is handled plus the filter block.
additional construction cost and name resolution in order to create the frames for diagnostic purposes, including reading of metadata etc.
both of the above items typically access "cold" code and data, so hard page faults are probable if you have memory pressure at all:
the CLR tries to put code and data that is used infrequently far from data that is used frequently to improve locality, so this works against you because you're forcing the cold to be hot.
the cost of the hard page faults, if any, will dwarf everything else.
Typical catch situations are often deep, therefore the above effects would tend to be magnified (increasing the likelihood of page faults).
As for the actual impact of the cost, this can vary a lot depending on what else is going on in your code at the time. Jon Skeet has a good summary here, with some useful links. I tend to agree with his statement that if you get to the point where exceptions are significantly hurting your performance, you have problems in terms of your use of exceptions beyond just the performance.
Contrary to theories commonly accepted, try/catch can have significant performance implications, and that's whether an exception is thrown or not!
It disables some automatic optimisations (by design), and in some cases injects debugging code, as you can expect from a debugging aid. There will always be people who disagree with me on this point, but the language requires it and the disassembly shows it so those people are by dictionary definition delusional.
It can impact negatively upon maintenance. This is actually the most significant issue here, but since my last answer (which focused almost entirely on it) was deleted, I'll try to focus on the less significant issue (the micro-optimisation) as opposed to the more significant issue (the macro-optimisation).
The former has been covered in a couple of blog posts by Microsoft MVPs over the years, and I trust you could find them easily yet StackOverflow cares so much about content so I'll provide links to some of them as filler evidence:
Performance implications of try/catch/finally (and part two), by Peter Ritchie explores the optimisations which try/catch/finally disables (and I'll go further into this with quotes from the standard)
Performance Profiling Parse vs. TryParse vs. ConvertTo by Ian Huff states blatantly that "exception handling is very slow" and demonstrates this point by pitting Int.Parse and Int.TryParse against each other... To anyone who insists that TryParse uses try/catch behind the scenes, this ought to shed some light!
There's also this answer which shows the difference between disassembled code with- and without using try/catch.
It seems so obvious that there is an overhead which is blatantly observable in code generation, and that overhead even seems to be acknowledged by people who Microsoft value! Yet I am, repeating the internet...
Yes, there are dozens of extra MSIL instructions for one trivial line of code, and that doesn't even cover the disabled optimisations so technically it's a micro-optimisation.
I posted an answer years ago which got deleted as it focused on the productivity of programmers (the macro-optimisation).
This is unfortunate as no saving of a few nanoseconds here and there of CPU time is likely to make up for many accumulated hours of manual optimisation by humans. Which does your boss pay more for: an hour of your time, or an hour with the computer running? At what point do we pull the plug and admit that it's time to just buy a faster computer?
Clearly, we should be optimising our priorities, not just our code! In my last answer I drew upon the differences between two snippets of code.
Using try/catch:
int x;
try {
x = int.Parse("1234");
}
catch {
return;
}
// some more code here...
Not using try/catch:
int x;
if (int.TryParse("1234", out x) == false) {
return;
}
// some more code here
Consider from the perspective of a maintenance developer, which is more likely to waste your time, if not in profiling/optimisation (covered above) which likely wouldn't even be necessary if it weren't for the try/catch problem, then in scrolling through source code... One of those has four extra lines of boilerplate garbage!
As more and more fields are introduced into a class, all of this boilerplate garbage accumulates (both in source and disassembled code) well beyond reasonable levels. Four extra lines per field, and they're always the same lines... Were we not taught to avoid repeating ourselves? I suppose we could hide the try/catch behind some home-brewed abstraction, but... then we might as well just avoid exceptions (i.e. use Int.TryParse).
This isn't even a complex example; I've seen attempts at instantiating new classes in try/catch. Consider that all of the code inside of the constructor might then be disqualified from certain optimisations that would otherwise be automatically applied by the compiler. What better way to give rise to the theory that the compiler is slow, as opposed to the compiler is doing exactly what it's told to do?
Assuming an exception is thrown by said constructor, and some bug is triggered as a result, the poor maintenance developer then has to track it down. That might not be such an easy task, as unlike the spaghetti code of the goto nightmare, try/catch can cause messes in three dimensions, as it could move up the stack into not just other parts of the same method, but also other classes and methods, all of which will be observed by the maintenance developer, the hard way! Yet we are told that "goto is dangerous", heh!
At the end I mention, try/catch has its benefit which is, it's designed to disable optimisations! It is, if you will, a debugging aid! That's what it was designed for and it's what it should be used as...
I guess that's a positive point too. It can be used to disable optimizations that might otherwise cripple safe, sane message passing algorithms for multithreaded applications, and to catch possible race conditions ;) That's about the only scenario I can think of to use try/catch. Even that has alternatives.
What optimisations do try, catch and finally disable?
A.K.A
How are try, catch and finally useful as debugging aids?
they're write-barriers. This comes from the standard:
12.3.3.13 Try-catch statements
For a statement stmt of the form:
try try-block
catch ( ... ) catch-block-1
...
catch ( ... ) catch-block-n
The definite assignment state of v at the beginning of try-block is the same as the definite assignment state of v at the beginning of stmt.
The definite assignment state of v at the beginning of catch-block-i (for any i) is the same as the definite assignment state of v at the beginning of stmt.
The definite assignment state of v at the end-point of stmt is definitely assigned if (and only if) v is definitely assigned at the end-point of try-block and every catch-block-i (for every i from 1 to n).
In other words, at the beginning of each try statement:
all assignments made to visible objects prior to entering the try statement must be complete, which requires a thread lock for a start, making it useful for debugging race conditions!
the compiler isn't allowed to:
eliminate unused variable assignments which have definitely been assigned to before the try statement
reorganise or coalesce any of it's inner-assignments (i.e. see my first link, if you haven't already done so).
hoist assignments over this barrier, to delay assignment to a variable which it knows won't be used until later (if at all) or to pre-emptively move later assignments forward to make other optimisations possible...
A similar story holds for each catch statement; suppose within your try statement (or a constructor or function it invokes, etc) you assign to that otherwise pointless variable (let's say, garbage=42;), the compiler can't eliminate that statement, no matter how irrelevant it is to the observable behaviour of the program. The assignment needs to have completed before the catch block is entered.
For what it's worth, finally tells a similarly degrading story:
12.3.3.14 Try-finally statements
For a try statement stmt of the form:
try try-block
finally finally-block
• The definite assignment state of v at the beginning of try-block is the same as the definite assignment state of v at the beginning of stmt.
• The definite assignment state of v at the beginning of finally-block is the same as the definite assignment state of v at the beginning of stmt.
• The definite assignment state of v at the end-point of stmt is definitely assigned if (and only if) either:
o v is definitely assigned at the end-point of try-block
o v is definitely assigned at the end-point of finally-block
If a control flow transfer (such as a goto statement) is made that begins within try-block, and ends outside of try-block, then v is also considered definitely assigned on that control flow transfer if v is definitely assigned at the end-point of finally-block. (This is not an only if—if v is definitely assigned for another reason on this control flow transfer, then it is still considered definitely assigned.)
12.3.3.15 Try-catch-finally statements
Definite assignment analysis for a try-catch-finally statement of the form:
try try-block
catch ( ... ) catch-block-1
...
catch ( ... ) catch-block-n
finally finally-block
is done as if the statement were a try-finally statement enclosing a try-catch statement:
try {
try
try-block
catch ( ... ) catch-block-1
...
catch ( ... ) catch-block-n
}
finally finally-block
In my experience the biggest overhead is in actually throwing an exception and handling it. I once worked on a project where code similar to the following was used to check if someone had a right to edit some object. This HasRight() method was used everywhere in the presentation layer, and was often called for 100s of objects.
bool HasRight(string rightName, DomainObject obj) {
try {
CheckRight(rightName, obj);
return true;
}
catch (Exception ex) {
return false;
}
}
void CheckRight(string rightName, DomainObject obj) {
if (!_user.Rights.Contains(rightName))
throw new Exception();
}
When the test database got fuller with test data, this lead to a very visible slowdown while openening new forms etc.
So I refactored it to the following, which - according to later quick 'n dirty measurements - is about 2 orders of magnitude faster:
bool HasRight(string rightName, DomainObject obj) {
return _user.Rights.Contains(rightName);
}
void CheckRight(string rightName, DomainObject obj) {
if (!HasRight(rightName, obj))
throw new Exception();
}
So in short, using exceptions in normal process flow is about two orders of magnitude slower then using similar process flow without exceptions.
Not to mention if it's inside a frequently-called method it may affect the overall behavior of the application.
For example, I consider the use of Int32.Parse as a bad practice in most cases since it throws exceptions for something that can be caught easily otherwise.
So to conclude everything written here:
1) Use try..catch blocks to catch unexpected errors - almost no performance penalty.
2) Don't use exceptions for excepted errors if you can avoid it.
I wrote an article about this a while back because there were a lot of people asking about this at the time. You can find it and the test code at http://www.blackwasp.co.uk/SpeedTestTryCatch.aspx.
The upshot is that there is a tiny amount of overhead for a try/catch block but so small that it should be ignored. However, if you are running try/catch blocks in loops that are executed millions of times, you may want to consider moving the block to outside of the loop if possible.
The key performance issue with try/catch blocks is when you actually catch an exception. This can add a noticeable delay to your application. Of course, when things are going wrong, most developers (and a lot of users) recognise the pause as an exception that is about to happen! The key here is not to use exception handling for normal operations. As the name suggests, they are exceptional and you should do everything you can to avoid them being thrown. You should not use them as part of the expected flow of a program that is functioning correctly.
I made a blog entry about this subject last year.
Check it out. Bottom line is that there is almost no cost for a try block if no exception occurs - and on my laptop, an exception was about 36μs. That might be less than you expected, but keep in mind that those results where on a shallow stack. Also, first exceptions are really slow.
It is vastly easier to write, debug, and maintain code that is free of compiler error messages, code-analysis warning messages, and routine accepted exceptions (particularly exceptions that are thrown in one place and accepted in another). Because it is easier, the code will on average be better written and less buggy.
To me, that programmer and quality overhead is the primary argument against using try-catch for process flow.
The computer overhead of exceptions is insignificant in comparison, and usually tiny in terms of the application's ability to meet real-world performance requirements.
I really like Hafthor's blog post, and to add my two cents to this discussion, I'd like to say that, it's always been easy for me to have the DATA LAYER throw only one type of exception (DataAccessException). This way my BUSINESS LAYER knows what exception to expect and catches it. Then depending on further business rules (i.e. if my business object participates in the workflow etc), I may throw a new exception (BusinessObjectException) or proceed without re/throwing.
I'd say don't hesitate to use try..catch whenever it is necessary and use it wisely!
For example, this method participates in a workflow...
Comments?
public bool DeleteGallery(int id)
{
try
{
using (var transaction = new DbTransactionManager())
{
try
{
transaction.BeginTransaction();
_galleryRepository.DeleteGallery(id, transaction);
_galleryRepository.DeletePictures(id, transaction);
FileManager.DeleteAll(id);
transaction.Commit();
}
catch (DataAccessException ex)
{
Logger.Log(ex);
transaction.Rollback();
throw new BusinessObjectException("Cannot delete gallery. Ensure business rules and try again.", ex);
}
}
}
catch (DbTransactionException ex)
{
Logger.Log(ex);
throw new BusinessObjectException("Cannot delete gallery.", ex);
}
return true;
}
We can read in Programming Languages Pragmatics by Michael L. Scott that the nowadays compilers do not add any overhead in common case, this means, when no exceptions occurs. So every work is made in compile time.
But when an exception is thrown in run-time, compiler needs to perform a binary search to find the correct exception and this will happen for every new throw that you made.
But exceptions are exceptions and this cost is perfectly acceptable. If you try to do Exception Handling without exceptions and use return error codes instead, probably you will need a if statement for every subroutine and this will incur in a really real time overhead. You know a if statement is converted to a few assembly instructions, that will performed every time you enter in your sub-routines.
Sorry about my English, hope that it helps you. This information is based on cited book, for more information refer to Chapter 8.5 Exception Handling.
Let us analyse one of the biggest possible costs of a try/catch block when used where it shouldn't need to be used:
int x;
try {
x = int.Parse("1234");
}
catch {
return;
}
// some more code here...
And here's the one without try/catch:
int x;
if (int.TryParse("1234", out x) == false) {
return;
}
// some more code here
Not counting the insignificant white-space, one might notice that these two equivelant pieces of code are almost exactly the same length in bytes. The latter contains 4 bytes less indentation. Is that a bad thing?
To add insult to injury, a student decides to loop while the input can be parsed as an int. The solution without try/catch might be something like:
while (int.TryParse(...))
{
...
}
But how does this look when using try/catch?
try {
for (;;)
{
x = int.Parse(...);
...
}
}
catch
{
...
}
Try/catch blocks are magical ways of wasting indentation, and we still don't even know the reason it failed! Imagine how the person doing debugging feels, when code continues to execute past a serious logical flaw, rather than halting with a nice obvious exception error. Try/catch blocks are a lazy man's data validation/sanitation.
One of the smaller costs is that try/catch blocks do indeed disable certain optimizations: http://msmvps.com/blogs/peterritchie/archive/2007/06/22/performance-implications-of-try-catch-finally.aspx. I guess that's a positive point too. It can be used to disable optimizations that might otherwise cripple safe, sane message passing algorithms for multithreaded applications, and to catch possible race conditions ;) That's about the only scenario I can think of to use try/catch. Even that has alternatives.