Replacing while-loop with recursion - c#

Quick and simple question:
Is this:
private static void SetupConnection()
{
try
{
TcpClient client = new TcpClient(myServer, myPort);
//Do whatever...
}
catch (SocketException)
{
//Server is closed. Retry in 10 minutes.
Thread.Sleep(600000);
SetupConnection();
}
a viable alternative to this:
private static void SetupConnection()
{
while (true)
{
try
{
TcpClient client = new TcpClient(myServer, myPort);
//Do whatever...
break;
}
catch (SocketException)
{
//Server is closed. Retry in 10 minutes.
Thread.Sleep(600000);
}
}
}
While the second one looks "cleaner" I am still rather curious if the first one is also acceptable – and if it isn't, then why not?

Recursion is bad in this case, because if your program runs for too long and connection is retried, you will eventually hit a StackOverflowException.

Why Recursion is Bad ?
You will need to understand what is call stack.
This is a stack which is maitained in the memory. Every time a new method is called, that method's reference and parmaeters are added in this stack. This stack will be maintained in memory (RAM).
With recursion, if you keep on calling method without any boundary condition, the stack will keep on growing.
After some time, it would be in a state where it cannot accept any further entries because there is not enough memory to hold it.
That's when you will get "Stack Overflow Exception".
Can we rewrite every recursive algorithm without using recursion ?
Yes you can. The aproach is generally called as "Iterative approach".
In every case, you can use an auxillary stack / list / group of variables - to hold the parameters you were using in recursion.
Then you can iterate over these variables, until you reach the boundary condition.
That way, you can achieve the same result without calling your method again and again.
It is always better to use this approach.
Then why people write recursive algorithms ?
Sometimes, recursive algorithm is very easy to read. Iterative approach code may not be easy to read and that's why people try to write recursive code many times.
You can decide whether to use iterative approach or recursive approach based on two things:
Input samples
Code maintainability
If you still want to write recursive method, DO NOT forget to add the boundary condition after which recursion will stop.
For ex. Tree travesal code (pre order, in order , post order) is easier to understand if you write recursive algorithm.
This will work fine as long as you have some limit on number of nodes / levels in the tree. If you already know that your tree is very huge, probably you would go for iterative aproach.
Hope this helps you to understand these approaches better.

Related

How to safely iterate over an IAsyncEnumerable to send a collection downstream for message processing in batches

I've watched the chat on LINQ with IAsyncEnumerable which has given me some insight on dealing with extension methods for IAsyncEnumerables, but wasn't detailed enough frankly for a real-world application, especially for my experience level, and I understand that samples/documentation don't really exist as of yet for IAsyncEnumerables
I'm trying to read from a file, do some transformation on the stream, returning a IAsyncEnumerable, and then send those objects downstream after an arbitrary number of objects have been obtained, like:
await foreach (var data in ProcessBlob(downloadedFile))
{
//todo add data to List<T> called listWithPreConfiguredNumberOfElements
if (listWithPreConfiguredNumberOfElements.Count == preConfiguredNumber)
await _messageHandler.Handle(listWithPreConfiguredNumberOfElements);
//repeat the behaviour till all the elements in the IAsyncEnumerable returned by ProcessBlob are sent downstream to the _messageHandler.
}
My understanding from reading on the matter so far is that the await foreach line is working on data that employs the use of Tasks (or ValueTasks), so we don't have a count up front. I'm also hesitant to use a List variable and just do a length-check on that as sharing that data across threads doesn't seem very thread-safe.
I'm using the System.Linq.Async package in the hopes that I could use a relevant extensions method. I can see some promise in the form of TakeWhile, but my understanding on how thread-safe the task I intend to do is not all there, causing me to lose confidence.
Any help or push in the right direction would be massively appreciated, thank you.
There is an operator Buffer that does what you want, in the package System.Interactive.Async.
// Projects each element of an async-enumerable sequence into consecutive
// non-overlapping buffers which are produced based on element count information.
public static IAsyncEnumerable<IList<TSource>> Buffer<TSource>(
this IAsyncEnumerable<TSource> source, int count);
This package contains operators like Amb, Throw, Catch, Defer, Finally etc that do not have a direct equivalent in Linq, but they do have an equivalent in System.Reactive. This is because IAsyncEnumerables are conceptually closer to IObservables than to IEnumerables (because both have a time dimension, while IEnumerables are timeless).
I'm also hesitant to use a List variable and just do a length-check on that as sharing that data across threads doesn't seem very thread-safe.
You need to think in terms of execution flows, not threads, when dealing with async; since you are await-ing the processing step, there isn't actually a concurrency problem accessing the list, because regardless of which threads are used: the list is only accessed once at a time.
If you are still concerned, you could new a list per batch, but that is probably overkill. What you do need, however, is two additions - a reset between batches, and a final processing step:
var listWithPreConfiguredNumberOfElements = new List<YourType>(preConfiguredNumber);
await foreach (var data in ProcessBlob(downloadedFile)) // CAF?
{
listWithPreConfiguredNumberOfElements.Add(data);
if (listWithPreConfiguredNumberOfElements.Count == preConfiguredNumber)
{
await _messageHandler.Handle(listWithPreConfiguredNumberOfElements); // CAF?
listWithPreConfiguredNumberOfElements.Clear(); // reset for a new batch
// (replace this with a "new" if you're still concerned about concurrency)
}
}
if (listWithPreConfiguredNumberOfElements.Any())
{ // process any stragglers
await _messageHandler.Handle(listWithPreConfiguredNumberOfElements); // CAF?
}
You might also choose to use ConfigureAwait(false) in the three spots marked // CAF?

Why does the C# compiler not even warn about endless recursion?

A legacy app is in an endless loop at startup; I don't know why/how yet (code obfuscation contest candidate), but regarding the method that's being called over and over (which is called from several other methods), I thought, "I wonder if one of the methods that calls this is also calling another method that also calls it?"
I thought: "Nah, the compiler would be able to figure that out, and not allow it, or at least emit a warning!"
So I created a simple app to prove that would be the case:
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void button1_Click(object sender, EventArgs e)
{
method1();
}
private void button2_Click(object sender, EventArgs e)
{
method2();
}
private void method1()
{
MessageBox.Show("method1 called, which will now call method2");
method2();
}
private void method2()
{
MessageBox.Show("method2 called, which will now call method1");
// Note to self: Write an article entitled, "Copy-and-Paste Considered Harmful"
method1();
}
}
...but no! It compiles just fine. Why wouldn't the compiler flag this code as questionable at best? If either button is mashed, you are in never-never land!
Okay, sometimes you may want an endless loop (pacemaker code, etc.), but still I think a warning should be emitted.
As you said sometimes people want infinite loops. And the jit-compiler of .net supports tailcall optimization, so you might not even get a stack overflow for endless recursion like you did it.
For the general case, predicting whether or not a program is going to terminate at some point or stuck in an infinite loop is impossible in finite time. It's called the halting problem. All a compiler can possibly find are some special cases, where it is easy to decide.
That's not an endless loop, but an endless recursion. And this is much worse, since they can lead to a stack overflow. Endless recursions are not desired in most languages, unless you are programming malware. Endless loops, however, are often intentional. Services typically run in endless loops.
In order to detect this kind of situation, the compiler would have to analyze the code by following the method calls; however the C# compiler limits this process to the immediate code within the current method. Here, uninitialized or unused variables can be tracked and unreachable code can be detected, for instance. There is a tradeoff to make between the compiling speed and the depth of static analysis and optimizations.
Also it is hardly possible to know the real intention of the programmer.
Imagine that you wrote a method that is perfectly legal. Suddenly because you are calling this method from another place, your compiler complains and tells you that your method is no more legal. I can already see the flood of posts on SO like: "My method compiled yesterday. Today it does not compile any more. But I didn't change it".
To put it very simply: it's not the compiler's job to question your coding patterns.
You could very well write a Main method that does nothing but throw an Exception. It's a far easier pattern to detect and a much more stupid thing to do; yet the compiler will happily allow your program to compile, run, crash and burn.
With that being said, since technically an endless loop / recursion is perfectly legal as far as the compiler is concerned, there's no reason why it should complain about it.
Actually, it would be very hard to figure out at compile time that the loop can't ever be broken at runtime. An exception could be thrown, user interaction could happen, a state might change somewhere on a specific thread, on a port you are monitoring, etc... there's way too much possibilities for any code analysis tool out there to establish, without any doubt, that a specific recursing code segment will inevitably cause an overflow at runtime.
I think the right way to prevent these situations is through unit testing organization. The more code paths you are covering in your tests, the less likely you are to ever face such a scenario.
Because its nearly impossible to detect!
In the example you gave, it is obvious (to us) that the code will loop forever. But the compiler just sees a function call, it doesn't necessarily know at the time what calls that function, what conditional logic could change the looping behavior etc.
For example, with this slight change you aren't in an infinite loop anymore:
private bool method1called = false;
private void method1()
{
MessageBox.Show("method1 called, which will now call method2");
if (!method1called)
method2();
method1called = true;
}
private void method2()
{
MessageBox.Show("method2 called, which will now call method1");
method1();
}
Without actually running the program, how would you know that it isn't looping? I could potentially see a warning for while (true), but that has enough valid use cases that it also makes sense to not put a warning in for it.
A compiler is just parsing the code and translating to IL (for .NET anyways). You can get limited information like variables not being assigned while doing that (especially since it has to generate the symbol table anyways) but advanced detection like this is generally left to code analysis tools.
I found this on the Infinite Loop Wiki found here: http://en.wikipedia.org/wiki/Infinite_loop#Intentional_looping
There are a few situations when this is desired behavior. For example, the games on cartridge-based game consoles typically have no exit condition in their main loop, as there is no operating system for the program to exit to; the loop runs until the console is powered off.
Antique punchcard-reading unit record equipment would literally halt once a card processing task was completed, since there was no need for the hardware to continue operating, until a new stack of program cards were loaded.
By contrast, modern interactive computers require that the computer constantly be monitoring for user input or device activity, so at some fundamental level there is an infinite processing idle loop that must continue until the device is turned off or reset. In the Apollo Guidance Computer, for example, this outer loop was contained in the Exec program, and if the computer had absolutely no other work to do it would loop running a dummy job that would simply turn off the "computer activity" indicator light.
Modern computers also typically do not halt the processor or motherboard circuit-driving clocks when they crash. Instead they fall back to an error condition displaying messages to the operator, and enter an infinite loop waiting for the user to either respond to a prompt to continue, or to reset the device.
Hope this helps.

Is it better to use if/then/else to flip a boolean, or negation?

Can I switch a boolean with one statement as effectively as with an if/then/else ?
Found this in another piece of code that is going into my app...
private void whatever()
{
////
//// a bunch of stuff
////
if (SomeBooleanValue)
{
SomeBooleanValue= false;
}
else
{
SomeBooleanValue = true;
}
}
Out of curiosity, I tried this...
private void whatever_whatever()
{
////
//// the same stuff
////
SomeBooleanValue = !SomeBooleanValue;
}
...and walked through it in debug, and it appears that I get the same result.
Is there a good reason to use the if/then/else instead of the single line way ?
Is there a good reason to use the if/then/else instead of the single line way
Not any that I can think of. Using the ! operator is cleaner and more intuitive for most programmers.
The 1-line way is perfectly fine, and the only reason why you'd use the if/ else structure is if you were doing other things aside from just toggling the boolean.
I would say the second one is better since it is more readable and compact (if/then/else IMO just adds unnecesary lines of code), that would be the only (but strong!) reason to prefer one from the other
Due to compiler optimizations, it will be the same as using the ! operator, which is easier to read for other programmers.
However,
To improve performance, the CPU will try to predict the execution logic ahead of time. For conditional (if/else) statements, it will try to predict the result of the condition and then load the rest of the logic. If it chooses incorrectly, it must go back and re-calculate everything again, hence decreasing performance.
http://en.wikipedia.org/wiki/Branch_predictor
Please, please, please write the code as a negation; it is the most succinct and concise expression of intent. If you write the code the first way, every reader of your code is going to waste time wondering why you did NOT write the simple negation.
Most code will be read, by you and others, many more times than it is written. Writing code that eases the reading process is, almost by definition, good code.

How does StartCoroutine / yield return pattern really work in Unity?

I understand the principle of coroutines. I know how to get the standard StartCoroutine / yield return pattern to work in C# in Unity, e.g. invoke a method returning IEnumerator via StartCoroutine and in that method do something, do yield return new WaitForSeconds(1); to wait a second, then do something else.
My question is: what's really going on behind the scenes? What does StartCoroutine really do? What IEnumerator is WaitForSeconds returning? How does StartCoroutine return control to the "something else" part of the called method? How does all this interact with Unity's concurrency model (where lots of things are going on at the same time without use of coroutines)?
The oft referenced Unity3D coroutines in detail link is dead. Since it is mentioned in the comments and the answers I am going to post the contents of the article here. This content comes from this mirror.
Unity3D coroutines in detail
Many processes in games take place over the course of multiple frames. You’ve got ‘dense’ processes, like pathfinding, which work hard each frame but get split across multiple frames so as not to impact the framerate too heavily. You’ve got ‘sparse’ processes, like gameplay triggers, that do nothing most frames, but occasionally are called upon to do critical work. And you’ve got assorted processes between the two.
Whenever you’re creating a process that will take place over multiple frames – without multithreading – you need to find some way of breaking the work up into chunks that can be run one-per-frame. For any algorithm with a central loop, it’s fairly obvious: an A* pathfinder, for example, can be structured such that it maintains its node lists semi-permanently, processing only a handful of nodes from the open list each frame, instead of trying to do all the work in one go. There’s some balancing to be done to manage latency – after all, if you’re locking your framerate at 60 or 30 frames per second, then your process will only take 60 or 30 steps per second, and that might cause the process to just take too long overall. A neat design might offer the smallest possible unit of work at one level – e.g. process a single A* node – and layer on top a way of grouping work together into larger chunks – e.g. keep processing A* nodes for X milliseconds. (Some people call this ‘timeslicing’, though I don’t).
Still, allowing the work to be broken up in this way means you have to transfer state from one frame to the next. If you’re breaking an iterative algorithm up, then you’ve got to preserve all the state shared across iterations, as well as a means of tracking which iteration is to be performed next. That’s not usually too bad – the design of an ‘A* pathfinder class’ is fairly obvious – but there are other cases, too, that are less pleasant. Sometimes you’ll be facing long computations that are doing different kinds of work from frame to frame; the object capturing their state can end up with a big mess of semi-useful ‘locals,’ kept for passing data from one frame to the next. And if you’re dealing with a sparse process, you often end up having to implement a small state machine just to track when work should be done at all.
Wouldn’t it be neat if, instead of having to explicitly track all this state across multiple frames, and instead of having to multithread and manage synchronization and locking and so on, you could just write your function as a single chunk of code, and mark particular places where the function should ‘pause’ and carry on at a later time?
Unity – along with a number of other environments and languages – provides this in the form of Coroutines.
How do they look?
In “Unityscript” (Javascript):
function LongComputation()
{
while(someCondition)
{
/* Do a chunk of work */
// Pause here and carry on next frame
yield;
}
}
In C#:
IEnumerator LongComputation()
{
while(someCondition)
{
/* Do a chunk of work */
// Pause here and carry on next frame
yield return null;
}
}
How do they work?
Let me just say, quickly, that I don’t work for Unity Technologies. I’ve not seen the Unity source code. I’ve never seen the guts of Unity’s coroutine engine. However, if they’ve implemented it in a way that is radically different from what I’m about to describe, then I’ll be quite surprised. If anyone from UT wants to chime in and talk about how it actually works, then that’d be great.
The big clues are in the C# version. Firstly, note that the return type for the function is IEnumerator. And secondly, note that one of the statements is yield
return. This means that yield must be a keyword, and as Unity’s C# support is vanilla C# 3.5, it must be a vanilla C# 3.5 keyword. Indeed, here it is in MSDN – talking about something called ‘iterator blocks.’ So what’s going on?
Firstly, there’s this IEnumerator type. The IEnumerator type acts like a cursor over a sequence, providing two significant members: Current, which is a property giving you the element the cursor is presently over, and MoveNext(), a function that moves to the next element in the sequence. Because IEnumerator is an interface, it doesn’t specify exactly how these members are implemented; MoveNext() could just add one toCurrent, or it could load the new value from a file, or it could download an image from the Internet and hash it and store the new hash in Current… or it could even do one thing for the first element in the sequence, and something entirely different for the second. You could even use it to generate an infinite sequence if you so desired. MoveNext() calculates the next value in the sequence (returning false if there are no more values), and Current retrieves the value it calculated.
Ordinarily, if you wanted to implement an interface, you’d have to write a class, implement the members, and so on. Iterator blocks are a convenient way of implementing IEnumerator without all that hassle – you just follow a few rules, and the IEnumerator implementation is generated automatically by the compiler.
An iterator block is a regular function that (a) returns IEnumerator, and (b) uses the yield keyword. So what does the yield keyword actually do? It declares what the next value in the sequence is – or that there are no more values. The point at which the code encounters a yield
return X or yield break is the point at which IEnumerator.MoveNext() should stop; a yield return X causes MoveNext() to return true andCurrent to be assigned the value X, while a yield
break causes MoveNext() to return false.
Now, here’s the trick. It doesn’t have to matter what the actual values returned by the sequence are. You can call MoveNext() repeatly, and ignore Current; the computations will still be performed. Each time MoveNext() is called, your iterator block runs to the next ‘yield’ statement, regardless of what expression it actually yields. So you can write something like:
IEnumerator TellMeASecret()
{
PlayAnimation("LeanInConspiratorially");
while(playingAnimation)
yield return null;
Say("I stole the cookie from the cookie jar!");
while(speaking)
yield return null;
PlayAnimation("LeanOutRelieved");
while(playingAnimation)
yield return null;
}
and what you’ve actually written is an iterator block that generates a long sequence of null values, but what’s significant is the side-effects of the work it does to calculate them. You could run this coroutine using a simple loop like this:
IEnumerator e = TellMeASecret();
while(e.MoveNext()) { }
Or, more usefully, you could mix it in with other work:
IEnumerator e = TellMeASecret();
while(e.MoveNext())
{
// If they press 'Escape', skip the cutscene
if(Input.GetKeyDown(KeyCode.Escape)) { break; }
}
It’s all in the timing
As you’ve seen, each yield return statement must provide an expression (like null) so that the iterator block has something to actually assign to IEnumerator.Current. A long sequence of nulls isn’t exactly useful, but we’re more interested in the side-effects. Aren’t we?
There’s something handy we can do with that expression, actually. What if, instead of just yielding null
and ignoring it, we yielded something that indicated when we expect to need to do more work? Often we’ll need to carry straight on the next frame, sure, but not always: there will be plenty of times where we want to carry on after an animation or sound has finished playing, or after a particular amount of time has passed. Those while(playingAnimation)
yield return null; constructs are bit tedious, don’t you think?
Unity declares the YieldInstruction base type, and provides a few concrete derived types that indicate particular kinds of wait. You’ve got WaitForSeconds, which resumes the coroutine after the designated amount of time has passed. You’ve got WaitForEndOfFrame, which resumes the coroutine at a particular point later in the same frame. You’ve got the Coroutine type itself, which, when coroutine A yields coroutine B, pauses coroutine A until after coroutine B has finished.
What does this look like from a runtime point of view? As I said, I don’t work for Unity, so I’ve never seen their code; but I’d imagine it might look a little bit like this:
List<IEnumerator> unblockedCoroutines;
List<IEnumerator> shouldRunNextFrame;
List<IEnumerator> shouldRunAtEndOfFrame;
SortedList<float, IEnumerator> shouldRunAfterTimes;
foreach(IEnumerator coroutine in unblockedCoroutines)
{
if(!coroutine.MoveNext())
// This coroutine has finished
continue;
if(!coroutine.Current is YieldInstruction)
{
// This coroutine yielded null, or some other value we don't understand; run it next frame.
shouldRunNextFrame.Add(coroutine);
continue;
}
if(coroutine.Current is WaitForSeconds)
{
WaitForSeconds wait = (WaitForSeconds)coroutine.Current;
shouldRunAfterTimes.Add(Time.time + wait.duration, coroutine);
}
else if(coroutine.Current is WaitForEndOfFrame)
{
shouldRunAtEndOfFrame.Add(coroutine);
}
else /* similar stuff for other YieldInstruction subtypes */
}
unblockedCoroutines = shouldRunNextFrame;
It’s not difficult to imagine how more YieldInstruction subtypes could be added to handle other cases – engine-level support for signals, for example, could be added, with a WaitForSignal("SignalName")YieldInstruction supporting it. By adding more YieldInstructions, the coroutines themselves can become more expressive – yield
return new WaitForSignal("GameOver") is nicer to read thanwhile(!Signals.HasFired("GameOver"))
yield return null, if you ask me, quite apart from the fact that doing it in the engine could be faster than doing it in script.
A couple of non-obvious ramifications
There’s a couple of useful things about all this that people sometimes miss that I thought I should point out.
Firstly, yield return is just yielding an expression – any expression – and YieldInstruction is a regular type. This means you can do things like:
YieldInstruction y;
if(something)
y = null;
else if(somethingElse)
y = new WaitForEndOfFrame();
else
y = new WaitForSeconds(1.0f);
yield return y;
The specific lines yield return new WaitForSeconds(), yield
return new WaitForEndOfFrame(), etc, are common, but they’re not actually special forms in their own right.
Secondly, because these coroutines are just iterator blocks, you can iterate over them yourself if you want – you don’t have to have the engine do it for you. I’ve used this for adding interrupt conditions to a coroutine before:
IEnumerator DoSomething()
{
/* ... */
}
IEnumerator DoSomethingUnlessInterrupted()
{
IEnumerator e = DoSomething();
bool interrupted = false;
while(!interrupted)
{
e.MoveNext();
yield return e.Current;
interrupted = HasBeenInterrupted();
}
}
Thirdly, the fact that you can yield on other coroutines can sort of allow you to implement your own YieldInstructions, albeit not as performantly as if they were implemented by the engine. For example:
IEnumerator UntilTrueCoroutine(Func fn)
{
while(!fn()) yield return null;
}
Coroutine UntilTrue(Func fn)
{
return StartCoroutine(UntilTrueCoroutine(fn));
}
IEnumerator SomeTask()
{
/* ... */
yield return UntilTrue(() => _lives < 3);
/* ... */
}
however, I wouldn’t really recommend this – the cost of starting a Coroutine is a little heavy for my liking.
Conclusion
I hope this clarifies a little some of what’s really happening when you use a Coroutine in Unity. C#’s iterator blocks are a groovy little construct, and even if you’re not using Unity, maybe you’ll find it useful to take advantage of them in the same way.
The first heading below is a straight answer to the question. The two headings after are more useful for the everyday programmer.
Possibly Boring Implementation Details of Coroutines
Coroutines are explained in Wikipedia and elsewhere. Here I'll just provide some details from a practical point of view. IEnumerator, yield, etc. are C# language features that are used for somewhat of a different purpose in Unity.
To put it very simply, an IEnumerator claims to have a collection of values that you can request one by one, kind of like a List. In C#, a function with a signature to return an IEnumerator does not have to actually create and return one, but can let C# provide an implicit IEnumerator. The function then can provide the contents of that returned IEnumerator in the future in a lazy fashion, through yield return statements. Every time the caller asks for another value from that implicit IEnumerator, the function executes till the next yield return statement, which provides the next value. As a byproduct of this, the function pauses until the next value is requested.
In Unity, we don't use these to provide future values, we exploit the fact that the function pauses. Because of this exploitation, a lot of things about coroutines in Unity do not make sense (What does IEnumerator have to do with anything? What is yield? Why new WaitForSeconds(3)? etc.). What happens "under the hood" is, the values you provide through the IEnumerator are used by StartCoroutine() to decide when to ask for the next value, which determines when your coroutine will unpause again.
Your Unity Game is Single Threaded (*)
Coroutines are not threads. There is one main loop of Unity and all those functions that you write are being called by the same main thread in order. You can verify this by placing a while(true); in any of your functions or coroutines. It will freeze the whole thing, even the Unity editor. This is evidence that everything runs in one main thread. This link that Kay mentioned in his above comment is also a great resource.
(*) Unity calls your functions from one thread. So, unless you create a thread yourself, the code that you wrote is single threaded. Of course Unity does employ other threads and you can create threads yourself if you like.
A Practical Description of Coroutines for Game Programmers
Basically, when you call StartCoroutine(MyCoroutine()), it's exactly like a regular function call to MyCoroutine(), until the first yield return X, where X is something like null, new WaitForSeconds(3), StartCoroutine(AnotherCoroutine()), break, etc. This is when it starts differing from a function. Unity "pauses" that function right at that yield return X line, goes on with other business and some frames pass, and when it's time again, Unity resumes that function right after that line. It remembers the values for all the local variables in the function. This way, you can have a for loop that loops every two seconds, for example.
When Unity will resume your coroutine depends on what X was in your yield return X. For example, if you used yield return new WaitForSeconds(3);, it resumes after 3 seconds have passed. If you used yield return StartCoroutine(AnotherCoroutine()), it resumes after AnotherCoroutine() is completely done, which enables you to nest behaviors in time. If you just used a yield return null;, it resumes right at the next frame.
It couldn't be simpler:
Unity (and all game engines) are frame based.
The whole entire point, the whole raison d'etre of Unity, is that it is frame based. The engine does things "each frame" for you. (Animates, renders objects, does physics, and so on.)
You might ask .. "Oh, that's great. What if I want the engine to do something for me each frame? How do I tell the engine to do such-and-such in a frame?"
The answer is ...
That's exactly what a "coroutine" is for.
It's just that simple.
A note on the "Update" function...
Quite simply, anything you put in "Update" is done every frame. It's literally exactly the same, no difference at all, from the coroutine-yield syntax.
void Update()
{
this happens every frame,
you want Unity to do something of "yours" in each of the frame,
put it in here
}
...in a coroutine...
while(true)
{
this happens every frame.
you want Unity to do something of "yours" in each of the frame,
put it in here
yield return null;
}
There is absolutely no difference.
Threads have utterly no connection to frames/coroutines, in any way. There is no connection whatsoever.
The frames in a game engine have utterly no connection to threads, in any way. They are completely, totally, utterly, unrelated issues.
(You often hear that "Unity is single-threaded!" Note that even that statement is very confused. Frames/coroutines just have absolutely no connection at all to threading. If Unity was multithreaded, hyperthreaded, or ran on a quantum computer!! ... it would just have no connection whatsoever to frames/coroutines. It is a completely, totally, absolutely, unrelated issue.)
If Unity was multithreaded, hyperthreaded, or ran on a quantum computer!! ... it would just have no connection whatsoever to frames/coroutines. It is a completely, totally, absolutely, unrelated issue.
So in summary...
So, Coroutines/yield are simply how you access the frames in Unity. That's it.
(And indeed, it's absolutely the same as the Update() function provided by Unity.)
That's all there is to it, it's that simple.
Why IEnumerator?
Couldn't be simpler: IEnumerator returns things "over and over".
(That list of things can either have a specific length, such as "10 things," or the list can go on forever.)
Thus, self-evidently, an IEnumerator is what you would use.
Anywhere in .Net you want to "return over and over," IEnumerator exists for this purpose.
All frame-based computing, with .Net, of course uses IEnumerator to return each frame. What else could it use?
(If you are new to C#, note that IEnumerator is also used for returning "ordinary" things one by one, such as simply the items in an array, etc.)
Have dig into this lately, wrote a post here - http://eppz.eu/blog/understanding-ienumerator-in-unity-3d/ - that shed a light on the internals (with dense code examples), the underlying IEnumerator interface, and how it is used for coroutines.
Using collection enumerators for this purpose still seems a bit weird for me. It is the inverse of what enumerators feels designed for. The point of enumerators is the returned value on every access, but the point of Coroutines is the code in-between the value returns. The actual returned value is pointless in this context.
On Unity 2017+, you can use the native C# async/await keywords for async code, but before that, C# had no native way to implement async code.
Unity had to use a workaround for async code. They achieved this by exploiting the C# iterators, which was a popular async technique at the time.
A look at C# Iterators
Let's say you have this code:
IEnumerable SomeNumbers() {
yield return 3;
yield return 5;
yield return 8;
}
If you run it through a loop, calling as if was an array, you will get 3 5 8:
// Output: 3 5 8
foreach (int number in SomeNumbers()) {
Console.Write(number);
}
If you are not familiar with iterators (most languages have them to implement lists and collections), they work as an array. The difference is that a callback generates the values.
How do they work?
When looping through an iterator on C#, we use MoveNext to go to the next value.
In the example, we are using foreach, which calls this method under the hood.
When we call MoveNext, the iterator executes everything until its next yield. The parent caller gets the value returned by yield. Then, the iterator code pauses, waiting for the next MoveNext call.
Because of their "lazy" capability, C# programmers used iterators to run async code.
Asynchronous Programming in C# using Iterators
Before 2012, using iterators was a popular hack to perform asynchronous operations in C#.
Example - Asynchronous download function:
IEnumerable DownloadAsync(string URL) {
WebRequest req = HttpWebRequest.Create(url);
WebResponse response = req.GetResponseAsync();
yield return response;
Stream resp = response.Result.GetResponseStream();
string html = resp.ReadToEndAsync().ExecuteAsync();
yield return html;
Console.WriteLine(html.Result);
}
PS: The code above is from this excellent, yet old, article about Async programming using iterators:
http://tomasp.net/blog/csharp-async.aspx/
Should I use async instead of StartCoroutine?
As for 2021, the official Unity docs use coroutines on their examples and not async.
Also, the community seems to be more in favor of coroutines instead of async:
Developers are familiar with coroutines;
Coroutines are integrated with Unity;
And others;
I recommend this Unity lecture from 2019, "Best practices: Async vs. coroutines - Unite Copenhagen 2019": https://youtu.be/7eKi6NKri6I
PS: This is an old question from 2012, but I'm answering it because it is still relevant in 2021.
The basis functions in Unity that you get automatically are the Start() function and the Update() function, so Coroutine's are essentially functions just like the Start() and Update() function. Any old function func() can be called the same way a Coroutine can be called. Unity has obviously set certain boundaries for Coroutines that make them different than regular functions.
One difference is instead of
void func()
You write
IEnumerator func()
for coroutines.
And the same way you can control the time in normal functions with code lines like
Time.deltaTime
A coroutine has a specific handle on the way time can be controlled.
yield return new WaitForSeconds();
Although this is not the only thing possible to do inside of an IEnumerator/Coroutine, it is one of the useful things that Coroutines are used for. You would have to research Unity's scripting API to learn other specific uses of Coroutines.
StartCoroutine is a method to call a IEnumerator function. It is similar to just calling a simple void function, just the difference is that you use it on IEnumerator functions. This type of function is unique as it can allow you to use a special yield function, note that you must return something. Thats as far as I know.
Here I wrote a simple flicker game Over text method in unity
public IEnumerator GameOver()
{
while (true)
{
_gameOver.text = "GAME OVER";
yield return new WaitForSeconds(Random.Range(1.0f, 3.5f));
_gameOver.text = "";
yield return new WaitForSeconds(Random.Range(0.1f, 0.8f));
}
}
I then called it out of the IEnumerator itself
public void UpdateLives(int currentlives)
{
if (currentlives < 1)
{
_gameOver.gameObject.SetActive(true);
StartCoroutine(GameOver());
}
}
As you can see how I used the StartCoroutine() method.
Hope I helped somehow. I am a begainner myself, so if you correct me, or apprecite me, any type of a feedback would be great.

What is the real overhead of try/catch in C#?

So, I know that try/catch does add some overhead and therefore isn't a good way of controlling process flow, but where does this overhead come from and what is its actual impact?
Three points to make here:
Firstly, there is little or NO performance penalty in actually having try-catch blocks in your code. This should not be a consideration when trying to avoid having them in your application. The performance hit only comes into play when an exception is thrown.
When an exception is thrown in addition to the stack unwinding operations etc that take place which others have mentioned you should be aware that a whole bunch of runtime/reflection related stuff happens in order to populate the members of the exception class such as the stack trace object and the various type members etc.
I believe that this is one of the reasons why the general advice if you are going to rethrow the exception is to just throw; rather than throw the exception again or construct a new one as in those cases all of that stack information is regathered whereas in the simple throw it is all preserved.
I'm not an expert in language implementations (so take this with a grain of salt), but I think one of the biggest costs is unwinding the stack and storing it for the stack trace. I suspect this happens only when the exception is thrown (but I don't know), and if so, this would be decently sized hidden cost every time an exception is thrown... so it's not like you are just jumping from one place in the code to another, there is a lot going on.
I don't think it's a problem as long as you are using exceptions for EXCEPTIONAL behavior (so not your typical, expected path through the program).
Are you asking about the overhead of using try/catch/finally when exceptions aren't thrown, or the overhead of using exceptions to control process flow? The latter is somewhat akin to using a stick of dynamite to light a toddler's birthday candle, and the associated overhead falls into the following areas:
You can expect additional cache misses due to the thrown exception accessing resident data not normally in the cache.
You can expect additional page faults due to the thrown exception accessing non-resident code and data not normally in your application's working set.
for example, throwing the exception will require the CLR to find the location of the finally and catch blocks based on the current IP and the return IP of every frame until the exception is handled plus the filter block.
additional construction cost and name resolution in order to create the frames for diagnostic purposes, including reading of metadata etc.
both of the above items typically access "cold" code and data, so hard page faults are probable if you have memory pressure at all:
the CLR tries to put code and data that is used infrequently far from data that is used frequently to improve locality, so this works against you because you're forcing the cold to be hot.
the cost of the hard page faults, if any, will dwarf everything else.
Typical catch situations are often deep, therefore the above effects would tend to be magnified (increasing the likelihood of page faults).
As for the actual impact of the cost, this can vary a lot depending on what else is going on in your code at the time. Jon Skeet has a good summary here, with some useful links. I tend to agree with his statement that if you get to the point where exceptions are significantly hurting your performance, you have problems in terms of your use of exceptions beyond just the performance.
Contrary to theories commonly accepted, try/catch can have significant performance implications, and that's whether an exception is thrown or not!
It disables some automatic optimisations (by design), and in some cases injects debugging code, as you can expect from a debugging aid. There will always be people who disagree with me on this point, but the language requires it and the disassembly shows it so those people are by dictionary definition delusional.
It can impact negatively upon maintenance. This is actually the most significant issue here, but since my last answer (which focused almost entirely on it) was deleted, I'll try to focus on the less significant issue (the micro-optimisation) as opposed to the more significant issue (the macro-optimisation).
The former has been covered in a couple of blog posts by Microsoft MVPs over the years, and I trust you could find them easily yet StackOverflow cares so much about content so I'll provide links to some of them as filler evidence:
Performance implications of try/catch/finally (and part two), by Peter Ritchie explores the optimisations which try/catch/finally disables (and I'll go further into this with quotes from the standard)
Performance Profiling Parse vs. TryParse vs. ConvertTo by Ian Huff states blatantly that "exception handling is very slow" and demonstrates this point by pitting Int.Parse and Int.TryParse against each other... To anyone who insists that TryParse uses try/catch behind the scenes, this ought to shed some light!
There's also this answer which shows the difference between disassembled code with- and without using try/catch.
It seems so obvious that there is an overhead which is blatantly observable in code generation, and that overhead even seems to be acknowledged by people who Microsoft value! Yet I am, repeating the internet...
Yes, there are dozens of extra MSIL instructions for one trivial line of code, and that doesn't even cover the disabled optimisations so technically it's a micro-optimisation.
I posted an answer years ago which got deleted as it focused on the productivity of programmers (the macro-optimisation).
This is unfortunate as no saving of a few nanoseconds here and there of CPU time is likely to make up for many accumulated hours of manual optimisation by humans. Which does your boss pay more for: an hour of your time, or an hour with the computer running? At what point do we pull the plug and admit that it's time to just buy a faster computer?
Clearly, we should be optimising our priorities, not just our code! In my last answer I drew upon the differences between two snippets of code.
Using try/catch:
int x;
try {
x = int.Parse("1234");
}
catch {
return;
}
// some more code here...
Not using try/catch:
int x;
if (int.TryParse("1234", out x) == false) {
return;
}
// some more code here
Consider from the perspective of a maintenance developer, which is more likely to waste your time, if not in profiling/optimisation (covered above) which likely wouldn't even be necessary if it weren't for the try/catch problem, then in scrolling through source code... One of those has four extra lines of boilerplate garbage!
As more and more fields are introduced into a class, all of this boilerplate garbage accumulates (both in source and disassembled code) well beyond reasonable levels. Four extra lines per field, and they're always the same lines... Were we not taught to avoid repeating ourselves? I suppose we could hide the try/catch behind some home-brewed abstraction, but... then we might as well just avoid exceptions (i.e. use Int.TryParse).
This isn't even a complex example; I've seen attempts at instantiating new classes in try/catch. Consider that all of the code inside of the constructor might then be disqualified from certain optimisations that would otherwise be automatically applied by the compiler. What better way to give rise to the theory that the compiler is slow, as opposed to the compiler is doing exactly what it's told to do?
Assuming an exception is thrown by said constructor, and some bug is triggered as a result, the poor maintenance developer then has to track it down. That might not be such an easy task, as unlike the spaghetti code of the goto nightmare, try/catch can cause messes in three dimensions, as it could move up the stack into not just other parts of the same method, but also other classes and methods, all of which will be observed by the maintenance developer, the hard way! Yet we are told that "goto is dangerous", heh!
At the end I mention, try/catch has its benefit which is, it's designed to disable optimisations! It is, if you will, a debugging aid! That's what it was designed for and it's what it should be used as...
I guess that's a positive point too. It can be used to disable optimizations that might otherwise cripple safe, sane message passing algorithms for multithreaded applications, and to catch possible race conditions ;) That's about the only scenario I can think of to use try/catch. Even that has alternatives.
What optimisations do try, catch and finally disable?
A.K.A
How are try, catch and finally useful as debugging aids?
they're write-barriers. This comes from the standard:
12.3.3.13 Try-catch statements
For a statement stmt of the form:
try try-block
catch ( ... ) catch-block-1
...
catch ( ... ) catch-block-n
The definite assignment state of v at the beginning of try-block is the same as the definite assignment state of v at the beginning of stmt.
The definite assignment state of v at the beginning of catch-block-i (for any i) is the same as the definite assignment state of v at the beginning of stmt.
The definite assignment state of v at the end-point of stmt is definitely assigned if (and only if) v is definitely assigned at the end-point of try-block and every catch-block-i (for every i from 1 to n).
In other words, at the beginning of each try statement:
all assignments made to visible objects prior to entering the try statement must be complete, which requires a thread lock for a start, making it useful for debugging race conditions!
the compiler isn't allowed to:
eliminate unused variable assignments which have definitely been assigned to before the try statement
reorganise or coalesce any of it's inner-assignments (i.e. see my first link, if you haven't already done so).
hoist assignments over this barrier, to delay assignment to a variable which it knows won't be used until later (if at all) or to pre-emptively move later assignments forward to make other optimisations possible...
A similar story holds for each catch statement; suppose within your try statement (or a constructor or function it invokes, etc) you assign to that otherwise pointless variable (let's say, garbage=42;), the compiler can't eliminate that statement, no matter how irrelevant it is to the observable behaviour of the program. The assignment needs to have completed before the catch block is entered.
For what it's worth, finally tells a similarly degrading story:
12.3.3.14 Try-finally statements
For a try statement stmt of the form:
try try-block
finally finally-block
• The definite assignment state of v at the beginning of try-block is the same as the definite assignment state of v at the beginning of stmt.
• The definite assignment state of v at the beginning of finally-block is the same as the definite assignment state of v at the beginning of stmt.
• The definite assignment state of v at the end-point of stmt is definitely assigned if (and only if) either:
o v is definitely assigned at the end-point of try-block
o v is definitely assigned at the end-point of finally-block
If a control flow transfer (such as a goto statement) is made that begins within try-block, and ends outside of try-block, then v is also considered definitely assigned on that control flow transfer if v is definitely assigned at the end-point of finally-block. (This is not an only if—if v is definitely assigned for another reason on this control flow transfer, then it is still considered definitely assigned.)
12.3.3.15 Try-catch-finally statements
Definite assignment analysis for a try-catch-finally statement of the form:
try try-block
catch ( ... ) catch-block-1
...
catch ( ... ) catch-block-n
finally finally-block
is done as if the statement were a try-finally statement enclosing a try-catch statement:
try {
try
try-block
catch ( ... ) catch-block-1
...
catch ( ... ) catch-block-n
}
finally finally-block
In my experience the biggest overhead is in actually throwing an exception and handling it. I once worked on a project where code similar to the following was used to check if someone had a right to edit some object. This HasRight() method was used everywhere in the presentation layer, and was often called for 100s of objects.
bool HasRight(string rightName, DomainObject obj) {
try {
CheckRight(rightName, obj);
return true;
}
catch (Exception ex) {
return false;
}
}
void CheckRight(string rightName, DomainObject obj) {
if (!_user.Rights.Contains(rightName))
throw new Exception();
}
When the test database got fuller with test data, this lead to a very visible slowdown while openening new forms etc.
So I refactored it to the following, which - according to later quick 'n dirty measurements - is about 2 orders of magnitude faster:
bool HasRight(string rightName, DomainObject obj) {
return _user.Rights.Contains(rightName);
}
void CheckRight(string rightName, DomainObject obj) {
if (!HasRight(rightName, obj))
throw new Exception();
}
So in short, using exceptions in normal process flow is about two orders of magnitude slower then using similar process flow without exceptions.
Not to mention if it's inside a frequently-called method it may affect the overall behavior of the application.
For example, I consider the use of Int32.Parse as a bad practice in most cases since it throws exceptions for something that can be caught easily otherwise.
So to conclude everything written here:
1) Use try..catch blocks to catch unexpected errors - almost no performance penalty.
2) Don't use exceptions for excepted errors if you can avoid it.
I wrote an article about this a while back because there were a lot of people asking about this at the time. You can find it and the test code at http://www.blackwasp.co.uk/SpeedTestTryCatch.aspx.
The upshot is that there is a tiny amount of overhead for a try/catch block but so small that it should be ignored. However, if you are running try/catch blocks in loops that are executed millions of times, you may want to consider moving the block to outside of the loop if possible.
The key performance issue with try/catch blocks is when you actually catch an exception. This can add a noticeable delay to your application. Of course, when things are going wrong, most developers (and a lot of users) recognise the pause as an exception that is about to happen! The key here is not to use exception handling for normal operations. As the name suggests, they are exceptional and you should do everything you can to avoid them being thrown. You should not use them as part of the expected flow of a program that is functioning correctly.
I made a blog entry about this subject last year.
Check it out. Bottom line is that there is almost no cost for a try block if no exception occurs - and on my laptop, an exception was about 36μs. That might be less than you expected, but keep in mind that those results where on a shallow stack. Also, first exceptions are really slow.
It is vastly easier to write, debug, and maintain code that is free of compiler error messages, code-analysis warning messages, and routine accepted exceptions (particularly exceptions that are thrown in one place and accepted in another). Because it is easier, the code will on average be better written and less buggy.
To me, that programmer and quality overhead is the primary argument against using try-catch for process flow.
The computer overhead of exceptions is insignificant in comparison, and usually tiny in terms of the application's ability to meet real-world performance requirements.
I really like Hafthor's blog post, and to add my two cents to this discussion, I'd like to say that, it's always been easy for me to have the DATA LAYER throw only one type of exception (DataAccessException). This way my BUSINESS LAYER knows what exception to expect and catches it. Then depending on further business rules (i.e. if my business object participates in the workflow etc), I may throw a new exception (BusinessObjectException) or proceed without re/throwing.
I'd say don't hesitate to use try..catch whenever it is necessary and use it wisely!
For example, this method participates in a workflow...
Comments?
public bool DeleteGallery(int id)
{
try
{
using (var transaction = new DbTransactionManager())
{
try
{
transaction.BeginTransaction();
_galleryRepository.DeleteGallery(id, transaction);
_galleryRepository.DeletePictures(id, transaction);
FileManager.DeleteAll(id);
transaction.Commit();
}
catch (DataAccessException ex)
{
Logger.Log(ex);
transaction.Rollback();
throw new BusinessObjectException("Cannot delete gallery. Ensure business rules and try again.", ex);
}
}
}
catch (DbTransactionException ex)
{
Logger.Log(ex);
throw new BusinessObjectException("Cannot delete gallery.", ex);
}
return true;
}
We can read in Programming Languages Pragmatics by Michael L. Scott that the nowadays compilers do not add any overhead in common case, this means, when no exceptions occurs. So every work is made in compile time.
But when an exception is thrown in run-time, compiler needs to perform a binary search to find the correct exception and this will happen for every new throw that you made.
But exceptions are exceptions and this cost is perfectly acceptable. If you try to do Exception Handling without exceptions and use return error codes instead, probably you will need a if statement for every subroutine and this will incur in a really real time overhead. You know a if statement is converted to a few assembly instructions, that will performed every time you enter in your sub-routines.
Sorry about my English, hope that it helps you. This information is based on cited book, for more information refer to Chapter 8.5 Exception Handling.
Let us analyse one of the biggest possible costs of a try/catch block when used where it shouldn't need to be used:
int x;
try {
x = int.Parse("1234");
}
catch {
return;
}
// some more code here...
And here's the one without try/catch:
int x;
if (int.TryParse("1234", out x) == false) {
return;
}
// some more code here
Not counting the insignificant white-space, one might notice that these two equivelant pieces of code are almost exactly the same length in bytes. The latter contains 4 bytes less indentation. Is that a bad thing?
To add insult to injury, a student decides to loop while the input can be parsed as an int. The solution without try/catch might be something like:
while (int.TryParse(...))
{
...
}
But how does this look when using try/catch?
try {
for (;;)
{
x = int.Parse(...);
...
}
}
catch
{
...
}
Try/catch blocks are magical ways of wasting indentation, and we still don't even know the reason it failed! Imagine how the person doing debugging feels, when code continues to execute past a serious logical flaw, rather than halting with a nice obvious exception error. Try/catch blocks are a lazy man's data validation/sanitation.
One of the smaller costs is that try/catch blocks do indeed disable certain optimizations: http://msmvps.com/blogs/peterritchie/archive/2007/06/22/performance-implications-of-try-catch-finally.aspx. I guess that's a positive point too. It can be used to disable optimizations that might otherwise cripple safe, sane message passing algorithms for multithreaded applications, and to catch possible race conditions ;) That's about the only scenario I can think of to use try/catch. Even that has alternatives.

Categories