Good morning,
Say I have a class ClassA, an operator + which sums up two objects of type ClassA, an implicit casting from intto ClassA, and that I want to overload the operator ++... Supposing the code for + is rather long, but that the sum of a ClassA and 1 is a very particular case of it, which option is better?
Implement ++ using + and the implicit casting already defined.
Repeat part of the code which simplifies alot when adding just 1.
My idea is that (2) is better since it saves the creation of a new ClassA object by the implicit casting, which can be quite useful if the ++ operator is used, for example, in a for cycle. Also, speed is a must.
Thank you very much.
You've answered your own question. If speed is a must, then go with the second, faster option (it's a good idea to benchmark it to make sure it really is significantly faster though).
Otherwise, go with the first option since less code is better (and staying DRY doubly-so). Less code means less potential bugs, less to maintain, less to write, and less to read. If the code largely duplicates another section of code, then you'd have to keep the two in sync as you make changes -- this would be inviting trouble, as it's easy to forget to update one (and even if you always remember to make changes to both places, since they're not exactly identical it's possible to correctly update one section and incorrectly update the other).
Make sure speed really is a must though before making your final decision -- you don't want premature optimization.
Either way is acceptable. It sounds like the second way is what you're already leaning towards, so try it. In fact, try it both ways and measure the time it takes to increment a million times. Benchmarking is always the way to make these decisions.
In case you haven't done any benchmarking before, the simplest way is to create a System.Diagnostics.Stopwatch and start/stop it around the relevant code. You can then write the elapsed time to a console.
My opinion is that if +1 is a real special case that is much simpler, do the special ++ implementation. You can always comment it out and refer it to + 1 if you want to keep your code small.
Otherwise, it will be too easy to forget about this special optimization 6 mo. down the road when you are trying to optimize.
Premature optimization refers to the fact that you are optimizing before you know how to, not when you have a clear rationale to do so. How to draw the line is difficult, however; you'd need to decide how much simpler the ++ code is in order to consider putting it in.
Related
Please ignore code readability in this question.
In terms of performance, should the following code be written like this:
int maxResults = criteria.MaxResults;
if (maxResults > 0)
{
while (accounts.Count > maxResults)
accounts.RemoveAt(maxResults);
}
or like this:
if (criteria.MaxResults > 0)
{
while (accounts.Count > criteria.MaxResults)
accounts.RemoveAt(criteria.MaxResults);
}
?
Edit: criteria is a class, and MaxResults is a simple integer property (i.e., public int MaxResults { get { return _maxResults; } }.
Does the C# compiler treat MaxResults as a black box and evaluate it every time? Or is it smart enough to figure out that I've got 3 calls to the same property with no modification of that property between the calls? What if MaxResults was a field?
One of the laws of optimization is precalculation, so I instinctively wrote this code like the first listing, but I'm curious if this kind of thing is being done for me automatically (again, ignore code readability).
(Note: I'm not interested in hearing the 'micro-optimization' argument, which may be valid in the specific case I've posted. I'd just like some theory behind what's going on or not going on.)
First off, the only way to actually answer performance questions is to actually try it both ways and test the results in realistic conditions.
That said, the other answers which say that "the compiler" does not do this optimization because the property might have side effects are both right and wrong. The problem with the question (aside from the fundamental problem that it simply cannot be answered without actually trying it and measuring the result) is that "the compiler" is actually two compilers: the C# compiler, which compiles to MSIL, and the JIT compiler, which compiles IL to machine code.
The C# compiler never ever does this sort of optimization; as noted, doing so would require that the compiler peer into the code being called and verify that the result it computes does not change over the lifetime of the callee's code. The C# compiler does not do so.
The JIT compiler might. No reason why it couldn't. It has all the code sitting right there. It is completely free to inline the property getter, and if the jitter determines that the inlined property getter returns a value that can be cached in a register and re-used, then it is free to do so. (If you don't want it to do so because the value could be modified on another thread then you already have a race condition bug; fix the bug before you worry about performance.)
Whether the jitter actually does inline the property fetch and then enregister the value, I have no idea. I know practically nothing about the jitter. But it is allowed to do so if it sees fit. If you are curious about whether it does so or not, you can either (1) ask someone who is on the team that wrote the jitter, or (2) examine the jitted code in the debugger.
And finally, let me take this opportunity to note that computing results once, storing the result and re-using it is not always an optimization. This is a surprisingly complicated question. There are all kinds of things to optimize for:
execution time
executable code size -- this has a major effect on executable time because big code takes longer to load, increases the working set size, puts pressure on processor caches, RAM and the page file. Small slow code is often in the long run faster than big fast code in important metrics like startup time and cache locality.
register allocation -- this also has a major effect on execution time, particularly in architectures like x86 which have a small number of available registers. Enregistering a value for fast re-use can mean that there are fewer registers available for other operations that need optimization; perhaps optimizing those operations instead would be a net win.
and so on. It get real complicated real fast.
In short, you cannot possibly know whether writing the code to cache the result rather than recomputing it is actually (1) faster, or (2) better performing. Better performance does not always mean making execution of a particular routine faster. Better performance is about figuring out what resources are important to the user -- execution time, memory, working set, startup time, and so on -- and optimizing for those things. You cannot do that without (1) talking to your customers to find out what they care about, and (2) actually measuring to see if your changes are having a measurable effect in the desired direction.
If MaxResults is a property then no, it will not optimize it, because the getter may have complex logic, say:
private int _maxResults;
public int MaxReuslts {
get { return _maxResults++; }
set { _maxResults = value; }
}
See how the behavior would change if it in-lines your code?
If there's no logic...either method you wrote is fine, it's a very minute difference and all about how readable it is TO YOU (or your team)...you're the one looking at it.
Your two code samples are only guaranteed to have the same result in single-threaded environments, which .Net isn't, and if MaxResults is a field (not a property). The compiler can't assume, unless you use the synchronization features, that criteria.MaxResults won't change during the course of your loop. If it's a property, it can't assume that using the property doesn't have side effects.
Eric Lippert points out quite correctly that it depends a lot on what you mean by "the compiler". The C# -> IL compiler? Or the IL -> machine code (JIT) compiler? And he's right to point out that the JIT may well be able to optimize the property getter, since it has all of the information (whereas the C# -> IL compiler doesn't, necessarily). It won't change the situation with multiple threads, but it's a good point nonetheless.
It will be called and evaluated every time. The compiler has no way of determining if a method (or getter) is deterministic and pure (no side effects).
Note that actual evaluation of the property may be inlined by the JIT compiler, making it effectively as fast as a simple field.
It's good practise to make property evaluation an inexpensive operation. If you do some heavy calculation in the getter, consider caching the result manually, or changing it to a method.
why not test it?
just set up 2 console apps make it look 10 million times and compare the results ... remember to run them as properly released apps that have been installed properly or else you cannot gurantee that you are not just running the msil.
Really you are probably going to get about 5 answers saying 'you shouldn't worry about optimisation'. they clearly do not write routines that need to be as fast as possible before being readable (eg games).
If this piece of code is part of a loop that is executed billions of times then this optimisation could be worthwhile. For instance max results could be an overridden method and so you may need to discuss virtual method calls.
Really the ONLY way to answer any of these questions is to figure out is this is a piece of code that will benefit from optimisation. Then you need to know the kinds of things that are increasing the time to execute. Really us mere mortals cannot do this a priori and so have to simply try 2-3 different versions of the code and then test it.
If criteria is a class type, I doubt it would be optimized, because another thread could always change that value in the meantime. For structs I'm not sure, but my gut feeling is that it won't be optimized, but I think it wouldn't make much difference in performance in that case anyhow.
Would it be better for performance to use if statements over and over again, or to use them once and use delegates to call functions based on the output from the IF statements? I want to say that the answer is obviously delegates, but I'm not sure if going to different methods over and over is faster or slower than many IF statements that do the same thing. I hope I explained it right.
PS The framework I need to know this for is XNA, if it matters.
You have your trade offs. The best answer was commented already and that is to profile both and then figure it out. IF statements may take more CPU because it has to do the comparisons again and again. On the other hand using delegates takes more memory and it's another object you need to keep around.
Personally what I've like doing the best (when applicable, don't know the full context of your question) is turning your IF ELSE statements into a switch-case. This works really well for state machines and other repetitive processes plus you eliminate all that branching that comes with IFs. However, this is assuming that the values your are checking for are all relatively close in range or else you'll be causing a lot of pain for the compiler.
That is a useful technique, sure. Clearly, for only a few if's delegates don't make sense because they are costly and the CPU cannot easily predict the branch target. And for very many if's a one-time initialized delegate makes sense.
It is unclear where the break-even point is. That one needs to be measured.
Good afternoon,
I am writing a simple lexer which is basically a modified version of this one. After getting each token I need to perform slight modifications and re-analyse it to re-check it's type. Also, of course, after the lexical analysis I need to re-use the whole token list to make a kind of "parsing" on it. My question is if using IEnumerable<Token> and yield return statements in the lexer can make the whole program's performance slower... Would it be preferable to use a List<Token>, to build the list iteratively and use a normal return statement? What about iterating throught the IEnumerable/List? Which one is faster?
Thank you very much.
You are asking the wrong question, you should be worried far more about the cost of Regex. Enumerating the tokens will be a very small fraction of that, there's just no point in optimizing code that could be double as fast but only improves program perf by 1%.
Write the code, profile it, you'll know what to do for version 2. Given that these kind of tools run at 'human time' (no perceptible difference when the program takes twice as long when it needs 20 milliseconds), the most likely result is "nothing needs done".
It's possible that it will have some performance impact - but it also allows the iterator to be built lazily.
Personally I'd write the code in the most readable way and measure its performance - then start worrying micro-optimizing this sort of thing. Test it one way, test it the other way, see how much readability you lose (if any) by using the most performant solution, and how much speed you actually gain.
Note that there's a very slight performance benefit to iterating over an expression which is known to be of type List<T> vs iterating over an IEnumerable<T> which happens to be implemented by List<T>, as List<T> implements the iterator itself using a mutable struct... basically you'll end up with a boxed value if you use the higher abstraction layer, but in that particular case I would almost certainly prefer using the right abstraction over the tiny performance improvement.
IEnumerable and yield return statements are converted into an GetEnumator() and the implementation of an enumerator in IL code.
Although yield return has its merits in terms of doing some additional work for each token returned
during enumeration, I'd stick to List creation and returning the list as it incurs less method calls and therefore should be faster.
By now, I'm sure you'll see that you're trying to optimize prematurely, which is, according to many, the root of all evil.
However, if you REALLY want to speed this up, regular expressions seem an expensive way to do it. Everytime you do a Regex.Match(), you're scanning the string again, which results in at least as many scans as you have tokens.
If you know the boundaries that define a token ('{' and '}', for example), you could scan the string once to build the enumerable of tokens (with yield, or list, I don't think that'll make much difference). The caller can then rebuild the string, looking up the values to replace the tokens with.
Of course, this would only work with simple "search and replace" type tokens. More complex ones would require something more sophisticated, such as a regex. Perhaps you could extend the TokenDefinition to specify whether the match is a simple one or a regex one. This would cut down the number of regular expressions performed, but still keep the flexibility required.
Does anyone have advice for using the params in C# for method argument passing. I'm contemplating making overloads for the first 6 arguments and then a 7th using the params feature. My reasoning is to avoid the extra array allocation the params feature require. This is for some high performant utility methods. Any advice? Is it a waste of code to create all the overloads?
Honestly, I'm a little bothered by everyone shouting "premature optimization!" Here's why.
What you say makes perfect sense, particularly as you have already indicated you are working on a high-performance library.
Even BCL classes follow this pattern. Consider all the overloads of string.Format or Console.WriteLine.
This is very easy to get right. The whole premise behind the movement against premature optimization is that when you do something tricky for the purposes of optimizing performance, you're liable to break something by accident and make your code less maintainable. I don't see how that's a danger here; it should be very straightforward what you're doing, to yourself as well as any future developer who may deal with your code.
Also, even if you profiled the results of both approaches and saw only a very small difference in speed, there's still the issue of memory allocation. Creating a new array for every method call entails allocating more memory that will need to be garbage collected later. And in some scenarios where "nearly" real-time behavior is desired (such as algorithmic trading, the field I'm in), minimizing garbage collections is just as important as maximizing execution speed.
So, even if it earns me some downvotes: I say go for it.
(And to those who claim "the compiler surely already does something like this"--I wouldn't be so sure. Firstly, if that were the case, I fail to see why BCL classes would follow this pattern, as I've already mentioned. But more importantly, there is a very big semantic difference between a method that accepts multiple arguments and one that accepts an array. Just because one can be used as a substitute for the other doesn't mean the compiler would, or should, attempt such a substitution).
Yes, that's the strategy that the .NET framework uses. String.Concat() would be a good example. It has overloads for up to 4 strings, plus a fallback one that takes a params string[]. Pretty important here, Concat needs to be fast and is there to help the user fall in the pit of success when he uses the + operator instead of a StringBuilder.
The code duplication you'll get is the price. You'd profile them to see if the speedup is worth the maintenance headache.
Fwiw: there are plenty of micro-optimizations like this in the .NET framework. Somewhat necessary because the designers could not really predict how their classes were going to be used. String.Concat() is just as likely to be used in a tight inner loop that is critical to program perf as, say, a config reader that only runs once at startup. As the end-user of your own code, you typically have the luxury of not having to worry about that. The reverse is also true, the .NET framework code is remarkably free of micro-optimizations when it is unlikely that their benefit would be measurable. Like providing overloads when the core code is slow anyway.
You can always pass Tuple as a parameter, or if the types of the parameters are always the same, an IList<T>.
As other answers and comments have said, you should only optimize after:
Ensuring correct behavior.
Determining the need to optimize.
My point is, if your method is capable of getting unlimited number of parameters, then the logic inside it works in an array-style. So, having overloads for limited number of parameters wouldn't be helping. Unless, you can implement limited number of parameters in a whole different way that is much faster.
For example, if you're handing the parameters to a Console.WriteLine, there's a hidden array creation in there too, so either way you end up having an array.
And, sorry for bothering Dan Tao, I also feel like it is premature optimization. Because you need to know what difference would it make to have overloads with limited number of parameters. If your application is that much performance-critical, you'd need to implement both ways and try to run a test and compare execution times.
Don't even think about performance at this stage. Create whatever overloads will make your code easier to write and easier to understand at 4am two years from now. Sometimes that means params, sometimes that means avoiding it.
After you've got something that works, figure out if these are a performance problem. It's not hard to make the parameters more complicated, but if you add unnecessary complexity now, you'll never make them less so later.
You can try something like this to benchmark the performance so you have some concrete numbers to make decisions with.
In general, object allocation is slightly faster than in C/C++ and deletion is much, much faster for small objects -- until you have tens of thousands of them being made per second. Here's an old article regarding memory allocation performance.
I had someone advise me to avoid repeatedly calling String.Length, because it was recalculated each time I called it. I had assumed that String.Length ran in O(1) time. Is String.Length more complex than that?
That's bad advice - String.Length is indeed O(1). It's not like strlen in C.
Admittedly it's not guaranteed in the docs as far as I can tell, but the immutability of strings makes it a pretty silly thing not to make O(1). (And not just O(1), but a very fast constant time too.)
Frankly if someone is giving that sort of advice, I would become a bit more skeptical about other advice they may provide too...
String.Length is O(1). The reason people tell you not to call it in a loop is because it's a property access, which is the same as a method call. In reality one extra method call rarely makes any significant difference.
As always, don't start going through your code caching all calls to String.Length unless your profiler says it's the source of a performance problem.
Recall that strings are immutable. System.String.Length never changes.
No it does not recalculate. String type is immutable.
To take this further, as per the .net framework design guidelines, a property of an object that is very much static and non-volatile in nature would be designed as property. Should the property be of a volatile nature that needs recalculation on every call, it shall be made available as method.
you can take this rule while you attempt to check if a property takes enough processing cycles to attract your attention.
As others have said String.Length is a constant property. If you really care about performance (or have significant iterations), you could assign its value to a local integer variable once, and read that many times (in a loop, etc.). This would give the optimizer a better chance of allocating this value to a CPU register. Accessing a property is a much more expensive operation than a stack variable or a register.
According to the internal comments, the String.Length property is a single instruction that does not run a for loop. Therefore, it is an O(1) operation.