Does anyone have advice for using the params in C# for method argument passing. I'm contemplating making overloads for the first 6 arguments and then a 7th using the params feature. My reasoning is to avoid the extra array allocation the params feature require. This is for some high performant utility methods. Any advice? Is it a waste of code to create all the overloads?
Honestly, I'm a little bothered by everyone shouting "premature optimization!" Here's why.
What you say makes perfect sense, particularly as you have already indicated you are working on a high-performance library.
Even BCL classes follow this pattern. Consider all the overloads of string.Format or Console.WriteLine.
This is very easy to get right. The whole premise behind the movement against premature optimization is that when you do something tricky for the purposes of optimizing performance, you're liable to break something by accident and make your code less maintainable. I don't see how that's a danger here; it should be very straightforward what you're doing, to yourself as well as any future developer who may deal with your code.
Also, even if you profiled the results of both approaches and saw only a very small difference in speed, there's still the issue of memory allocation. Creating a new array for every method call entails allocating more memory that will need to be garbage collected later. And in some scenarios where "nearly" real-time behavior is desired (such as algorithmic trading, the field I'm in), minimizing garbage collections is just as important as maximizing execution speed.
So, even if it earns me some downvotes: I say go for it.
(And to those who claim "the compiler surely already does something like this"--I wouldn't be so sure. Firstly, if that were the case, I fail to see why BCL classes would follow this pattern, as I've already mentioned. But more importantly, there is a very big semantic difference between a method that accepts multiple arguments and one that accepts an array. Just because one can be used as a substitute for the other doesn't mean the compiler would, or should, attempt such a substitution).
Yes, that's the strategy that the .NET framework uses. String.Concat() would be a good example. It has overloads for up to 4 strings, plus a fallback one that takes a params string[]. Pretty important here, Concat needs to be fast and is there to help the user fall in the pit of success when he uses the + operator instead of a StringBuilder.
The code duplication you'll get is the price. You'd profile them to see if the speedup is worth the maintenance headache.
Fwiw: there are plenty of micro-optimizations like this in the .NET framework. Somewhat necessary because the designers could not really predict how their classes were going to be used. String.Concat() is just as likely to be used in a tight inner loop that is critical to program perf as, say, a config reader that only runs once at startup. As the end-user of your own code, you typically have the luxury of not having to worry about that. The reverse is also true, the .NET framework code is remarkably free of micro-optimizations when it is unlikely that their benefit would be measurable. Like providing overloads when the core code is slow anyway.
You can always pass Tuple as a parameter, or if the types of the parameters are always the same, an IList<T>.
As other answers and comments have said, you should only optimize after:
Ensuring correct behavior.
Determining the need to optimize.
My point is, if your method is capable of getting unlimited number of parameters, then the logic inside it works in an array-style. So, having overloads for limited number of parameters wouldn't be helping. Unless, you can implement limited number of parameters in a whole different way that is much faster.
For example, if you're handing the parameters to a Console.WriteLine, there's a hidden array creation in there too, so either way you end up having an array.
And, sorry for bothering Dan Tao, I also feel like it is premature optimization. Because you need to know what difference would it make to have overloads with limited number of parameters. If your application is that much performance-critical, you'd need to implement both ways and try to run a test and compare execution times.
Don't even think about performance at this stage. Create whatever overloads will make your code easier to write and easier to understand at 4am two years from now. Sometimes that means params, sometimes that means avoiding it.
After you've got something that works, figure out if these are a performance problem. It's not hard to make the parameters more complicated, but if you add unnecessary complexity now, you'll never make them less so later.
You can try something like this to benchmark the performance so you have some concrete numbers to make decisions with.
In general, object allocation is slightly faster than in C/C++ and deletion is much, much faster for small objects -- until you have tens of thousands of them being made per second. Here's an old article regarding memory allocation performance.
Related
I have one question. Instead of writing big method(including big business logic), i preferred to divide this method in small methods and call them in one method because for me it looks so neat and easy to maintain. But my Team Lead said that "Don't write small methods and call them in one because it consumes more memory while you call small methods." Is that correct ?
Please suggest what should i do in this case ? and once again thank you for your valuable time
There are many factors that come into play here. More context of your project would be required to give any strict conclusions.
Generally speaking though, C#, VB and managed languages in general were devised to prioritize developer productivity over performance. In that light, worrying about method call memory consumption seems questionable.
Additionally, IL-based languages (C#, VB, ...) use a JIT that compiles the intermediate code to CPU-specific assembly in runtime. JIT's unit of work is a method. The bigger the method, the less optimizations JIT can do. Therefore a big method may yield worse performance than many small methods doing the same work. In addition, JIT can also do an optimization called inlining where a small method code is generated inside its caller, eliding the function call altogether.
Function call takes very little memory by C#/VB's terms. Unless you're working in a very constrained environment (e.g. embedded), such optimization doesn't really make sense, especially when not backed by any reasonable arguments.
You are both mistaken.
OOP is built on a concept of divide and conquer so you should divide your method into small methods for the sake of reuse ability and maintainability.
About the memory consumed, I don't think it will consume more memory but this may happen when you create methods for each small task.
So yes divide them into small methods only if need to, with respect of resources and sharing variables.
Would it be better for performance to use if statements over and over again, or to use them once and use delegates to call functions based on the output from the IF statements? I want to say that the answer is obviously delegates, but I'm not sure if going to different methods over and over is faster or slower than many IF statements that do the same thing. I hope I explained it right.
PS The framework I need to know this for is XNA, if it matters.
You have your trade offs. The best answer was commented already and that is to profile both and then figure it out. IF statements may take more CPU because it has to do the comparisons again and again. On the other hand using delegates takes more memory and it's another object you need to keep around.
Personally what I've like doing the best (when applicable, don't know the full context of your question) is turning your IF ELSE statements into a switch-case. This works really well for state machines and other repetitive processes plus you eliminate all that branching that comes with IFs. However, this is assuming that the values your are checking for are all relatively close in range or else you'll be causing a lot of pain for the compiler.
That is a useful technique, sure. Clearly, for only a few if's delegates don't make sense because they are costly and the CPU cannot easily predict the branch target. And for very many if's a one-time initialized delegate makes sense.
It is unclear where the break-even point is. That one needs to be measured.
What is better/preferred - Creating a method of 2 lines which accepts a web control as a parameter, operates on it and is called from 3-4 places within the same code file or writing those 2 lines at the 3-4 places and not creating the method?
P.S. The control I am referring here is a textbox.
All it is passing is a reference. There will be no significant cost to this whatsoever. If the method is small and linear, the JIT may even choose to inline it - but ultimately, this is not going to make any difference.
Stick with the method approach - then you only have one place to maintain.
For maintainability it is better to break out the lines to a method.
Performance wise you will not notice any difference at all.
Unless you're working on code that needs to run in a Nuclear Powerplant or a NASA landrover on Mars, you're always better of writing code that is easier for YOU to maintain! And that means refactoring your code so you never repeat yourself.
Theoretically its of course faster to have the instructions inline and not call a method, but in practice it far outweighs the cons maintaining it.
The DRY (Don't Repeat Yourself) principle says that you create the method and call it the 3-4 times. The performance hit is so minimal that it really isn't worth thinking about unless the consequence of those billionths of a second outweigh the additional overhead of maintaining the code 3-4 times over.
In short, unless you can really really really justify the additional maintenance over grabbing every last processor cycle* overhead then create the method.
(*) Given this is a textbox and therefore most likely a business application your biggest performance worries more likely are databases and webservices.
Careful where the method that modifies the control resides. If the method is part of another class, then passing controls to it will break the encapsulation of the class that owns the control.
I find a lot of cases where I think to myself that I could use relfection to solve a problem, but I usually don't because I hear a lot along the lines of "don't use reflection, it's too inefficient".
Now I'm in a position where I have a problem where I can't find any other solution than to use reflection with new T(), as outlined in this question & answer.
So I'm wondering if somebody can tell me reflection's specific intended usage, and if there's a set of guidelines to indicate when it's appropriate and when it isn't?
It is often "fast enough", and if you need faster (for tight loops etc) you can do meta-programming with Expression or ILGenerator (perhaps via DynamicMethod), to make extremely fast code (including some tricks you can't do in C#).
Reflection is more commonly used for framework/library scenarios, where the library by definition knows nothing about the caller, and must work based on configuration, attributes or patterns.
If there's one thing that I hate hearing it's "don't use reflection, it's too inefficient".
Too inefficient for what? If you're writing a console application that's run once a month and isn't time critical, does it really matter if it takes 30 seconds instead of 28, because of you using reflection?
Guidelines for when it's inappropriate to use are ones that only you can really put together as they're heavily dependent on what you're doing and how efficient/performant alternatives are.
A useful abstraction for code efficiency is to partition it in three categories of time, each about 3 orders of magnitude apart.
First is human-time. There's a lot you can do when you only need to keep a person happy with the performance of your code. Humans cannot perceive the difference between code that needs 10 milliseconds or 20 milliseconds, both look instant. And a human is forgiving when a program needs 6 seconds instead of 5, roughly 3 billion machine instructions more. Common examples of programs that run at human-time are compilers and point-and-click designers. Using reflection is never a problem.
Then there is I/O-time. When your program needs to hit the disk or the network. I/O is slow, restricted by mechanical motion in the case of the disk, bandwidth and latency in the case of a network. You can always tell when I/O is the bottleneck, your program is running but it isn't driving up the CPU load much. The operating system is constantly blocking the thread, making it wait until the I/O request is complete.
Reflection operates at I/O-time. To retrieve type data, the CLR must read the assembly metadata. And when that wasn't done before, your program will cause a page-fault, requiring the operating system to read the data from disk. What follows is that, roughly, reflection can make I/O bound code only twice as slow. Usually better because after the first perf hit, the metadata is cached and can be retrieved a lot quicker. Reflection is thus often an acceptable trade-off. The canonical examples are serialization and dbase ORMs.
Then there's machine-time. The raw performance of a CPU core is stupendous. A property getter can execute in somewhere between 0 and 1/2 a nanosecond. This does not compare favorably with, say, PropertyInfo.GetValue(). Both will keep the CPU busy, you'll see the CPU load for the core at 100%. But GetValue() costs hundreds if not thousands of machine code instructions. Not counting the time needed to page in the metadata. While not much an incremental time, it builds up fast when you loop.
If you cannot classify your reflection code in the human-time or I/O-time categories then reflection is unlikely to be an appropriate substitute for regular code.
The key to keeping reflection from slowing down your program is to not use it inside a loop. If you want to read a property from an object during startup (happens once), use reflection. You want to read a property from a list of 10,000 objects of unknown type, use reflection to get the property getter delegate once (search term: PropertyInfo.GetGetMethod), then call the delegate 10,000 types. There are plenty of examples of this on StackOverflow.
Reflection is not inefficient. It is less efficient than direct calls. So personnaly I use reflection when there's no equivalent compile time safe method. IMHO the problem with reflection is not so much the efficiency but the fragility of the code as it uses magic strings which are very refactor unfriendly.
I use it for plugin architecture - looking through assemblies in the plugin folder for methods marked with a custom attribute indicating info about the plugin - and in a logging framework. The framework detects a custom attribute on the assembly itself which holds information about the author of the assembly, the project, version information, and other tags that are logged along with everything in the stack trace.
Going to give away a 'trade secret', but it's a good one. The framework allows you to tag each method or class with a 'Story ref', e.g.
[StoryRef(Ref="ImportCSV1")]
...and the idea is it would integrate into our agile project management framework: if there were any exceptions thrown within that class/method, the logging method would use reflection to check for a StoryRef attribute in the stack trace, and if so that would be logged as an exception against that story. In the PM software you could see exceptions by Story (a story is like an extreme/agile use case).
I think that's a valid use, at least! Basically, when it just seems the most neat, and appropriate way to do it, I use reflection. Nothing else really comes into it - I can't think of an occasion you'd be using reflection to make that many calls that efficiency would come into it.
So I'm wondering if somebody can tell
me reflection's specific intended
usage, and if there's a set of
guidelines to indicate when it's
appropriate and when it isn't?
A bad example of reflection is this one from Wikipedia:
//Without reflection
Foo foo = new Foo();
foo.Hello();
//With reflection
Type t = Type.GetType("FooNamespace.Foo");
object foo = Activator.CreateInstance(t);
t.InvokeMember("Hello", BindingFlags.InvokeMethod, null, foo, null);
Here, there is no advantage to using reflection: The non-reflection-using code is not only more efficient, but easier to understand.
Good uses of reflection are things like serialization and object-relational mapping, which are easy to implement if you have a list of a class's properties, but otherwise require a custom-written function for each class.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
It seems like optimization is a lost art these days. Wasn't there a time when all programmers squeezed every ounce of efficiency from their code? Often doing so while walking five miles in the snow?
In the spirit of bringing back a lost art, what are some tips that you know of for simple (or perhaps complex) changes to optimize C#/.NET code? Since it's such a broad thing that depends on what one is trying to accomplish it'd help to provide context with your tip. For instance:
When concatenating many strings together use StringBuilder instead. See link at the bottom for caveats on this.
Use string.Compare to compare two strings instead of doing something like string1.ToLower() == string2.ToLower()
The general consensus so far seems to be measuring is key. This kind of misses the point: measuring doesn't tell you what's wrong, or what to do about it if you run into a bottleneck. I ran into the string concatenation bottleneck once and had no idea what to do about it, so these tips are useful.
My point for even posting this is to have a place for common bottlenecks and how they can be avoided before even running into them. It's not even necessarily about plug and play code that anyone should blindly follow, but more about gaining an understanding that performance should be thought about, at least somewhat, and that there's some common pitfalls to look out for.
I can see though that it might be useful to also know why a tip is useful and where it should be applied. For the StringBuilder tip I found the help I did long ago at here on Jon Skeet's site.
It seems like optimization is a lost art these days.
There was once a day when manufacture of, say, microscopes was practiced as an art. The optical principles were poorly understood. There was no standarization of parts. The tubes and gears and lenses had to be made by hand, by highly skilled workers.
These days microscopes are produced as an engineering discipline. The underlying principles of physics are extremely well understood, off-the-shelf parts are widely available, and microscope-building engineers can make informed choices as to how to best optimize their instrument to the tasks it is designed to perform.
That performance analysis is a "lost art" is a very, very good thing. That art was practiced as an art. Optimization should be approached for what it is: an engineering problem solvable through careful application of solid engineering principles.
I have been asked dozens of times over the years for my list of "tips and tricks" that people can use to optimize their vbscript / their jscript / their active server pages / their VB / their C# code. I always resist this. Emphasizing "tips and tricks" is exactly the wrong way to approach performance. That way leads to code which is hard to understand, hard to reason about, hard to maintain, that is typically not noticably faster than the corresponding straightforward code.
The right way to approach performance is to approach it as an engineering problem like any other problem:
Set meaningful, measurable, customer-focused goals.
Build test suites to test your performance against these goals under realistic but controlled and repeatable conditions.
If those suites show that you are not meeting your goals, use tools such as profilers to figure out why.
Optimize the heck out of what the profiler identifies as the worst-performing subsystem. Keep profiling on every change so that you clearly understand the performance impact of each.
Repeat until one of three things happens (1) you meet your goals and ship the software, (2) you revise your goals downwards to something you can achieve, or (3) your project is cancelled because you could not meet your goals.
This is the same as you'd solve any other engineering problem, like adding a feature -- set customer focused goals for the feature, track progress on making a solid implementation, fix problems as you find them through careful debugging analysis, keep iterating until you ship or fail. Performance is a feature.
Performance analysis on complex modern systems requires discipline and focus on solid engineering principles, not on a bag full of tricks that are narrowly applicable to trivial or unrealistic situations. I have never once solved a real-world performance problem through application of tips and tricks.
Get a good profiler.
Don't bother even trying to optimize C# (really, any code) without a good profiler. It actually helps dramatically to have both a sampling and a tracing profiler on hand.
Without a good profiler, you're likely to create false optimizations, and, most importantly, optimize routines that aren't a performance problem in the first place.
The first three steps to profiling should always be 1) Measure, 2) measure, and then 3) measure....
Optimization guidelines:
Don't do it unless you need to
Don't do it if it's cheaper to throw new hardware at the problem instead of a developer
Don't do it unless you can measure the changes in a production-equivalent environment
Don't do it unless you know how to use a CPU and a Memory profiler
Don't do it if it's going to make your code unreadable or unmaintainable
As processors continue to get faster the main bottleneck in most applications isn't CPU, it's bandwidth: bandwidth to off-chip memory, bandwidth to disk and bandwidth to net.
Start at the far end: use YSlow to see why your web site is slow for end-users, then move back and fix you database accesses to be not too wide (columns) and not too deep (rows).
In the very rare cases where it's worth doing anything to optimize CPU usage be careful that you aren't negatively impacting memory usage: I've seen 'optimizations' where developers have tried to use memory to cache results to save CPU cycles. The net effect was to reduce the available memory to cache pages and database results which made the application run far slower! (See rule about measuring.)
I've also seen cases where a 'dumb' un-optimized algorithm has beaten a 'clever' optimized algorithm. Never underestimate how good compiler-writers and chip-designers have become at turning 'inefficient' looping code into super efficient code that can run entirely in on-chip memory with pipelining. Your 'clever' tree-based algorithm with an unwrapped inner loop counting backwards that you thought was 'efficient' can be beaten simply because it failed to stay in on-chip memory during execution. (See rule about measuring.)
When working with ORMs be aware of N+1 Selects.
List<Order> _orders = _repository.GetOrders(DateTime.Now);
foreach(var order in _orders)
{
Print(order.Customer.Name);
}
If the customers are not eagerly loaded this could result in several round trips to the database.
Don't use magic numbers, use enumerations
Don't hard-code values
Use generics where possible since it's typesafe & avoids boxing & unboxing
Use an error handler where it's absolutely needed
Dispose, dispose, dispose. CLR wound't know how to close your database connections, so close them after use and dispose of unmanaged resources
Use common-sense!
OK, I have got to throw in my favorite: If the task is long enough for human interaction, use a manual break in the debugger.
Vs. a profiler, this gives you a call stack and variable values you can use to really understand what's going on.
Do this 10-20 times and you get a good idea of what optimization might really make a difference.
If you identify a method as a bottleneck, but you don't know what to do about it, you are essentially stuck.
So I'll list a few things. All of these things are not silver bullets and you will still have to profile your code. I'm just making suggestions for things you could do and can sometimes help. Especially the first three are important.
Try solving the problem using just (or: mainly) low-level types or arrays of them.
Problems are often small - using a smart but complex algorithm does not always make you win, especially if the less-smart algorithm can be expressed in code that only uses (arrays of) low level types. Take for example InsertionSort vs MergeSort for n<=100 or Tarjan's Dominator finding algorithm vs using bitvectors to naively solve the data-flow form of the problem for n<=100. (the 100 is of course just to give you some idea - profile!)
Consider writing a special case that can be solved using just low-level types (often problem instances of size < 64), even if you have to keep the other code around for larger problem instances.
Learn bitwise arithmetic to help you with the two ideas above.
BitArray can be your friend, compared to Dictionary, or worse, List. But beware that the implementation is not optimal; You can write a faster version yourself. Instead of testing that your arguments are out of range etc., you can often structure your algorithm so that the index can not go out of range anyway - but you can not remove the check from the standard BitArray and it is not free.
As an example of what you can do with just arrays of low level types, the BitMatrix is a rather powerful structure that can be implemented as just an array of ulongs and you can even traverse it using an ulong as "front" because you can take the lowest order bit in constant time (compared with the Queue in Breadth First Search - but obviously the order is different and depends on the index of the items rather than purely the order in which you find them).
Division and modulo are really slow unless the right hand side is a constant.
Floating point math is not in general slower than integer math anymore (not "something you can do", but "something you can skip doing")
Branching is not free. If you can avoid it using a simple arithmetic (anything but division or modulo) you can sometimes gain some performance. Moving a branch to outside a loop is almost always a good idea.
People have funny ideas about what actually matters. Stack Overflow is full of questions about, for example, is ++i more "performant" than i++. Here's an example of real performance tuning, and it's basically the same procedure for any language. If code is simply written a certain way "because it's faster", that's guessing.
Sure, you don't purposely write stupid code, but if guessing worked, there would be no need for profilers and profiling techniques.
The truth is that there is no such thing as the perfect optimised code. You can, however, optimise for a specific portion of code, on a known system (or set of systems) on a known CPU type (and count), a known platform (Microsoft? Mono?), a known framework / BCL version, a known CLI version, a known compiler version (bugs, specification changes, tweaks), a known amount of total and available memory, a known assembly origin (GAC? disk? remote?), with known background system activity from other processes.
In the real world, use a profiler, and look at the important bits; usually the obvious things are anything involving I/O, anything involving threading (again, this changes hugely between versions), and anything involving loops and lookups, but you might be surprised at what "obviously bad" code isn't actually a problem, and what "obviously good" code is a huge culprit.
Tell the compiler what to do, not how to do it. As an example, foreach (var item in list) is better than for (int i = 0; i < list.Count; i++) and m = list.Max(i => i.value); is better than list.Sort(i => i.value); m = list[list.Count - 1];.
By telling the system what you want to do it can figure out the best way to do it. LINQ is good because its results aren't computed until you need them. If you only ever use the first result, it doesn't have to compute the rest.
Ultimately (and this applies to all programming) minimize loops and minimize what you do in loops. Even more important is to minimize the number of loops inside your loops. What's the difference between an O(n) algorithm and an O(n^2) algorithm? The O(n^2) algorithm has a loop inside of a loop.
I don't really try to optimize my code but at times I will go through and use something like reflector to put my programs back to source. It is interesting to then compare what I wrong with what the reflector will output. Sometimes I find that what I did in a more complicated form was simplified. May not optimize things but helps me to see simpler solutions to problems.