Related
I have a static class with properties to store user's inputs:
public static class UserData
{
public static double UserInput1 { get; set; }
}
And I have nested methods that need the user's inputs
public static double Foo()
{
[...]
var input1 = UserData.UserInput1;
var bar = Bar();
[...]
}
private static double Bar()
{
var input1 = UserData.UserInput1;
[...]
}
The positive thing is that I do not have to pass all user inputs to Foo(), then to Bar() (and to further nested methods within Bar()).
The negative thing is that I have to get UserData.UserInput1 and other user inputs very often. I could change the code to get the user inputs only once:
public static double Foo()
{
[...]
var input1 = UserData.UserInput1;
var bar = Bar(input1);
[...]
}
private static double Bar(double input1)
{
[...]
}
Which one is faster?
The second one is the faster than the first one. Because you avoid to obtain the static property from UserData.
It's not a big goal works with static when we talk about performance cost due to the need to perform a lookup in the symbol table and track shared memory. By passing input values as parameters, this is avoided and slightly better performance is achieved.
But both options are ok. It's more important to focus on code readability and maintainability rather than performance unless you are working on a critical performance issue.
Which one is faster?
Using static mutable state in this way will be way slower in the long run. Because you will spend a bunch of time trying to find and fix bugs. This time could be better spent doing things that will actually help performance, like profiling and optimizing code.
Try to make method that compute anything take the required input as parameters. Try to make input fields properties of the associated UI class. This should help keep the code simple and understandable.
Accessing a static property will be translated to a indirect memory access. Passing a parameter to a method might be free if the parameter is already in a register, or might involve a bit more work if it needs to be loaded, moved or passed on the stack. But we are talking about single digit cycles here, optimization on this level should only be done in super tight loops that are run many millions of times each second, and then you should typically ensure that all methods can be inlined, side stepping the problem.
If you're worried about such micro-optimizations (which you generally wouldn't need), consider using inlining.
[MethodImpl(MethodImplOptions.AggressiveInlining)]
https://learn.microsoft.com/en-us/dotnet/api/system.runtime.compilerservices.methodimploptions?view=net-7.0
PS: using one over the other, or using AggressiveInlining, will not save you anywhere close to the 1ms you are hoping for, under non-extreme/farfetched scenarios.
This is purely a language matter, because I know, that this may (and possibly even should) be solved in a different way.
We have a property Prop, which in its getter has some side effects. How to "call" this property in a "nice" way to trigger these side effects?
One way:
object dummy = this.Prop;
But this doesn't seem to be a nice solution, because it involves creating unnecessary variable. I tried with:
(() => this.Prop)();
But it doesn't compile. Is there short and clean way to do it?
If you create a variable, you'll then get code complaining that it's unused, which can be annoying.
For benchmarking cases, I've sometimes added a generic Consume() extension method, which just does nothing:
public static void Consume<T>(this T ignored)
{
}
You can then write:
this.Prop.Consume();
and the compiler will be happy. Another alternative would be to put have a method which accepted a Func<T>:
public static void Consume<T>(Func<T> function)
{
function();
}
Then call it as:
Consume(() => this.Prop);
I rarely face this situation outside tests (both benchmarks, and "I should be able to call the property without an exception being thrown" test) but every so often it can be useful, e.g. to force a class to be initialized. Any time you find yourself wanting this, it's worth considering whether this would be more appropriate as a method.
I'm testing performance differences using various lambda expression syntaxes. If I have a simple method:
public IEnumerable<Item> GetItems(int point)
{
return this.items.Where(i => i.IsApplicableFor(point));
}
then there's some variable lifting going on here related to point parameter because it's a free variable from lambda's perspective. If I would call this method a million times, would it be better to keep it as it is or change it in any way to improve its performance?
What options do I have and which ones are actually feasible? As I understand it is I have to get rid of free variables so compiler won't have to create closure class and instantiate it on every call to this method. This instantiation usually takes significant amount of time compared to non-closure versions.
The thing is I would like to come up with some sort of lambda writing guidelines that would generally work, because it seems I'm wasting some time every time I write a heavily hit lambda expression. I have to manually test it to make sure it will work, because I don't know what rules to follow.
Alternative method
& example console application code
I've also written a different version of the same method that doesn't need any variable lifting (at least I think it doesn't, but you guys who understand this let me know if that's the case):
public IEnumerable<Item> GetItems(int point)
{
Func<int, Func<Item, bool>> buildPredicate = p => i => i.IsApplicableFor(p);
return this.items.Where(buildPredicate(point));
}
Check out Gist here. Just create a console application and copy the whole code into Program.cs file inside namespace block. You will see that the second example is much much slower even though it doesn't use free variables.
A contradictory example
The reason why I would like to construct some lambda best usage guidelines is that I've met this problem before and to my surprise that one turned out to be working faster when a predicate builder lambda expression was used.
Now explain that then. I'm completely lost here because it may as well turn out I won't be using lambdas at all when I know I have some heavy use method in my code. But I would like to avoid such situation and get to the bottom of it all.
Edit
Your suggestions don't seem to work
I've tried implementing a custom lookup class that internally works similar to what compiler does with a free variable lambda. But instead of having a closure class I've implemented instance members that simulate a similar scenario. This is the code:
private int Point { get; set; }
private bool IsItemValid(Item item)
{
return item.IsApplicableFor(this.Point);
}
public IEnumerable<TItem> GetItems(int point)
{
this.Point = point;
return this.items.Where(this.IsItemValid);
}
Interestingly enough this works just as slow as the slow version. I don't know why, but it seems to do nothing else than the fast one. It reuses the same functionality because these additional members are part of the same object instance. Anyway. I'm now extremely confused!
I've updated Gist source with this latest addition, so you can test for yourself.
What makes you think that the second version doesn't require any variable lifting? You're defining the Func with a Lambda expression, and that's going to require the same bits of compiler trickery that the first version requires.
Furthermore, you're creating a Func that returns a Func, which bends my brain a little bit and will almost certainly require re-evaluation with each call.
I would suggest that you compile this in release mode and then use ILDASM to examine the generated IL. That should give you some insight into what code is generated.
Another test that you should run, which will give you more insight, is to make the predicate call a separate function that uses a variable at class scope. Something like:
private DateTime dayToCompare;
private bool LocalIsDayWithinRange(TItem i)
{
return i.IsDayWithinRange(dayToCompare);
}
public override IEnumerable<TItem> GetDayData(DateTime day)
{
dayToCompare = day;
return this.items.Where(i => LocalIsDayWithinRange(i));
}
That will tell you if hoisting the day variable is actually costing you anything.
Yes, this requires more code and I wouldn't suggest that you use it. As you pointed out in your response to a previous answer that suggested something similar, this creates what amounts to a closure using local variables. The point is that either you or the compiler has to do something like this in order to make things work. Beyond writing the pure iterative solution, there is no magic you can perform that will prevent the compiler from having to do this.
My point here is that "creating the closure" in my case is a simple variable assignment. If this is significantly faster than your version with the Lambda expression, then you know that there is some inefficiency in the code that the compiler creates for the closure.
I'm not sure where you're getting your information about having to eliminate the free variables, and the cost of the closure. Can you give me some references?
Your second method runs 8 times slower than the first for me. As #DanBryant says in comments, this is to do with constructing and calling the delegate inside the method - not do do with variable lifting.
Your question is confusing as it reads to me like you expected the second sample to be faster than the first. I also read it as the first is somehow unacceptably slow due to 'variable lifting'. The second sample still has a free variable (point) but it adds additional overhead - I don't understand why you'd think it removes the free variable.
As the code you have posted confirms, the first sample above (using a simple inline predicate) performs jsut 10% slower than a simple for loop - from your code:
foreach (TItem item in this.items)
{
if (item.IsDayWithinRange(day))
{
yield return item;
}
}
So, in summary:
The for loop is the simplest approach and is "best case".
The inline predicate is slightly slower, due to some additional overhead.
Constructing and calling a Func that returns Func within each iteration is significantly slower than either.
I don't think any of this is surprising. The 'guideline' is to use an inline predicate - if it performs poorly, simplify by moving to a straight loop.
I profiled your benchmark for you and determined many things:
First of all, it spends half its time on the line return this.GetDayData(day).ToList(); calling ToList. If you remove that and instead manually iterate over the results, you can measure relative the differences in the methods.
Second, because IterationCount = 1000000 and RangeCount = 1, you are timing the initialization of the different methods rather than the amount of time it takes to execute them. This means your execution profile is dominated by creating the iterators, escaping variable records, and delegates, plus the hundreds of subsequent gen0 garbage collections that result from creating all that garbage.
Third, the "slow" method is really slow on x86, but about as fast as the "fast" method on x64. I believe this is due to how the different JITters create delegates. If you discount the delegate creation from the results, the "fast" and "slow" methods are identical in speed.
Fourth, if you actually invoke the iterators a significant number of times (on my computer, targetting x64, with RangeCount = 8), "slow" is actually faster than "foreach" and "fast" is faster than all of them.
In conclusion, the "lifting" aspect is negligible. Testing on my laptop shows that capturing a variable like you do requires an extra 10ns every time the lambda gets created (not every time it is invoked), and that includes the extra GC overhead. Furthermore, while creating the iterator in your "foreach" method is somewhat faster than creating the lambdas, actually invoking that iterator is slower than invoking the lambdas.
If the few extra nanoseconds required to create delegates is too much for your application, consider caching them. If you require parameters to those delegates (i.e. closures), consider creating your own closure classes such that you can create them once and then just change the properties when you need to reuse their delegates. Here's an example:
public class SuperFastLinqRangeLookup<TItem> : RangeLookupBase<TItem>
where TItem : RangeItem
{
public SuperFastLinqRangeLookup(DateTime start, DateTime end, IEnumerable<TItem> items)
: base(start, end, items)
{
// create delegate only once
predicate = i => i.IsDayWithinRange(day);
}
DateTime day;
Func<TItem, bool> predicate;
public override IEnumerable<TItem> GetDayData(DateTime day)
{
this.day = day; // set captured day to correct value
return this.items.Where(predicate);
}
}
When a LINQ expression that uses deferred execution executes within the same scope that encloses the free variables it references, the compiler should detect that and not create a closure over the lambda, because it's not needed.
The way to verify that would be by testing it using something like this:
public class Test
{
public static void ExecuteLambdaInScope()
{
// here, the lambda executes only within the scope
// of the referenced variable 'add'
var items = Enumerable.Range(0, 100000).ToArray();
int add = 10; // free variable referenced from lambda
Func<int,int> f = x => x + add;
// measure how long this takes:
var array = items.Select( f ).ToArray();
}
static Func<int,int> GetExpression()
{
int add = 10;
return x => x + add; // this needs a closure
}
static void ExecuteLambdaOutOfScope()
{
// here, the lambda executes outside the scope
// of the referenced variable 'add'
Func<int,int> f = GetExpression();
var items = Enumerable.Range(0, 100000).ToArray();
// measure how long this takes:
var array = items.Select( f ).ToArray();
}
}
I would love to write code like this:
class Zebra
{
public lazy int StripeCount
{
get { return ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce(); }
}
}
EDIT: Why? I think it looks better than:
class Zebra
{
private Lazy<int> _StripeCount;
public Zebra()
{
this._StripeCount = new Lazy(() => ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce());
}
public lazy int StripeCount
{
get { return this._StripeCount.Value; }
}
}
The first time you call the property, it would run the code in the get block, and afterward would just return the value from it.
My questions:
What costs would be involved with adding this kind of keyword to the library?
What situations would this be problematic in?
Would you find this useful?
I'm not starting a crusade to get this into the next version of the library, but I am curious what kind of considerations a feature such as this should have to go through.
I am curious what kind of considerations a feature such as this should have to go through.
First off, I write a blog about this subject, amongst others. See my old blog:
http://blogs.msdn.com/b/ericlippert/
and my new blog:
http://ericlippert.com
for many articles on various aspects of language design.
Second, the C# design process is now open for view to the public, so you can see for yourself what the language design team considers when vetting new feature suggestions. See https://github.com/dotnet/roslyn/ for details.
What costs would be involved with adding this kind of keyword to the library?
It depends on a lot of things. There are, of course, no cheap, easy features. There are only less expensive, less difficult features. In general, the costs are those involving designing, specifying, implementing, testing, documenting and maintaining the feature. There are more exotic costs as well, like the opportunity cost of not doing a better feature, or the cost of choosing a feature that interacts poorly with future features we might want to add.
In this case the feature would probably be simply making the "lazy" keyword a syntactic sugar for using Lazy<T>. That's a pretty straightforward feature, not requiring a lot of fancy syntactic or semantic analysis.
What situations would this be problematic in?
I can think of a number of factors that would cause me to push back on the feature.
First off, it is not necessary; it's merely a convenient sugar. It doesn't really add new power to the language. The benefits don't seem to be worth the costs.
Second, and more importantly, it enshrines a particular kind of laziness into the language. There is more than one kind of laziness, and we might choose wrong.
How is there more than one kind of laziness? Well, think about how it would be implemented. Properties are already "lazy" in that their values are not calculated until the property is called, but you want more than that; you want a property that is called once, and then the value is cached for the next time. By "lazy" essentially you mean a memoized property. What guarantees do we need to put in place? There are many possibilities:
Possibility #1: Not threadsafe at all. If you call the property for the "first" time on two different threads, anything can happen. If you want to avoid race conditions, you have to add synchronization yourself.
Possibility #2: Threadsafe, such that two calls to the property on two different threads both call the initialization function, and then race to see who fills in the actual value in the cache. Presumably the function will return the same value on both threads, so the extra cost here is merely in the wasted extra call. But the cache is threadsafe, and doesn't block any thread. (Because the threadsafe cache can be written with low-lock or no-lock code.)
Code to implement thread safety comes at a cost, even if it is low-lock code. Is that cost acceptable? Most people write what are effectively single-threaded programs; does it seem right to add the overhead of thread safety to every single lazy property call whether it's needed or not?
Possibility #3: Threadsafe such that there is a strong guarantee that the initialization function will only be called once; there is no race on the cache. The user might have an implicit expectation that the initialization function is only called once; it might be very expensive and two calls on two different threads might be unacceptable. Implementing this kind of laziness requires full-on synchronization where it is possible that one thread blocks indefinitely while the lazy method is running on another thread. It also means there could be deadlocks if there's a lock-ordering problem with the lazy method.
That adds even more cost to the feature, a cost that is borne equally by people who do not take advantage of it (because they are writing single-threaded programs).
So how do we deal with this? We could add three features: "lazy not threadsafe", "lazy threadsafe with races" and "lazy threadsafe with blocking and maybe deadlocks". And now the feature just got a whole lot more expensive and way harder to document. This produces an enormous user education problem. Every time you give a developer a choice like this, you present them with an opportunity to write terrible bugs.
Third, the feature seems weak as stated. Why should laziness be applied merely to properties? It seems like this could be applied generally through the type system:
lazy int x = M(); // doesn't call M()
lazy int y = x + x; // doesn't add x + x
int z = y * y; // now M() is called once and cached.
// x + x is computed and cached
// y * y is computed
We try to not do small, weak features if there is a more general feature that is a natural extension of it. But now we're talking about really serious design and implementation costs.
Would you find this useful?
Personally? Not really useful. I write lots of simple low-lock lazy code mostly using Interlocked.Exchange. (I don't care if the lazy method gets run twice and one of the results discarded; my lazy methods are never that expensive.) The pattern is straightforward, I know it to be safe, there are never extra objects allocated for the delegate or the locks, and if I have something a little more complex I can always use Lazy<T> to do the work for me. It would be a small convenience.
The system library already has a class that does what you want: System.Lazy<T>
I'm sure it could be integrated into the language, but as Eric Lippert will tell you adding features to a language is not something to take lightly. Many things have to be considered, and the benefit/cost ratio needs to be very good. Since System.Lazy already handles this pretty well, I doubt we will see this anytime soon.
Do you know about the Lazy<T> class that was added in .Net 4.0?
http://sankarsan.wordpress.com/2009/10/04/laziness-in-c-4-0-lazyt/
Have you tryed / Dou you mean this?
private Lazy<int> MyExpensiveCountingValue = new Lazy<int>(new Func<int>(()=> ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce()));
public int StripeCount
{
get
{
return MyExpensiveCountingValue.Value;
}
}
EDIT:
after your post edit I would add that your idea is definitely more elegant, but still has the same functionallity!!!.
This is unlikely to be added to the C# language because you can easily do it yourself, even without Lazy<T>.
A simple, but not thread-safe, example:
class Zebra
{
private int? stripeCount;
public int StripeCount
{
get
{
if (this.stripeCount == null)
{
this.stripeCount = ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce();
}
return this.stripeCount;
}
}
}
If you don't mind using a post-compiler, CciSharp has this feature:
class Zebra {
[Lazy] public int StripeCount {
get { return ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce(); }
}
}
Have a look at the Lazy<T> type. Also ask Eric Lippert about adding things like this to the language, he would no doubt have a view.
I have always wondered how delegates can be useful and why shall we use them? Other then being type safe and all those advantages in Visual Studio Documentation, what are real world uses of delegates.
I already found one and it's very targeted.
using System;
namespace HelloNamespace {
class Greetings{
public static void DisplayEnglish() {
Console.WriteLine("Hello, world!");
}
public static void DisplayItalian() {
Console.WriteLine("Ciao, mondo!");
}
public static void DisplaySpanish() {
Console.WriteLine("Hola, imundo!");
}
}
delegate void delGreeting();
class HelloWorld {
static void Main(string [] args) {
int iChoice=int.Parse(args[0]);
delGreeting [] arrayofGreetings={
new delGreeting(Greetings.DisplayEnglish),
new delGreeting(Greetings.DisplayItalian),
new delGreeting(Greetings.DisplaySpanish)};
arrayofGreetings[iChoice-1]();
}
}
}
But this doesn't show me exactly the advantages of using delegates rather than a conditional "If ... { }" that parses the argument and run the method.
Does anyone know why it's better to use delegate here rather than "if ... { }". Also do you have other examples that demonstrate the usefulness of delegates.
Thanks!
Delegates are a great way of injecting functionality into a method. They greatly help with code reuse because of this.
Think about it, lets say you have a group of related methods that have almost the same functionality but vary on just a few lines of code. You could refactor all of the things these methods have in common into one single method, then you could inject the specialised functionality in via a delegate.
Take for example all of the IEnumerable extension methods used by LINQ. All of them define common functionality but need a delegate passing to them to define how the return data is projected, or how the data is filtered, sorted, etc...
The most common real-world everyday use of delegates that I can think of in C# would be event handling. When you have a button on a WinForm, and you want to do something when the button is clicked, then what you do is you end up registering a delegate function to be called by the button when it is clicked.
All of this happens for you automatically behind the scenes in the code generated by Visual Studio itself, so you might not see where it happens.
A real-world case that might be more useful to you would be if you wanted to make a library that people can use that will read data off an Internet feed, and notify them when the feed has been updated. By using delegates, then programmers who are using your library would be able to have their own code called whenever the feed is updated.
Lambda expressions
Delegates were mostly used in conjunction with events. But dynamic languages showed their much broader use. That's why delegates were underused up until C# 3.0 when we got Lambda expressions. It's very easy to do something using Lambda expressions (that generates a delegate method)
Now imagine you have a IEnumerable of strings. You can easily define a delegate (using Lambda expression or any other way) and apply it to run on every element (like trimming excess spaces for instance). And doing it without using loop statements. Of course your delegates may do even more complex tasks.
I will try to list some examples that are beyond a simple if-else scenario:
Implementing call backs. For example you are parsing an XML document and want a particular function to be called when a particular node is encountered. You can pass delegates to the functions.
Implementing the strategy design pattern. Assign the delegate to the required algorithm/ strategy implementation.
Anonymous delegates in the case where you want some functionality to be executed on a separate thread (and this function does not have anything to send back to the main program).
Event subscription as suggested by others.
Delegates are simply .Net's implementation of first class functions and allow the languages using them to provide Higher Order Functions.
The principle benefit of this sort of style is that common aspects can be abstracted out into a function which does just what it needs to do (for example traversing a data structure) and is provided another function (or functions) that it asks to do something as it goes along.
The canonical functional examples are map and fold which can be changed to do all sorts of things by the provision of some other operation.
If you want to sum a list of T's and have some function add which takes two T's and adds them together then (via partial application) fold add 0 becomes sum. fold multiply 1 would become the product, fold max 0 the maximum. In all these examples the programmer need not think about how to iterate over the input data, need not worry about what to do if the input is empty.
These are simple examples (though they can be surprisingly powerful when combined with others) but consider tree traversal (a more complex task) all of that can be abstracted away behind a treefold function. Writing of the tree fold function can be hard, but once done it can be re-used widely without having to worry about bugs.
This is similar in concept and design to the addition of foreach loop constructs to traditional imperative languages, the idea being that you don't have to write the loop control yourself (since it introduces the chance of off by one errors, increases verbosity that gets in the way of what you are doing to each entry instead showing how you are getting each entry. Higher order functions simply allow you to separate the traversal of a structure from what to do while traversing extensibly within the language itself.
It should be noted that delegates in c# have been largely superseded by lambdas because the compiler can simply treat it as a less verbose delegate if it wants but is also free to pass through the expression the lambda represents to the function it is passed to to allow (often complex) restructuring or re-targeting of the desire into some other domain like database queries via Linq-to-Sql.
A principle benefit of the .net delegate model over c-style function pointers is that they are actually a tuple (two pieces of data) the function to call and the optional object on which the function is to be called. This allows you to pass about functions with state which is even more powerful. Since the compiler can use this to construct classes behind your back(1), instantiate a new instance of this class and place local variables into it thus allowing closures.
(1) it doesn't have to always do this, but for now that is an implementation detail
In your example your greating are the same, so what you actually need is array of strings.
If you like to gain use of delegates in Command pattern, imagine you have:
public static void ShakeHands()
{ ... }
public static void HowAreYou()
{ ... }
public static void FrenchKissing()
{ ... }
You can substitute a method with the same signature, but different actions.
You picked way too simple example, my advice would be - go and find a book C# in Depth.
Here's a real world example. I often use delegates when wrapping some sort of external call. For instance, we have an old app server (that I wish would just go away) which we connect to through .Net remoting. I'll call the app server in a delegate from a 'safecall ' function like this:
private delegate T AppServerDelegate<T>();
private T processAppServerRequest<T>(AppServerDelegate<T> delegate_) {
try{
return delegate_();
}
catch{
//Do a bunch of standard error handling here which will be
//the same for all appserver calls.
}
}
//Wrapped public call to AppServer
public int PostXYZRequest(string requestData1, string requestData2,
int pid, DateTime latestRequestTime){
processAppServerRequest<int>(
delegate {
return _appSvr.PostXYZRequest(
requestData1,
requestData2,
pid,
latestRequestTime);
});
Obviously the error handling is done a bit better than that but you get the rough idea.
Delegates are used to "call" code in other classes (that might not necessarily be in the same, class, or .cs or even the same assembly).
In your example, delegates can simply be replaced by if statements like you pointed out.
However, delegates are pointers to functions that "live" somewhere in the code where for organizational reasons for instance you don't have access to (easily).
Delegates and related syntactic sugar have significantly changed the C# world (2.0+)
Delegates are type-safe function pointers - so you use delegates anywhere you want to invoke/execute a code block at a future point of time.
Broad sections I can think of
Callbacks/Event handlers: do this when EventX happens. Or do this when you are ready with the results from my async method call.
myButton.Click += delegate { Console.WriteLine("Robbery in progress. Call the cops!"); }
LINQ: selection, projection etc. of elements where you want to do something with each element before passing it down the pipeline. e.g. Select all numbers that are even, then return the square of each of those
var list = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }
.Where(delegate(int x) { return ((x % 2) == 0); })
.Select(delegate(int x) { return x * x; });
// results in 4, 16, 36, 64, 100
Another use that I find a great boon is if I wish to perform the same operation, pass the same data or trigger the same action in multiple instances of the same object type.
In .NET, delegates are also needed when updating the UI from a background thread. As you can not update controls from thread different from the one that created the controls, you need to invoke the update code withing the creating thread's context (mostly using this.Invoke).