Does a lambda create a new instance everytime it is invoked? - c#

I'm curious to know whether a Lambda (when used as delegate) will create a new instance every time it is invoked, or whether the compiler will figure out a way to instantiate the delegate only once and pass in that instance.
More specifically, I'm wanting to create an API for an XNA game that I can use a lambda to pass in a custom call back. Since this will be called in the Update method (which is called many times per second) it would be pretty bad if it newed up an instance everytime to pass in the delegate.
InputManager.GamePads.ButtonPressed(Buttons.A, s => s.MoveToScreen<NextScreen>());

Yes, it will cache them when it can:
using System;
class Program {
static void Main(string[] args) {
var i1 = test(10);
var i2 = test(20);
System.Console.WriteLine(object.ReferenceEquals(i1, i2));
}
static Func<int, int> test(int x) {
Func<int, int> inc = y => y + 1;
Console.WriteLine(inc(x));
return inc;
}
}
It creates a static field, and if it's null, populates it with a new delegate, otherwise returns the existing delegate.
Outputs 10, 20, true.

I was interested by your question because I had just assumed that this kind of thing would always generate a new object and hence to be avoided in code which is called frequently.
I do something similar so I thought I would use ildasm to find out what exactly is going on behind the scenes. In my case it turned out that a new object was getting created each time the delegate was called, I won't post my code because it is fairly complex and not very easy to understand out of context. This conflicts with the answer provided by MichaelGG, I suspect because in his example he makes use of static functions. I would suggest you try it for yourself before designing everything one way and later on finding out that you have a problem. ildasm is the way to go (http://msdn.microsoft.com/en-us/library/f7dy01k1.aspx), look out for any "newobj" lines, you don't want those.
Also worth using CLR Profile to find out if your lambda functions are allocating memory (https://github.com/MicrosoftArchive/clrprofiler). It says it's for framework 2.0 but it also works for 3.5 and it's the latest version that is available.

Related

C# self reference a method

can I self reference from within a method, without using the method name (for code maintenance reasons this may cause a future bug)
void SelfRef(string _Input){
if (_Input == "route1"){
//route1 calls route 2
SelfRef("route2"); //call self
}else if (_Input == "route2"){
//route2 ends
}
}
i would like to not write the word "SelfRef" again and make the function immune to future changes?
I would like to not write the word "SelfRef" again and make the function immune to future changes?
As others have said, you should simply use renaming tools if you are planning to rename a method.
But it is an interesting challenge to make a recursive function that does not refer to the name of the function. Here's a way to do it:
delegate Action<A> Recursive<A>(Recursive<A> r);
static Action<A> AnonymousRecursion<A>(Func<Action<A>, Action<A>> f)
{
Recursive<A> rec = r => a => { f(r(r))(a); };
return rec(rec);
}
Action<string> SelfRef = AnonymousRecursion<string>(
f =>
input =>
{
if (input == "route1")
f("route2");
// and so on
});
Notice how field SelfRef nowhere refers to SelfRef in its body, and yet the recursion is very straightforward; you simply recurse on f instead of SelfRef.
However I submit to you that this code is far, far harder to understand and maintain than simply writing a straightforward recursive method.
Yes, it's possible:
void Foo(int bar)
{
Console.WriteLine(bar);
if(bar < 10)
MethodBase.GetCurrentMethod().Invoke(this, new object[] {++bar});
else
Console.WriteLine("Finished");
}
but please never ever use code like this (it's ugly, it's slow, it's hard to understand, it will not work if the method is inlined).
Since you probably use an IDE like Visual Studio, renaming a method should never ever be an issue; and even if you rename the method manually you probably hit a compile time error.
You can create extension like this
public static void RecursivelyCall(this object thisObject, object [] param, [System.Runtime.CompilerServices.CallerMemberName] string methodName = "")
{
Type thisType = thisObject.GetType();
MethodInfo theMethod = thisType.GetMethod(methodName);
theMethod.Invoke(thisObject, param);
}
public static void RecursivelyCall(this object thisObject, object param, [System.Runtime.CompilerServices.CallerMemberName] string methodName = "")
{
Type thisType = thisObject.GetType();
MethodInfo theMethod = thisType.GetMethod(methodName);
theMethod.Invoke(thisObject, new object[] {param});
}
So you can use it for recursive call
private void Rec(string a)
{
this.RecursivelyCall(a);
}
But, to be honest, i don't think it is a good idea, because
function immune to future changes
doesn't worth losing code readability.
Now let's just make it clear that the technical term for "self-referencing" is recursion. And let's start looking at the problem.
You want to write a recursive method but don't want to "mention" the method name because of maintainability reasons. When I saw this I was like, "what editor are you using?". Maintainability reasons? You mean when you change the name of the method the code breaks?
These problems ca be easily fixed by using an IDE. I suggest you to use Visual Studio Community 2015. It is free and offers a wide range of features. If you want to rename a method, do this:
Right click the method name.
Select "Rename" in the context menu
Type whatever name you want it to be.
Press Enter
And you will see that magically, all the references to the method has changed their names!
So you don't need to turn the whole recursive method into a loop or something. You just need to use the right editor!
Self referencing (probably via reflection) and recursing seems quite long winded when a loop would be sufficient here:
void SelfRef(string _Input){
while(true)
{
if (_Input == "route1"){
_Input = "route2"
continue;
}else if (_Input == "route2"){
//route2 ends
break;
}
//break?
}
}
There's no construct in C# to refer to the method that you're in and call it with different parameters. You could examine the call stack to find the method name and use reflection to invoke it, but it seems pointless.
The only type of bug that you would be "immune" to is renaming the method. In that case, the build will fail and you will quickly have an indication that something is wrong.
It is possible using reflection using MethodBase.GetCurrentMethod() and it's the fastest way that I know of. Certainly faster than using Stacktraces.
You can change your code as follows:
void SelfRef(string _Input){
if (_Input == "route1"){
//route1 calls route 2
MethodBase.GetCurrentMethod().Invoke(this,new []{"route2"}); //call self
}else if (_Input == "route2"){
//route2 ends
}
}
Don't forget using System.Reflection namespace.
You are using recursion, not self reference
A function (method) calling itself is called as a recursive function and this concept is called as recursion. What you essentially have, is a recursive function called as SelfRef that takes a string as a parameter. It calls itself if the parameter is equal to "route1" which is a string.
Method reference is passing a method to another method
Method references (as called in other languages) is called as delegates in C# and that is used to pass a function itself to another function. This is mostly used to implement callbacks, i.e., event handlers. For example, you might see this code in events.
public class DelegateExample
{
public delegate void MyDelegate();
public void PrintMessage(MyDelegate d)
{
d();
}
public void PrintHello()
{
Console.WriteLine("Hello.");
}
public void PrintWorld()
{
Console.WriteLine("World.");
}
public static void Main(string[] args)
{
PrintDelegate(new MyDelegate(PrintHello));
PrintDelegate(new MyDelegate(PrintWorld));
}
}
There you see, you are passing a function as a delegate.
Use enums if you know what routes are available already
Never use strings for communicating between objects or withing the program, only use them for interacting with the user. The best bet (if the routes are pre-known) at the time of writing the program, then use enumerations for this purpose.
Calling a function by name does not cause a code-maintenance problem.
All modern development environments (e.g. Visual Studio) will automatically update every usage of that function's name, when someone renames it.
If you want to make the function immune from future changes, keep the source code to yourself.

Determining when a lambda will be compiled to an instance method

Foreword: I am trying to describe the scenario very precisely here. The TL;DR version is 'how do I tell if a lambda will be compiled into an instance method or a closure'...
I am using MvvmLight in my WPF projects, and that library recently changed to using WeakReference instances in order to hold the actions that are passed into a RelayCommand. So, effectively, we have an object somewhere which is holding a WeakReference to an Action<T>.
Now, since upgrading to the latest version, some of our commands stopped working. And we had some code like this:
ctor(Guid token)
{
Command = new RelayCommand(x => Messenger.Default.Send(x, token));
}
This caused a closure (please correct me if I'm not using the correct term) class to be generated - like this:
[CompilerGenerated]
private sealed class <>c__DisplayClass4
{
public object token;
public void <.ctor>b__0(ReportType x)
{
Messenger.Default.Send<ReportTypeSelected>(new ReportTypeSelected(X), this.token);
}
}
This worked fine previously, as the action was stored within the RelayCommand instance, and was kept alive whether it was compiled to an instance method or a closure (i.e. using the '<>DisplayClass' syntax).
However, now, because it is held in a WeakReference, the code only works if the lambda specified is compiled into an instance method. This is because the closure class is instantiated, passed into the RelayCommand and virtually instantly garbage collected, meaning that when the command came to be used, there was no action to perform. So, the above code has to be modified. Changing it to the following causes that, for instance:
Guid _token;
ctor(Guid token)
{
_token = token;
Command = new RelayCommand(x => Messenger.Default.Send(x, _token));
}
This causes the compiled code to result in a member - like the following:
[CompilerGenerated]
private void <.ctor>b__0(ReportType x)
{
Messenger.Default.Send<ReportTypeSelected>(new ReportTypeSelected(X), this._token);
}
Now the above is all fine, and I understand why it didn't work previously, and how changing it caused it to work. However, what I am left with is something which means the code I write now has to be stylistically different based on a compiler decision which I am not privy to.
So, my question is - is this a documented behaviour in all circumstances - or could the behaviour change based on future implementations of the compiler? Should I just forget trying to use lambdas and always pass an instance method into the RelayCommand? Or should I have a convention whereby the action is always cached into an instance member:
Action<ReportTypeSelected> _commandAction;
ctor(Guid token)
{
_commandAction = x => Messenger.Default.Send(x, token);
Command = new RelayCommand(_commandAction);
}
Any background reading pointers are also gratefully accepted!
Whether you will end up with a new class or an instance method on the current class is an implementation detail you should not rely on.
From the C# specification, chapter 7.15.2 (emphasis mine):
It is explicitly unspecified whether there is any way to execute the block of an anonymous function other than through evaluation and invocation of the lambda-expression or anonymous-method-expression. In particular, the compiler may choose to implement an anonymous function by synthesizing one or more named methods or types.
-> Even the fact that it generates any methods at all is not specified.
Given the circumstances, I would go with named methods instead of anonymous ones. If that's not possible, because you need to access variables from the method that registers the command, you should go with the code you showed last.
In my opinion the decision to change RelayCommand to use WeakReference was a poor one. It created a lot more problems than it solved.
As soon as the lambda references any free variables (aka capture), then this will happen as it needs a common location (aka storage class/closure) to reference (and/or assign to) them.
An exercise for the reader is to determine why these storage classes cannot just be static.

C# Lambda performance issues/possibilities/guidelines

I'm testing performance differences using various lambda expression syntaxes. If I have a simple method:
public IEnumerable<Item> GetItems(int point)
{
return this.items.Where(i => i.IsApplicableFor(point));
}
then there's some variable lifting going on here related to point parameter because it's a free variable from lambda's perspective. If I would call this method a million times, would it be better to keep it as it is or change it in any way to improve its performance?
What options do I have and which ones are actually feasible? As I understand it is I have to get rid of free variables so compiler won't have to create closure class and instantiate it on every call to this method. This instantiation usually takes significant amount of time compared to non-closure versions.
The thing is I would like to come up with some sort of lambda writing guidelines that would generally work, because it seems I'm wasting some time every time I write a heavily hit lambda expression. I have to manually test it to make sure it will work, because I don't know what rules to follow.
Alternative method
& example console application code
I've also written a different version of the same method that doesn't need any variable lifting (at least I think it doesn't, but you guys who understand this let me know if that's the case):
public IEnumerable<Item> GetItems(int point)
{
Func<int, Func<Item, bool>> buildPredicate = p => i => i.IsApplicableFor(p);
return this.items.Where(buildPredicate(point));
}
Check out Gist here. Just create a console application and copy the whole code into Program.cs file inside namespace block. You will see that the second example is much much slower even though it doesn't use free variables.
A contradictory example
The reason why I would like to construct some lambda best usage guidelines is that I've met this problem before and to my surprise that one turned out to be working faster when a predicate builder lambda expression was used.
Now explain that then. I'm completely lost here because it may as well turn out I won't be using lambdas at all when I know I have some heavy use method in my code. But I would like to avoid such situation and get to the bottom of it all.
Edit
Your suggestions don't seem to work
I've tried implementing a custom lookup class that internally works similar to what compiler does with a free variable lambda. But instead of having a closure class I've implemented instance members that simulate a similar scenario. This is the code:
private int Point { get; set; }
private bool IsItemValid(Item item)
{
return item.IsApplicableFor(this.Point);
}
public IEnumerable<TItem> GetItems(int point)
{
this.Point = point;
return this.items.Where(this.IsItemValid);
}
Interestingly enough this works just as slow as the slow version. I don't know why, but it seems to do nothing else than the fast one. It reuses the same functionality because these additional members are part of the same object instance. Anyway. I'm now extremely confused!
I've updated Gist source with this latest addition, so you can test for yourself.
What makes you think that the second version doesn't require any variable lifting? You're defining the Func with a Lambda expression, and that's going to require the same bits of compiler trickery that the first version requires.
Furthermore, you're creating a Func that returns a Func, which bends my brain a little bit and will almost certainly require re-evaluation with each call.
I would suggest that you compile this in release mode and then use ILDASM to examine the generated IL. That should give you some insight into what code is generated.
Another test that you should run, which will give you more insight, is to make the predicate call a separate function that uses a variable at class scope. Something like:
private DateTime dayToCompare;
private bool LocalIsDayWithinRange(TItem i)
{
return i.IsDayWithinRange(dayToCompare);
}
public override IEnumerable<TItem> GetDayData(DateTime day)
{
dayToCompare = day;
return this.items.Where(i => LocalIsDayWithinRange(i));
}
That will tell you if hoisting the day variable is actually costing you anything.
Yes, this requires more code and I wouldn't suggest that you use it. As you pointed out in your response to a previous answer that suggested something similar, this creates what amounts to a closure using local variables. The point is that either you or the compiler has to do something like this in order to make things work. Beyond writing the pure iterative solution, there is no magic you can perform that will prevent the compiler from having to do this.
My point here is that "creating the closure" in my case is a simple variable assignment. If this is significantly faster than your version with the Lambda expression, then you know that there is some inefficiency in the code that the compiler creates for the closure.
I'm not sure where you're getting your information about having to eliminate the free variables, and the cost of the closure. Can you give me some references?
Your second method runs 8 times slower than the first for me. As #DanBryant says in comments, this is to do with constructing and calling the delegate inside the method - not do do with variable lifting.
Your question is confusing as it reads to me like you expected the second sample to be faster than the first. I also read it as the first is somehow unacceptably slow due to 'variable lifting'. The second sample still has a free variable (point) but it adds additional overhead - I don't understand why you'd think it removes the free variable.
As the code you have posted confirms, the first sample above (using a simple inline predicate) performs jsut 10% slower than a simple for loop - from your code:
foreach (TItem item in this.items)
{
if (item.IsDayWithinRange(day))
{
yield return item;
}
}
So, in summary:
The for loop is the simplest approach and is "best case".
The inline predicate is slightly slower, due to some additional overhead.
Constructing and calling a Func that returns Func within each iteration is significantly slower than either.
I don't think any of this is surprising. The 'guideline' is to use an inline predicate - if it performs poorly, simplify by moving to a straight loop.
I profiled your benchmark for you and determined many things:
First of all, it spends half its time on the line return this.GetDayData(day).ToList(); calling ToList. If you remove that and instead manually iterate over the results, you can measure relative the differences in the methods.
Second, because IterationCount = 1000000 and RangeCount = 1, you are timing the initialization of the different methods rather than the amount of time it takes to execute them. This means your execution profile is dominated by creating the iterators, escaping variable records, and delegates, plus the hundreds of subsequent gen0 garbage collections that result from creating all that garbage.
Third, the "slow" method is really slow on x86, but about as fast as the "fast" method on x64. I believe this is due to how the different JITters create delegates. If you discount the delegate creation from the results, the "fast" and "slow" methods are identical in speed.
Fourth, if you actually invoke the iterators a significant number of times (on my computer, targetting x64, with RangeCount = 8), "slow" is actually faster than "foreach" and "fast" is faster than all of them.
In conclusion, the "lifting" aspect is negligible. Testing on my laptop shows that capturing a variable like you do requires an extra 10ns every time the lambda gets created (not every time it is invoked), and that includes the extra GC overhead. Furthermore, while creating the iterator in your "foreach" method is somewhat faster than creating the lambdas, actually invoking that iterator is slower than invoking the lambdas.
If the few extra nanoseconds required to create delegates is too much for your application, consider caching them. If you require parameters to those delegates (i.e. closures), consider creating your own closure classes such that you can create them once and then just change the properties when you need to reuse their delegates. Here's an example:
public class SuperFastLinqRangeLookup<TItem> : RangeLookupBase<TItem>
where TItem : RangeItem
{
public SuperFastLinqRangeLookup(DateTime start, DateTime end, IEnumerable<TItem> items)
: base(start, end, items)
{
// create delegate only once
predicate = i => i.IsDayWithinRange(day);
}
DateTime day;
Func<TItem, bool> predicate;
public override IEnumerable<TItem> GetDayData(DateTime day)
{
this.day = day; // set captured day to correct value
return this.items.Where(predicate);
}
}
When a LINQ expression that uses deferred execution executes within the same scope that encloses the free variables it references, the compiler should detect that and not create a closure over the lambda, because it's not needed.
The way to verify that would be by testing it using something like this:
public class Test
{
public static void ExecuteLambdaInScope()
{
// here, the lambda executes only within the scope
// of the referenced variable 'add'
var items = Enumerable.Range(0, 100000).ToArray();
int add = 10; // free variable referenced from lambda
Func<int,int> f = x => x + add;
// measure how long this takes:
var array = items.Select( f ).ToArray();
}
static Func<int,int> GetExpression()
{
int add = 10;
return x => x + add; // this needs a closure
}
static void ExecuteLambdaOutOfScope()
{
// here, the lambda executes outside the scope
// of the referenced variable 'add'
Func<int,int> f = GetExpression();
var items = Enumerable.Range(0, 100000).ToArray();
// measure how long this takes:
var array = items.Select( f ).ToArray();
}
}

Uses of delegates in C# (or other languages)

I have always wondered how delegates can be useful and why shall we use them? Other then being type safe and all those advantages in Visual Studio Documentation, what are real world uses of delegates.
I already found one and it's very targeted.
using System;
namespace HelloNamespace {
class Greetings{
public static void DisplayEnglish() {
Console.WriteLine("Hello, world!");
}
public static void DisplayItalian() {
Console.WriteLine("Ciao, mondo!");
}
public static void DisplaySpanish() {
Console.WriteLine("Hola, imundo!");
}
}
delegate void delGreeting();
class HelloWorld {
static void Main(string [] args) {
int iChoice=int.Parse(args[0]);
delGreeting [] arrayofGreetings={
new delGreeting(Greetings.DisplayEnglish),
new delGreeting(Greetings.DisplayItalian),
new delGreeting(Greetings.DisplaySpanish)};
arrayofGreetings[iChoice-1]();
}
}
}
But this doesn't show me exactly the advantages of using delegates rather than a conditional "If ... { }" that parses the argument and run the method.
Does anyone know why it's better to use delegate here rather than "if ... { }". Also do you have other examples that demonstrate the usefulness of delegates.
Thanks!
Delegates are a great way of injecting functionality into a method. They greatly help with code reuse because of this.
Think about it, lets say you have a group of related methods that have almost the same functionality but vary on just a few lines of code. You could refactor all of the things these methods have in common into one single method, then you could inject the specialised functionality in via a delegate.
Take for example all of the IEnumerable extension methods used by LINQ. All of them define common functionality but need a delegate passing to them to define how the return data is projected, or how the data is filtered, sorted, etc...
The most common real-world everyday use of delegates that I can think of in C# would be event handling. When you have a button on a WinForm, and you want to do something when the button is clicked, then what you do is you end up registering a delegate function to be called by the button when it is clicked.
All of this happens for you automatically behind the scenes in the code generated by Visual Studio itself, so you might not see where it happens.
A real-world case that might be more useful to you would be if you wanted to make a library that people can use that will read data off an Internet feed, and notify them when the feed has been updated. By using delegates, then programmers who are using your library would be able to have their own code called whenever the feed is updated.
Lambda expressions
Delegates were mostly used in conjunction with events. But dynamic languages showed their much broader use. That's why delegates were underused up until C# 3.0 when we got Lambda expressions. It's very easy to do something using Lambda expressions (that generates a delegate method)
Now imagine you have a IEnumerable of strings. You can easily define a delegate (using Lambda expression or any other way) and apply it to run on every element (like trimming excess spaces for instance). And doing it without using loop statements. Of course your delegates may do even more complex tasks.
I will try to list some examples that are beyond a simple if-else scenario:
Implementing call backs. For example you are parsing an XML document and want a particular function to be called when a particular node is encountered. You can pass delegates to the functions.
Implementing the strategy design pattern. Assign the delegate to the required algorithm/ strategy implementation.
Anonymous delegates in the case where you want some functionality to be executed on a separate thread (and this function does not have anything to send back to the main program).
Event subscription as suggested by others.
Delegates are simply .Net's implementation of first class functions and allow the languages using them to provide Higher Order Functions.
The principle benefit of this sort of style is that common aspects can be abstracted out into a function which does just what it needs to do (for example traversing a data structure) and is provided another function (or functions) that it asks to do something as it goes along.
The canonical functional examples are map and fold which can be changed to do all sorts of things by the provision of some other operation.
If you want to sum a list of T's and have some function add which takes two T's and adds them together then (via partial application) fold add 0 becomes sum. fold multiply 1 would become the product, fold max 0 the maximum. In all these examples the programmer need not think about how to iterate over the input data, need not worry about what to do if the input is empty.
These are simple examples (though they can be surprisingly powerful when combined with others) but consider tree traversal (a more complex task) all of that can be abstracted away behind a treefold function. Writing of the tree fold function can be hard, but once done it can be re-used widely without having to worry about bugs.
This is similar in concept and design to the addition of foreach loop constructs to traditional imperative languages, the idea being that you don't have to write the loop control yourself (since it introduces the chance of off by one errors, increases verbosity that gets in the way of what you are doing to each entry instead showing how you are getting each entry. Higher order functions simply allow you to separate the traversal of a structure from what to do while traversing extensibly within the language itself.
It should be noted that delegates in c# have been largely superseded by lambdas because the compiler can simply treat it as a less verbose delegate if it wants but is also free to pass through the expression the lambda represents to the function it is passed to to allow (often complex) restructuring or re-targeting of the desire into some other domain like database queries via Linq-to-Sql.
A principle benefit of the .net delegate model over c-style function pointers is that they are actually a tuple (two pieces of data) the function to call and the optional object on which the function is to be called. This allows you to pass about functions with state which is even more powerful. Since the compiler can use this to construct classes behind your back(1), instantiate a new instance of this class and place local variables into it thus allowing closures.
(1) it doesn't have to always do this, but for now that is an implementation detail
In your example your greating are the same, so what you actually need is array of strings.
If you like to gain use of delegates in Command pattern, imagine you have:
public static void ShakeHands()
{ ... }
public static void HowAreYou()
{ ... }
public static void FrenchKissing()
{ ... }
You can substitute a method with the same signature, but different actions.
You picked way too simple example, my advice would be - go and find a book C# in Depth.
Here's a real world example. I often use delegates when wrapping some sort of external call. For instance, we have an old app server (that I wish would just go away) which we connect to through .Net remoting. I'll call the app server in a delegate from a 'safecall ' function like this:
private delegate T AppServerDelegate<T>();
private T processAppServerRequest<T>(AppServerDelegate<T> delegate_) {
try{
return delegate_();
}
catch{
//Do a bunch of standard error handling here which will be
//the same for all appserver calls.
}
}
//Wrapped public call to AppServer
public int PostXYZRequest(string requestData1, string requestData2,
int pid, DateTime latestRequestTime){
processAppServerRequest<int>(
delegate {
return _appSvr.PostXYZRequest(
requestData1,
requestData2,
pid,
latestRequestTime);
});
Obviously the error handling is done a bit better than that but you get the rough idea.
Delegates are used to "call" code in other classes (that might not necessarily be in the same, class, or .cs or even the same assembly).
In your example, delegates can simply be replaced by if statements like you pointed out.
However, delegates are pointers to functions that "live" somewhere in the code where for organizational reasons for instance you don't have access to (easily).
Delegates and related syntactic sugar have significantly changed the C# world (2.0+)
Delegates are type-safe function pointers - so you use delegates anywhere you want to invoke/execute a code block at a future point of time.
Broad sections I can think of
Callbacks/Event handlers: do this when EventX happens. Or do this when you are ready with the results from my async method call.
myButton.Click += delegate { Console.WriteLine("Robbery in progress. Call the cops!"); }
LINQ: selection, projection etc. of elements where you want to do something with each element before passing it down the pipeline. e.g. Select all numbers that are even, then return the square of each of those
var list = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }
.Where(delegate(int x) { return ((x % 2) == 0); })
.Select(delegate(int x) { return x * x; });
// results in 4, 16, 36, 64, 100
Another use that I find a great boon is if I wish to perform the same operation, pass the same data or trigger the same action in multiple instances of the same object type.
In .NET, delegates are also needed when updating the UI from a background thread. As you can not update controls from thread different from the one that created the controls, you need to invoke the update code withing the creating thread's context (mostly using this.Invoke).

Can I detect whether I've been given a new object as a parameter?

Short Version
For those who don't have the time to read my reasoning for this question below:
Is there any way to enforce a policy of "new objects only" or "existing objects only" for a method's parameters?
Long Version
There are plenty of methods which take objects as parameters, and it doesn't matter whether the method has the object "all to itself" or not. For instance:
var people = new List<Person>();
Person bob = new Person("Bob");
people.Add(bob);
people.Add(new Person("Larry"));
Here the List<Person>.Add method has taken an "existing" Person (Bob) as well as a "new" Person (Larry), and the list contains both items. Bob can be accessed as either bob or people[0]. Larry can be accessed as people[1] and, if desired, cached and accessed as larry (or whatever) thereafter.
OK, fine. But sometimes a method really shouldn't be passed a new object. Take, for example, Array.Sort<T>. The following doesn't make a whole lot of sense:
Array.Sort<int>(new int[] {5, 6, 3, 7, 2, 1});
All the above code does is take a new array, sort it, and then forget it (as its reference count reaches zero after Array.Sort<int> exits and the sorted array will therefore be garbage collected, if I'm not mistaken). So Array.Sort<T> expects an "existing" array as its argument.
There are conceivably other methods which may expect "new" objects (though I would generally think that to have such an expectation would be a design mistake). An imperfect example would be this:
DataTable firstTable = myDataSet.Tables["FirstTable"];
DataTable secondTable = myDataSet.Tables["SecondTable"];
firstTable.Rows.Add(secondTable.Rows[0]);
As I said, this isn't a great example, since DataRowCollection.Add doesn't actually expect a new DataRow, exactly; but it does expect a DataRow that doesn't already belong to a DataTable. So the last line in the code above won't work; it needs to be:
firstTable.ImportRow(secondTable.Rows[0]);
Anyway, this is a lot of setup for my question, which is: is there any way to enforce a policy of "new objects only" or "existing objects only" for a method's parameters, either in its definition (perhaps by some custom attributes I'm not aware of) or within the method itself (perhaps by reflection, though I'd probably shy away from this even if it were available)?
If not, any interesting ideas as to how to possibly accomplish this would be more than welcome. For instance I suppose if there were some way to get the GC's reference count for a given object, you could tell right away at the start of a method whether you've received a new object or not (assuming you're dealing with reference types, of course--which is the only scenario to which this question is relevant anyway).
EDIT:
The longer version gets longer.
All right, suppose I have some method that I want to optionally accept a TextWriter to output its progress or what-have-you:
static void TryDoSomething(TextWriter output) {
// do something...
if (output != null)
output.WriteLine("Did something...");
// do something else...
if (output != null)
output.WriteLine("Did something else...");
// etc. etc.
if (output != null)
// do I call output.Close() or not?
}
static void TryDoSomething() {
TryDoSomething(null);
}
Now, let's consider two different ways I could call this method:
string path = GetFilePath();
using (StreamWriter writer = new StreamWriter(path)) {
TryDoSomething(writer);
// do more things with writer
}
OR:
TryDoSomething(new StreamWriter(path));
Hmm... it would seem that this poses a problem, doesn't it? I've constructed a StreamWriter, which implements IDisposable, but TryDoSomething isn't going to presume to know whether it has exclusive access to its output argument or not. So the object either gets disposed prematurely (in the first case), or doesn't get disposed at all (in the second case).
I'm not saying this would be a great design, necessarily. Perhaps Josh Stodola is right and this is just a bad idea from the start. Anyway, I asked the question mainly because I was just curious if such a thing were possible. Looks like the answer is: not really.
No, basically.
There's really no difference between:
var x = new ...;
Foo(x);
and
Foo(new ...);
and indeed sometimes you might convert between the two for debugging purposes.
Note that in the DataRow/DataTable example, there's an alternative approach though - that DataRow can know its parent as part of its state. That's not the same thing as being "new" or not - you could have a "detach" operation for example. Defining conditions in terms of the genuine hard-and-fast state of the object makes a lot more sense than woolly terms such as "new".
Yes, there is a way to do this.
Sort of.
If you make your parameter a ref parameter, you'll have to have an existing variable as your argument. You can't do something like this:
DoSomething(ref new Customer());
If you do, you'll get the error "A ref or out argument must be an assignable variable."
Of course, using ref has other implications. However, if you're the one writing the method, you don't need to worry about them. As long as you don't reassign the ref parameter inside the method, it won't make any difference whether you use ref or not.
I'm not saying it's good style, necessarily. You shouldn't use ref or out unless you really, really need to and have no other way to do what you're doing. But using ref will make what you want to do work.
No. And if there is some reason that you need to do this, your code has improper architecture.
Short answer - no there isn't
In the vast majority of cases I usually find that the issues that you've listed above don't really matter all that much. When they do you could overload a method so that you can accept something else as a parameter instead of the object you are worried about sharing.
// For example create a method that allows you to do this:
people.Add("Larry");
// Instead of this:
people.Add(new Person("Larry"));
// The new method might look a little like this:
public void Add(string name)
{
Person person = new Person(name);
this.add(person); // This method could be private if neccessary
}
I can think of a way to do this, but I would definitely not recommend this. Just for argument's sake.
What does it mean for an object to be a "new" object? It means there is only one reference keeping it alive. An "existing" object would have more than one reference to it.
With this in mind, look at the following code:
class Program
{
static void Main(string[] args)
{
object o = new object();
Console.WriteLine(IsExistingObject(o));
Console.WriteLine(IsExistingObject(new object()));
o.ToString(); // Just something to simulate further usage of o. If we didn't do this, in a release build, o would be collected by the GC.Collect call in IsExistingObject. (not in a Debug build)
}
public static bool IsExistingObject(object o)
{
var oRef = new WeakReference(o);
#if DEBUG
o = null; // In Debug, we need to set o to null. This is not necessary in a release build.
#endif
GC.Collect();
GC.WaitForPendingFinalizers();
return oRef.IsAlive;
}
}
This prints True on the first line, False on the second.
But again, please do not use this in your code.
Let me rewrite your question to something shorter.
Is there any way, in my method, which takes an object as an argument, to know if this object will ever be used outside of my method?
And the short answer to that is: No.
Let me venture an opinion at this point: There should not be any such mechanism either.
This would complicate method calls all over the place.
If there was a method where I could, in a method call, tell if the object I'm given would really be used or not, then it's a signal to me, as a developer of that method, to take that into account.
Basically, you'd see this type of code all over the place (hypothetical, since it isn't available/supported:)
if (ReferenceCount(obj) == 1) return; // only reference is the one we have
My opinion is this: If the code that calls your method isn't going to use the object for anything, and there are no side-effects outside of modifying the object, then that code should not exist to begin with.
It's like code that looks like this:
1 + 2;
What does this code do? Well, depending on the C/C++ compiler, it might compile into something that evaluates 1+2. But then what, where is the result stored? Do you use it for anything? No? Then why is that code part of your source code to begin with?
Of course, you could argue that the code is actually a+b;, and the purpose is to ensure that the evaluation of a+b isn't going to throw an exception denoting overflow, but such a case is so diminishingly rare that a special case for it would just mask real problems, and it would be really simple to fix by just assigning it to a temporary variable.
In any case, for any feature in any programming language and/or runtime and/or environment, where a feature isn't available, the reasons for why it isn't available are:
It wasn't designed properly
It wasn't specified properly
It wasn't implemented properly
It wasn't tested properly
It wasn't documented properly
It wasn't prioritized above competing features
All of these are required to get a feature to appear in version X of application Y, be it C# 4.0 or MS Works 7.0.
Nope, there's no way of knowing.
All that gets passed in is the object reference. Whether it is 'newed' in-situ, or is sourced from an array, the method in question has no way of knowing how the parameters being passed in have been instantiated and/or where.
One way to know if an object passed to a function (or a method) has been created right before the call to the function/method is that the object has a property that is initialized with the timestamp passed from a system function; in that way, looking at that property, it would be possible to resolve the problem.
Frankly, I would not use such method because
I don't see any reason why the code should now if the passed parameter is an object right created, or if it has been created in a different moment.
The method I suggest depends from a system function that in some systems could not be present, or that could be less reliable.
With the modern CPUs, which are a way faster than the CPUs used 10 years ago, there could be the problem to use the right value for the threshold value to decide when an object has been freshly created, or not.
The other solution would be to use an object property that is set to a a value from the object creator, and that is set to a different value from all the methods of the object.
In this case the problem would be to forget to add the code to change that property in each method.
Once again I would ask to myself "Is there a really need to do this?".
As a possible partial solution if you only wanted one of an object to be consumed by a method maybe you could look at a Singleton. In this way the method in question could not create another instance if it existed already.

Categories