Using Anonymous Delegates with .NET ThreadPool.QueueUserWorkItem - c#

I was going to post a question, but figured it out ahead of time and decided to post the question and the answer - or at least my observations.
When using an anonymous delegate as the WaitCallback, where ThreadPool.QueueUserWorkItem is called in a foreach loop, it appears that the same one foreach-value is passed into each thread.
List< Thing > things = MyDb.GetTheThings();
foreach( Thing t in Things)
{
localLogger.DebugFormat( "About to queue thing [{0}].", t.Id );
ThreadPool.QueueUserWorkItem(
delegate()
{
try
{
WorkWithOneThing( t );
}
finally
{
Cleanup();
localLogger.DebugFormat("Thing [{0}] has been queued and run by the delegate.", t.Id );
}
});
}
For a collection of 16 Thing instances in Things I observed that each 'Thing' passed to WorkWithOneThing corresponded to the last item in the 'things' list.
I suspect this is because the delegate is accessing the 't' outer variable. Note that I also experimented with passing the Thing as a parameter to the anonymous delegate, but the behavior remained incorrect.
When I re-factored the code to use a named WaitCallback method and passed the Thing 't' to the method, voilĂ  ... the i'th instance of Things was correctly passed into WorkWithOneThing.
A lesson in parallelism I guess. I also imagine that the Parallel.For family addresses this, but that library was not an option for us at this point.
Hope this saves someone else some time.
Howard Hoffman

This is correct, and describes how C# captures outside variables inside closures. It's not directly an issue about parallelism, but rather about anonymous methods and lambda expressions.
This question discusses this language feature and its implications in detail.

This is a common occurrence when using closures and is especially evident when constructing LINQ queries. The closure references the variable, not its contents, therefore, to make your example work, you can just specify a variable inside the loop that takes the value of t and then reference that in the closure. This will ensure each version of your anonymous delegate references a different variable.

Below is a link detailing why that happens. It's written for VB but C# has the same semantics.
http://blogs.msdn.com/jaredpar/archive/2007/07/26/closures-in-vb-part-5-looping.aspx

Related

C#: Anonymous method vs Named method

I'm new to SO and programming and learning day by day with bits and pieces of tech (C#) jargons.
After Googling for a while, below is what I've researched about methods
A Method is a block of statements, which serves for code reusability
& it also supports overloading with different SIGNATURE....for ex:
drawShape(2pts), drawShape(3pts) etc...
An Anonymous method is one with block of statements, but no
name....(as its premature to ask, in wt situation we come across
anonymous method...any articles, samples ...)
Named method: Here's a link but at the end i didn't get what Named Method actually is...
Can anyone explain what a "Named" method is, and where do we use anonymous method?
A named method is a method you can call by its name (e.g. it is a function that has a name). For example, you have defined a function to add two numbers:
int f(int x, int y)
{
return x+y;
}
You would call this method by its name like so: f(1, 2);.
Anonymous method is a method that is passed as an argument to a function without the need for its name. These methods can be constructed at runtime or evaluated from a lambda expression at compile time.
These methods are often used in LINQ queries, for example:
int maxSmallerThan10 = array.Where(x => x < 10).Max();
The expression x => x < 10 is called a lambda expression and its result is an anonymous function that will be run by the method Where.
If you are a beginner, I would suggest you first read about more basic stuff. Check out the following links:
http://www.completecsharptutorial.com/
http://www.csharp-station.com/tutorial.aspx
http://www.homeandlearn.co.uk/csharp/csharp.html
Let's start from a simple method.
void MyMethod()
{
Console.WriteLine("Inside MyMethod"); //Write to output
}
The above method is a named-method which just writes Inside MyMethod to the output window.
Anonymous methods are some methods used in some special scenarios (when using delegates) where the method definition is usually smaller where you don't specify the name of the method.
For example, (delegate) => { Console.WriteLine("Inside Mymethod");}
Just start writing some simple programs and in the due course, when you use delegates or some advanced concepts, you will yourself learn. :)
Explanation by Analogy
Normally when we tell stories we refer to people by name:
"Freddie"
"Who's Freddie?"
"You know, Freddie, Freddie from Sales - the male guy with the red hair, who burned the building down...?"
In reality nobody cares who the person is, department he works etc. it's not like we'll refer to him every again. We want to be able to say: "Some guy burned down our building". All the other stuff (hair color, name etc.) is irrelevant and/or can be inferred.
What does this have to do with c#?
Typically in c# you would have to define a method if you want to use it: you must tell the compiler (typically):
what it is called,
and what goes into it (parameters + their types),
as well as what should come out (return type),
and whether it is something you can do in the privacy of your home or whether you can do it in public. (scope)
When you do that with methods, you are basically using named methods. But writing them out: that's a lot of effort. Especially if all of that can be inferred and you're never going to use it again.
That's basically where anonymous methods come in. It's like a disposable method - something quick and dirty - it reduces the amount you have to type in. That's basically the purpose of them.
Anonymous methods or anonymous functions, what seems to be the same, basically are delegates. As the link you point out: http://msdn.microsoft.com/en-us/library/bb882516.aspx describes, anonymous methods provide a simplified way to pass method to be executed by another method. Like a callback.
Another way to see it, is think about lambda expressions.
A named by the contrast is any common method.
From MSDN:
A delegate can be associated with a named method. When you instantiate a delegate by using a named method, the method is passed as a parameter. This is called using a named method. Delegates constructed with a named method can encapsulate either a static method or an instance method. Named methods are the only way to instantiate a delegate in earlier versions of C#. However, in a situation where creating a new method is unwanted overhead, C# enables you to instantiate a delegate and immediately specify a code block that the delegate will process when it is called. The block can contain either a lambda expression or an anonymous method.
and
In versions of C# before 2.0, the only way to declare a delegate was to use named methods. C# 2.0 introduced anonymous methods and in C# 3.0 and later, lambda expressions supersede anonymous methods as the preferred way to write inline code. However, the information about anonymous methods in this topic also applies to lambda expressions. There is one case in which an anonymous method provides functionality not found in lambda expressions. Anonymous methods enable you to omit the parameter list. This means that an anonymous method can be converted to delegates with a variety of signatures. This is not possible with lambda expressions. For more information specifically about lambda expressions, see Lambda Expressions (C# Programming Guide). Creating anonymous methods is essentially a way to pass a code block as a delegate parameter. By using anonymous methods, you reduce the coding overhead in instantiating delegates because you do not have to create a separate method.
So in answer to your question about when to use anonymous methods, then MSDN says: in a situation where creating a new method is unwanted overhead.
In my experience it's more down to a question of code reuse and readability.
Links:
http://msdn.microsoft.com/en-us/library/98dc08ac.aspx
http://msdn.microsoft.com/en-us/library/0yw3tz5k.aspx
Hope that helps

Are Lambda expressions in C# closures?

Are lambda expressions (and to a degree, anonymous functions) closures?
My understanding of closures are that they are functions that are treated as objects, which seems to be an accurate representation of what anonymous functions and Lambda expressions do.
And is it correct to call them closures? I understand that closures came about (or became popular) due to the lisp dialect, but is it also a general programming term?
Thanks for any clarification that you can provide!
A lambda may be implemented using a closure, but it is not itself necessarily a closure.
A closure is "a function together with a referencing environment for the non-local variables of that function.".
When you make a lambda expression that uses variables defined outside of the method, then the lambda must be implemented using a closure. For example:
int i = 42;
Action lambda = () => { Console.WriteLine(i); };
In this case, the compiler generated method must have access to the variable (i) defined in a completely different scope. In order for this to work, the method it generates is a "function together with the referencing environment" - basically, it's creating a "closure" to retrieve access to the variable.
However, this lambda:
Action lambda2 = () => { Console.WriteLine("Foo"); }
does not rely on any "referencing environment", since it's a fully contained method. In this case, the compiler generates a normal static method, and there is no closure involved at all.
In both cases, the lambda is creating a delegate ("function object"), but it's only creating a closure in the first case, as the lambda doesn't necessarily need to "capture" the referencing environment in all cases.
Reed's answer is correct; I would just add few additional details:
lambda expressions and anonymous methods both have closure semantics; that is, they "capture" their outer variables and extend the lifetimes of those variables.
anonymous function is the term we use when we mean a lambda expression or an anonymous method. Yes, that is confusing. Sorry. It was the best we could come up with.
a function that can be treated as an object is just a delegate. What makes a lambda a closure is that it captures its outer variables.
lambda expressions converted to expression trees also have closure semantics, interestingly enough. And implementing that correctly was a pain in the neck, I tell you!
"this" is considered an "outer variable" for the purpose of creating a closure even though "this" is not a variable.
It's "closure" not "clojure."
That is not what a closure is. A closure is basically a representation of a function together with any non-local variables that the function consumes.
In that sense, lambdas are not closures, but they do cause closures to be generated by the compiler if they close over any variables.
If you use ILDASM on an assembly that contains a lambda that closes over some variables, you will see in that assembly a compiler generated class that repsresents the function and those variables that were closed over. That is the closure.
When you say
functions that are treated as objects,
that's normally just "function object" (in C# we'd say "delegate") and is common in functional programming.
Yes. Closures typically capture variables from the outer scope. Lambdas can do that. However if your lambda does not capture anything, it is not a closure.

C# - Syntactic sugar for out parameters?

Let us say for a moment that C# allowed multiple return values in the most pure sense, where we would expect to see something like:
string sender = message.GetSender();
string receiver = message.GetReceiver();
compacted to:
string sender, receiver = message.GetParticipants();
In that case, I do not have to understand the return values of the method until I actually make the method call. Perhaps I rely on Intellisense to tell me what return value(s) I'm dealing with, or perhaps I'm searching for a method that returns what I want from a class I am unfamiliar with.
Similarly, we have something like this, currently, in C#:
string receiver;
string sender = message.GetParticipants(out receiver);
where the argument to GetParticipants is an out string parameter. However, this is a bit different than the above because it means I have to preempt with, or at least go back and write, code that creates a variable to hold the result of the out parameter. This is a little counterintuitive.
My question is, is there any syntactic sugar in current C#, that allows a developer to make this declaration in the same line as the method call? I think it would make development a (tiny) bit more fluid, and also make the code more readable if I were doing something like:
string sender = message.GetParicipants(out string receiver);
to show that receiver was being declared and assigned on the spot.
No, there isn't currently any syntactic sugar around this. I haven't heard of any intention to introduce any either.
I can't say I use out parameters often enough for it really to be a significant concern for me (there are other features I'd rather the C# team spent their time on) but I agree it's a bit annoying.
.NET 4 will be adding a Tuple concept, which deals with this. Unfortunately, the C# language isn't going to provide any language support for "destructuring bind".
Personally, I like the inconvience introduced when using out parameters. It helps me to think about whether my method is really doing what it should be or if I've crammed too much functionality into it. That said, perhaps dynamic typing in C#4.0/.Net 4 will address some of your concerns.
dynamic participant = message.GetParticipants();
var sender = participant.Sender;
var recipient = participant.Recipient;
where
public object GetParticipants()
{
return new { Sender = ..., Recipient = ... };
}
You can also return a Tuple<T,U> or something similar. However, since you want to return two string, it might get confusing.
I use the Tuples structs of the BclExtras library which is very handy (found it on SO, thank you JaredPar!).
I don't think such functionality exists, but if it were implemented in a way similar to arrays in perl that could be useful actually.
In perl You can assign an array to a list of variables in parentheses. So you can for example do this
($user, $password) = split(/:/,$data);
Where this bugs me the most: since there's no overload of (say) DateTime.TryParse that doesn't take an out parameter, you can't write
if (DateTime.TryParse(s, out d))
{
return new ValidationError("{0} isn't a valid date", s);
}
without declaring d. I don't know if this is a problem with out parameters or just with how the TryParse method is implemented, but it's annoying.
This syntactic sugar is now is now available in the roslyn preview as seen here (called Declaration expressions).
int.TryParse(s, out var x);
At best you would have to use var rather than an explicit type, unless you want to restrict all multiple return values to be of the same type (not likely practical). You would also be limiting the scope of the variable; currently you can declare a variable at a higher scope and initialize it in an out parameter. With this approach, the variable would go out of scope in the same block as its assignment. Obviously this is usable in some cases, but I wouldn't want to enforce this as the general rule. Obviously you could leave the 'out' option in place, but chances are people are going to code for one approach or the other.
I think this is not what you want.
You may have come across a piece of code where you would have
liked that. But variables popping out of nowhere because
they have been introduced in the parameter list would be
a personal nightmare ( to me :) )
Multiple return values have grave downsides from the point
of portability/maintainability. If you make a function that returns two strings
and you now want it to return three, you will have to change all the code
that uses this function.
A returned record type however usually plays nice in such common scenarios.
you may be opening pandora's box ;-)
For line compacting:
string s1, s2; s1 = foo.bar(s2);
Lines can be any length, so you could pack some common stuff into one.
Just try to live with the semicolons.
Try the following code
Participants p = message.GetParticipants();
log(p.sender,p.receiver);

When not to use lambda expressions [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
A lot of questions are being answered on Stack Overflow, with members specifying how to solve these real world/time problems using lambda expressions.
Are we overusing it, and are we considering the performance impact of using lambda expressions?
I found a few articles that explores the performance impact of lambda vs anonymous delegates vs for/foreach loops with different results
Anonymous Delegates vs Lambda Expressions vs Function Calls Performance
Performance of foreach vs. List.ForEach
.NET/C# Loop Performance Test (FOR, FOREACH, LINQ, & Lambda).
DataTable.Select is faster than LINQ
What should be the evaluation criteria when choosing the appropriate solution? Except for the obvious reason that it's more concise code and readable when using lambda.
Even though I will focus on point one, I begin by giving my 2 cents on the whole issue of performance. Unless differences are big or usage is intensive, usually I don't bother about microseconds that when added don't amount to any visible difference to the user. I emphasize that I only don't care when considering non-intensive called methods. Where I do have special performance considerations is on the way I design the application itself. I care about caching, about the use of threads, about clever ways to call methods (whether to make several calls or to try to make only one call), whether to pool connections or not, etc., etc. In fact I usually don't focus on raw performance, but on scalibility. I don't care if it runs better by a tiny slice of a nanosecond for a single user, but I care a lot to have the ability to load the system with big amounts of simultaneous users without noticing the impact.
Having said that, here goes my opinion about point 1. I love anonymous methods. They give me great flexibility and code elegance. The other great feature about anonymous methods is that they allow me to directly use local variables from the container method (from a C# perspective, not from an IL perspective, of course). They spare me loads of code oftentimes. When do I use anonymous methods? Evey single time the piece of code I need isn't needed elsewhere. If it is used in two different places, I don't like copy-paste as a reuse technique, so I'll use a plain ol' delegate. So, just like shoosh answered, it isn't good to have code duplication. In theory there are no performance differences as anonyms are C# tricks, not IL stuff.
Most of what I think about anonymous methods applies to lambda expressions, as the latter can be used as a compact syntax to represent anonymous methods. Let's assume the following method:
public static void DoSomethingMethod(string[] names, Func<string, bool> myExpression)
{
Console.WriteLine("Lambda used to represent an anonymous method");
foreach (var item in names)
{
if (myExpression(item))
Console.WriteLine("Found {0}", item);
}
}
It receives an array of strings and for each one of them, it will call the method passed to it. If that method returns true, it will say "Found...". You can call this method the following way:
string[] names = {"Alice", "Bob", "Charles"};
DoSomethingMethod(names, delegate(string p) { return p == "Alice"; });
But, you can also call it the following way:
DoSomethingMethod(names, p => p == "Alice");
There is no difference in IL between the both, being that the one using the Lambda expression is much more readable. Once again, there is no performance impact as these are all C# compiler tricks (not JIT compiler tricks). Just as I didn't feel we are overusing anonymous methods, I don't feel we are overusing Lambda expressions to represent anonymous methods. Of course, the same logic applies to repeated code: Don't do lambdas, use regular delegates. There are other restrictions leading you back to anonymous methods or plain delegates, like out or ref argument passing.
The other nice things about Lambda expressions is that the exact same syntax doesn't need to represent an anonymous method. Lambda expressions can also represent... you guessed, expressions. Take the following example:
public static void DoSomethingExpression(string[] names, System.Linq.Expressions.Expression<Func<string, bool>> myExpression)
{
Console.WriteLine("Lambda used to represent an expression");
BinaryExpression bExpr = myExpression.Body as BinaryExpression;
if (bExpr == null)
return;
Console.WriteLine("It is a binary expression");
Console.WriteLine("The node type is {0}", bExpr.NodeType.ToString());
Console.WriteLine("The left side is {0}", bExpr.Left.NodeType.ToString());
Console.WriteLine("The right side is {0}", bExpr.Right.NodeType.ToString());
if (bExpr.Right.NodeType == ExpressionType.Constant)
{
ConstantExpression right = (ConstantExpression)bExpr.Right;
Console.WriteLine("The value of the right side is {0}", right.Value.ToString());
}
}
Notice the slightly different signature. The second parameter receives an expression and not a delegate. The way to call this method would be:
DoSomethingExpression(names, p => p == "Alice");
Which is exactly the same as the call we made when creating an anonymous method with a lambda. The difference here is that we are not creating an anonymous method, but creating an expression tree. It is due to these expression trees that we can then translate lambda expressions to SQL, which is what Linq 2 SQL does, for instance, instead of executing stuff in the engine for each clause like the Where, the Select, etc. The nice thing is that the calling syntax is the same whether you're creating an anonymous method or sending an expression.
My answer will not be popular.
I believe Lambda's are 99% always the better choice for three reasons.
First, there is ABSOLUTELY nothing wrong with assuming your developers are smart. Other answers have an underlying premise that every developer but you is stupid. Not so.
Second, Lamdas (et al) are a modern syntax - and tomorrow they will be more commonplace than they already are today. Your project's code should flow from current and emerging conventions.
Third, writing code "the old fashioned way" might seem easier to you, but it's not easier to the compiler. This is important, legacy approaches have little opportunity to be improved as the compiler is rev'ed. Lambdas (et al) which rely on the compiler to expand them can benefit as the compiler deals with them better over time.
To sum up:
Developers can handle it
Everyone is doing it
There's future potential
Again, I know this will not be a popular answer. And believe me "Simple is Best" is my mantra, too. Maintenance is an important aspect to any source. I get it. But I think we are overshadowing reality with some cliché rules of thumb.
// Jerry
Code duplication.
If you find yourself writing the same anonymous function more than once, it shouldn't be one.
Well, when we are talking bout delegate usage, there shouldn't be any difference between lambda and anonymous methods -- they are the same, just with different syntax. And named methods (used as delegates) are also identical from the runtime's viewpoint. The difference, then, is between using delegates, vs. inline code - i.e.
list.ForEach(s=>s.Foo());
// vs.
foreach(var s in list) { s.Foo(); }
(where I would expect the latter to be quicker)
And equally, if you are talking about anything other than in-memory objects, lambdas are one of your most powerful tools in terms of maintaining type checking (rather than parsing strings all the time).
Certainly, there are cases when a simple foreach with code will be faster than the LINQ version, as there will be fewer invokes to do, and invokes cost a small but measurable time. However, in many cases, the code is simply not the bottleneck, and the simpler code (especially for grouping, etc) is worth a lot more than a few nanoseconds.
Note also that in .NET 4.0 there are additional Expression nodes for things like loops, commas, etc. The language doesn't support them, but the runtime does. I mention this only for completeness: I'm certainly not saying you should use manual Expression construction where foreach would do!
I'd say that the performance differences are usually so small (and in the case of loops, obviously, if you look at the results of the 2nd article (btw, Jon Skeet has a similar article here)) that you should almost never choose a solution for performance reasons alone, unless you are writing a piece of software where performance is absolutely the number one non-functional requirement and you really have to do micro-optimalizations.
When to choose what? I guess it depends on the situation but also the person. Just as an example, some people perfer List.Foreach over a normal foreach loop. I personally prefer the latter, as it is usually more readable, but who am I to argue against this?
Rules of thumb:
Write your code to be natural and readable.
Avoid code duplications (lambda expressions might require a little extra diligence).
Optimize only when there's a problem, and only with data to back up what that problem actually is.
Any time the lambda simply passes its arguments directly to another function. Don't create a lambda for function application.
Example:
var coll = new ObservableCollection<int>();
myInts.ForEach(x => coll.Add(x))
Is nicer as:
var coll = new ObservableCollection<int>();
myInts.ForEach(coll.Add)
The main exception is where C#'s type inference fails for whatever reason (and there are plenty of times that's true).
If you need recursion, don't use lambdas, or you'll end up getting very distracted!
Lambda expressions are cool. Over older delegate syntax they have a few advantages like, they can be converted to either anonymous function or expression trees, parameter types are inferred from the declaration, they are cleaner and more concise, etc. I see no real value to not use lambda expression when you're in need of an anonymous function. One not so big advantage the earlier style has is that you can omit the parameter declaration totally if they are not used. Like
Action<int> a = delegate { }; //takes one argument, but no argument specified
This is useful when you have to declare an empty delegate that does nothing, but it is not a strong reason enough to not use lambdas.
Lambdas let you write quick anonymous methods. Now that makes lambdas meaningless everywhere where anonymous methods go meaningless, ie where named methods make more sense. Over named methods, anonymous methods can be disadvantageous (not a lambda expression per se thing, but since these days lambdas widely represent anonymous methods it is relevant):
because it tend to lead to logic duplication (often does, reuse is difficult)
when it is unnecessary to write to one, like:
//this is unnecessary
Func<string, int> f = x => int.Parse(x);
//this is enough
Func<string, int> f = int.Parse;
since writing anonymous iterator block is impossible.
Func<IEnumerable<int>> f = () => { yield return 0; }; //impossible
since recursive lambdas require one more line of quirkiness, like
Func<int, int> f = null;
f = x => (x <= 1) ? 1 : x * f(x - 1);
well, since reflection is kinda messier, but that is moot isn't it?
Apart from point 3, the rest are not strong reasons not to use lambdas.
Also see this thread about what is disadvantageous about Func/Action delegates, since often they are used along with lambda expressions.

C# compiler and caching of local variables

EDIT: Oops - as rightly pointed out, there'd be no way to know whether the constructor for the class in question is sensitive to when or how many times it is called, or whether the object's state is changed during the method, so it would have to be created from scratch each time. Ignore the Dictionary and just consider delegates created in-line during the course of a method :-)
Say I have the following method with Dictionary of Type to Action local variable.
void TakeAction(Type type)
{
// Random types chosen for example.
var actions = new Dictionary<Type, Action>()
{
{typeof(StringBuilder), () =>
{
// ..
}},
{typeof(DateTime), () =>
{
// ..
}}
};
actions[type].Invoke();
}
The Dictionary will always be the same when the method is called. Can the C# compiler notice this, only create it once and cache it somewhere for use in future calls to the method? Or will it simply be created from scratch each time? I know it could be a field of the containing class, but it seems neater to me for a thing like this to be contained in the method that uses it.
How should the C# compiler know that it's "the same" dictionary every time? You explicitly create a new dictionary every time. C# does not support static local variables, so you have to use a field. There's nothing wrong with that, even if no other method uses the field.
It would be bad if the C# compiler did things like that. What if the constructor of the variable uses random input? :)
Short answer: no.
Slightly longer answer: I believe it will cache the result of creating a delegate from a lambda expression which doesn't capture anything (including "this") but that's a pretty special case.
Correct way to change your code: declare a private static readonly variable for the dictionary.
private static readonly Dictionary<Type,Action> Actions =
new Dictionary<Type, Action>()
{
{ typeof(StringBuilder), () => ... },
{ typeof(DateTime), () => ... },
}
void TakeAction(Type type)
{
Actions[type].Invoke();
}
For any compiler to be able to do this, it would have to have some way to have guarantees for the following issues:
Constructing two objects the exact same way produces identical objects, in any way, except for their location in memory. This would mean that the object constructed the second time would be no different from the first one, as opposed to say, caching an object of the Random typ.e
Interacting with the object does not change its state. This would mean that caching the object would be safe and would not change the behavior of subsequent calls. This would for instance rule out modifying the dictionary in any way.
The reason for this is that the compiler would have to be able to guarantee that the object it constructed the first time would be equally usable the next time around without having to recreate it.
Now, that C# and .NET does not have mechanisms for making these guarantees is probably not the reason why support for this kind of optimization isn't done by the compiler, but these would have to be implemented first. There could also be other such guarantees the compiler would need to have before it could do it that I don't know of.
The change that Jon Skeet has suggested is basically the way to say that I know that these two guarantees hold for my code, and thus you take control over the situation yourself.
That dictionary will be created anew each time; otherwise, for example, you could put other things into the dictionary, and the intent would be lost.

Categories