Related
Question
How does a delegate store a reference to a function? The source code appears to refer to it as an Object, and the manner in which it invokes the method seems redacted from the source code. Can anyone explain how C# is handling this?
Original Post
It seems I'm constantly fighting the abstractions C# imposes on its programmers. One that's been irking me is the obfuscation of Functions/Methods. As I understand it, all methods are in fact anonymous methods assigned to properties of a class. This is the reason why no function is prefixed by a datatype. For example...
void foo() { ... }
... would be written in Javascript as...
Function foo = function():void { ... };
In my experience, Anonymous functions are typically bad form, but here it's replete throughout the language standard. Because you cannot define a function with its datatype (and apparently the implication/handling is assumed by the compiler), how does one store a reference to a method if the type is never declared?
I'm trying very hard to avoid Delegates and its variants (Action & Func), both because...
it is another abstraction from what's actually happening
the unnecessary overhead required to instantiate these classes (which in turn carry their own pointers to the methods being called).
Looking at the source code for the Delegate.cs, it appears to refer to the reference of a function as simply Object (see lines 23-25).
If these really are objects, how are we calling them? According to the delegate.cs trail, it dead-ends on the following path:
Delegate.cs:DynamicInvoke() > DynamicInvokeImpl() > methodinfo.cs:UnsafeInvoke() > UnsafeInvokeInternal() > RuntimeMethodHandle.InvokeMethod() > runtimehandles.cs:InvokeMethod()
internal extern static object InvokeMethod(object target, object[] arguments, Signature sig, bool constructor);
This really doesn't explain how its invoked if indeed the method is an object. It feels as though this is not code at all, and the actual code called has been redacted from source repository.
Your help is appreciated.
Response to Previous Comments
#Amy: I gave an example immediately after that statement to explain what I meant. If a function were prefixed by a datatype, you could write a true anonymous function, and store it as a property to an Object such as:
private Dictionary<string, Function> ops = new Dictionary<string, Function> {
{"foo", int (int a, int b) { return a + b } }
};
As it stands, C# doesn't allow you to write true anonymous functions, and walls that functionality off behind Delegates and Lambda expressions.
#500 Internal server error: I already explained what I was trying to do. I even bolded it. You assume there's any ulterior motive here; I'm simply trying to understand how C# stores a reference to a method. I even provided links to the source code so that others could read the code for themselves and help answer the question.
#Dialecticus: Obviously if I already found the typical answer on Google, the only other place to find the answer I'm looking for would be here. I realize this is outside the knowledge of most C# developers, and that's why I've provided the source code links. You don't have to reply if you don't know the answer.
While I'm not fully understanding your insights about "true anonymous functions", "not prefixed by a data type" etc, I can explain you how applications written in C# call methods.
First of all, there is no such a thing "function" in C#. Each and every executable entity in C# is in fact a method, that means, it belongs to a class. Even if you define lambdas or anonymous functions like this:
collection.Where(item => item > 0);
the C# compiler creates a compiler-generated class behind the scenes and puts the lambda body return item > 0 into a compiler-generated method.
So assuming you have this code:
class Example
{
public static void StaticMethod() { }
public void InstanceMethod() { }
public Action Property { get; } = () => { };
}
static class Program
{
static void Main()
{
Example.StaticMethod();
var ex = new Example();
ex.InstanceMethod();
ex.Property();
}
}
The C# compiler will create an IL code out of that. The IL code is not executable right away, it needs to be run in a virtual machine.
The IL code will contain a class Example with two methods (actually, four - a default constructor and the property getter method will be automatically generated) and a compiler-generated class containing a method whose body is the body of the lambda expression.
The IL code of Main will look like this (simplified):
call void Example::StaticMethod()
newobj instance void Example::.ctor()
callvirt instance void Example::InstanceMethod()
callvirt instance class [mscorlib]System.Action Example::get_Prop()
callvirt instance void [mscorlib]System.Action::Invoke()
Notice those call and callvirt instructions: these are method calls.
To actually execute the called methods, their IL code needs to be compiled into machine code (CPU instructions). This occurs in the virtual machine called .NET Runtime. There are several of them like .NET Framework, .NET Core, Mono etc.
A .NET Runtime contains a JIT (just-in-time) compiler. It converts the IL code to the actually executable code during the execution of your program.
When the .NET Runtime first encounters the IL code "call method StaticMethod from class Example", it first looks in the internal cache of already compiled methods. When there are no matches (which means this is the first call of that method), the Runtime asks the JIT compiler to create such a compiled-and-ready-to-run method using the IL code. The IL code is converted into a sequence of CPU operations and stored in the process' memory. A pointer to that compiled code is stored in the cache for future reuse.
This all will happen behind the call or callvirt IL instructions (again, simplified).
Once this happened, the Runtime is ready to execute the method. The CPU gets the compiled code's first operation address as the next operation to execute and goes on until the code returns. Then, the Runtime takes over again and proceeds with next IL instructions.
The DynamicInvoke method of the delegates does the same thing: it instructs the Runtime to call a method (after some additional arguments checks etc). The "dead end" you mention RuntimeMethodHandle.InvokeMethod is an intrinsic call to the Runtime directly. The parameters of this method are:
object target - the object on which the delegate invokes the instance method (this parameter).
object[] arguments - the arguments to pass to the method.
Signature sig - the actual method to call, Signature is an internal class that provides the connection between the managed IL code and native executable code.
bool constructor - true if this is a constructor call.
So in summary, methods are not represented as objects in C# (while you of course can have a delegate instance that is an object, but it doesn't represent the executable method, it rather provides an invokable reference to it).
Methods are called by the Runtime, the JIT compiler makes the methods executable.
You cannot define a global "function" outside of classes in C#. You could get a direct native pointer to the compiled (jitted) method code and probably even call it manually by directly manipulating own process' memory. But why?
You clearly misunderstand main differences between script languages, C/C++ and C#.
I guess the main difficulty is that there is no such thing as a function in C#. At all.
C#7 introduced the new feature "a local function", but that is not what a function in JS is.
All pieces of code are methods.
That name is intentionally different from function or a procedure to emphasize the fact that all executable code in C# belongs to a class.
Anonymous methods and lambdas are just a syntax sugar.
A compiler will generate a real method in the same (or a nested) class, where the method with anonymous method declaration belongs to.
This simple article explains it. You can take the examples, compile them and check the generated IL code yourself.
So all the methods (anonymous or not) do belong to a class. It's impossible to answer your updated question, besides saying It does not store a reference to a function, as there is no such thing in C#.
How does one store a reference to a method?
Depending on what you mean by reference, it can be either
An instance of MethodInfo class, used to reference reflection information for a method,
RuntimeMethodHandle (obtainable via RuntimeMethodInfo.MethodHandle) stores a real memory pointer to a JITed method code
A delegate, that is very different from just a memory pointer, but logically could be used to "pass a method reference to another method" .
I believe you are looking for the MethodInfo option, it has a MethodInfo.Invoke method which is very much alike Function..apply function in JS. You have already seen in the Delegate source code how that class is used.
If by "reference" you mean the C-style function pointer, it is in RuntimeMethodHandle struct. You should never use it without solid understanding how a particular .Net platform implementation and a C# compiler work.
Hopefully it clarifies things a bit.
A delegate is simply a pointer(memory location to jump to) to a method with the specified parameters and return type. Any Method that matches the signature(Parameters and return type) is eligible to fulfill the role, irrespective of the defined object. Anonymous simply means the delegate is not named.
Most times the type is implied(if it is not you will get a compiler error):
C# is a strongly typed language. That means every expression (including delegates) MUST have a return type(including void) as well as strongly typed parameters(if any). Generics were created to permit explicit types to be used within general contexts, such as Lists.
To put it another way, delegates are the type-safe managed version of C++ callbacks.
Delegates are helpful in eliminating switch statements by allowing the code to jump to the proper handler without testing any conditions.
A delegate is similar to a Closure in Javascript terminology.
In your response to Amy, you are attempting to equate a loosely typed language like JS, and a strongly typed language C#. In C# it is not possible to pass an arbitrary(loosely-typed) function anywhere. Lambdas and delegates are the only way to guarantee type safety.
I would recommend trying F#, if you are looking to pass functions around.
EDIT:
If you are trying to mimic the behavior of Javascipt, I would try looking at using inheritance through Interfaces. I can mimic multiple inheritance, and be type safe at the same time. But, be aware that it cannot fully supplant Javascript's dependency injection model.
As you probably found out C# doesn't have the concept of a function as in your JavaScript example.
C# is a statically typed language and the only way you can use function pointers is by using the built in types (Func,Action) or custom delegates.(I'm talking about safe,strongly typed pointers)
Javascript is a dynamic language that's why you can do what you describe
If you are willing to lose type safety, you can use the "dynamic" features of C# or refection to achieve what you want like in the following examples (Don't do this,use Func/Action)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
namespace ConsoleApp1
{
class Program
{
private static Dictionary<string, Func<int, int, int>> FuncOps = new Dictionary<string, Func<int, int, int>>
{
{"add", (a, b) => a + b},
{"subtract", (a, b) => a - b}
};
//There are no anonymous delegates
//private static Dictionary<string, delegate> DelecateOps = new Dictionary<string, delegate>
//{
// {"add", delegate {} }
//};
private static Dictionary<string, dynamic> DynamicOps = new Dictionary<string, dynamic>
{
{"add", new Func<int, int, int>((a, b) => a + b)},
{"subtract", new Func<int, int, int>((a, b) => a - b)},
{"inverse", new Func<int, int>((a) => -a )} //Can't do this with Func
};
private static Dictionary<string, MethodInfo> ReflectionOps = new Dictionary<string, MethodInfo>
{
{"abs", typeof(Math).GetMethods().Single(m => m.Name == "Abs" && m.ReturnParameter.ParameterType == typeof(int))}
};
static void Main(string[] args)
{
Console.WriteLine(FuncOps["add"](3, 2));//5
Console.WriteLine(FuncOps["subtract"](3, 2));//1
Console.WriteLine(DynamicOps["add"](3, 2));//5
Console.WriteLine(DynamicOps["subtract"](3, 2));//1
Console.WriteLine(DynamicOps["inverse"](3));//-3
Console.WriteLine(ReflectionOps["abs"].Invoke(null, new object[] { -1 }));//1
Console.ReadLine();
}
}
}
one more example that you shouldn't use
delegate object CustomFunc(params object[] paramaters);
private static Dictionary<string, CustomFunc> CustomParamsOps = new Dictionary<string, CustomFunc>
{
{"add", parameters => (int) parameters[0] + (int) parameters[1]},
{"subtract", parameters => (int) parameters[0] - (int) parameters[1]},
{"inverse", parameters => -((int) parameters[0])}
};
Console.WriteLine(CustomParamsOps["add"](3, 2)); //5
Console.WriteLine(CustomParamsOps["subtract"](3, 2)); //1
Console.WriteLine(CustomParamsOps["inverse"](3)); //-3
I will provide a really short and simplified answer compared to the others. Everything in C# (classes, variables, properties, structs, etc) has a backed with tons of things your programs can hook into. This network of backend stuff slightly lowers the speed of C# when compared to "deeper" languages like C++, but also gives programmers a lot more tools to work with and makes the language easier to use. In this backend is included things like "garbage collection," which is a feature that automatically deletes objects from memory when there are no variables left that reference them. Speaking of reference, the whole system of passing objects by reference, which is default in C#, is also managed in the backend. In C#, Delegates are possible because of features in this backend that allow for something called "reflection."
From Wikipedia:
Reflection is the ability of a computer program to examine,
introspect, and modify its own structure and behavior at runtime.
So when C# compiles and it finds a Delegate, it is just going to make a function, and then store a reflective reference to that function in the variable, allowing you to pass it around and do all sorts of cool stuff with it. You aren't actually storing the function itself in the variable though, you are storing a reference, which is kinda like an address that points you to where the function is stored in RAM.
I’m trying to pass a COM object from C# code to Perl.
At the moment I’m wrapping my Perl code with PerlNET (PDK 9.4; ActiveState) and I have defined a simple subroutine (+ required pod declaration) in Perl to pass objects from C# to the wrapped Perl module.
It seems that the objects I pass are not recognized correctly as COM objects.
An example:
In C# (.NET 4.0), the ScriptControl is used to load a simple class from a file written in VBScript.
var host = new ScriptControl();
host.Language = "VBScript";
var text = File.ReadAllText("TestScript.vbs");
host.AddCode(text);
dynamic obj = host.Run("GetTestClass");
What I get (obj) is of type System.__ComObject. When I pass it to my Perl/PerlNET assembly and try to call method Xyz() in Perl I get the following (runtime) exception:
Can't locate public method Xyz() for System.__ComObject
If, however, I do more or less the same thing in Perl, it works. (In the following case, passing only the contents of my .vbs file as parameter.)
I can even use the script control :
sub UseScriptControl {
my ($self, $text) = #_;
my $script = Win32::OLE->new('ScriptControl');
$script->{Language} = 'VBScript';
$script->AddCode($text);
my $obj = $script->Run('GetTestClass');
$obj->Xyz();
}
Now, calling Xyz() on obj works fine (using Win32::OLE).
In both cases I use:
use strict;
use Win32;
use Win32::OLE::Variant;
Another approach:
I can invoke methods by using InvokeMember of class System.Type if I specify exactly which overload I want to use and which types I’m passing:
use PerlNET qw(typeof);
typeof($obj)->InvokeMember("Xyz",
PerlNET::enum("System.Reflection.BindingFlags.InvokeMethod"),
PerlNET::null("System.Reflection.Binder"),
$obj,
"System.Object[]"->new());
Using this approach would mean rewriting the whole wrapped Perl module. And using this syntax..
Now I am wondering if I am losing both the advantages of the dynamic keyword in .NET 4.0 and the dynamic characteristics of Perl (with Win32::OLE) by using PerlNET with COM objects.
It seems like my preferred solution boils down to some way of mimicking the behaviour of the dynamic keyword in C#/.NET 4.0.
Or, better, finding some way of converting the passed COM object to something that will be recognized as compatible with Win32::OLE. Maybe extract some information of the __ComObject for it to be identified correctly as COM object.
I have to add that I posted to the PDK discussion site too (but didn’t get any response yet): http://community.activestate.com/node/18247
I also posted it to PerlMonks - as I'm not quite sure if this is more a Perl or C#/.NET question:
http://www.perlmonks.org/?node_id=1146244
I would greatly appreciate any help - or advise on where to look further.
I am trying to learn how to write a simple scripting language on top of DLR, by playing with a very old DLR example called ToyScript. However ToyScript does not seem to support the following structure of a script, which I would like to use in my implementation :
print b()
def b() {
return 1
}
It raises an exception, exactly as in most statically compiled languages.'
If the script follows a "static languages paradigm" :
def b() {
return 1
}
print b()
ToyScript works without problems.
My question is : how the former should be done in DLR ?
[Obviously I am looking for a description of a solution, and not for a solution itself :)]
There are a few possible implementations. The first is to require an execution to create a function. With this way, you cannot invoke a function before a function is created with an execution. The second way is to create the all the functions when you parse the code and execute the global scripts. With this way, function declaration can appear anywhere in the code because the functions are already created before any execution. The draw back is that you need to create all the functions no matter you invoke them or not. Then there is an in-between way; when you parse the code for the first time, you store the abstract syntax tree (AST) of the functions in a function table. Then when you want to invoke a function, look for the function declaration in the function table and then compile or interpret from the AST. Compare the following two JavaScript snippets and you will have a good idea.
console.log(b());
function b() {
return 1;
}
and
console.log(b());
var b = function() {
return 1;
}
public void Finalise()
ProcessFinalisation(true);
Doesn't compile, but the correct version:
public void Finalise()
{
ProcessFinalisation(true);
}
Compiles fine (of course).
If I am allowed if's without brackets when the following code has only one line:
if(true)
CallMethod();
Why is the same not allowed for methods with one following line? Is there a technical reason?
The obvious answer is the language spec; for reasoning... I guess mainly simplicity - it just wasn't worth the overhead of sanity-checking the spec and compiler for the tiny tiny number of single-statement methods. In particular, I can potentially see issues with generic constraints, etc (i.e. where T : IBlah, new() on the end of the signature).
Note that not using the braces can sometimes lead to ambiguities, and in some places is frowned upon. I'm a bit more pragmatic than that personally, but each to their own.
It might also be of interest that C# inside razor does not allow usage without explicit braces. At all (i.e. even for if etc).
Marc is basically right. To expand on his answer a bit: there are a number of places where C# requires a braced block of statements rather than allowing a "naked" statement. They are:
the body of a method, constructor, destructor, property accessor, event accessor or indexer accessor.
the block of a try, catch, finally, checked, unchecked or unsafe region.
the block of a statement lambda or anonymous method
the block of an if or loop statement if the block directly contains a local variable declaration. (That is, "while (x != 10) int y = 123;" is illegal; you've got to brace the declaration.)
In each of these cases it would be possible to come up with an unambiguous grammar (or heuristics to disambiguate an ambiguous grammar) for the feature where a single unbraced statement is legal. But what would the point be? In each of those situations you are expecting to see multiple statements; single statements are the rare, unlikely case. It seems like it is not realy worth it to make the grammar unambiguous for these very unlikely cases.
Since C# 6.0 you can declare:
void WriteToConsole(string word) => Console.WriteLine(word)
And then call it as usual:
public static void Main()
{
var word = "Hello World!";
WriteToConsole(word);
}
Short answer: C# is styled after C, and C mandates that functions be braced because of how C function declarations used to be.
Long version with history: Back in K&R C, functions were declared like this:
int function(arg1, arg2)
int arg1;
int arg2;
{ code }
Of course you couldn't have unbraced functions in that arrangement. ANSI C mandated the syntax we all know and love:
int function(int arg1, int arg2)
{ code }
but didn't allow unbraced functions, because they would cause havoc with older compilers that only knew K&R syntax [and support for K&R declarations was still required].
Time went on, and years later C# was designed around C [or C++, same difference in terms of syntax] and, since C didn't allow unbraced functions, neither did C#.
Are these two essentially the same thing? They look very similar to me.
Did lambda expression borrow its idea from Ruby?
Ruby actually has 4 constructs that are all extremely similar
The Block
The idea behind blocks is sort of a way to implement really light weight strategy patterns. A block will define a coroutine on the function, which the function can delegate control to with the yield keyword. We use blocks for just about everything in ruby, including pretty much all the looping constructs or anywhere you would use using in c#. Anything outside the block is in scope for the block, however the inverse is not true, with the exception that return inside the block will return the outer scope. They look like this
def foo
yield 'called foo'
end
#usage
foo {|msg| puts msg} #idiomatic for one liners
foo do |msg| #idiomatic for multiline blocks
puts msg
end
Proc
A proc is basically taking a block and passing it around as a parameter. One extremely interesting use of this is that you can pass a proc in as a replacement for a block in another method. Ruby has a special character for proc coercion which is &, and a special rule that if the last param in a method signature starts with an &, it will be a proc representation of the block for the method call. Finally, there is a builtin method called block_given?, which will return true if the current method has a block defined. It looks like this
def foo(&block)
return block
end
b = foo {puts 'hi'}
b.call # hi
To go a little deeper with this, there is a really neat trick that rails added to Symbol (and got merged into core ruby in 1.9). Basically, that & coercion does its magic by calling to_proc on whatever it is next to. So the rails guys added a Symbol#to_proc that would call itself on whatever is passed in. That lets you write some really terse code for any aggregation style function that is just calling a method on every object in a list
class Foo
def bar
'this is from bar'
end
end
list = [Foo.new, Foo.new, Foo.new]
list.map {|foo| foo.bar} # returns ['this is from bar', 'this is from bar', 'this is from bar']
list.map &:bar # returns _exactly_ the same thing
More advanced stuff, but imo that really illustrates the sort of magic you can do with procs
Lambdas
The purpose of a lambda is pretty much the same in ruby as it is in c#, a way to create an inline function to either pass around, or use internally. Like blocks and procs, lambdas are closures, but unlike the first two it enforces arity, and return from a lambda exits the lambda, not the containing scope. You create one by passing a block to the lambda method, or to -> in ruby 1.9
l = lambda {|msg| puts msg} #ruby 1.8
l = -> {|msg| puts msg} #ruby 1.9
l.call('foo') # => foo
Methods
Only serious ruby geeks really understand this one :) A method is a way to turn an existing function into something you can put in a variable. You get a method by calling the method function, and passing in a symbol as the method name. You can re bind a method, or you can coerce it into a proc if you want to show off. A way to re-write the previous method would be
l = lambda &method(:puts)
l.call('foo')
What is happening here is that you are creating a method for puts, coercing it into a proc, passing that in as a replacement for a block for the lambda method, which in turn returns you the lambda
Feel free to ask about anything that isn't clear (writing this really late on a weeknight without an irb, hopefully it isn't pure gibberish)
EDIT: To address questions in the comments
list.map &:bar Can I use this syntax
with a code block that takes more than
one argument? Say I have hash = { 0 =>
"hello", 1 => "world" }, and I want to
select the elements that has 0 as the
key. Maybe not a good example. – Bryan
Shen
Gonna go kind of deep here, but to really understand how it works you need to understand how ruby method calls work.
Basically, ruby doesn't have a concept of invoking a method, what happens is that objects pass messages to each other. The obj.method arg syntax you use is really just sugar around the more explicit form, which is obj.send :method, arg, and is functionally equivalent to the first syntax. This is a fundamental concept in the language, and is why things like method_missing and respond_to? make sense, in the first case you are just handling an unrecognized message, the second you are checking to see if it is listening for that message.
The other thing to know is the rather esoteric "splat" operator, *. Depending on where its used, it actually does very different things.
def foo(bar, *baz)
In a method call, if it is the last parameter, splat will make that parameter glob up all additional parameters passed in to the function (sort of like params in C#)
obj.foo(bar, *[biz, baz])
When in a method call (or anything else that takes argument lists), it will turn an array into a bare argument list. The snippet below is equivilent to the snippet above.
obj.foo(bar, biz, baz)
Now, with send and * in mind, Symbol#to_proc is basically implemented like this
class Symbol
def to_proc
Proc.new { |obj, *args| obj.send(self, *args) }
end
end
So, &:sym is going to make a new proc, that calls .send :sym on the first argument passed to it. If any additional args are passed, they are globbed up into an array called args, and then splatted into the send method call.
I notice that & is used in three
places: def foo(&block), list.map
&:bar, and l = lambda &method(:puts).
Do they share the same meaning? –
Bryan Shen
Yes, they do. An & will call to_proc on what ever it is beside. In the case of the method definition it has a special meaning when on the last parameter, where you are pulling in the co-routine defined as a block, and turning that into a proc. Method definitions are actually one of the most complex parts of the language, there are a huge amount of tricks and special meanings that can be in the parameters, and the placement of the parameters.
b = {0 => "df", 1 => "kl"} p b.select
{|key, value| key.zero? } I tried to
transform this to p b.select &:zero?,
but it failed. I guess that's because
the number of parameters for the code
block is two, but &:zero? can only
take one param. Is there any way I can
do that? – Bryan Shen
This should be addressed earlier, unfortunately you can't do it with this trick.
"A method is a way to turn an existing
function into something you can put in
a variable." why is l = method(:puts)
not sufficient? What does lambda &
mean in this context? – Bryan Shen
That example was exceptionally contrived, I just wanted to show equivalent code to the example before it, where I was passing a proc to the lambda method. I will take some time later and re-write that bit, but you are correct, method(:puts) is totally sufficient. What I was trying to show is that you can use &method(:puts) anywhere that would take a block. A better example would be this
['hello', 'world'].each &method(:puts) # => hello\nworld
l = -> {|msg| puts msg} #ruby 1.9:
this doesn't work for me. After I
checked Jörg's answer, I think it
should be l = -> (msg) {puts msg}. Or
maybe i'm using an incorrect version
of Ruby? Mine is ruby 1.9.1p738 –
Bryan Shen
Like I said in the post, I didn't have an irb available when I was writing the answer, and you are right, I goofed that (spend the vast majority of my time in 1.8.7, so I am not used to the new syntax yet)
There is no space between the stabby bit and the parens. Try l = ->(msg) {puts msg}. There was actually a lot of resistance to this syntax, since it is so different from everything else in the language.
C# vs. Ruby
Are these two essentially the same thing? They look very similar to me.
They are very different.
First off, lambdas in C# do two very different things, only one of which has an equivalent in Ruby. (And that equivalent is, surprise, lambdas, not blocks.)
In C#, lambda expression literals are overloaded. (Interestingly, they are the only overloaded literals, as far as I know.) And they are overloaded on their result type. (Again, they are the only thing in C# that can be overloaded on its result type, methods can only be overloaded on their argument types.)
C# lambda expression literals can either be an anonymous piece of executable code or an abstract representation of an anonymous piece of executable code, depending on whether their result type is Func / Action or Expression.
Ruby doesn't have any equivalent for the latter functionality (well, there are interpreter-specific non-portable non-standardized extensions). And the equivalent for the former functionality is a lambda, not a block.
The Ruby syntax for a lambda is very similar to C#:
->(x, y) { x + y } # Ruby
(x, y) => { return x + y; } // C#
In C#, you can drop the return, the semicolon and the curly braces if you only have a single expression as the body:
->(x, y) { x + y } # Ruby
(x, y) => x + y // C#
You can leave off the parentheses if you have only one parameter:
-> x { x } # Ruby
x => x // C#
In Ruby, you can leave off the parameter list if it is empty:
-> { 42 } # Ruby
() => 42 // C#
An alternative to using the literal lambda syntax in Ruby is to pass a block argument to the Kernel#lambda method:
->(x, y) { x + y }
lambda {|x, y| x + y } # same thing
The main difference between those two is that you don't know what lambda does, since it could be overridden, overwritten, wrapped or otherwise modified, whereas the behavior of literals cannot be modified in Ruby.
In Ruby 1.8, you can also use Kernel#proc although you should probably avoid that since that method does something different in 1.9.
Another difference between Ruby and C# is the syntax for calling a lambda:
l.() # Ruby
l() // C#
I.e. in C#, you use the same syntax for calling a lambda that you would use for calling anything else, whereas in Ruby, the syntax for calling a method is different from the syntax for calling any other kind of callable object.
Another difference is that in C#, () is built into the language and is only available for certain builtin types like methods, delegates, Actions and Funcs, whereas in Ruby, .() is simply syntactic sugar for .call() and can thus be made to work with any object by just implementing a call method.
procs vs. lambdas
So, what are lambdas exactly? Well, they are instances of the Proc class. Except there's a slight complication: there are actually two different kinds of instances of the Proc class which are subtly different. (IMHO, the Proc class should be split into two classes for the two different kinds of objects.)
In particular, not all Procs are lambdas. You can check whether a Proc is a lambda by calling the Proc#lambda? method. (The usual convention is to call lambda Procs "lambdas" and non-lambda Procs just "procs".)
Non-lambda procs are created by passing a block to Proc.new or to Kernel#proc. However, note that before Ruby 1.9, Kernel#proc creates a lambda, not a proc.
What's the difference? Basically, lambdas behave more like methods, procs behave more like blocks.
If you have followed some of the discussions on the Project Lambda for Java 8 mailinglists, you might have encountered the problem that it is not at all clear how non-local control-flow should behave with lambdas. In particular, there are three possible sensible behaviors for return (well, three possible but only two are really sensible) in a lambda:
return from the lambda
return from the method the lambda was called from
return from the method the lambda was created in
That last one is a bit iffy, since in general the method will have already returned, but the other two both make perfect sense, and neither is more right or more obvious than the other. The current state of Project Lambda for Java 8 is that they use two different keywords (return and yield). Ruby uses the two different kinds of Procs:
procs return from the calling method (just like blocks)
lambdas return from the lambda (just like methods)
They also differ in how they handle argument binding. Again, lambdas behave more like methods and procs behave more like blocks:
you can pass more arguments to a proc than there are parameters, in which case the excess arguments will be ignored
you can pass less arguments to a proc than there are parameters, in which case the excess parameters will be bound to nil
if you pass a single argument which is an Array (or responds to to_ary) and the proc has multiple parameters, the array will be unpacked and the elements bound to the parameters (exactly like they would in the case of destructuring assignment)
Blocks: lightweight procs
A block is essentially a lightweight proc. Every method in Ruby has exactly one block parameter, which does not actually appear in its parameter list (more on that later), i.e. is implicit. This means that on every method call you can pass a block argument, whether the method expects it or not.
Since the block doesn't appear in the parameter list, there is no name you can use to refer to it. So, how do you use it? Well, the only two things you can do (not really, but more on that later) is call it implicitly via the yield keyword and check whether a block was passed via block_given?. (Since there is no name, you cannot use the call or nil? methods. What would you call them on?)
Most Ruby implementations implement blocks in a very lightweight manner. In particular, they don't actually implement them as objects. However, since they have no name, you cannot refer to them, so it's actually impossible to tell whether they are objects or not. You can just think of them as procs, which makes it easier since there is one less different concept to keep in mind. Just treat the fact that they aren't actually implemented as blocks as a compiler optimization.
to_proc and &
There is actually a way to refer to a block: the & sigil / modifier / unary prefix operator. It can only appear in parameter lists and argument lists.
In a parameter list, it means "wrap up the implicit block into a proc and bind it to this name". In an argument list, it means "unwrap this Proc into a block".
def foo(&bar)
end
Inside the method, bar is now bound to a proc object that represents the block. This means for example that you can store it in an instance variable for later use.
baz(&quux)
In this case, baz is actually a method which takes zero arguments. But of course it takes the implicit block argument which all Ruby methods take. We are passing the contents of the variable quux, but unroll it into a block first.
This "unrolling" actually works not just for Procs. & calls to_proc on the object first, to convert it to a proc. That way, any object can be converted into a block.
The most widely used example is Symbol#to_proc, which first appeared sometime during the late 90s, I believe. It became popular when it was added to ActiveSupport from where it spread to Facets and other extension libraries. Finally, it was added to the Ruby 1.9 core library and backported to 1.8.7. It's pretty simple:
class Symbol
def to_proc
->(recv, *args) { recv.send self, *args }
end
end
%w[Hello StackOverflow].map(&:length) # => [5, 13]
Or, if you interpret classes as functions for creating objects, you can do something like this:
class Class
def to_proc
-> *args { new *args }
end
end
[1, 2, 3].map(&Array) # => [[nil], [nil, nil], [nil, nil, nil]]
Methods and UnboundMethods
Another class to represent a piece of executable code, is the Method class. Method objects are reified proxies for methods. You can create a Method object by calling Object#method on any object and passing the name of the method you want to reify:
m = 'Hello'.method(:length)
m.() #=> 5
or using the method reference operator .::
m = 'Hello'.:length
m.() #=> 5
Methods respond to to_proc, so you can pass them anywhere you could pass a block:
[1, 2, 3].each(&method(:puts))
# 1
# 2
# 3
An UnboundMethod is a proxy for a method that hasn't been bound to a receiver yet, i.e. a method for which self hasn't been defined yet. You cannot call an UnboundMethod, but you can bind it to an object (which must be an instance of the module you got the method from), which will convert it to a Method.
UnboundMethod objects are created by calling one of the methods from the Module#instance_method family, passing the name of the method as an argument.
u = String.instance_method(:length)
u.()
# NoMethodError: undefined method `call' for #<UnboundMethod: String#length>
u.bind(42)
# TypeError: bind argument must be an instance of String
u.bind('Hello').() # => 5
Generalized callable objects
Like I already hinted at above: there's not much special about Procs and Methods. Any object that responds to call can be called and any object that responds to to_proc can be converted to a Proc and thus unwrapped into a block and passed to a method which expects a block.
History
Did lambda expression borrow its idea from Ruby?
Probably not. Most modern programming languages have some form of anonymous literal block of code: Lisp (1958), Scheme, Smalltalk (1974), Perl, Python, ECMAScript, Ruby, Scala, Haskell, C++, D, Objective-C, even PHP(!). And of course, the whole idea goes back to Alonzo Church's λ-calculus (1935 and even earlier).
Not exactly. But they're very similar. The most obvious difference is that in C# a lambda expression can go anywhere where you might have a value that happens to be a function; in Ruby you only have one code block per method call.
They both borrowed the idea from Lisp (a programming language dating back to the late 1950s) which in turn borrowed the lambda concept from Church's Lambda Calculus, invented in the 1930s.