Related
Can someone please break down what a delegate is into a simple, short and terse explanation that encompasses both the purpose and general benefits? I've tried to wrap my head around this and it's just not sinking in.
I have a function:
public long GiveMeTwoTimesTwo()
{
return 2 * 2;
}
This function sucks. What if I want 3 * 3?
public long GiveMeThreeTimesThree()
{
return 3 * 3;
}
Too much typing. I'm lazy!
public long SquareOf(int n)
{
return n * n;
}
My SquareOf function doesn't care what n is. It will operate properly for any n passed in. It doesn't know exactly what number n is, but it does know that n is an integer. You can't pass "Haha not an integer" into SquareOf.
Here's another function:
public void DoSomethingRad()
{
int x = 4;
long y = SquareOf(x);
Console.WriteLine(y);
}
Contrary to its name, DoSomethingRad doesn't actually do anything rad. However, it does write the SquareOf(4) which is 16. Can we change it to be less boring?
public void DoSomethingRad(int numberToSquare)
{
long y = SquareOf(numberToSquare);
Console.WriteLine(y);
}
DoSomethingRad is clearly still pretty fail. But at least now we can pass in a number to square, so it won't write 16 every time. (It'll write 1, or 4, or 9, or 16, or... zzzz still kinda boring).
It'd be nice if there was a way to change what happens to the number passed in. Maybe we don't want to square it; maybe we want to cube it, or subtract it from 69 (number chosen at random from my head).
On further inspection, it seems as though the only part of SquareOf that DoSomethingRad cares about is that we can give it an integer (numberToSquare) and that it gives us a long (because we put its return value in y and y is a long).
public long CubeOf(int n)
{
return n * n * n;
}
public void DoSomethingLeet(int numberToSquare)
{
long y = CubeOf(numberToSquare);
Console.WriteLine(y);
}
See how similar DoSomethingLeet is to DoSomethingRad? If only there was a way to pass in behavior (DoX()) instead of just data (int n)...
So now if we want to write a square of a number, we can DoSomethingRad and if we want to write the cube of a number, we can DoSomethingLeet. So if we want to write the number subtracted from 69, do we have to make another method, DoSomethingCool? No, because that takes too damn much typing (and more importantly, it hinders our ability to alter interesting behavior by changing only one aspect of our program).
So we arrive at:
public long Radlicious(int doSomethingToMe, Func<int, long> doSomething)
{
long y = doSomething(doSomethingToMe);
Console.WriteLine(y);
}
We can call this method by writing this:
Radlicious(77, SquareOf);
Func<int, long> is a special kind of delegate. It stores behavior that accepts integers and spits out longs. We're not sure what the method it points to is going to do with any given integer we pass; all we know is that, whatever happens, we are going to get a long back.
We don't have to give any parameters to SquareOf because Func<int, long> describes behavior, not data. Calling Radlicious(77, SquareOf) just gives Radlicious the general behavior of SquareOf ("I take a number and return its square"), not what SquareOf will do to any specific integer.
Now if you have understood what I am saying, then you have already one-upped me, for I myself don't really get this stuff.
* END ANSWER, BEGIN WANDERING IDIOCY *
I mean, it seems like ints could be perceived as just really boring behavior:
static int Nine()
{
return 9;
}
That said, the line between what is data and behavior appears to blur, with what is normally perceived as data is simply boring-ass behavior.
Of course, one could imagine super "interesting" behavior, that takes all sorts of abstract parameters, but requires a ton of information to be able to call it. What if it required us to provide the source code that it would compile and run for us?
Well, then our abstraction seems to have gotten us all the way back to square one. We have behavior so abstract it requires the entire source code of our program to determine what it's going to do. This is fully indeterminate behavior: the function can do anything, but it has to be provided with everything to determine what it does. On the other hand, fully determinate behavior, such as Nine(), doesn't need any additional information, but can't do anything other than return 9.
So what? I don't know.
In the simplest possible terms, it's essentially a pointer to a method.
You can have a variable that holds a delegate type (just like you would have an int variable that can hold an int type). You can execute the method that the delegate points to by simply calling your variable like a function.
This allows you to have variable functions just like you might have variable data. Your object can accept delegates from other objects and call them, without having to define all the possible functions itself.
This comes in very handy when you want an object to do things based on user specified criteria. For example, filtering a list based on a user-defined true/false expression. You can let the user specify the delegate function to use as a filter to evaluate each list item against.
A delegate is a pointer to a method. You can then use your delegate as a parameter of other methods.
here is a link to a simple tutorial.
The question I had was 'So, why would I want to do that?' You won't really 'get it' until you solve a programming problem with them.
It's interesting that no-one has mentioned one of key benefits of delegation - it's preferable to sub-classing when you realise that inheritance is not a magic bullet and usually creates more problems than it solves. It is the basis of many design patterns, most notably the strategy pattern.
A delegate instance is a reference to a method. The reason they are useful is that you can create a delegate that is tied to a particular method on a particular instance of a type. The delegate instance allows you to invoke that method on that particular instance even if the object on which you will invoke the method has left your lexical scope.
The most common use for delegate instances like this is to support the concept of callbacks at the language level.
It simply references a method. They come in great use with working with cross threading.
Here is an example right out of my code.
//Start our advertisiment thread
rotator = new Thread(initRotate);
rotator.Priority = ThreadPriority.Lowest;
rotator.Start();
#region Ad Rotation
private delegate void ad();
private void initRotate()
{
ad ad = new ad(adHelper);
while (true)
{
this.Invoke(ad);
Thread.Sleep(30000);
}
}
private void adHelper()
{
List<string> tmp = Lobby.AdRotator.RotateAd();
picBanner.ImageLocation = #tmp[0].ToString();
picBanner.Tag = tmp[1].ToString();
}
#endregion
If you didnt use a delegate you wouldn't be able to crossthread and call the Lobby.AdRotator function.
Like others have said, a delegate is a reference to a function. One of the more beneficial uses(IMO) is events. When you register an event you register a function for the event to invoke, and delegates are perfect for this task.
In the most basic terms, a delegate is just a variable that contains (a reference to) a function. Delegates are useful because they allow you to pass a function around as a variable without any concern for "where" the function actually came from.
It's important to note, of course, that the function isn't being copied when it's being bundled up in a variable; it's just being bound by reference. For example:
class Foo
{
public string Bar
{
get;
set;
}
public void Baz()
{
Console.WriteLine(Bar);
}
}
Foo foo = new Foo();
Action someDelegate = foo.Baz;
// Produces "Hello, world".
foo.Bar = "Hello, world";
someDelegate();
In most simplest terms, the responsibility to execute a method is delegated to another object. Say the president of some nation dies and the president of USA is supposed to be present for the funeral with condolences message. If the president of USA is not able to go, he will delegate this responsibility to someone either the vice-president or the secretary of the state.
Same goes in code. A delegate is a type, it is an object which is capable of executing the method.
eg.
Class Person
{
public string GetPersonName(Person person)
{
return person.FirstName + person.LastName;
}
//Calling the method without the use of delegate
public void PrintName()
{
Console.WriteLine(GetPersonName(this));
}
//using delegate
//Declare delegate which matches the methods signature
public delegate string personNameDelegate(Person person);
public void PrintNameUsingDelegate()
{
//instantiate
personNameDelegate = new personNameDelegate(GetPersonName);
//invoke
personNameDelegate(this);
}
}
The GetPersonName method is called using the delegate object personNameDelegate.
Alternatively we can have the PrintNameUsingDelegate method to take a delegate as a parameter.
public void PrintNameUsingDelegate(personNameDelegate pnd, Person person)
{
pnd(person);
}
The advantage is if someone want to print the name as lastname_firstname, s/he just has to wrap that method in personNameDelegate and pass to this function. No further code change is required.
Delegates are specifically important in
Events
Asynchronous calls
LINQ (as lambda expressions)
If you were going to delegate a task to someone, the delegate would be the person who receives the work.
In programming, it's a reference to the block of code which actually knows how to do something. Often this is a pointer to the function or method which will handle some item.
In the absolute most simplest terms I can come up with is this: A delegate will force the burdens of work into the hands of a class that pretty much knows what to do. Think of it as a kid that doesn't want to grow up to be like his big brother completely but still needs his guidance and orders. Instead of inheriting all the methods from his brother (ie subclassing), he just makes his brother do the work or The little brother does something that requires actions to be taken by the big brother. When you fall into the lines of Protocols, the big brother defines what is absolutely required, or he might give you flexibility to choose what you want to make him do in certain events (ie informal and formal protocols as outlined in Objective-C).
The absolute benefit of this concept is that you do not need to create a subclass. If you want something to fall in line, follow orders when an event happens, the delegate allows a developed class to hold it's hand and give orders if necessary.
I have written an application in C# which at present includes nearly 400 mathematical functions, each with its own integer ID. To access function code, my application currently feeds the function's ID to a huge 'switch' block which has one 'case' block per function: total, nearly 400 'case' blocks. This seems to work seamlessly and quite fast, but is there a more efficient way of matching code blocks to function identifiers?
Sounds like you want a Dictionary<string, Func<double, double>> or something similar...
// Better, use an *enum* of function IDs...
private static readonly Dictionary<int, Func<double, double>> Functions =
new Dictionary<int, Func<double, double>>
{
{ 0, Math.Sqrt },
{ 100, x => x * 2 },
...
};
Note that just as this refers to Math.Sqrt, it could also refer to your own methods - you don't have to put all the logic in the collection initializer. I would probably break any function which wasn't just a single expression into a method.
You'd then apply it like this:
public void Apply(int functionId, double value)
{
Func<double, double> function;
if (!Functions.TryGetValue(functionId, out function))
{
// Throw an exception or whatever
}
return function(value);
}
Note that I've used a dictionary here to be general purpose; if the function IDs are 0...n, you could use an array instead.
As noted in the comment at the start, it would be neater to use an enum of function IDs than just integers.
A better solution would be to encapsulate each function in its own class and then use the strategy pattern to swap in the function required.
Having each function in its own class has the advantage that it is simpler to see the code for the function in one place and maintains good separation of concerns, with each class only responsible for the code to implement that function. It will make testing those functions easier as well.
It also would mean that adding functions in the future could be more flexible, as you would have the opportunity of adding functions from external sources without having to recompile.
Another benefit of this approach could be that as the functions would all implement an interface, you could scan your assemblies for those implementations at load time and thus automatically have new functions added to the list of available functions without any extra effort.
As far as I know switch statements that contain more than a few items are already optimized by the compiler to use a lookup table internally so the performance gain by switching to e.g. a Dictionary should be minimal. As always though, you should measure both approaches to make a decision.
Further re the potential for using Dictionary, this fellow observes that having six or more string cases and a default case resulted in the compiler generating a Dictionary. So, why not just skip that conversion during compilation?
It sounds like you are writing your code like this (psudocode):
function doMath(a,b, functionID) {
switch (functionID) {
case 1:
// addition function
return a + b;
break;
case 2:
// subtraction function
return a - b;
break;
}
}
when you want to write it like this:
function add(a,b) {
return a + b;
}
function subtract(a,b) {
return a - b;
}
or am I misunderstanding your requirements?
public void Finalise()
ProcessFinalisation(true);
Doesn't compile, but the correct version:
public void Finalise()
{
ProcessFinalisation(true);
}
Compiles fine (of course).
If I am allowed if's without brackets when the following code has only one line:
if(true)
CallMethod();
Why is the same not allowed for methods with one following line? Is there a technical reason?
The obvious answer is the language spec; for reasoning... I guess mainly simplicity - it just wasn't worth the overhead of sanity-checking the spec and compiler for the tiny tiny number of single-statement methods. In particular, I can potentially see issues with generic constraints, etc (i.e. where T : IBlah, new() on the end of the signature).
Note that not using the braces can sometimes lead to ambiguities, and in some places is frowned upon. I'm a bit more pragmatic than that personally, but each to their own.
It might also be of interest that C# inside razor does not allow usage without explicit braces. At all (i.e. even for if etc).
Marc is basically right. To expand on his answer a bit: there are a number of places where C# requires a braced block of statements rather than allowing a "naked" statement. They are:
the body of a method, constructor, destructor, property accessor, event accessor or indexer accessor.
the block of a try, catch, finally, checked, unchecked or unsafe region.
the block of a statement lambda or anonymous method
the block of an if or loop statement if the block directly contains a local variable declaration. (That is, "while (x != 10) int y = 123;" is illegal; you've got to brace the declaration.)
In each of these cases it would be possible to come up with an unambiguous grammar (or heuristics to disambiguate an ambiguous grammar) for the feature where a single unbraced statement is legal. But what would the point be? In each of those situations you are expecting to see multiple statements; single statements are the rare, unlikely case. It seems like it is not realy worth it to make the grammar unambiguous for these very unlikely cases.
Since C# 6.0 you can declare:
void WriteToConsole(string word) => Console.WriteLine(word)
And then call it as usual:
public static void Main()
{
var word = "Hello World!";
WriteToConsole(word);
}
Short answer: C# is styled after C, and C mandates that functions be braced because of how C function declarations used to be.
Long version with history: Back in K&R C, functions were declared like this:
int function(arg1, arg2)
int arg1;
int arg2;
{ code }
Of course you couldn't have unbraced functions in that arrangement. ANSI C mandated the syntax we all know and love:
int function(int arg1, int arg2)
{ code }
but didn't allow unbraced functions, because they would cause havoc with older compilers that only knew K&R syntax [and support for K&R declarations was still required].
Time went on, and years later C# was designed around C [or C++, same difference in terms of syntax] and, since C didn't allow unbraced functions, neither did C#.
Are these two essentially the same thing? They look very similar to me.
Did lambda expression borrow its idea from Ruby?
Ruby actually has 4 constructs that are all extremely similar
The Block
The idea behind blocks is sort of a way to implement really light weight strategy patterns. A block will define a coroutine on the function, which the function can delegate control to with the yield keyword. We use blocks for just about everything in ruby, including pretty much all the looping constructs or anywhere you would use using in c#. Anything outside the block is in scope for the block, however the inverse is not true, with the exception that return inside the block will return the outer scope. They look like this
def foo
yield 'called foo'
end
#usage
foo {|msg| puts msg} #idiomatic for one liners
foo do |msg| #idiomatic for multiline blocks
puts msg
end
Proc
A proc is basically taking a block and passing it around as a parameter. One extremely interesting use of this is that you can pass a proc in as a replacement for a block in another method. Ruby has a special character for proc coercion which is &, and a special rule that if the last param in a method signature starts with an &, it will be a proc representation of the block for the method call. Finally, there is a builtin method called block_given?, which will return true if the current method has a block defined. It looks like this
def foo(&block)
return block
end
b = foo {puts 'hi'}
b.call # hi
To go a little deeper with this, there is a really neat trick that rails added to Symbol (and got merged into core ruby in 1.9). Basically, that & coercion does its magic by calling to_proc on whatever it is next to. So the rails guys added a Symbol#to_proc that would call itself on whatever is passed in. That lets you write some really terse code for any aggregation style function that is just calling a method on every object in a list
class Foo
def bar
'this is from bar'
end
end
list = [Foo.new, Foo.new, Foo.new]
list.map {|foo| foo.bar} # returns ['this is from bar', 'this is from bar', 'this is from bar']
list.map &:bar # returns _exactly_ the same thing
More advanced stuff, but imo that really illustrates the sort of magic you can do with procs
Lambdas
The purpose of a lambda is pretty much the same in ruby as it is in c#, a way to create an inline function to either pass around, or use internally. Like blocks and procs, lambdas are closures, but unlike the first two it enforces arity, and return from a lambda exits the lambda, not the containing scope. You create one by passing a block to the lambda method, or to -> in ruby 1.9
l = lambda {|msg| puts msg} #ruby 1.8
l = -> {|msg| puts msg} #ruby 1.9
l.call('foo') # => foo
Methods
Only serious ruby geeks really understand this one :) A method is a way to turn an existing function into something you can put in a variable. You get a method by calling the method function, and passing in a symbol as the method name. You can re bind a method, or you can coerce it into a proc if you want to show off. A way to re-write the previous method would be
l = lambda &method(:puts)
l.call('foo')
What is happening here is that you are creating a method for puts, coercing it into a proc, passing that in as a replacement for a block for the lambda method, which in turn returns you the lambda
Feel free to ask about anything that isn't clear (writing this really late on a weeknight without an irb, hopefully it isn't pure gibberish)
EDIT: To address questions in the comments
list.map &:bar Can I use this syntax
with a code block that takes more than
one argument? Say I have hash = { 0 =>
"hello", 1 => "world" }, and I want to
select the elements that has 0 as the
key. Maybe not a good example. – Bryan
Shen
Gonna go kind of deep here, but to really understand how it works you need to understand how ruby method calls work.
Basically, ruby doesn't have a concept of invoking a method, what happens is that objects pass messages to each other. The obj.method arg syntax you use is really just sugar around the more explicit form, which is obj.send :method, arg, and is functionally equivalent to the first syntax. This is a fundamental concept in the language, and is why things like method_missing and respond_to? make sense, in the first case you are just handling an unrecognized message, the second you are checking to see if it is listening for that message.
The other thing to know is the rather esoteric "splat" operator, *. Depending on where its used, it actually does very different things.
def foo(bar, *baz)
In a method call, if it is the last parameter, splat will make that parameter glob up all additional parameters passed in to the function (sort of like params in C#)
obj.foo(bar, *[biz, baz])
When in a method call (or anything else that takes argument lists), it will turn an array into a bare argument list. The snippet below is equivilent to the snippet above.
obj.foo(bar, biz, baz)
Now, with send and * in mind, Symbol#to_proc is basically implemented like this
class Symbol
def to_proc
Proc.new { |obj, *args| obj.send(self, *args) }
end
end
So, &:sym is going to make a new proc, that calls .send :sym on the first argument passed to it. If any additional args are passed, they are globbed up into an array called args, and then splatted into the send method call.
I notice that & is used in three
places: def foo(&block), list.map
&:bar, and l = lambda &method(:puts).
Do they share the same meaning? –
Bryan Shen
Yes, they do. An & will call to_proc on what ever it is beside. In the case of the method definition it has a special meaning when on the last parameter, where you are pulling in the co-routine defined as a block, and turning that into a proc. Method definitions are actually one of the most complex parts of the language, there are a huge amount of tricks and special meanings that can be in the parameters, and the placement of the parameters.
b = {0 => "df", 1 => "kl"} p b.select
{|key, value| key.zero? } I tried to
transform this to p b.select &:zero?,
but it failed. I guess that's because
the number of parameters for the code
block is two, but &:zero? can only
take one param. Is there any way I can
do that? – Bryan Shen
This should be addressed earlier, unfortunately you can't do it with this trick.
"A method is a way to turn an existing
function into something you can put in
a variable." why is l = method(:puts)
not sufficient? What does lambda &
mean in this context? – Bryan Shen
That example was exceptionally contrived, I just wanted to show equivalent code to the example before it, where I was passing a proc to the lambda method. I will take some time later and re-write that bit, but you are correct, method(:puts) is totally sufficient. What I was trying to show is that you can use &method(:puts) anywhere that would take a block. A better example would be this
['hello', 'world'].each &method(:puts) # => hello\nworld
l = -> {|msg| puts msg} #ruby 1.9:
this doesn't work for me. After I
checked Jörg's answer, I think it
should be l = -> (msg) {puts msg}. Or
maybe i'm using an incorrect version
of Ruby? Mine is ruby 1.9.1p738 –
Bryan Shen
Like I said in the post, I didn't have an irb available when I was writing the answer, and you are right, I goofed that (spend the vast majority of my time in 1.8.7, so I am not used to the new syntax yet)
There is no space between the stabby bit and the parens. Try l = ->(msg) {puts msg}. There was actually a lot of resistance to this syntax, since it is so different from everything else in the language.
C# vs. Ruby
Are these two essentially the same thing? They look very similar to me.
They are very different.
First off, lambdas in C# do two very different things, only one of which has an equivalent in Ruby. (And that equivalent is, surprise, lambdas, not blocks.)
In C#, lambda expression literals are overloaded. (Interestingly, they are the only overloaded literals, as far as I know.) And they are overloaded on their result type. (Again, they are the only thing in C# that can be overloaded on its result type, methods can only be overloaded on their argument types.)
C# lambda expression literals can either be an anonymous piece of executable code or an abstract representation of an anonymous piece of executable code, depending on whether their result type is Func / Action or Expression.
Ruby doesn't have any equivalent for the latter functionality (well, there are interpreter-specific non-portable non-standardized extensions). And the equivalent for the former functionality is a lambda, not a block.
The Ruby syntax for a lambda is very similar to C#:
->(x, y) { x + y } # Ruby
(x, y) => { return x + y; } // C#
In C#, you can drop the return, the semicolon and the curly braces if you only have a single expression as the body:
->(x, y) { x + y } # Ruby
(x, y) => x + y // C#
You can leave off the parentheses if you have only one parameter:
-> x { x } # Ruby
x => x // C#
In Ruby, you can leave off the parameter list if it is empty:
-> { 42 } # Ruby
() => 42 // C#
An alternative to using the literal lambda syntax in Ruby is to pass a block argument to the Kernel#lambda method:
->(x, y) { x + y }
lambda {|x, y| x + y } # same thing
The main difference between those two is that you don't know what lambda does, since it could be overridden, overwritten, wrapped or otherwise modified, whereas the behavior of literals cannot be modified in Ruby.
In Ruby 1.8, you can also use Kernel#proc although you should probably avoid that since that method does something different in 1.9.
Another difference between Ruby and C# is the syntax for calling a lambda:
l.() # Ruby
l() // C#
I.e. in C#, you use the same syntax for calling a lambda that you would use for calling anything else, whereas in Ruby, the syntax for calling a method is different from the syntax for calling any other kind of callable object.
Another difference is that in C#, () is built into the language and is only available for certain builtin types like methods, delegates, Actions and Funcs, whereas in Ruby, .() is simply syntactic sugar for .call() and can thus be made to work with any object by just implementing a call method.
procs vs. lambdas
So, what are lambdas exactly? Well, they are instances of the Proc class. Except there's a slight complication: there are actually two different kinds of instances of the Proc class which are subtly different. (IMHO, the Proc class should be split into two classes for the two different kinds of objects.)
In particular, not all Procs are lambdas. You can check whether a Proc is a lambda by calling the Proc#lambda? method. (The usual convention is to call lambda Procs "lambdas" and non-lambda Procs just "procs".)
Non-lambda procs are created by passing a block to Proc.new or to Kernel#proc. However, note that before Ruby 1.9, Kernel#proc creates a lambda, not a proc.
What's the difference? Basically, lambdas behave more like methods, procs behave more like blocks.
If you have followed some of the discussions on the Project Lambda for Java 8 mailinglists, you might have encountered the problem that it is not at all clear how non-local control-flow should behave with lambdas. In particular, there are three possible sensible behaviors for return (well, three possible but only two are really sensible) in a lambda:
return from the lambda
return from the method the lambda was called from
return from the method the lambda was created in
That last one is a bit iffy, since in general the method will have already returned, but the other two both make perfect sense, and neither is more right or more obvious than the other. The current state of Project Lambda for Java 8 is that they use two different keywords (return and yield). Ruby uses the two different kinds of Procs:
procs return from the calling method (just like blocks)
lambdas return from the lambda (just like methods)
They also differ in how they handle argument binding. Again, lambdas behave more like methods and procs behave more like blocks:
you can pass more arguments to a proc than there are parameters, in which case the excess arguments will be ignored
you can pass less arguments to a proc than there are parameters, in which case the excess parameters will be bound to nil
if you pass a single argument which is an Array (or responds to to_ary) and the proc has multiple parameters, the array will be unpacked and the elements bound to the parameters (exactly like they would in the case of destructuring assignment)
Blocks: lightweight procs
A block is essentially a lightweight proc. Every method in Ruby has exactly one block parameter, which does not actually appear in its parameter list (more on that later), i.e. is implicit. This means that on every method call you can pass a block argument, whether the method expects it or not.
Since the block doesn't appear in the parameter list, there is no name you can use to refer to it. So, how do you use it? Well, the only two things you can do (not really, but more on that later) is call it implicitly via the yield keyword and check whether a block was passed via block_given?. (Since there is no name, you cannot use the call or nil? methods. What would you call them on?)
Most Ruby implementations implement blocks in a very lightweight manner. In particular, they don't actually implement them as objects. However, since they have no name, you cannot refer to them, so it's actually impossible to tell whether they are objects or not. You can just think of them as procs, which makes it easier since there is one less different concept to keep in mind. Just treat the fact that they aren't actually implemented as blocks as a compiler optimization.
to_proc and &
There is actually a way to refer to a block: the & sigil / modifier / unary prefix operator. It can only appear in parameter lists and argument lists.
In a parameter list, it means "wrap up the implicit block into a proc and bind it to this name". In an argument list, it means "unwrap this Proc into a block".
def foo(&bar)
end
Inside the method, bar is now bound to a proc object that represents the block. This means for example that you can store it in an instance variable for later use.
baz(&quux)
In this case, baz is actually a method which takes zero arguments. But of course it takes the implicit block argument which all Ruby methods take. We are passing the contents of the variable quux, but unroll it into a block first.
This "unrolling" actually works not just for Procs. & calls to_proc on the object first, to convert it to a proc. That way, any object can be converted into a block.
The most widely used example is Symbol#to_proc, which first appeared sometime during the late 90s, I believe. It became popular when it was added to ActiveSupport from where it spread to Facets and other extension libraries. Finally, it was added to the Ruby 1.9 core library and backported to 1.8.7. It's pretty simple:
class Symbol
def to_proc
->(recv, *args) { recv.send self, *args }
end
end
%w[Hello StackOverflow].map(&:length) # => [5, 13]
Or, if you interpret classes as functions for creating objects, you can do something like this:
class Class
def to_proc
-> *args { new *args }
end
end
[1, 2, 3].map(&Array) # => [[nil], [nil, nil], [nil, nil, nil]]
Methods and UnboundMethods
Another class to represent a piece of executable code, is the Method class. Method objects are reified proxies for methods. You can create a Method object by calling Object#method on any object and passing the name of the method you want to reify:
m = 'Hello'.method(:length)
m.() #=> 5
or using the method reference operator .::
m = 'Hello'.:length
m.() #=> 5
Methods respond to to_proc, so you can pass them anywhere you could pass a block:
[1, 2, 3].each(&method(:puts))
# 1
# 2
# 3
An UnboundMethod is a proxy for a method that hasn't been bound to a receiver yet, i.e. a method for which self hasn't been defined yet. You cannot call an UnboundMethod, but you can bind it to an object (which must be an instance of the module you got the method from), which will convert it to a Method.
UnboundMethod objects are created by calling one of the methods from the Module#instance_method family, passing the name of the method as an argument.
u = String.instance_method(:length)
u.()
# NoMethodError: undefined method `call' for #<UnboundMethod: String#length>
u.bind(42)
# TypeError: bind argument must be an instance of String
u.bind('Hello').() # => 5
Generalized callable objects
Like I already hinted at above: there's not much special about Procs and Methods. Any object that responds to call can be called and any object that responds to to_proc can be converted to a Proc and thus unwrapped into a block and passed to a method which expects a block.
History
Did lambda expression borrow its idea from Ruby?
Probably not. Most modern programming languages have some form of anonymous literal block of code: Lisp (1958), Scheme, Smalltalk (1974), Perl, Python, ECMAScript, Ruby, Scala, Haskell, C++, D, Objective-C, even PHP(!). And of course, the whole idea goes back to Alonzo Church's λ-calculus (1935 and even earlier).
Not exactly. But they're very similar. The most obvious difference is that in C# a lambda expression can go anywhere where you might have a value that happens to be a function; in Ruby you only have one code block per method call.
They both borrowed the idea from Lisp (a programming language dating back to the late 1950s) which in turn borrowed the lambda concept from Church's Lambda Calculus, invented in the 1930s.
There is a lot of talk about monads these days. I have read a few articles / blog posts, but I can't go far enough with their examples to fully grasp the concept. The reason is that monads are a functional language concept, and thus the examples are in languages I haven't worked with (since I haven't used a functional language in depth). I can't grasp the syntax deeply enough to follow the articles fully ... but I can tell there's something worth understanding there.
However, I know C# pretty well, including lambda expressions and other functional features. I know C# only has a subset of functional features, and so maybe monads can't be expressed in C#.
However, surely it is possible to convey the concept? At least I hope so. Maybe you can present a C# example as a foundation, and then describe what a C# developer would wish he could do from there but can't because the language lacks functional programming features. This would be fantastic, because it would convey the intent and benefits of monads. So here's my question: What is the best explanation you can give of monads to a C# 3 developer?
Thanks!
(EDIT: By the way, I know there are at least 3 "what is a monad" questions already on SO. However, I face the same problem with them ... so this question is needed imo, because of the C#-developer focus. Thanks.)
Most of what you do in programming all day is combining some functions together to build bigger functions from them. Usually you have not only functions in your toolbox but also other things like operators, variable assignments and the like, but generally your program combines together lots of "computations" to bigger computations that will be combined together further.
A monad is some way to do this "combining of computations".
Usually your most basic "operator" to combine two computations together is ;:
a; b
When you say this you mean "first do a, then do b". The result a; b is basically again a computation that can be combined together with more stuff.
This is a simple monad, it is a way of combing small computations to bigger ones. The ; says "do the thing on the left, then do the thing on the right".
Another thing that can be seen as a monad in object oriented languages is the .. Often you find things like this:
a.b().c().d()
The . basically means "evaluate the computation on the left, and then call the method on the right on the result of that". It is another way to combine functions/computations together, a little more complicated than ;. And the concept of chaining things together with . is a monad, since it's a way of combining two computations together to a new computation.
Another fairly common monad, that has no special syntax, is this pattern:
rv = socket.bind(address, port);
if (rv == -1)
return -1;
rv = socket.connect(...);
if (rv == -1)
return -1;
rv = socket.send(...);
if (rv == -1)
return -1;
A return value of -1 indicates failure, but there is no real way to abstract out this error checking, even if you have lots of API-calls that you need to combine in this fashion. This is basically just another monad that combines the function calls by the rule "if the function on the left returned -1, do return -1 ourselves, otherwise call the function on the right". If we had an operator >>= that did this thing we could simply write:
socket.bind(...) >>= socket.connect(...) >>= socket.send(...)
It would make things more readable and help to abstract out our special way of combining functions, so that we don't need to repeat ourselves over and over again.
And there are many more ways to combine functions/computations that are useful as a general pattern and can be abstracted in a monad, enabling the user of the monad to write much more concise and clear code, since all the book-keeping and management of the used functions is done in the monad.
For example the above >>= could be extended to "do the error checking and then call the right side on the socket that we got as input", so that we don't need to explicitly specify socket lots of times:
new socket() >>= bind(...) >>= connect(...) >>= send(...);
The formal definition is a bit more complicated since you have to worry about how to get the result of one function as an input to the next one, if that function needs that input and since you want to make sure that the functions you combine fit into the way you try to combine them in your monad. But the basic concept is just that you formalize different ways to combine functions together.
It has been a year since I posted this question. After posting it, I delved into Haskell for a couple of months. I enjoyed it tremendously, but I placed it aside just as I was ready to delve into Monads. I went back to work and focused on the technologies my project required.
And last night, I came and re-read these responses. Most importantly, I re-read the specific C# example in the text comments of the Brian Beckman video someone mentions above. It was so completely clear and illuminating that I’ve decided to post it directly here.
Because of this comment, not only do I feel like I understand exactly what Monads are … I realize I’ve actually written some things in C# that are Monads … or at least very close, and striving to solve the same problems.
So, here’s the comment – this is all a direct quote from the comment here by sylvan:
This is pretty cool. It's a bit abstract though. I can imagine people
who don't know what monads are already get confused due to the lack of
real examples.
So let me try to comply, and just to be really clear I'll do an
example in C#, even though it will look ugly. I'll add the equivalent
Haskell at the end and show you the cool Haskell syntactic sugar which
is where, IMO, monads really start getting useful.
Okay, so one of the easiest Monads is called the "Maybe monad" in
Haskell. In C# the Maybe type is called Nullable<T>. It's basically
a tiny class that just encapsulates the concept of a value that is
either valid and has a value, or is "null" and has no value.
A useful thing to stick inside a monad for combining values of this
type is the notion of failure. I.e. we want to be able to look at
multiple nullable values and return null as soon as any one of them
is null. This could be useful if you, for example, look up lots of
keys in a dictionary or something, and at the end you want to process
all of the results and combine them somehow, but if any of the keys
are not in the dictionary, you want to return null for the whole
thing. It would be tedious to manually have to check each lookup for
null and return, so we can hide this checking inside the bind
operator (which is sort of the point of monads, we hide book-keeping
in the bind operator which makes the code easier to use since we can
forget about the details).
Here's the program that motivates the whole thing (I'll define the
Bind later, this is just to show you why it's nice).
class Program
{
static Nullable<int> f(){ return 4; }
static Nullable<int> g(){ return 7; }
static Nullable<int> h(){ return 9; }
static void Main(string[] args)
{
Nullable<int> z =
f().Bind( fval =>
g().Bind( gval =>
h().Bind( hval =>
new Nullable<int>( fval + gval + hval ))));
Console.WriteLine(
"z = {0}", z.HasValue ? z.Value.ToString() : "null" );
Console.WriteLine("Press any key to continue...");
Console.ReadKey();
}
}
Now, ignore for a moment that there already is support for doing this
for Nullable in C# (you can add nullable ints together and you get
null if either is null). Let's pretend that there is no such feature,
and it's just a user-defined class with no special magic. The point is
that we can use the Bind function to bind a variable to the contents
of our Nullable value and then pretend that there's nothing strange
going on, and use them like normal ints and just add them together. We
wrap the result in a nullable at the end, and that nullable will
either be null (if any of f, g or h returns null) or it will be
the result of summing f, g, and h together. (this is analogous
of how we can bind a row in a database to a variable in LINQ, and do
stuff with it, safe in the knowledge that the Bind operator will
make sure that the variable will only ever be passed valid row
values).
You can play with this and change any of f, g, and h to return
null and you will see that the whole thing will return null.
So clearly the bind operator has to do this checking for us, and bail
out returning null if it encounters a null value, and otherwise pass
along the value inside the Nullable structure into the lambda.
Here's the Bind operator:
public static Nullable<B> Bind<A,B>( this Nullable<A> a, Func<A,Nullable<B>> f )
where B : struct
where A : struct
{
return a.HasValue ? f(a.Value) : null;
}
The types here are just like in the video. It takes an M a
(Nullable<A> in C# syntax for this case), and a function from a to
M b (Func<A, Nullable<B>> in C# syntax), and it returns an M b
(Nullable<B>).
The code simply checks if the nullable contains a value and if so
extracts it and passes it onto the function, else it just returns
null. This means that the Bind operator will handle all the
null-checking logic for us. If and only if the value that we call
Bind on is non-null then that value will be "passed along" to the
lambda function, else we bail out early and the whole expression is
null. This allows the code that we write using the monad to be
entirely free of this null-checking behaviour, we just use Bind and
get a variable bound to the value inside the monadic value (fval,
gval and hval in the example code) and we can use them safe in the
knowledge that Bind will take care of checking them for null before
passing them along.
There are other examples of things you can do with a monad. For
example you can make the Bind operator take care of an input stream
of characters, and use it to write parser combinators. Each parser
combinator can then be completely oblivious to things like
back-tracking, parser failures etc., and just combine smaller parsers
together as if things would never go wrong, safe in the knowledge that
a clever implementation of Bind sorts out all the logic behind the
difficult bits. Then later on maybe someone adds logging to the monad,
but the code using the monad doesn't change, because all the magic
happens in the definition of the Bind operator, the rest of the code
is unchanged.
Finally, here's the implementation of the same code in Haskell (--
begins a comment line).
-- Here's the data type, it's either nothing, or "Just" a value
-- this is in the standard library
data Maybe a = Nothing | Just a
-- The bind operator for Nothing
Nothing >>= f = Nothing
-- The bind operator for Just x
Just x >>= f = f x
-- the "unit", called "return"
return = Just
-- The sample code using the lambda syntax
-- that Brian showed
z = f >>= ( \fval ->
g >>= ( \gval ->
h >>= ( \hval -> return (fval+gval+hval ) ) ) )
-- The following is exactly the same as the three lines above
z2 = do
fval <- f
gval <- g
hval <- h
return (fval+gval+hval)
As you can see the nice do notation at the end makes it look like
straight imperative code. And indeed this is by design. Monads can be
used to encapsulate all the useful stuff in imperative programming
(mutable state, IO etc.) and used using this nice imperative-like
syntax, but behind the curtains, it's all just monads and a clever
implementation of the bind operator! The cool thing is that you can
implement your own monads by implementing >>= and return. And if
you do so those monads will also be able to use the do notation,
which means you can basically write your own little languages by just
defining two functions!
A monad is essentially deferred processing. If you are trying to write code that has side effects (e.g. I/O) in a language that does not permit them, and only allows pure computation, one dodge is to say, "Ok, I know you won't do side effects for me, but can you please compute what would happen if you did?"
It's sort of cheating.
Now, that explanation will help you understand the big picture intent of monads, but the devil is in the details. How exactly do you compute the consequences? Sometimes, it isn't pretty.
The best way to give an overview of the how for someone used to imperative programming is to say that it puts you in a DSL wherein operations that look syntactically like what you are used to outside the monad are used instead to build a function that would do what you want if you could (for example) write to an output file. Almost (but not really) as if you were building code in a string to later be eval'd.
You can think of a monad as a C# interface that classes have to implement. This is a pragmatic answer that ignores all the category theoretical math behind why you'd want to choose to have these declarations in your interface and ignores all the reasons why you'd want to have monads in a language that tries to avoid side effects, but I found it to be a good start as someone who understands (C#) interfaces.
See my answer to "What is a monad?"
It begins with a motivating example, works through the example, derives an example of a monad, and formally defines "monad".
It assumes no knowledge of functional programming and it uses pseudocode with function(argument) := expression syntax with the simplest possible expressions.
This C# program is an implementation of the pseudocode monad. (For reference: M is the type constructor, feed is the "bind" operation, and wrap is the "return" operation.)
using System.IO;
using System;
class Program
{
public class M<A>
{
public A val;
public string messages;
}
public static M<B> feed<A, B>(Func<A, M<B>> f, M<A> x)
{
M<B> m = f(x.val);
m.messages = x.messages + m.messages;
return m;
}
public static M<A> wrap<A>(A x)
{
M<A> m = new M<A>();
m.val = x;
m.messages = "";
return m;
}
public class T {};
public class U {};
public class V {};
public static M<U> g(V x)
{
M<U> m = new M<U>();
m.messages = "called g.\n";
return m;
}
public static M<T> f(U x)
{
M<T> m = new M<T>();
m.messages = "called f.\n";
return m;
}
static void Main()
{
V x = new V();
M<T> m = feed<U, T>(f, feed(g, wrap<V>(x)));
Console.Write(m.messages);
}
}