Since I started Java it's been very aggravating for me that it doesn't support implicit conversions from numeric types to booleans, so you can't do things like:
if (flags & 0x80) { ... }
instead you have to go through this lunacy:
if ((flags & 0x80) != 0) { ... }
It's the same with null and objects. Every other C-like language I know including JavaScript allows it, so I thought Java was just moronic, but I've just discovered that C# is the same (at least for numbers, don't know about null/objects):
http://msdn.microsoft.com/en-us/library/c8f5xwh7(VS.71).aspx
Microsoft changed it on purpose from C++, so why? Clearly I'm missing something. Why change (what I thought was) the most natural thing in the world to make it longer to type? What on Earth is wrong with it?
For clarity. It makes the following mistake simply illegal:
int x = ...;
if (x = 0) // in C: assign 0 to x and always evaluate to false
.... // never executed
Note: most modern C / C++ compilers will give a Warning (but not an Error) on this straightforward pattern, but there are many variations possible. It can creep up on you.
Both Java and C# abandoned implicit conversions to booleans to reduce the chance of programmer error.
For example, many programmers would accidentally write:
if( x = 5 ) { ... }
instead of:
if( x == 5 ) { ... }
Which of course results in completely different behavior, since the first statement performs an assignment (which will always result in true), while the second performs a comparison. In the past, developers would sometimes write such assignments in reverse to avoid the pitfall, since:
if( 5 = x ) { ... } // doesn't compile.
Now, in C#, you can still create implicit conversion operators to bool for your own types - although it is rarely advisable, since most developers don't expect it:
public class MyValue
{
public int Value { get; set; }
public static implicit operator bool( MyValue mb )
{
return mb.Value != 0;
}
}
MyValue x = new MyValue() { Value = 10; }
if( x ) { ... } // perfectly legal, compiler applies implicit conversion
Maybe they felt that being more explicit was more in line with a strongly typed language.
You've got it backward.
It's actually C that does not support boolean, so if ( and any other conditional statement ) actually expects an int value, not boolean. Then int value of 0 is treated as false and any other value is treated as true.
Some people actually find it a little ambiguous, because this type of behavior can lead to many errors, as others have pointed out. Because of this, Java designers have opted out to support only boolean types in the condition statements. And when Microsoft decided to implement MS-Java ( AKA C# ), they've borrowed this design principal.
If you don't like it, you can program in a variety of languages that do not have this restriction.
Implicit conversion of any int value (such as (flags & 0x80)) to a boolean implies a language defined mapping from an int value to a boolean. C did this, and caused a huge amount of confusion and a lot of programmer error. There is no good reason why a zero int value ALWAYS means true (or false) and a lot of good reasons why you might want to leave the decision to the programmer. For these reasons implicit conversion to boolean has been abandoned by most modern languages.
If typing seven extra characters every time you do a bit test constitutes 'lunacy' you may be in the wrong profession. If you are doing bit tests in an int extremely frequently you might want to think about whether you are prematurely optimizing to save memory.
Even the most experienced programmers have problems with an implicit conversion to boolean. I for one appreciate this little feature.
Some programming languages do no automatic coercion at all. An integer, for example, can only be compared to another integer; assignment to a non-integer variable results in an error. Such is the hallmark of a strongly-typed language.
That Java does any coercion is a convenience for you and breaks the strong-typing model.
Mapping the entire range of integers -- or the even larger range of floats -- onto the two boolean values is fraught with disagreement over arbitrary assignment of "truthness" and "falseness".
What values map onto false and true? If you're C, only zero maps to false and all other values are true. If you're the bash shell, it's reversed.
How should negative values be mapped?
When you try to automatically convert a double to an integer, Java flags this as a "loss of precision" error. By analogy, converting a number to a boolean should also result in a loss of precision. Instead, Java chose to not syntactically support it.
Related
today I discovered a very strange behavior with C# function overloading. The problem occurs when I have a method with 2 overloads, one accepting Object and the other accepting Enum of any type. When I pass 0 as parameter, the Enum version of the method is called. When I use any other integer value, the Object version is called. I know this can be easilly fixed by using explicit casting, but I want to know why the compiler behaves that way. Is this a bug or just some strange language rule I don't know about?
The code below explains the problem (checked with runtime 2.0.50727)
Thanks for any help on this,
Grzegorz Kyc
class Program
{
enum Bar
{
Value1,
Value2,
Value3
}
static void Main(string[] args)
{
Foo(0);
Foo(1);
Console.ReadLine();
}
static void Foo(object a)
{
Console.WriteLine("object");
}
static void Foo(Bar a)
{
Console.WriteLine("enum");
}
}
It may be that you're not aware that there's an implicit conversion from a constant1 of 0 to any enum:
Bar x = 0; // Implicit conversion
Now, the conversion from 0 to Bar is more specific than the conversion from 0 to object, which is why the Foo(Bar) overload is used.
Does that clear everything up?
1 There's actually a bug in the Microsoft C# compiler which lets it be any zero constant, not just an integer:
const decimal DecimalZero = 0.0m;
...
Bar x = DecimalZero;
It's unlikely that this will ever be fixed, as it could break existing working code. I believe Eric Lippert has a two blog posts which go into much more detail.
The C# specification section 6.1.3 (C# 4 spec) has this to say about it:
An implicit enumeration conversion
permits the decimal-integer-literal 0
to be converted to any enum-type and
to any nullable-type whose underlying
type is an enum-type. In the latter
case the conversion is evaluated by
converting to the underlying enum-type
and wrapping the result (§4.1.10).
That actually suggests that the bug isn't just in allowing the wrong type, but allowing any constant 0 value to be converted rather than only the literal value 0.
EDIT: It looks like the "constant" part was partially introduced in the C# 3 compiler. Previously it was some constant values, now it looks like it's all of them.
I know I have read somewhere else that the .NET system always treats zero as a valid enumeration value, even if it actually isn't. I will try to find some reference for this...
OK, well I found this, which quotes the following and attributes it to Eric Gunnerson:
Enums in C# do dual purpose. They are used for the usual enum use, and they're also used for bit fields. When I'm dealing with bit fields, you often want to AND a value with the bit field and check if it's true.
Our initial rules meant that you had to write:
if ((myVar & MyEnumName.ColorRed) != (MyEnumName) 0)
which we thought was difficult to read. One alernative was to define a zero entry:
if ((myVar & MyEnumName.ColorRed) != MyEnumName.NoBitsSet)
which was also ugly.
We therefore decided to relax our rules a bit, and permit an implicit conversion from the literal zero to any enum type, which allows you to write:
if ((myVar & MyEnumName.ColorRed) != 0)
which is why PlayingCard(0, 0) works.
So it appears that the whole reason behind this was to simply allow equating to zero when checking flags without having to cast the zero.
After I have migrated my project from VS2013 to VS2015 the project no longer builds. A compilation error occurs in the following LINQ statement:
static void Main(string[] args)
{
decimal a, b;
IEnumerable<dynamic> array = new string[] { "10", "20", "30" };
var result = (from v in array
where decimal.TryParse(v, out a) && decimal.TryParse("15", out b) && a <= b // Error here
orderby decimal.Parse(v)
select v).ToArray();
}
The compiler returns an error:
Error CS0165 Use of unassigned local variable 'b'
What causes this issue? Is it possible to fix it through a compiler setting?
What does cause this issue?
Looks like a compiler bug to me. At least, it did. Although the decimal.TryParse(v, out a) and decimal.TryParse(v, out b) expressions are evaluated dynamically, I expected the compiler to still understand that by the time it reaches a <= b, both a and b are definitely assigned. Even with the weirdnesses you can come up with in dynamic typing, I'd expect to only ever evaluate a <= b after evaluating both of the TryParse calls.
However, it turns out that through operator and conversion tricky, it's entirely feasible to have an expression A && B && C which evaluates A and C but not B - if you're cunning enough. See the Roslyn bug report for Neal Gafter's ingenious example.
Making that work with dynamic is even harder - the semantics involved when the operands are dynamic are harder to describe, because in order to perform overload resolution, you need to evaluate operands to find out what types are involved, which can be counter-intuitive. However, again Neal has come up with an example which shows that the compiler error is required... this isn't a bug, it's a bug fix. Huge amounts of kudos to Neal for proving it.
Is it possible to fix it through compiler settings?
No, but there are alternatives which avoid the error.
Firstly, you could stop it from being dynamic - if you know that you'll only ever use strings, then you could use IEnumerable<string> or give the range variable v a type of string (i.e. from string v in array). That would be my preferred option.
If you really need to keep it dynamic, just give b a value to start with:
decimal a, b = 0m;
This won't do any harm - we know that actually your dynamic evaluation won't do anything crazy, so you'll still end up assigning a value to b before you use it, making the initial value irrelevant.
Additionally, it seems that adding parentheses works too:
where decimal.TryParse(v, out a) && (decimal.TryParse("15", out b) && a <= b)
That changes the point at which various pieces of overload resolution are triggered, and happens to make the compiler happy.
There is one issue still remaining - the spec's rules on definite assignment with the && operator need to be clarified to state that they only apply when the && operator is being used in its "regular" implementation with two bool operands. I'll try to make sure this is fixed for the next ECMA standard.
This does appear to be a bug, or at the least a regression, in the Roslyn compiler. The following bug has been filed to track it:
https://github.com/dotnet/roslyn/issues/4509
In the meantime, Jon's excellent answer has a couple of work arounds.
Since I got schooled so hard in the bug report, I'm going to try to explain this myself.
Imagine T is some user-defined type with an implicit cast to bool that alternates between false and true, starting with false. As far as the compiler knows, the dynamic first argument to the first && might evaluate to that type, so it has to be pessimistic.
If, then, it let the code compile, this could happen:
When the dynamic binder evaluates the first &&, it does the following:
Evaluate the first argument
It's a T - implicitly cast it to bool.
Oh, it's false, so we don't need to evaluate the second argument.
Make the result of the && evaluate as the first argument. (No, not false, for some reason.)
When the dynamic binder evaluates the second &&, it does the following:
Evaluate the first argument.
It's a T - implicitly cast it to bool.
Oh, it's true, so evaluate the second argument.
... Oh crap, b isn't assigned.
In spec terms, in short, there are special "definite assignment" rules that let us say not only whether a variable is "definitely assigned" or "not definitely assigned", but also if it is "definitely assigned after false statement" or "definitely assigned after true statement".
These exist so that when dealing with && and || (and ! and ?? and ?:) the compiler can examine whether variables may be assigned in particular branches of a complex boolean expression.
However, these only work while the expressions' types remain boolean. When part of the expression is dynamic (or a non-boolean static type) we can no longer reliably say that the expression is true or false - the next time we cast it to bool to decide which branch to take, it may have changed its mind.
Update: this has now been resolved and documented:
The definite assignment rules implemented by previous compilers for dynamic expressions allowed some cases of code that could result in variables being read that are not definitely assigned. See https://github.com/dotnet/roslyn/issues/4509 for one report of this.
...
Because of this possibility the compiler must not allow this program to be compiled if val has no initial value. Previous versions of the compiler (prior to VS2015) allowed this program to compile even if val has no initial value. Roslyn now diagnoses this attempt to read a possibly uninitialized variable.
This is not a bug. See https://github.com/dotnet/roslyn/issues/4509#issuecomment-130872713 for an example of how a dynamic expression of this form can leave such an out variable unassigned.
I've recently been coding a lot in both Objective C while also working on several C# projects. In this process, I've found that I miss things in both directions.
In particular, when I code in C# I find I miss the short null check syntax of Objective C.
Why do you suppose in C# you can't check an object for null with a syntax like:
if (maybeNullObject) // works in Objective C, but not C# :(
{
...
}
I agree that if (maybeNullObject != null) is a more verbose / clear syntax, but it feels not only tedious to write it out in code all the time but overly verbose. In addition, I believe the if (maybeNullObject) syntax is generally understood (Javascript, Obj C, and I assume others) by most developers.
I throw this out as a question assuming that perhaps there is a specific reason C# disallows the if (maybeNullObject) syntax. I would think however that the compiler could easily convert an object expression such as if (maybeNullObject) automatically (or automagically) to if (maybeNullObject != null).
Great reference to this question is How an idea becomes a C# language feature?.
Edit
The short null check syntax that I am suggesting would only apply to objects. The short null check would not apply to primitives and types like bool?.
Because if statements in C# are strict. They take only boolean values, nothing else, and there are no subsequent levels of "truthiness" (i.e., 0, null, whatever. They are their own animal and no implicit conversion exists for them).
The compiler could "easily convert" almost any expression to a boolean, but that can cause subtle problems (believe me...) and a conscious decision was made to disallow these implicit conversions.
IMO this was a good choice. You are essentially asking for a one-off implicit conversion where the compiler assumes that, if the expression does not return a boolean result, then the programmer must have wanted to perform a null check. Aside from being a very narrow feature, it is purely syntactic sugar and provides little to no appreciable benefit. As Eric Lippert woudl say, every feature has a cost...
You are asking for a feature which adds needless complexity to the language (yes, it is complex because a type may define an implicit conversion to bool. If that is the case, which check is performed?) only to allow you to not type != null once in a while.
EDIT:
Example of how to define an implicit conversion to bool for #Sam (too long for comments).
class Foo
{
public int SomeVar;
public Foo( int i )
{
SomeVar = i;
}
public static implicit operator bool( Foo f )
{
return f.SomeVar != 0;
}
}
static void Main()
{
var f = new Foo(1);
if( f )
{
Console.Write( "It worked!" );
}
}
One potential collision is with a reference object that defines an implicit conversion to bool.
There is no delineation for the compiler between if(myObject) checking for null or checking for true.
The intent its to leave no ambiguity. You may find it tedious but that short hand is responsible for a number of bugs over the years. C# rightly has a type for booleans and out was a conscience decision not to make 0 mean false and any other value true.
You could write an extension method against System.Object, perhaps called IsNull()?
Of course, that's still an extra 8 or 9 characters on top of the code you'd have to write for the extension class. I think most people are happy with the clarity that an explicit null test brings.
today I discovered a very strange behavior with C# function overloading. The problem occurs when I have a method with 2 overloads, one accepting Object and the other accepting Enum of any type. When I pass 0 as parameter, the Enum version of the method is called. When I use any other integer value, the Object version is called. I know this can be easilly fixed by using explicit casting, but I want to know why the compiler behaves that way. Is this a bug or just some strange language rule I don't know about?
The code below explains the problem (checked with runtime 2.0.50727)
Thanks for any help on this,
Grzegorz Kyc
class Program
{
enum Bar
{
Value1,
Value2,
Value3
}
static void Main(string[] args)
{
Foo(0);
Foo(1);
Console.ReadLine();
}
static void Foo(object a)
{
Console.WriteLine("object");
}
static void Foo(Bar a)
{
Console.WriteLine("enum");
}
}
It may be that you're not aware that there's an implicit conversion from a constant1 of 0 to any enum:
Bar x = 0; // Implicit conversion
Now, the conversion from 0 to Bar is more specific than the conversion from 0 to object, which is why the Foo(Bar) overload is used.
Does that clear everything up?
1 There's actually a bug in the Microsoft C# compiler which lets it be any zero constant, not just an integer:
const decimal DecimalZero = 0.0m;
...
Bar x = DecimalZero;
It's unlikely that this will ever be fixed, as it could break existing working code. I believe Eric Lippert has a two blog posts which go into much more detail.
The C# specification section 6.1.3 (C# 4 spec) has this to say about it:
An implicit enumeration conversion
permits the decimal-integer-literal 0
to be converted to any enum-type and
to any nullable-type whose underlying
type is an enum-type. In the latter
case the conversion is evaluated by
converting to the underlying enum-type
and wrapping the result (§4.1.10).
That actually suggests that the bug isn't just in allowing the wrong type, but allowing any constant 0 value to be converted rather than only the literal value 0.
EDIT: It looks like the "constant" part was partially introduced in the C# 3 compiler. Previously it was some constant values, now it looks like it's all of them.
I know I have read somewhere else that the .NET system always treats zero as a valid enumeration value, even if it actually isn't. I will try to find some reference for this...
OK, well I found this, which quotes the following and attributes it to Eric Gunnerson:
Enums in C# do dual purpose. They are used for the usual enum use, and they're also used for bit fields. When I'm dealing with bit fields, you often want to AND a value with the bit field and check if it's true.
Our initial rules meant that you had to write:
if ((myVar & MyEnumName.ColorRed) != (MyEnumName) 0)
which we thought was difficult to read. One alernative was to define a zero entry:
if ((myVar & MyEnumName.ColorRed) != MyEnumName.NoBitsSet)
which was also ugly.
We therefore decided to relax our rules a bit, and permit an implicit conversion from the literal zero to any enum type, which allows you to write:
if ((myVar & MyEnumName.ColorRed) != 0)
which is why PlayingCard(0, 0) works.
So it appears that the whole reason behind this was to simply allow equating to zero when checking flags without having to cast the zero.
There is a lot of talk about monads these days. I have read a few articles / blog posts, but I can't go far enough with their examples to fully grasp the concept. The reason is that monads are a functional language concept, and thus the examples are in languages I haven't worked with (since I haven't used a functional language in depth). I can't grasp the syntax deeply enough to follow the articles fully ... but I can tell there's something worth understanding there.
However, I know C# pretty well, including lambda expressions and other functional features. I know C# only has a subset of functional features, and so maybe monads can't be expressed in C#.
However, surely it is possible to convey the concept? At least I hope so. Maybe you can present a C# example as a foundation, and then describe what a C# developer would wish he could do from there but can't because the language lacks functional programming features. This would be fantastic, because it would convey the intent and benefits of monads. So here's my question: What is the best explanation you can give of monads to a C# 3 developer?
Thanks!
(EDIT: By the way, I know there are at least 3 "what is a monad" questions already on SO. However, I face the same problem with them ... so this question is needed imo, because of the C#-developer focus. Thanks.)
Most of what you do in programming all day is combining some functions together to build bigger functions from them. Usually you have not only functions in your toolbox but also other things like operators, variable assignments and the like, but generally your program combines together lots of "computations" to bigger computations that will be combined together further.
A monad is some way to do this "combining of computations".
Usually your most basic "operator" to combine two computations together is ;:
a; b
When you say this you mean "first do a, then do b". The result a; b is basically again a computation that can be combined together with more stuff.
This is a simple monad, it is a way of combing small computations to bigger ones. The ; says "do the thing on the left, then do the thing on the right".
Another thing that can be seen as a monad in object oriented languages is the .. Often you find things like this:
a.b().c().d()
The . basically means "evaluate the computation on the left, and then call the method on the right on the result of that". It is another way to combine functions/computations together, a little more complicated than ;. And the concept of chaining things together with . is a monad, since it's a way of combining two computations together to a new computation.
Another fairly common monad, that has no special syntax, is this pattern:
rv = socket.bind(address, port);
if (rv == -1)
return -1;
rv = socket.connect(...);
if (rv == -1)
return -1;
rv = socket.send(...);
if (rv == -1)
return -1;
A return value of -1 indicates failure, but there is no real way to abstract out this error checking, even if you have lots of API-calls that you need to combine in this fashion. This is basically just another monad that combines the function calls by the rule "if the function on the left returned -1, do return -1 ourselves, otherwise call the function on the right". If we had an operator >>= that did this thing we could simply write:
socket.bind(...) >>= socket.connect(...) >>= socket.send(...)
It would make things more readable and help to abstract out our special way of combining functions, so that we don't need to repeat ourselves over and over again.
And there are many more ways to combine functions/computations that are useful as a general pattern and can be abstracted in a monad, enabling the user of the monad to write much more concise and clear code, since all the book-keeping and management of the used functions is done in the monad.
For example the above >>= could be extended to "do the error checking and then call the right side on the socket that we got as input", so that we don't need to explicitly specify socket lots of times:
new socket() >>= bind(...) >>= connect(...) >>= send(...);
The formal definition is a bit more complicated since you have to worry about how to get the result of one function as an input to the next one, if that function needs that input and since you want to make sure that the functions you combine fit into the way you try to combine them in your monad. But the basic concept is just that you formalize different ways to combine functions together.
It has been a year since I posted this question. After posting it, I delved into Haskell for a couple of months. I enjoyed it tremendously, but I placed it aside just as I was ready to delve into Monads. I went back to work and focused on the technologies my project required.
And last night, I came and re-read these responses. Most importantly, I re-read the specific C# example in the text comments of the Brian Beckman video someone mentions above. It was so completely clear and illuminating that I’ve decided to post it directly here.
Because of this comment, not only do I feel like I understand exactly what Monads are … I realize I’ve actually written some things in C# that are Monads … or at least very close, and striving to solve the same problems.
So, here’s the comment – this is all a direct quote from the comment here by sylvan:
This is pretty cool. It's a bit abstract though. I can imagine people
who don't know what monads are already get confused due to the lack of
real examples.
So let me try to comply, and just to be really clear I'll do an
example in C#, even though it will look ugly. I'll add the equivalent
Haskell at the end and show you the cool Haskell syntactic sugar which
is where, IMO, monads really start getting useful.
Okay, so one of the easiest Monads is called the "Maybe monad" in
Haskell. In C# the Maybe type is called Nullable<T>. It's basically
a tiny class that just encapsulates the concept of a value that is
either valid and has a value, or is "null" and has no value.
A useful thing to stick inside a monad for combining values of this
type is the notion of failure. I.e. we want to be able to look at
multiple nullable values and return null as soon as any one of them
is null. This could be useful if you, for example, look up lots of
keys in a dictionary or something, and at the end you want to process
all of the results and combine them somehow, but if any of the keys
are not in the dictionary, you want to return null for the whole
thing. It would be tedious to manually have to check each lookup for
null and return, so we can hide this checking inside the bind
operator (which is sort of the point of monads, we hide book-keeping
in the bind operator which makes the code easier to use since we can
forget about the details).
Here's the program that motivates the whole thing (I'll define the
Bind later, this is just to show you why it's nice).
class Program
{
static Nullable<int> f(){ return 4; }
static Nullable<int> g(){ return 7; }
static Nullable<int> h(){ return 9; }
static void Main(string[] args)
{
Nullable<int> z =
f().Bind( fval =>
g().Bind( gval =>
h().Bind( hval =>
new Nullable<int>( fval + gval + hval ))));
Console.WriteLine(
"z = {0}", z.HasValue ? z.Value.ToString() : "null" );
Console.WriteLine("Press any key to continue...");
Console.ReadKey();
}
}
Now, ignore for a moment that there already is support for doing this
for Nullable in C# (you can add nullable ints together and you get
null if either is null). Let's pretend that there is no such feature,
and it's just a user-defined class with no special magic. The point is
that we can use the Bind function to bind a variable to the contents
of our Nullable value and then pretend that there's nothing strange
going on, and use them like normal ints and just add them together. We
wrap the result in a nullable at the end, and that nullable will
either be null (if any of f, g or h returns null) or it will be
the result of summing f, g, and h together. (this is analogous
of how we can bind a row in a database to a variable in LINQ, and do
stuff with it, safe in the knowledge that the Bind operator will
make sure that the variable will only ever be passed valid row
values).
You can play with this and change any of f, g, and h to return
null and you will see that the whole thing will return null.
So clearly the bind operator has to do this checking for us, and bail
out returning null if it encounters a null value, and otherwise pass
along the value inside the Nullable structure into the lambda.
Here's the Bind operator:
public static Nullable<B> Bind<A,B>( this Nullable<A> a, Func<A,Nullable<B>> f )
where B : struct
where A : struct
{
return a.HasValue ? f(a.Value) : null;
}
The types here are just like in the video. It takes an M a
(Nullable<A> in C# syntax for this case), and a function from a to
M b (Func<A, Nullable<B>> in C# syntax), and it returns an M b
(Nullable<B>).
The code simply checks if the nullable contains a value and if so
extracts it and passes it onto the function, else it just returns
null. This means that the Bind operator will handle all the
null-checking logic for us. If and only if the value that we call
Bind on is non-null then that value will be "passed along" to the
lambda function, else we bail out early and the whole expression is
null. This allows the code that we write using the monad to be
entirely free of this null-checking behaviour, we just use Bind and
get a variable bound to the value inside the monadic value (fval,
gval and hval in the example code) and we can use them safe in the
knowledge that Bind will take care of checking them for null before
passing them along.
There are other examples of things you can do with a monad. For
example you can make the Bind operator take care of an input stream
of characters, and use it to write parser combinators. Each parser
combinator can then be completely oblivious to things like
back-tracking, parser failures etc., and just combine smaller parsers
together as if things would never go wrong, safe in the knowledge that
a clever implementation of Bind sorts out all the logic behind the
difficult bits. Then later on maybe someone adds logging to the monad,
but the code using the monad doesn't change, because all the magic
happens in the definition of the Bind operator, the rest of the code
is unchanged.
Finally, here's the implementation of the same code in Haskell (--
begins a comment line).
-- Here's the data type, it's either nothing, or "Just" a value
-- this is in the standard library
data Maybe a = Nothing | Just a
-- The bind operator for Nothing
Nothing >>= f = Nothing
-- The bind operator for Just x
Just x >>= f = f x
-- the "unit", called "return"
return = Just
-- The sample code using the lambda syntax
-- that Brian showed
z = f >>= ( \fval ->
g >>= ( \gval ->
h >>= ( \hval -> return (fval+gval+hval ) ) ) )
-- The following is exactly the same as the three lines above
z2 = do
fval <- f
gval <- g
hval <- h
return (fval+gval+hval)
As you can see the nice do notation at the end makes it look like
straight imperative code. And indeed this is by design. Monads can be
used to encapsulate all the useful stuff in imperative programming
(mutable state, IO etc.) and used using this nice imperative-like
syntax, but behind the curtains, it's all just monads and a clever
implementation of the bind operator! The cool thing is that you can
implement your own monads by implementing >>= and return. And if
you do so those monads will also be able to use the do notation,
which means you can basically write your own little languages by just
defining two functions!
A monad is essentially deferred processing. If you are trying to write code that has side effects (e.g. I/O) in a language that does not permit them, and only allows pure computation, one dodge is to say, "Ok, I know you won't do side effects for me, but can you please compute what would happen if you did?"
It's sort of cheating.
Now, that explanation will help you understand the big picture intent of monads, but the devil is in the details. How exactly do you compute the consequences? Sometimes, it isn't pretty.
The best way to give an overview of the how for someone used to imperative programming is to say that it puts you in a DSL wherein operations that look syntactically like what you are used to outside the monad are used instead to build a function that would do what you want if you could (for example) write to an output file. Almost (but not really) as if you were building code in a string to later be eval'd.
You can think of a monad as a C# interface that classes have to implement. This is a pragmatic answer that ignores all the category theoretical math behind why you'd want to choose to have these declarations in your interface and ignores all the reasons why you'd want to have monads in a language that tries to avoid side effects, but I found it to be a good start as someone who understands (C#) interfaces.
See my answer to "What is a monad?"
It begins with a motivating example, works through the example, derives an example of a monad, and formally defines "monad".
It assumes no knowledge of functional programming and it uses pseudocode with function(argument) := expression syntax with the simplest possible expressions.
This C# program is an implementation of the pseudocode monad. (For reference: M is the type constructor, feed is the "bind" operation, and wrap is the "return" operation.)
using System.IO;
using System;
class Program
{
public class M<A>
{
public A val;
public string messages;
}
public static M<B> feed<A, B>(Func<A, M<B>> f, M<A> x)
{
M<B> m = f(x.val);
m.messages = x.messages + m.messages;
return m;
}
public static M<A> wrap<A>(A x)
{
M<A> m = new M<A>();
m.val = x;
m.messages = "";
return m;
}
public class T {};
public class U {};
public class V {};
public static M<U> g(V x)
{
M<U> m = new M<U>();
m.messages = "called g.\n";
return m;
}
public static M<T> f(U x)
{
M<T> m = new M<T>();
m.messages = "called f.\n";
return m;
}
static void Main()
{
V x = new V();
M<T> m = feed<U, T>(f, feed(g, wrap<V>(x)));
Console.Write(m.messages);
}
}