Readability a=b=c or a=c; b=c;? - c#

I have a class which has a group of integers, say
foo()
{
int a;
int b;
int c;
int d;
....
string s;
}
Now the question is for the best readbility, the init() function for foo(), should it look like
void init()
{
a=b=c=d=1; //for some reason they are init to 1;
s = "abc";
}
or
void init()
{
a=1;
b=1;
c=1;
d=1;
s = "abc";
}
?
The reason for a string in class is a hint of other groups of same types might present and of course, the class might grow as requirement changes
EDIT: before this question goes too far, the intention of this question was simple:
In Effective C++ item 12 (prefer initialization to assignment in constructors), Scott uses chain assignment instead of a=c; b=c; I am sure he knows when to use what, but I also remembered the books I read also recommended to use int a; int b; which in similar case of assignments. In my program I have a similar situation of a group of related individual build-in types needs to be initialized and I have found by making a chain assignment does makes it easier to read especially if the class have many other different types instance variables. It seems to contradict with books I read and my memory, hence the question.

I happen to prefer the chained version, but it's completely a matter of preference.
Please note, however, that
a = b = c = 0;
is equivalent to:
c = 0;
b = c;
a = b;
and not
a = 0;
b = 0;
c = 0;
(not that it should matter to you which assignment happens first)

My personal preference is a=b=c=d for the following reasons:
It is concise, saves lines
It conveys the concept that (a/b/c/d) are initialized to the same thing, that they are related
However, caveat:
Don't do that if a/b/c/d are not related (and just happens to be initialized to 1). You'll reduce the readability of your code. Example:
a=c=1; // Foo-function related
b=d=1; // Bar-function related
Chaining assignments like this reduces the flexibility for you in the future to assign different initial values to the variables -- because then you'll have to break them up again.
Nevertheless, my personal recommendation is to chain assignments on variables that are related on concept/usage. In actual practice, the need to change an assignment usually doesn't come up often so caveat #2 should not typically pose a problem.
Edit: My recommendation may go against published guidelines. See the comments.

I guess it is a matter of opinion which is most readable. (Clearly so ... otherwise you wouldn't be asking.)
However Oracle's "Code Conventions for the Java TM Programming Language" clearly says to use separate assignment statements:
10.4 Variable Assignments. "Avoid assigning several variables to the same value in a single statement. It is hard to read."
My opinion?
Follow your project's prescribed / agreed style rules, even if you don't like them1.
If your project doesn't (yet) have prescribed / agreed style rules:
Try to persuade the other members to adopt the most widely used applicable style rules.
If you can't persuade them / come to a consensus, then just do this informally for the major chunks of code that you write for the project1.
1 ... or get out.

Related

Is it good practice to use the same Variable names for both method call and method signature parameters?

for example:
int a;
int b;
int value = getValue(a,b);
private int getValue(int a, int b)
{
int value = a+b;
return value;
}
is the above practical or is it considered to be bad practice and would cause problem later in the development.
I realize that it's a contrived example to demonstrate what you're asking, but your example does contain a naming problem which I'll point out:
int a; // <---- right here
int b; // <---- and here
int value = getValue(a,b); // <--- and a little here
private int getValue(int a, int b)
{
int value = a+b;
return value;
}
The problem isn't in whether or not the variable names match or don't match what they're called in the method. The problem is that the variable names aren't called anything meaningful. This is considerably more of an issue than what you're asking.
Let's re-factor your method to make the example slightly less contrived...
int a;
int b;
int value = GetSum(a,b);
private int GetSum(int firstValue, int secondValue)
{
return firstValue + secondValue;
}
The method is a bit cleaner now and more intuitively reflects its purpose. Now we re-ask the question... Should a and b be renamed to match the ones in the method?
Most likely not. The names in the method have been changed to indicate their context. The method is getting a sum of two values, the first one and the second one. So what is the context of a and b? Are they also known only as the first one and the second one? Or do they convey some other meaning that's not readily available? Something like:
int milesToFirstDestination;
int milesToSecondDestination;
or:
int heightOfPersonInInches;
int heightOfStepstoolInInches;
or any other example of two values which would need to be summed for some purpose. If we added that context to the variable names then we most certainly wouldn't want to add it to the method. The method should be as general-purpose as possible, performing a single task without any concern outside of that task.
In short, it's neither good nor bad practice, because it's not something to even consider. There may be times where, by coincidence alone, the names are the same. (This can often happen with private helper methods, for example.) But they're not the same as a result of a standard or practice to be followed, but rather as a result of coincidentally having the same meaning.
Do you mean is it a good practice to always name a variable used as the argument of a method (at the call site) in the same way as the parameter of the method in the method signature? (Your example is unclear, wouldn't compile, and doesn't contain any parameters...)
No - you absolutely don't need to do that. In many cases you're calling a general purpose method which has no clue about your context - but you should name your variables in your calling method in a way which is meaningful in that context.
Only in very limited circumstances. Consider, for instance, when you want the minimum of two quantities. In the calling code, you'll know what those two quantities are. But in a general Min(a,b) method, it doesn't know or care about what those quantities mean.
If it was generally true, then each variable name could only be used once in each program. You would no longer need parameters to be passed to methods, and every variable would be global (assuming single threaded code).
We try not to write programs like that any more. For starters, it makes writing recursive code a lot less understandable.
is it good practice to use the same names for both method call and method signature parameters?
I would not have a rule that says you must or should always do this. First, it presents practical problems. Imagine:
private int Square(int n) { return n * n; }
What are you going to do here:
int a = 3;
int b = 4;
int cSquared = Square(a) + Square(b);
It's not possible, and it doesn't even make any sense, to give both a and b the same name, and to use the name n. What makes more sense is to give the parameters names that make sense in the context they are being used. So here, thinking of the Pythagorean theorem as a^2 + b^2 = c^2, we would use a and b as the local variable names. But in a different context, another name might make more sense. For example:
int length = 17;
int areaOfSquare = Square(length);
Again, it makes more sense to use a name that makes sense in the context where the method is being called. Not to use the same name in every context.

Definition/Initialization of Global Variables: Common Practice? (Geared towards C#)

I kind of posted a similar question a couple of days ago but that was more geared towards any *.Designer.cs file. This question is geared towards declaration and initialization of global variables within a class. To my knowledge, it's almost common practice (aside from the *.Designer.cs files it seems) to place all global variables at the beginning of a class definition followed by the rest of the code in any order (I prefer Getters and Setters, then Constructors, then Events, then misc functions). Well, I've seen it done and have done it myself where a global variable is set at declaration.
And I'm not referring to:
ClassA clA = new ClassA();
I'm referring to:
private int nViewMode = (int)Constants.ViewMode.Default;
Now, I've heard people say and I can agree with it on some levels, that the initialization of such variables, those variables that don't require a new statement when you declare the variable, should be done in constructors or initialization functions. However, when they stated that, they may have meant that the previous statements were fine, but not the following:
Wrong Way
private int nTotal = 100;
private int nCount = 10;
private int nDifference = nTotal - nCount;
Possible Right Way
private int nTotal = 100;
private int nCount = 10;
private int nDifference = 0;
void ClassConstructor()
{
nDifference = nTotal - nCount;
}
My questions are:
What is the most common/standard practice in such a situation?
What are the pros and cons of either?
Are these questions only relevant for some languages and not others?
My last question I thought of as I was typing this up and here's the reason. In Visual Studio 2008 it seems I can place breakpoints on global variable declarations while I don't think I could when I used to write C++ in college. Also, I believe in college, you couldn't use a variable that was declared immediately before the current variable, but then again, that was in C++. So I'm not sure if these questions are only valid for MSVS products (we used Borland in college), newer compilers, or what not. If anyone has any insight, it's appreciated. Thanks.
I believe this has been covered many times before, but there really isn't an answer other than: Whatever you do end up doing, make sure it's consistent with the code in other places in your project.
I personally prefer initializing default values outside of the constructor, unless they are calculated differently based on which constructor is used. That way, if another constructor comes along, there is no need to repeat the initialization code.
In the case of nDifference, perhaps a Property that encapsulates the logic would make more sense so:
If it doesn't get used, nDifference doesn't need to get calculated every time a new instance of the class is created.
It indicates the logic for nDifference should always be the same, regardless of which constructor is used.
The C# Language Definition guarantees that field initializations will occur in textual order within each compilation unit (file). This means that it is perfectly OK to have complex expressions in the variable initializer of static field declarations. (Instance fields, on the other hand, cannot reference other instance fields.)
If the initial value of a field depends on a previous value, then they should probably be kept together to avoid accidental re-ordering.
class Demo1 {
static int x = y + 10; // x == 10
static int y = 5;
}
class Demo2 {
static int y = 5;
static int x = y + 10; // x == 15
}
As others have stated, I would prefer to have initializers common to all instances (regardless of the chosen constructor) to occur in the declarations.
This re-ordering behavior is only true for static variable initializers. Constant initialization occurs at compile time, and the values are calculated in an order that ensures the values are initialized correctly (and circular references for constants, unlike variables, are not allowed).
class Demo3 {
const int x = y + 10; // Evaluated second. x == 15
const int y = 5; // Evaluated first.
}
You should really consider whether a calculated value needs to be stored at all, since in many cases it can be calculated at the time it is used.
class Demo4 {
int y = 5;
int x { get { return y + 10; } }
}
Personally, I like having the ability to initialize member variables where I declare them in C#, particularly if the only reason you were going to write an explicit constructor was to initialize them.
In older C# dialects (we're still on 2.0 where I work), I guess there's a consistency argument if you're populating a member Dictionary<T> or something in a constructor, since the new initializer syntax didn't show up until later. In that case, you could make the argument that you want to keep all of your initialization together. Likewise, if you're initializing some members based on constructor arguments maybe it makes more sense to keep all the initialization together rather than assigning some stuff where it's declared and other stuff in the constructor -- but if you have more than one constructor, if you don't repeat yourself you're just going to end up having some initialization in one place and the rest in another anyway so you're probably just better off assigning things where you declare them.
I prefer to initialise all fields in the constructor rather than at point of declaration. (The only exception I make to this is for static fields where I find the addition of a static constructor overkill.) My reasons are that I like to have all of the construction logic in one place and secondly to avoid the debugger jumping around confusingly when stepping through the code. However, this is just my preference and you are free to come up with whatever convention you are most comfortable with.
As others have already stated, think your convention through carefully and apply it consistently.

Closures for anonymous methods in C# 3.0

Why do closures exist for anonymous methods? Why not just pass state into the method without the overhead of a new class being generated with the closure variables being copied in? Isn't this just a throwback to "making everything global?"
Someone talk me down, I feel like i'm missing something here...
Purely, convenience... you don't know how much state you are going to need when defining, for example, a Predicate<T> - consider:
List<int> data = new List<int> {1,2,3,4,5,6,7,8,9,10};
int min = int.Parse(Console.ReadLine()), max = int.Parse(Console.ReadLine());
List<int> filtered = data.FindAll(i => i >= min && i <= max);
here we've passed two additional bits of state into Predicate<T> (min and max) - but we can't define List<T>.FindAll(Predicate<T>) to know about that, as that is a caller detail.
The alternative is to write the classes ourselves, but that is hard, even if we are lazy:
class Finder {
public int min, max;
public bool CheckItem(int i) { return i >= min && i <= max;}
}
...
Finder finder = new Finder();
finder.min = int.Parse(Console.ReadLine());
finder.max = int.Parse(Console.ReadLine());
List<int> filtered = data.FindAll(finder.CheckItem);
I don't know about you, but I prefer the version with the closure... especially when you consider how complex it gets when you have multiple contextual levels. I want the compiler to worry about it.
Consider also how often you use such constructs, especially for things like LINQ: you wouldn't want to have to do it any other way...
The overhead of creating a new class probably isn't something to worry about. It's just a convenient way for the CLR to make bound variables (those captured by the closure) available on the heap. They are still only accessible within the scope of the closure, so they're at all not "global" in the traditional sense.
I believe the reason for their existence is mainly convenience for the programmer. In fact, it's purely that as far as I'm concerned. You could emulate the behaviour of a closure pretty well before they existed in C#, but you don't get any of the simplicity and syntactical sugar that C# 3.0 offers. The entire point about closures is that you don't need to pass the variables in the parent scope to the function, because they're automatically bound. It's much easier and cleaner for the programer to work with, really, if you consider that the alternative is true global variables.
At least one thought is that closures exist in other languages such as javascript. So they probably included closures to be in line with people's prior experience with anonymous methods.
Because you take write this:
var divsor = [...]
for(int x = 0; x < 10000000; x++){
source[x] = source[x] / divsor
}
And easily turn it into this
var divsor = [...]
Parallel.For(int x = 0; x < 10000000; x++, ()=> {
source[x] = source[x] / divsor
})
Some methods require a specific signature. For example:
public void set_name_on_click(string name)
{
button.Click += (s,e) => { button.Text = name; };
}
A full closure solves this very neatly. You really don't want to mess with the anonymous methods signature.
Because the CLR takes care of it, the easiest way to "pass state into the method" is to auto-generate a class, which encapsulates that state.

Why doesn't C# support local static variables like C does? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Why doesn't C# have local static variables like C? I miss that!!
Because they screwed up, and left out a useful feature to suit themselves.
All the arguments about how you should code, and what's smart, and you should reconsider your way of life, are pompous defensive excuses.
Sure, C# is pure, and whatchamacallit-oriented. That's why they auto-generate persistent locals for lambda functions. It's all so complicated. I feel so dumb.
Loop scope static is useful and important in many cases.
Short, real answer, is you have to move local statics into class scope and live with class namespace pollution in C#. Take your complaint to city hall.
The MSDN blog entry from 2004: Why doesn't C# support static method variables? deals with the exact question asked in the original post:
There are two reasons C# doesn't have this feature.
First, it is possible to get nearly the same effect by having a
class-level static, and adding method statics would require increased
complexity.
Second, method level statics are somewhat notorious for causing
problems when code is called repeatedly or from multiple threads, and
since the definitions are in the methods, it's harder to find the
definitions.
[author: Eric Gunnerson]
(Same blog entry in the Microsoft's own archive. The Archive.org preserved the comments. Microsoft's archive didn't.)
State is generally part of an object or part of a type, not part of a method. (The exception being captured variables, of course.)
If you want the equivalent of a local static variable, either create an instance variable or a static variable - and consider whether the method itself should actually be part of a different type with that state.
I'm not nearly as familiar with C as I am C#, but I believe you can accomplish everything you could with a local static, by using a class level static that is only used for one method. Obviously, this comes with some syntactic change, but I believe you can get whatever functionality you need.
Additionally, Eric Lippert answers questions like this on his blog a lot. Generally answered in this way: "I am asked "why doesn't C# implement feature X?" all the time. The answer is always the same: because no one ever designed, specified, implemented, tested, documented and shipped that feature." Essentially his answers generally boil down to, it costs money to add any feature, and therefore, many potential features are not implemented because they have not come out on the positive side of the cost benefit analysis.
So you want to use a static local variable in your method? Congratulations! You made another step towards becoming a real programmer.
Don't listen to all the people telling you that static locals are not "clean", that they impede "readability" and could lead to subtle and hard-to-find "bugs". Nonsense! They just say that because they are wannabe programmers! Lots of them are probably even toying around with an esoteric functional programming language during their free-time. Can you believe it? What a bunch of hipsters!
Real programmers embrace a paradigm I like to call SDD - Side effect Driven Design. Here are some of it's most important laws:
Don't be predictable! Never return the same thing from a method twice - even if it's being called with the exact same arguments!
Screw purity - let's get dirty! State, by nature, craves changing, because it is an insatiable monoid in the category of polyamorous endofunctors, i.e. it likes to be touched by as many collaborators as possible. Never miss out on an opportunity to do it the favor!
Among the tools used to code in a side effect driven manner are, of course, static local variables. However, as you noticed, C# does not support them. Why? Because over the last two decades Microsoft has been infiltrated by so called Clean Coders that favor maintainability over flexibility and control. Can you even remember the last time you have seen our beloved blue screen? Now guess whose fault is that!
But fear not! Real developers don't have to suffer from those poor design decisions. As has been mentioned before it is possible to have local variables that are kind of static with the help of lambdas.
However, the provided solution wasn't quite satisfactory. Using the previous answer our almost-SDD-compliant code would look something like this:
var inc = Increment();
var zero = inc();
var one = inc();
or
var zero = Increment()();
But that's just silly. Even a wannabe developer can see that Increment() is not a normal method and will get suspicious. A real programmer, on the other hand, can make it even more SDD-like. He or she knows that we can make a property or field look like a method by giving it the type Func<T>! We just have to initialize it by executing a lambda that in turn initializes the counter and returns another lambda incrementing the captured counter!
Here it is in proper SDD code:
public Func<int> Increment = new Func<Func<int>>(() =>
{
var num = 0;
return () => num++;
}).Invoke();
(You think the above kinda looks like an IIFE? Yes, you are right and you should be ashamed of yourself.)
Now every time you call Increment() it will return something different:
var zero = Increment();
var one = Increment();
Of course you also can make it so that the counter survives the lifetime of your instance.
That'll show them wannabe programmers!
C# is a component-oriented language and doesn't have the concept of variables outside the scope of a class or local method. Variables within a method cannot be declared static either, as you may be accustomed to doing in C. However, you can always use a class static variable as a substitute.
As a general practice, there are usually ways to solve programming problems in C# without resorting to using method-level statics. State is generally something you should design into classes and types, not methods.
Logically, yes. It would be the same as a class-level static member that was only used in that one method. However, a method-level static member would be more encapsulated. If the data stored in a member is only meant to be used by a single method, it should only be accessible by that single method.
However, you CAN achieve almost exactly the same effect in C# by creating a nested class.
Because static local variables are tied to the method, and the method is shared amongst all instances.
I've had to correct myself and other programmers who expect it to be unique per class instance using the method.
However, if you make it a static class, or static instance of a class, it's syntactically clear whether there's an instance per container-class, or one instance at all.
If you don't use these, it becomes easier to refactor later as well.
I think the idea of local statics is just as easily solved by creating public static fields to the class. Very little logical change don't you think?
If you think it would be a big logical change, I'd be interested to hear how.
class MyClass
{
public static float MaxDepthInches = 3;
private void PickNose()
{
if (CurrentFingerDepth < MyClass.MaxDepthInches)
{
CurrentFingerDepth++;
}
}
}
You can use nested-class as a workaround for this. Since C# is limiting the scope of static variables to classes, you can use nested-class as a scope.
For example:
public class Foo {
public int Increment() {
return IncrementInternal.Increment();
}
private static class IncrementInternal {
private static int counter = 0;
public static int Increment() {
return counter++;
}
}
}
Here Foo supports Increment method, but its support it by the private nested class IncrementInternal which contains the static variable as a member. And of course, counter is not visible in the context (other methods) of Foo.
BTW, if you want to access to Foo context (other members and methods) inside IncrementInternal.Increment, you can pass this as a parameter to IncrementInternal.Increment when you call it from Foo.
To keep the scope as small as possible, my suggestion is to create a nested class per each such method. And because it is probably not so common, the number of nested classes will stay small enough to maintains it.
I think it is cleaner than anonymous functions or IIFE.
You can see a live demo here.
I don't see much added benefit to local statics, if you are keeping your classes single purpose and small, there is little problem with global static pollution as the naysayers like to complain about. But here is just one other alternative.
using System;
using System.Collections;
public class Program
{
delegate bool DoWork();
public static void Main()
{
DoWork work = Foo().GetEnumerator().MoveNext;
work();
work();
work();
}
public static IEnumerable Foo()
{
int static_x = 10;
/*
do some other static stuff....
*/
main:
//repetative housework
Console.WriteLine(static_x);
static_x++;
yield return true;
goto main;
}
}
If you can imagine some sort of Lippert/Farnsworth hybrid entity announcing GOOD NEWS EVERYONE!, C# 6.0 allows the using static statement. This effectively allows you to import static class methods (and, it seems, properties and members as well) into the global scope.
In short, you can do something like this:
using NUnit.Framework;
using static Fizz.Buzz;
class Program
{
[Test]
public void Main()
{
Method();
int z = Z;
object y = Y;
Y = new object();
}
}
namespace Fizz
{
class Buzz
{
public static void Method()
{
}
public static int Z;
public static object Y { get; set; }
}
}
While this is only available in C# 6.0, from what I understand the generated assemblies should be compatible with previous .NET platforms (correct me if I'm wrong).
You can simulate it using a delegate... Here is my sample code:
public Func<int> Increment()
{
int num = 0;
return new Func<int>(() =>
{
return num++;
});
}
You can call it like this:
Func<int> inc = Increment();
inc();

In C#, What is a monad?

There is a lot of talk about monads these days. I have read a few articles / blog posts, but I can't go far enough with their examples to fully grasp the concept. The reason is that monads are a functional language concept, and thus the examples are in languages I haven't worked with (since I haven't used a functional language in depth). I can't grasp the syntax deeply enough to follow the articles fully ... but I can tell there's something worth understanding there.
However, I know C# pretty well, including lambda expressions and other functional features. I know C# only has a subset of functional features, and so maybe monads can't be expressed in C#.
However, surely it is possible to convey the concept? At least I hope so. Maybe you can present a C# example as a foundation, and then describe what a C# developer would wish he could do from there but can't because the language lacks functional programming features. This would be fantastic, because it would convey the intent and benefits of monads. So here's my question: What is the best explanation you can give of monads to a C# 3 developer?
Thanks!
(EDIT: By the way, I know there are at least 3 "what is a monad" questions already on SO. However, I face the same problem with them ... so this question is needed imo, because of the C#-developer focus. Thanks.)
Most of what you do in programming all day is combining some functions together to build bigger functions from them. Usually you have not only functions in your toolbox but also other things like operators, variable assignments and the like, but generally your program combines together lots of "computations" to bigger computations that will be combined together further.
A monad is some way to do this "combining of computations".
Usually your most basic "operator" to combine two computations together is ;:
a; b
When you say this you mean "first do a, then do b". The result a; b is basically again a computation that can be combined together with more stuff.
This is a simple monad, it is a way of combing small computations to bigger ones. The ; says "do the thing on the left, then do the thing on the right".
Another thing that can be seen as a monad in object oriented languages is the .. Often you find things like this:
a.b().c().d()
The . basically means "evaluate the computation on the left, and then call the method on the right on the result of that". It is another way to combine functions/computations together, a little more complicated than ;. And the concept of chaining things together with . is a monad, since it's a way of combining two computations together to a new computation.
Another fairly common monad, that has no special syntax, is this pattern:
rv = socket.bind(address, port);
if (rv == -1)
return -1;
rv = socket.connect(...);
if (rv == -1)
return -1;
rv = socket.send(...);
if (rv == -1)
return -1;
A return value of -1 indicates failure, but there is no real way to abstract out this error checking, even if you have lots of API-calls that you need to combine in this fashion. This is basically just another monad that combines the function calls by the rule "if the function on the left returned -1, do return -1 ourselves, otherwise call the function on the right". If we had an operator >>= that did this thing we could simply write:
socket.bind(...) >>= socket.connect(...) >>= socket.send(...)
It would make things more readable and help to abstract out our special way of combining functions, so that we don't need to repeat ourselves over and over again.
And there are many more ways to combine functions/computations that are useful as a general pattern and can be abstracted in a monad, enabling the user of the monad to write much more concise and clear code, since all the book-keeping and management of the used functions is done in the monad.
For example the above >>= could be extended to "do the error checking and then call the right side on the socket that we got as input", so that we don't need to explicitly specify socket lots of times:
new socket() >>= bind(...) >>= connect(...) >>= send(...);
The formal definition is a bit more complicated since you have to worry about how to get the result of one function as an input to the next one, if that function needs that input and since you want to make sure that the functions you combine fit into the way you try to combine them in your monad. But the basic concept is just that you formalize different ways to combine functions together.
It has been a year since I posted this question. After posting it, I delved into Haskell for a couple of months. I enjoyed it tremendously, but I placed it aside just as I was ready to delve into Monads. I went back to work and focused on the technologies my project required.
And last night, I came and re-read these responses. Most importantly, I re-read the specific C# example in the text comments of the Brian Beckman video someone mentions above. It was so completely clear and illuminating that I’ve decided to post it directly here.
Because of this comment, not only do I feel like I understand exactly what Monads are … I realize I’ve actually written some things in C# that are Monads … or at least very close, and striving to solve the same problems.
So, here’s the comment – this is all a direct quote from the comment here by sylvan:
This is pretty cool. It's a bit abstract though. I can imagine people
who don't know what monads are already get confused due to the lack of
real examples.
So let me try to comply, and just to be really clear I'll do an
example in C#, even though it will look ugly. I'll add the equivalent
Haskell at the end and show you the cool Haskell syntactic sugar which
is where, IMO, monads really start getting useful.
Okay, so one of the easiest Monads is called the "Maybe monad" in
Haskell. In C# the Maybe type is called Nullable<T>. It's basically
a tiny class that just encapsulates the concept of a value that is
either valid and has a value, or is "null" and has no value.
A useful thing to stick inside a monad for combining values of this
type is the notion of failure. I.e. we want to be able to look at
multiple nullable values and return null as soon as any one of them
is null. This could be useful if you, for example, look up lots of
keys in a dictionary or something, and at the end you want to process
all of the results and combine them somehow, but if any of the keys
are not in the dictionary, you want to return null for the whole
thing. It would be tedious to manually have to check each lookup for
null and return, so we can hide this checking inside the bind
operator (which is sort of the point of monads, we hide book-keeping
in the bind operator which makes the code easier to use since we can
forget about the details).
Here's the program that motivates the whole thing (I'll define the
Bind later, this is just to show you why it's nice).
class Program
{
static Nullable<int> f(){ return 4; }
static Nullable<int> g(){ return 7; }
static Nullable<int> h(){ return 9; }
static void Main(string[] args)
{
Nullable<int> z =
f().Bind( fval =>
g().Bind( gval =>
h().Bind( hval =>
new Nullable<int>( fval + gval + hval ))));
Console.WriteLine(
"z = {0}", z.HasValue ? z.Value.ToString() : "null" );
Console.WriteLine("Press any key to continue...");
Console.ReadKey();
}
}
Now, ignore for a moment that there already is support for doing this
for Nullable in C# (you can add nullable ints together and you get
null if either is null). Let's pretend that there is no such feature,
and it's just a user-defined class with no special magic. The point is
that we can use the Bind function to bind a variable to the contents
of our Nullable value and then pretend that there's nothing strange
going on, and use them like normal ints and just add them together. We
wrap the result in a nullable at the end, and that nullable will
either be null (if any of f, g or h returns null) or it will be
the result of summing f, g, and h together. (this is analogous
of how we can bind a row in a database to a variable in LINQ, and do
stuff with it, safe in the knowledge that the Bind operator will
make sure that the variable will only ever be passed valid row
values).
You can play with this and change any of f, g, and h to return
null and you will see that the whole thing will return null.
So clearly the bind operator has to do this checking for us, and bail
out returning null if it encounters a null value, and otherwise pass
along the value inside the Nullable structure into the lambda.
Here's the Bind operator:
public static Nullable<B> Bind<A,B>( this Nullable<A> a, Func<A,Nullable<B>> f )
where B : struct
where A : struct
{
return a.HasValue ? f(a.Value) : null;
}
The types here are just like in the video. It takes an M a
(Nullable<A> in C# syntax for this case), and a function from a to
M b (Func<A, Nullable<B>> in C# syntax), and it returns an M b
(Nullable<B>).
The code simply checks if the nullable contains a value and if so
extracts it and passes it onto the function, else it just returns
null. This means that the Bind operator will handle all the
null-checking logic for us. If and only if the value that we call
Bind on is non-null then that value will be "passed along" to the
lambda function, else we bail out early and the whole expression is
null. This allows the code that we write using the monad to be
entirely free of this null-checking behaviour, we just use Bind and
get a variable bound to the value inside the monadic value (fval,
gval and hval in the example code) and we can use them safe in the
knowledge that Bind will take care of checking them for null before
passing them along.
There are other examples of things you can do with a monad. For
example you can make the Bind operator take care of an input stream
of characters, and use it to write parser combinators. Each parser
combinator can then be completely oblivious to things like
back-tracking, parser failures etc., and just combine smaller parsers
together as if things would never go wrong, safe in the knowledge that
a clever implementation of Bind sorts out all the logic behind the
difficult bits. Then later on maybe someone adds logging to the monad,
but the code using the monad doesn't change, because all the magic
happens in the definition of the Bind operator, the rest of the code
is unchanged.
Finally, here's the implementation of the same code in Haskell (--
begins a comment line).
-- Here's the data type, it's either nothing, or "Just" a value
-- this is in the standard library
data Maybe a = Nothing | Just a
-- The bind operator for Nothing
Nothing >>= f = Nothing
-- The bind operator for Just x
Just x >>= f = f x
-- the "unit", called "return"
return = Just
-- The sample code using the lambda syntax
-- that Brian showed
z = f >>= ( \fval ->
g >>= ( \gval ->
h >>= ( \hval -> return (fval+gval+hval ) ) ) )
-- The following is exactly the same as the three lines above
z2 = do
fval <- f
gval <- g
hval <- h
return (fval+gval+hval)
As you can see the nice do notation at the end makes it look like
straight imperative code. And indeed this is by design. Monads can be
used to encapsulate all the useful stuff in imperative programming
(mutable state, IO etc.) and used using this nice imperative-like
syntax, but behind the curtains, it's all just monads and a clever
implementation of the bind operator! The cool thing is that you can
implement your own monads by implementing >>= and return. And if
you do so those monads will also be able to use the do notation,
which means you can basically write your own little languages by just
defining two functions!
A monad is essentially deferred processing. If you are trying to write code that has side effects (e.g. I/O) in a language that does not permit them, and only allows pure computation, one dodge is to say, "Ok, I know you won't do side effects for me, but can you please compute what would happen if you did?"
It's sort of cheating.
Now, that explanation will help you understand the big picture intent of monads, but the devil is in the details. How exactly do you compute the consequences? Sometimes, it isn't pretty.
The best way to give an overview of the how for someone used to imperative programming is to say that it puts you in a DSL wherein operations that look syntactically like what you are used to outside the monad are used instead to build a function that would do what you want if you could (for example) write to an output file. Almost (but not really) as if you were building code in a string to later be eval'd.
You can think of a monad as a C# interface that classes have to implement. This is a pragmatic answer that ignores all the category theoretical math behind why you'd want to choose to have these declarations in your interface and ignores all the reasons why you'd want to have monads in a language that tries to avoid side effects, but I found it to be a good start as someone who understands (C#) interfaces.
See my answer to "What is a monad?"
It begins with a motivating example, works through the example, derives an example of a monad, and formally defines "monad".
It assumes no knowledge of functional programming and it uses pseudocode with function(argument) := expression syntax with the simplest possible expressions.
This C# program is an implementation of the pseudocode monad. (For reference: M is the type constructor, feed is the "bind" operation, and wrap is the "return" operation.)
using System.IO;
using System;
class Program
{
public class M<A>
{
public A val;
public string messages;
}
public static M<B> feed<A, B>(Func<A, M<B>> f, M<A> x)
{
M<B> m = f(x.val);
m.messages = x.messages + m.messages;
return m;
}
public static M<A> wrap<A>(A x)
{
M<A> m = new M<A>();
m.val = x;
m.messages = "";
return m;
}
public class T {};
public class U {};
public class V {};
public static M<U> g(V x)
{
M<U> m = new M<U>();
m.messages = "called g.\n";
return m;
}
public static M<T> f(U x)
{
M<T> m = new M<T>();
m.messages = "called f.\n";
return m;
}
static void Main()
{
V x = new V();
M<T> m = feed<U, T>(f, feed(g, wrap<V>(x)));
Console.Write(m.messages);
}
}

Categories