What's the difference between functors and "generics" - c#

I'm looking at OCaml's functors. It looks to me pretty identical to the so called generic objects in C++/C#/Java. If you ignore Java's type erasion for now, and ignore the implementation details for C++ templates (I'm interested with the language feature), functors are quite indentical to generics.
If I understand it correctly, functor gives you a new set of functions from a type you provide, so that for example
List<MyClass>.GetType() != List<MyOtherClass>.GetType()
But you could roughly rewrite OCaml's
#module Set =
functor (Elt: ORDERED_TYPE) ->
struct
type element = Elt.t
type set = element list
let empty = []
let rec add x s =
match s with
[] -> [x]
| hd::tl ->
match Elt.compare x hd with
Equal -> s (* x is already in s *)
| Less -> x :: s (* x is smaller than all elements of s *)
| Greater -> hd :: add x tl
let rec member x s =
match s with
[] -> false
| hd::tl ->
match Elt.compare x hd with
Equal -> true (* x belongs to s *)
| Less -> false (* x is smaller than all elements of s *)
| Greater -> member x tl
end;;
into C#
class Set<T> where T : ISortable
{
private List<T> l = new List<T>();
static public Set<T> empty = new Set<T>();
public bool IsMember(T x) {return l.IndexOf(x) > -1;}
public void Add(T x) {l.Add(x);}
}
Sure there's a slight different since a functor affects a Module (which is just a bunch of types function and values definitions, similar to C#'s namespace).
But is it just it? Are functors merely generics applied to namespaces? Or is there any signifcant different between functors and generics which I'm missing.
Even if functors are just generics-for-namespace, what's the significant advantage of that approach? Classes can also be used as ad-hoc namespaces using nested classes.

But is it just it? Are functors merely
generics applied to namespaces?
Yes, I think one can treat functors as "namespaces with generics", and that by itself would be very welcome in C++ where the only option left is to use classes with all static members which becomes pretty ugly soon. Comparing to C++ templates one huge advantage is the explicit signature on module parameters (this is what I believe C++0x concepts could become, but oops).
Also modules are quite different from namespaces (consider multiple structural signatures, abstract and private types).
Even if functors are just
generics-for-namespace, what's the
significant advantage of that
approach? Classes can also be used as
ad-hoc namespaces using nested
classes.
Not sure whether it qualifies for significant, but namespaces can be opened, while class usage is explicitly qualified.
All in all - I think there is no obvious "significant advantage" of functors alone, it is just different approach to code modularization - ML style - and it fits nicely with the
core language. Not sure whether comparing module system apart from the language makes much sense.
PS C++ templates and C# generics are also quite different so that comparing against them as a whole feels little strange.

If I understand it correctly, functor gives you a new set of functions from a type you provide
More generally, functors map modules to modules. Your Set example maps a module adhering to the ORDERED_TYPE signature to a module implementing a set. The ORDERED_TYPE signature requires a type and a comparison function. Therefore, your C# is not equivalent because it parameterizes the set only over the type and not over the comparison function. So your C# can only implement one set type for each element type whereas the functor can implement many set modules for each element module, e.g. in ascending and descending order.
Even if functors are just generics-for-namespace, what's the significant advantage of that approach?
One major advantage of a higher-order module system is the ability to gradually refine interfaces. In OOP, everything is public or private (or sometimes protected or internal etc.). With modules, you can gradually refine module signatures at will giving more public access closer to the internals of a module and abstracting more and more of it away as you get further from that part of the code. I find that to be a considerable benefit.
Two examples where higher-order module systems shine compared to OOP are parameterizing data structure implementations over each other and building extensible graph libraries. See the section on "Structural abstraction" in Chris Okasaki's PhD thesis for examples of data structures parameterized over other data structures, e.g. a functor that converts a queue into a catenable list. See OCaml's excellent ocamlgraph library and the paper Designing a Generic Graph Library using ML Functors for an example of extensible and reusable graph algorithms using functors.
Classes can also be used as ad-hoc namespaces using nested classes.
In C#, you cannot parameterize classes over other classes. In C++, you can do some things like inheriting from a class passed in via a template.
Also, you can curry functors.

Functors in SML are generative, so the abstract types produced by an application of a functor at one point in a program are not the same as the abstract types produced by the same application (i.e. same functor, same argument) at another point.
For example, in:
structure IntMap1 = MakeMap(Int)
(* ... some other file in some other persons code: *)
structure IntMap2 = MakeMap(Int)
You can't take a map produced by a function in IntMap1 and use it with a function from IntMap2, because IntMap1.t is a different abstract type to IntMap2.t.
In practice this means if your library has a function producing an IntMap.t then you must also supply the IntMap structure as part of your library, and if the user of your library wants to use his own (or another libraries) IntMap then he has to convert the values from your IntMap to his IntMap - even though they are already structurally equivalent.
The alternative is to make your library a functor itself, and require the user of the library to apply that functor with their choice of IntMap. This also requires the user of the library to do more work than ideal. Especially when your library not only uses IntMap, but also other kinds of Map, and various Sets, and others.
With generics, OTOH, it is quite easy to write a library producing a Map, and have that value work with other libraries functions that take Map.

I just found a source that may help you with your problem - as OCaml has a different meaning for functors:
http://books.google.de/books?id=lfTv3iU0p8sC&pg=PA160&lpg=PA160&dq=ocaml+functor&source=bl&ots=vu0sdIB3ja&sig=OhGGcBdaIUR-3-UU05W1VoXQPKc&hl=de&ei=u2e8SqqCNI7R-Qa43IHSCw&sa=X&oi=book_result&ct=result&resnum=9#v=onepage&q=ocaml%20functor&f=false
still - I find it confusing if the same word is used for different concepts.
I don't know if OCaml has a different meaning - but normally a Functor is a "Function object" (see here: http://en.wikipedia.org/wiki/Function_object). This is totally different to generics (see here: http://en.wikipedia.org/wiki/Generic_programming)
A function object is an object that can be used as a function. Generics are a way to parametrize objects. Generics are kind of orthogonally to inheritance (which specializes objects). Generics introduce typesafety and reduce the need for casting. Functors are an improved function pointer.

Related

How to avoid writing repetitive code for different numeric types in .NET

I am trying to write generic Vector2 type which would suite float, double, etc. types and use arithmetical operations. Is there any chance to do it in C#, F#, Nemerle or any other more or less mature .NET language?
I need a solution with
(1)good performance (same as I would have writing separate
Vector2Float, Vector2Double, etc. classes),
(2)which would allow
code to look nice (I do not want to emit code for each class in
run-time)
(3)and which would do as much compile time checking as possible.
For reasons 1 and 3 I would not like to use dynamics. Right now I am checking F# and Nemerle.
UPD: I expect to have a lot of mathematical code for this type. However, I would prefer to put the code in extension methods if it is possible.
UPD2: 'etc' types include int(which I actually doubt I would use) and decimal(which I think I might use, but not now). Using extension methods is just a matter of taste - if there are good reasons not to, please tell.
As mentioned by Daniel, F# has a feature called statically resolved type arguments which goes beyond what you can do with normal .NET generic in C#. The trick is that if you mark function as inline, F# generates specialized code automatically (a bit like C++ templates) and then you can use more powerful features of the F# type system to write generic math.
For example, if you write a simple add function and make it inline:
let inline add x y = x + y;;
The type inference prints the following type:
val inline add :
x: ^a -> y: ^b -> ^c
when ( ^a or ^b) : (static member ( + ) : ^a * ^b -> ^c)
You can see that the inferred type is fairly complex - it specifies a member constraint that requires one of the two arguments to define a + member (and this is also supported by standard .NET types) The good thing is that this can be fully inferred, so you will rarely have to write the ugly type definitions.
As mentioned in the comments, I wrote an article Writing generic numeric code that goes into more details of how to do this in F#. I don't think this can be easily done in C# and the inline functions that you write in F# should only be called from F# (calling them from C# would essentially use dynamic). But you can definitely write your generic numerical computations in F#.
This more directly addresses your previous question. You can't put a static member constraint on a struct, but you can put it on a static Create method.
[<Struct>]
type Vector2D<'a> private (x: 'a, y: 'a) =
static member inline Create<'a when 'a : (static member (+) : 'a * 'a -> 'a)>(x, y) = Vector2D<'a>(x, y)
C# alone will not help you in achieving that, unfortunately. Emitting structs at run-time wouldn't help you much either since your program couldn't statically refer to them.
If you really can't afford to duplicate the code, then as far as I know, "offline" code generation is the only way to go about this. Instead of generating the code at runtime, use AssemblyBuilder and friends to create an on-disk assembly with your Vector2 types, or generate a string of C# code to be fed to the compiler. I believe some of the native library wrappers take this route (ie OpenTK, SharpDX). You can then use ilmerge if you want to merge those types to one of your hand-coded libraries.
I'm assuming you must be coming from a C++ background where this is easily achieved using templates. However, you should ask yourself whether you actually need Vector2 types based on integral, decimal and other "exotic" numeric types. You probably won't be able to parameterize the rest of your code based on a specific Vector2 either so the effort might not be worth it.
Look into inline functions and Statically Resolved Type Parameters.
As I understand you a strict type in the compile time , but you don't care what happens in the runtime.
Nemerle language currently doesn't support this construction as you want.
But it supports macros and allows you writing DSLs to generate arbitrary code.
For instance you can do some macro which analyzes this code and transforms it to the correct type.
def vec = vector { [1,2] };
Assuming we have or create a type VectorInt the code could be translated to
def vec = VectorInt(1,2);
Of course you can write any code inside and transform it to any code you want :)
Operators can be implemented as usual operators of the class.
Nemerle also allows you to define any operators like F#.
make use of Generics , this makes is also type safe
more info on generics : http://msdn.microsoft.com/en-us/library/512aeb7t.aspx
But you also have availible datastructures such as List and Dictionary
Sounds like you want operator overloading, there are a lot of examples for this. There is not realy a good way to only allow decial, float and such. The only thing you can do is restrict to struct, but thats not exactly what you want.

Why there is no something like IMonad<T> in upcoming .NET 4.0

... with all those new (and not so new if we count IEnumerable) monad-related stuff?
interface IMonad<T>
{
SelectMany/Bind();
Return/Unit();
}
That would allow to write functions that operate on any monadic type. Or it's not so critical?
Think about what the signature for IMonad<T>'s methods would have to be. In Haskell the Monad typeclass is defined as
class Monad m where
(>>=) :: m a -> (a -> m b) -> m b
return :: a -> m a
It's tricky to translate this directly to a C# interface because you need to be able to reference the specific implementing subtype ("m a" or ISpecificMonad<a>) within the definition of the general IMonad interface. OK, instead of trying to have (for example) IEnumerable<T> implement IMonad<T> directly, we'll try factoring the IMonad implementation out into a separate object which can be passed, along with the specific monad type instance, to whatever needs to treat it as a monad (this is "dictionary-passing style"). This will be IMonad<TMonad> and TMonad here will be not the T in IEnumerable<T>, but IEnumerable<T> itself. But wait -- this can't work either, because the signature of Return<T> for example has to get us from any type T to a TMonad<T>, for any TMonad<>. IMonad would have to be defined as something like
interface IMonad<TMonad<>> {
TMonad<T> Unit<T>(T x);
TMonad<U> SelectMany<T, U>(TMonad<T> x, Func<T, TMonad<U>> f);
}
using a hypothetical C# feature that would allow us to use type constructors (like TMonad<>) as generic type parameters. But of course C# does not have this feature (higher-kinded polymorphism). You can reify type constructors at runtime (typeof(IEnumerable<>)) but can't refer to them in type signatures without giving them parameters. So besides the -100 points thing, implementing this "properly" would require not just adding another ordinary interface definition, but deep additions to the type system.
That's why the ability to have query comprehensions over your own types is kind of hacked on (they just "magically" work if the right magic method names with the right signatures are there) instead of using the interface mechanism etc.
Monads simply aren't important to .NET programmers. Without even knowing monads exist you can still build the LINQ framework. More importantly, it wouldn't look any different. It doesn't matter if you think in terms of monads (Haskell), expression tree rewriting (Lisp), set-based operations (SQL), or using map/reduce to create new types (Ruby, Python), the end result is going to be the same.
In fact, I would go so far as to say that monads are downright useless to .NET developers. Every time I see a library for .NET based on monads, it is invariably both more verbose and less comprehensible than straight C# or VB code. The reason is simple, languages like C# and VB are built on much, much more powerful building blocks than languages like Haskell.
Haskell in particular needs to use monads for everything because that is all they have. The same goes for macros in Lisp or dynamic typing in JavaScript. When you have a one-trick pony, that trick has to be pretty damn good.
On LINQ, Monads, and the Blindness of Power

What are first-class objects in Java and C#?

When I started OO programming many years ago I gained the impression that variables (if that is the right word) were either "primitives" (int, double, etc.) or first-class objects (String, JPane, etc.). This is reinforced by a recent answer on primitives in Java and C# (#Daniel Pryden: Are primitive types different in Java and C#?). However don't know whether C# ValueTypes are primitives, objects or some other beast such as second-class objects. I see that SO has only one use of the first-class tag so maybe it is no longer a useful term.
I did not find the Wikipedia article useful ("This article is in need of attention from an expert on the subject."). I'd be grateful for a taxonomy and current usage of terms, primarily related to Java and C# (though maybe other languages will shed enlightenment).
Clarification: I'd like to understand the term first-class and what its range of use is.
The notion of "first-class citizen" or "first-class element" in a programming language was introduced by British computer scientist Christopher Strachey in the 1960s in the context of first-class functions. The most famous formulation of this principle is probably in Structure and Interpretation of Computer Programs by Gerald Jay Sussman and Harry Abelson:
They may be named by variables.
They may be passed as arguments to procedures.
They may be returned as the results of procedures.
They may be included in data structures.
Basically, it means that you can do with this programming language element everything that you can do with all other elements in the programming language.
The problem is that "first class object" is not a well defined concept.
The normal usage is that someone says that an "object" is a class of thing that should have all of the properties X, Y and Z. But there are other things that don't have all of those properties, but they are sort of object-ish. So we'll call the former "first class" objects and the rest not "first class" ... and may be not objects.
The problem is that there are any number of views on the properties that a thing needs to have to make it a "first class" object. And no prospect of the people with opposing views coming to a consensus. (For example, a Javascript language expert might argue strenuously that an object is only first class if it is template-based.)
The only really solid insights about "first-classness" will be those that you can glean from the respective language specifications for Java and C#. And they only really apply within the scope of the respective languages / type systems ... and not across multiple languages.
So "first class Java object" or "first class C# object" might be meaningful, but "first class object" taken out of context is not.
Well that's my opinion ...
In .NET you don't have primitive types vs classes. Instead, you have structs vs classes, but structs share many of the features of classes (such as the ability to have properties and methods), and inherit from the Object class as well.
When you write int in C#, for example, it is just a language shortcut for the Int32 struct. You can do for example int i=int.Parse("34"), or even string s=1234.ToString(). In order to assign struct instances to variables of type Object, there is the boxing/unboxing mechanism.
In Java, on the other hand, you have indeed the primitive types vs classes dicotomy. So for example to perform operations on a variable of type int, you must use the auxiliary Integer class. That's one of the things that I don't like of Java compared to .NET.
EDIT. When you read about "first-class objects" (or classes), it means "fully-powered objects", that is, classes that have the same capabilities as any other system classes or user-made classes. This is to distinguish from "limited primitive types".
For each primitive data type in Java, the core class library provides a wrapper class that represents it as a Java object. For example, the Int32 class wraps the int data type, and the Double class wraps the double data type.
On the other hand, all primitive data types in C# are objects in the System namespace. For each data type, a short name, or alias, is provided. For instance, int is the short name for System.Int32 and double is the short form of System.Double.
The list of C# data types and their aliases is provided in the following table. As you can see, the first eight of these correspond to the primitive types available in Java. Note, however, that Java's boolean is called bool in C#.
From : http://msdn.microsoft.com/en-us/library/ms228360%28VS.80,lightweight%29.aspx
http://onjava.com/onjava/2003/05/21/delegates.html
in other words c# methods are first class object because we can pass it in another method. we can use methods like any other values(strings, numbers, user-created object).
Another example of first class objects that u can find uncommon in other languages but c# is Expressions
Frankly, I have no idea of what a "first-class object" is...
But I first found usage of a similar idiom in Lua documentation and mailing list, saying that functions are first-class citizens, or first-class values.
I let one of the authors of Lua to explain what it is: Programming in Lua : 6 - More about Functions
It means that, in Lua, a function is a
value with the same rights as
conventional values like numbers and
strings. Functions can be stored in
variables (both global and local) and
in tables, can be passed as arguments,
and can be returned by other
functions.
Somehow, this definition applies to objects in Java: you can store them in variables, in arrays, use them as function parameters and return them, use them as key of HashMap and other collections, etc.
Not sure if that's how the term is used for objects, but at least it makes sense... :-)
In a language like C, objects have to be made from scratch, using some tricks (re-creating C++, somehow...), so they are not first-class: you have to pass pointers around to manipulate them.
When we're talking about "first-class objects" by "objects" we mean some concepts of the language, not the objects that we create in that language. That is why there is also such terms like "first-class citizens".
So, for example, Java has following concepts - Java-objects, Java-primitives, fields, methods and other (by Java-objects I mean anything that is instance of Object type). I'd say that in Java both Java-objects and Java-primitives are first-class citizens in the language.
In C# we have some additional concepts that we can "test" for first-class properties. For example, delegates. We can assign delegate ot variable (give a name), pass it to the method as an argument, return it from method, incorporate in data structures (have a Dictionary of delegates for example). So I think we can say that delegates are first-class objects in C#. You can continue for other concepts of C# - events, properties...
Functional languages have concept of "function" and of course it is a first-class citizen in the any functional language. I'd say that we can call language a functional language if it has "function" as a first-class concept (name, pass, return, incorporate...).
So, if some language bring some concepts we can "measure" the power of this concepts in the language it self.

C# generics compared to C++ templates [duplicate]

This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
What are the differences between Generics in C# and Java… and Templates in C++?
What are the differences between C# generics compared to C++ templates? I understand that they do not solve exactly the same problem, so what are the pros and cons of both?
You can consider C++ templates to be an interpreted, functional programming language disguised as a generics system. If this doesn't scare you, it should :)
C# generics are very restricted; you can parameterize a class on a type or types, and use those types in methods. So, to take an example from MSDN, you could do:
public class Stack<T>
{
T[] m_Items;
public void Push(T item)
{...}
public T Pop()
{...}
}
And now you can declare a Stack<int> or Stack<SomeObject> and it'll store objects of that type, safely (ie, no worried about putting SomeOtherObject in by mistake).
Internally, the .NET runtime will specialize it into variants for fundamental types like int, and a variant for object types. This allows the representation for Stack<byte> to be much smaller than that of Stack<SomeObject>, for example.
C++ templates allow a similar use:
template<typename T>
class Stack
{
T *m_Items;
public void Push(const T &item)
{...}
public T Pop()
{...}
};
This looks similar at first glance, but there are a few important differences. First, instead of one variant for each fundamental type and one for all object types, there is one variant for each type it's instantiated against. That can be a lot of types!
The next major difference is (on most C++ compilers) it will be compiled in each translation unit it's used in. That can slow down compiles a lot.
Another interesting attribute to C++'s templates is they can by applied to things other than classes - and when they are, their arguments can be automatically detected. For example:
template<typename T>
T min(const T &a, const T &b) {
return a > b ? b : a;
}
The type T will be automatically determined by the context the function is used in.
These attributes can be used to good ends, at the expense of your sanity. Because a C++ template is recompiled for each type it's used against, and the implementation of a template is always available to the compiler, C++ can do very aggressive inlining on templates. Add to that the automatic detection of template values in functions, and you can make anonymous pseudo-functions in C++, using boost::lambda. Thus, an expression like:
_1 + _2 + _3
Produces an object with a seriously scary type, which has an operator() which adds up its arguments.
There are plenty of other dark corners of the C++ template system - it's an extremely powerful tool, but can be painful to think about, and sometimes hard to use - particularly when it gives you a twenty-page long error message. The C# system is much simpler - less powerful, but easier to understand and harder to abuse.
http://blogs.msdn.com/csharpfaq/archive/2004/03/12/88913.aspx
Roughly, much of the difference has to do with the fact that templates are resolved at compile-time, and generics are resolved at runtime.
Extensive answer on Stack Overflow: What are the differences between Generics in C# and Java... and Templates in C++?
This blog entry from Eric Gunnerson covers this topic quite well.
The biggest immediate difference is that templates are a compile time feature whereas generics are a runtime feature.
This looks like a handy reference.
http://msdn.microsoft.com/en-us/library/c6cyy67b.aspx

What problems does reflection solve?

I went through all the posts on reflection but couldn't find the answer to my question.
What were the problems in the programming world before .NET reflection
came and how it solved those problems?
Please explain with an example.
It should be stated that .NET reflection isn't revolutionary - the concepts have been around in other framework.
Reflection in .NET has 2 facets:
Investigating type information
Without some kind of reflection / introspection API, it becomes very hard to perform things like serialization. Rather than having this provided at runtime (by inspecting the properties/fields/etc), you often need code-generation instead, i.e. code that explicitly knows how to serialize each of your types. Tedious, and painful if you want to serialize something that doesn't have a twin.
Likewise, there is nowhere to store additional metadata about properties etc, so you end up having lots of additional code, or external configuration files. Something as simple as being able to associate a friendly name with a property (via an attribute) is a huge win for UI code.
Metaprogramming
.NET reflection also provides a mechanism to create types (etc) at runtime, which is hugely powerful for some specific scenarios; the alternatives are:
essentially running a parser/logic tree at runtime (rather than compiling the logic at runtime into executable code) - much slower
yet more code generation - yay!
I think to understand the need for reflection in .NET, we need to go back to before .NET. After all, modern languages like like Java and C# do not have a history BF (before reflection).
C++ arguably has had the most influence on C# and Java. But C++ did not originally have reflection and we coded without it and we managed to get by. Occasionally we had void pointer and would use a cast to force it into whatever type we wanted. The problem here was that the cast could fail with terrible consequences:
double CalculateSize(void* rectangle) {
return ((Rect*)rectangle)->getWidth() * ((Rect*)rectangle)->getHeight());
}
Now there are plenty of arguments why you shouldn't have coded yourself into this problem in the first place. But the problem is not much different from .NET 1.1 with C# when we didn't have generics:
Hashtable shapes = new Hashtable();
....
double CalculateSize(object shape) {
return ((Rect)shape).Width * ((Rect)shape).Height;
}
However, when the C# example fails it does so with a exception rather than a potential core dump.
When reflection was added to C++ (known as Run Time Type Identification or RTTI), it was hotly debated. In Stroustrup's book The Design and Evolution of C++, he lists the following
arguments against RTTI, in that some people:
Declared the support unnecessary
Declared the new style inherently evil ("against the spirit of C++")
Deemed it too expensive
Thought it too complicated and confusing
Saw it as the beginning of an avalanche of new features
But it did allow us to query the type of objects, or features of objects. For example (using C#)
Hashtable shapes = new Hashtable();
....
double CalculateSize(object shape) {
if(shape is Rect) {
return ((Rect)shape).Width * ((Rect)shape).Height;
}
else if(shape is Circle) {
return Math.Power(((Circle)shape).Radius, 2.0) * Math.PI;
}
}
Of course, with proper planning this example should never need to occur.
So, real world situations where I've needed it include:
Accessing objects from shared memory, all I have is a pointer and I need to decide what to do with it.
Dynamically loading assemblies, think about NUnit where it loads every assembly and uses reflection to determine which classes are test fixtures.
Having a mixed bag of objects in a Hashtable and wanting to process them differently in an enumerator.
Many others...
So, I would go as far as to argue that Reflection has not enabled the ability to do something that couldn't be done before. However, it does make some types of problems easier to code, clearer to reader, shorter to write, etc.
Of course that's just my opinion, I could be wrong.
I once wanted to have unit tests in a text file that could be modified by a non-technical user in the format in C++:
MyObj Function args //textfile.txt
But I couldn't find a way to read in a string and then have the code create an object instance of the type represented by the string without reflection which C++ doesn't support.
char *str; //read in some type from a text file say the string is "MyObj"
str *obj; //cast a pointer as type MyObj
obj = new str; //create a MyObj
Another use might be to have a generic copy function that could copy the members of an class without knowing them in advance.
It helps a lot when you are using C# attributes like [Obsolete] or [Serializable] in your code. Frameworks like NUnit use reflection on classes and containing methods to understand which methods are tests, setup, teardown, etc.

Categories