Why are they termed differently?
Isn't the nature/purpose/place of a function identical to that of a method?
Functions stand alone and methods are members of a class. That's all. (it's really all the same.)
A method is a function that is attached to an object.
JavaScript functions are usually referred to as methods when used in the context of an object.
e.g. document.getElementById('foo') uses the getElementById method.
Function is a very old term meaning a segment of code that accomplishes some task (does some function, if you will).
Method is a relatively newer term (came with OO programming) that means a function that belongs to a specific instance of an object.
All methods are functions, but not all functions are methods.
See this related question from the programmers stackexchange.
You usually find the term function when dealing with procedural code. OOP uses the term method but they are the same thing.
A function is a piece of code that is called by name. It can be passed data to operate on (ie. the parameters) and can optionally return data (the return value).
All data that is passed to a function is explicitly passed.
A method is a piece of code that is called by name that is associated with an object. In most respects it is identical to a function except for two key differences.
It is implicitly passed the object for which it was called
It is able to operate on data that is contained within the class (remembering that an object is an instance of a class - the class is the definition, the object is an instance of that data)
Also, another answer: Difference between function and method?
Well, its all about the names, but generally functions and methods are the same thing, and of course have the same purpose.
It all began with the first programming languages, where they where called functions, but as more higher level programming languages arose, i guess they thought to name them methods even if they are and serve the sasme purpose.
EDIT:
Functions are not part of any object or class.
Methods are a memeber of an object or class.
Related
I have been navigating through the ASP.NET MVC source code and I notice that in a lot of places, there are function parameters of type Func<T> that have been named with the suffix thunk. And I've been thunking why that might be?
As far I can remember, in the early days, I did come across the term thunk and understood it to mean a "port from one CPU architecture to another." So, if you had a 16-bit DLL, and you wanted to use it in a 32-bit environment, you'd have to create a thunk to marshal all those integers and doubles between the two platforms.
But what is the motivation behind naming these Funcs with such a name? For e.g. here is one such method.
A ctor of the System.Web.Mvc.FormCollection class.
internal FormCollection(ControllerBase controller,
Func<NameValueCollection> validatedValuesThunk,
Func<NameValueCollection> unvalidatedValuesThunk)
{
base.Add(((controller == null)
||
controller.ValidateRequest)
?
validatedValuesThunk()
:
unvalidatedValuesThunk());
}
And here's another instance. This is a private field in the ControllerActionInvoker class.
private Func<ControllerContext, ActionDescriptor,
IEnumerable<Filter>> _getFiltersThunk;
There are loads of such methods in the MVC code.
From wikipedia:
In computer programming, a thunk is a subroutine that is created,
often automatically, to assist a call to another subroutine. Thunks
are primarily used to represent an additional calculation that a
subroutine needs to execute, or to call a routine that does not
support the usual calling mechanism.
The use of "thunk" in the FormCollection method seems to fit this definition.
There is no single canonical meaning for the word "thunk". In this instance, I think it simply refers to a function object. Personally, I don't use the word "thunk" in that way, but like I said, there's no commonly-agreed meaning for that word.
While thinking a little bit about programming in Java/C# I wondered about how methods which belong to objects are represented in memory and how this fact does concern multi threading.
Is a method instantiated for each object in memory seperately or do
all objects of the same type share one instance of the method?
If the latter, how does the executing thread know which object's
attributes to use?
Is it possible to modify the code of a method in
C# with reflection for one, and only one object of many objects of
the same type?
Is a static method which does not use class attributes always thread safe?
I tried to make up my mind about these questions, but I'm very unsure about their answers.
Each method in your source code (in Java, C#, C++, Pascal, I think every OO and procedural language...) has only one copy in binaries and in memory.
Multiple instances of one object have separate fields but all share the same method code. Technically there is a procedure that takes a hidden this parameter to provide an illusion of executing a method on an object. In reality you are calling a procedure and passing structure (a bag of fields) to it along with other parameters. Here is a simple Java object and more-or-less equivalent pseudo-C code:
class Foo {
private int x;
int mulBy(int y) {
return x * y
}
}
Foo foo = new Foo()
foo.mulBy(3)
is translated to this pseude-C code (the encapsulation is forced by the compiler and runtime/VM):
struct Foo {
int x = 0;
}
int Foo_mulBy(Foo *this, int y) {
return this->x * y;
}
Foo* foo = new Foo();
Foo_mulBy(foo, 3)
You have to draw a difference between code and local variables and parameters it operates on (the data). Data is stored on call stack, local to each thread. Code can be executed by multiple threads, each thread has its own copy of instruction pointer (place in the method it currently executes). Also because this is a parameter, it is thread-local, so each thread can operate on a different object concurrently, even though it runs the same code.
That being said you cannot modify a method of only one instance because the method code is shared among all instances.
The Java specifications don't dictate how to do memory layout, and different implementations can do whatever they like, providing it meets the spec where it matters.
Having said that, the mainstream Oracle JVM (HotSpot) works off of things called oops - Ordinary Object Pointers. These consist of two words of header followed by the data which comprises the instance member fields (stored inline for primitive types, and as pointers for reference member fields).
One of the two header words - the class word - is a pointer to a klassOop. This is a special type of oop which holds pointers to the instance methods of the class (basically, the Java equivalent of a C++ vtable). The klassOop is kind-of a VM-level representation of the Class object corresponding to the Java type.
If you're curious about the low-level detail, you can find out a lot more by looking in the OpenJDK source for the definition of some of the oop types (klassOop is a good place to start).
tl;dr Java holds one blob of code for each method of each type. The blobs of code are shared among each instance of the type, and hidden this pointers are used to know which instance's members to use.
I am going to try to answer this in the context of C#.There are basically 3 different types of Methods
virtual
non-virtual
static
When your code is executed, you basically have two kinds of objects that are formed on the heap.
The object corresponding to the type of the object. This is called Type Object. This holds the type object pointer, the sync block index, the static fields and the method table.
The object corresponding to the object itself, which contains all the non static fields.
In response to your questions,
Is a method instantiated for each object in memory seperately or do all objects of the same type share one instance of the method?
This is a wrong way of understanding objects. All methods are per type only. Look at it this way. A method is just a set of instructions. The first time you call a particular method, the IL code is JITed into native instructions and saved in memory. The next time this is called, the address is picked up from the method table and the same instructions are executed again.
2.If the latter, how does the executing thread know which object's attributes to use?
Each static method call on a Type results in looking up the method table from the corresponding Type Object and finding the address of the JITed instruction. In case of methods that are not static, the the relevant object on which the method is called is maintained on the thread's local stack. Basically, you get the nearest object on the stack. That is always the object on which we want the method to be called.
3.Is it possible to modify the code of a method in C# with reflection for one, and only one object of many objects of the same type?
No, It is not possible now. (And I am thankful for that). The reason is that reflection only allows code inspection. If you figure out what some method actually means, there is no way you are going to be able to change the code in the same assembly.
I want to know that why the name of constructor is always same as that of class name and how its get invoked implicitly when we create object of that class. Can anyone please explain the flow of execution in such situation?
I want to know that why the name of constructor is always same as that of class name
Because this syntax does not require any new keywords. Aside from that, there is no good reason.
To minimize the number of new keywords, I didn't use an explicit syntax like this:
class X {
constructor();
destructor();
}
Instead, I chose a declaration syntax that mirrored the use of constructors.
class X {
X();
~X();
This may have been overly clever. [The Design And Evolution Of C++, 3.11.2 Constructor Notation]
Can anyone please explain the flow of execution in such situation?
The lifetime of an object can be summarized like this:
allocate memory
call constructor
use object
call destructor/finalizer
release memory
In Java, step 1 always allocates from the heap. In C#, classes are allocated from the heap as well, whereas the memory for structs is already available (either on the stack in the case of non-captured local structs or within their parent object/closure). Note that knowing these details is generally not necessary or very helpful. In C++, memory allocation is extremely complicated, so I won't go into the details here.
Step 5 depends on how the memory was allocated. Stack memory is automatically released as soon as the method ends. In Java and C#, heap memory is implicitly released by the Garbage Collector at some unknown time after it is no longer needed. In C++, heap memory is technically released by calling delete. In modern C++, delete is rarely called manually. Instead, you should use RAII objects such as std::string, std::vector<T> and std::shared_ptr<T> that take care of that themselves.
Why? Because the designers of the different languages you mention decided to make them that way. It is entirely possible for someone to design an OOP language where constructors do not have to have the same name as the class (as commented, this is the case in python).
It is a simple way to distinguish constructors from other functions and makes the constructing of a class in code very readable, so makes sense as a language design choice.
The mechanism is slightly different in the different languages, but essentially this is just a method call assisted by language features (the new keyword in java and c#, for example).
The constructor gets invoked by the runtime whenever a new object is created.
Seem to me that having sepearte keywords for declaring constructor(s) would be "better", as it would remove the otherwise unnecessary dependency to the name of the class itself.
Then, for instance, the code inside the class could be copied as the body of another without having to make changes regarding name of the constructor(s). Why one would want to do this I don't know (possibly during some code refactoring process), but the point is one always strives for independency between things and here the language syntax goes against that, I think.
Same for destructors.
One of the good reasons for constructor having the same name is their expressiveness. For example, in Java you create an object like,
MyClass obj = new MyClass(); // almost same in other languages too
Now, the constructor is defined as,
class MyClass {
public MyClass () {... }
}
So the statement above very well expresses that, you are creating an object and while this process the constructor MyClass() is called.
Now, whenever you create an object, it always calls its constructor. If that class is extending some other Base class, then their constructor will be called first and so on. All these operations are implicit. First the memory for the object is allocated (on heap) and then the constructor is called to initialize the object. If you don't provide a constructor, compiler will generate one for your class.
In C++, strictly speaking constructors do not have names at all. 12.1/1 in the standard states, "Constructors do not have names", it doesn't get much clearer than that.
The syntax for declaring and defining constructors in C++ uses the name of the class. There has to be some way of doing that, and using the name of the class is concise, and easy to understand. C# and Java both copied C++'s syntax, presumably because it would be familiar to at least some of the audience they were targeting.
The precise flow of execution depends what language you're talking about, but what the three you list have in common is that first some memory is assigned from somewhere (perhaps allocated dynamically, perhaps it's some specific region of stack memory or whatever). Then the runtime is responsible for ensuring that the correct constructor or constructors are called in the correct order, for the most-derived class and also base classes. It's up to the implementation how to ensure this happens, but the required effects are defined by each of those languages.
For the simplest possible case in C++, of a class that has no base classes, the compiler simply emits a call to the constructor specified by the code that creates the object, i.e. the constructor that matches any arguments supplied. It gets more complicated once you have a few virtual bases in play.
I want to know that why the name of constructor is always same as that
of class name
So that it can be unambigously identified as the constructor.
and how its get invoked implicitly when we create object of that class.
It is invoked by the compiler because it has already been unambiguously identified because of its naming sheme.
Can anyone please explain the flow of execution in such situation?
The new X() operator is called.
Memory is allocated, or an exception is thrown.
The constructor is called.
The new() operator returns to the caller.
the question is why designers decided so?
Naming the constructor after its class is a long-established convention dating back at least to the early days of C++ in the early 1980s, possibly to its Simula predecessor.
The convention for the same name of the constructor as that of the class is for programming ease, constructor chaining, and consistency in the language.
For example, consider a scenario where you want to use Scanner class, now what if the JAVA developers named the constructor as xyz!
Then how will you get to know that you need to write :
Scanner scObj = new xyz(System.in) ;
which could've have been really weird, right! Or, rather you might have to reference a huge manual to check constructor name of each class so as to get object created, which is again meaningless if you could have a solution of the problem by just naming constructors same as that of the class.
Secondly, the constructor is itself created by the compiler if you don't explicitly provide it, then what could be the best name for constructor could automatically be chosen by the compiler so it is clear to the programmer! Obviously, the best choice is to keep it the same as that of the class.
Thirdly, you may have heard of constructor chaining, then while chaining the calls among the constructors, how the compiler will know what name you have given to the constructor of the chained class! Obviously, the solution to the problem is again same, KEEP NAME OF THE CONSTRUCTOR SAME AS THAT OF THE CLASS.
When you create the object, you invoke the constructor by calling it in your code with the use of new keyword(and passing arguments if needed) then all superclass constructors are invoked by chaining the calls which finally gives the object.
Thanks for Asking.
I'm interesting in how CLR implementes the calls like this:
abstract class A {
public abstract void Foo<T, U, V>();
}
A a = ...
a.Foo<int, string, decimal>(); // <=== ?
Is this call cause an some kind of hash map lookup by type parameters tokens as the keys and compiled generic method specialization (one for all reference types and the different code for all the value types) as the values?
I didn't find much exact information about this, so much of this answer is based on the excellent paper on .Net generics from 2001 (even before .Net 1.0 came out!), one short note in a follow-up paper and what I gathered from SSCLI v. 2.0 source code (even though I wasn't able to find the exact code for calling virtual generic methods).
Let's start simple: how is a non-generic non-virtual method called? By directly calling the method code, so the compiled code contains direct address. The compiler gets the method address from the method table (see next paragraph). Can it be that simple? Well, almost. The fact that methods are JITed makes it a little more complicated: what is actually called is either code that compiles the method and only then executes it, if it wasn't compiled yet; or it's one instruction that directly calls the compiled code, if it already exists. I'm going to ignore this detail further on.
Now, how is a non-generic virtual method called? Similar to polymorphism in languages like C++, there is a method table accessible from the this pointer (reference). Each derived class has its own method table and its methods there. So, to call a virtual method, get the reference to this (passed in as a parameter), from there, get the reference to the method table, look at the correct entry in it (the entry number is constant for specific function) and call the code the entry points to. Calling methods through interfaces is slightly more complicated, but not interesting for us now.
Now we need to know about code sharing. Code can be shared between two “instances” of the same method, if reference types in type parameters correspond to any other reference types, and value types are exactly the same. So, for example C<string>.M<int>() shares code with C<object>.M<int>(), but not with C<string>.M<byte>(). There is no difference between type type parameters and method type parameters. (The original paper from 2001 mentions that code can be shared also when both parameters are structs with the same layout, but I'm not sure this is true in the actual implementation.)
Let's make an intermediate step on our way to generic methods: non-generic methods in generic types. Because of code sharing, we need to get the type parameters from somewhere (e.g. for calling code like new T[]). For this reason, each instantiation of generic type (e.g. C<string> and C<object>) has its own type handle, which contains the type parameters and also method table. Ordinary methods can access this type handle (technically a structure confusingly called MethodTable, even though it contains more than just the method table) from the this reference. There are two types of methods that can't do that: static methods and methods on value types. For those, the type handle is passed in as a hidden argument.
For non-virtual generic methods, the type handle is not enough and so they get different hidden argument, MethodDesc, that contains the type parameters. Also, the compiler can't store the instantiations in the ordinary method table, because that's static. So it creates a second, different method table for generic methods, which is indexed by type parameters, and gets the method address from there, if it already exists with compatible type parameters, or creates a new entry.
Virtual generic methods are now simple: the compiler doesn't know the concrete type, so it has to use the method table at runtime. And the normal method table can't be used, so it has to look in the special method table for generic methods. Of course, the hidden parameter containing type parameters is still present.
One interesting tidbit learned while researching this: because the JITer is very lazy, the following (completely useless) code works:
object Lift<T>(int count) where T : new()
{
if (count == 0)
return new T();
return Lift<List<T>>(count - 1);
}
The equivalent C++ code causes the compiler to give up with a stack overflow.
Yes. The code for specific type is generated at the runtime by CLR and keeps a hashtable (or similar) of implementations.
Page 372 of CLR via C#:
When a method that uses generic type
parameters is JIT-compiled, the CLR
takes the method's IL, substitutes the
specified type arguments, and then
creates native code that is specific
to that method operating on the
specified data types. This is exactly
what you want and is one of the main
features of generics. However, there
is a downside to this: the CLR keeps
generating native code for every
method/type combination. This is
referred to as code explosion. This
can end up increasing the
application's working set
substantially, thereby hurting
performance.
Fortunately, the CLR has some
optimizations built into it to reduce
code explosion. First, if a method is
called for a particular type argument,
and later, the method is called again
using the same type argument, the CLR
will compile the code for this
method/type combination just once. So
if one assembly uses List,
and a completely different assembly
(loaded in the same AppDomain) also
uses List, the CLR will
compile the methods for List
just once. This reduces code explosion
substantially.
EDIT
I now came across I now came across https://msdn.microsoft.com/en-us/library/sbh15dya.aspx which clearly states that generics when using reference types are reusing the same code, thus I would accept that as the definitive authority.
ORIGINAL ANSWER
I am seeing here two disagreeing answers, and both have references to their side, so I will try to add my two cents.
First, Clr via C# by Jeffrey Richter published by Microsoft Press is as valid as an msdn blog, especially as the blog is already outdated, (for more books from him take a look at http://www.amazon.com/Jeffrey-Richter/e/B000APH134 one must agree that he is an expert on windows and .net).
Now let me do my own analysis.
Clearly two generic types that contain different reference type arguments cannot share the same code
For example, List<TypeA> and List<TypeB>> cannot share the same code, as this would cause the ability to add an object of TypeA to List<TypeB> via reflection, and the clr is strongly typed on genetics as well, (unlike Java in which only the compiler validates generic, but the underlying JVM has no clue about them).
And this does not apply only to types, but to methods as well, since for example a generic method of type T can create an object of type T (for example nothing prevents it from creating a new List<T>), in which case reusing the same code would cause havoc.
Furthermore the GetType method is not overridable, and it in fact always return the correct generic type, prooving that each type argument has indeed its own code.
(This point is even more important than it looks, as the clr and jit work based on the type object created for that object, by using GetType () which simply means that for each type argument there must be a separate object even for reference types)
Another issue that would result from code reuse, as the is and as operators will no longer work correctly, and in general all types of casting will have serious problems.
NOW TO ACTUAL TESTING:
I have tested it by having a generic type that contaied a static member, and than created two object with different type parameters, and the static fields were clrearly not shared, clearly prooving that code is not shared even for reference types.
EDIT:
See http://blogs.msdn.com/b/csharpfaq/archive/2004/03/12/how-do-c-generics-compare-to-c-templates.aspx on how it is implemented:
Space Use
The use of space is different between C++ and C#. Because C++
templates are done at compile time, each use of a different type in a
template results in a separate chunk of code being created by the
compiler.
In the C# world, it's somewhat different. The actual implementations
using a specific type are created at runtime. When the runtime creates
a type like List, the JIT will see if that has already been
created. If it has, it merely users that code. If not, it will take
the IL that the compiler generated and do appropriate replacements
with the actual type.
That's not quite correct. There is a separate native code path for
every value type, but since reference types are all reference-sized,
they can share their implementation.
This means that the C# approach should have a smaller footprint on
disk, and in memory, so that's an advantage for generics over C++
templates.
In fact, the C++ linker implements a feature known as “template
folding“, where the linker looks for native code sections that are
identical, and if it finds them, folds them together. So it's not a
clear-cut as it would seem to be.
As one can see the CLR "can" reuse the implementation for reference types, as do current c++ compilers, however there is no guarantee on that, and for unsafe code using stackalloc and pointers it is probably not the case, and there might be other situations as well.
However what we do have to know that in CLR type system, they are treated as different types, such as different calls to static constructors, separate static fields, separate type objects, and a object of a type argument T1 should not be able to access a private field of another object with type argument T2 (although for an object of the same type it is indeed possible to access private fields from another object of the same type).
I never seem to understand why we need delegates?
I know they are immutable reference types that hold reference of a method but why can't we just call the method directly, instead of calling it via a delegate?
Thanks
Simple answer: the code needing to perform the action doesn't know the method to call when it's written. You can only call the method directly if you know at compile-time which method to call, right? So if you want to abstract out the idea of "perform action X at the appropriate time" you need some representation of the action, so that the method calling the action doesn't need to know the exact implementation ahead of time.
For example:
Enumerable.Select in LINQ can't know the projection you want to use unless you tell it
The author of Button didn't know what you want the action to be when the user clicks on it
If a new Thread only ever did one thing, it would be pretty boring...
It may help you to think of delegates as being like single-method interfaces, but with a lot of language syntax to make them easy to use, and funky support for asynchronous execution and multicasting.
Of course you can call method directly on the object but consider following scenarios:
You want to call series of method by using single delegate without writing lot of method calls.
You want to implement event based system elegantly.
You want to call two methods same in signature but reside in different classes.
You want to pass method as a parameter.
You don't want to write lot of polymorphic code like in LINQ , you can provide lot of implementation to the Select method.
Because you may not have the method written yet, or you have designed your class in such a way that a user of it can decide what method (that user wrote) the user wants your class to execute.
They also make certain designs cleaner (for example, instead of a switch statement where you call different methods, you call the delegate passed in) and easier to understand and allow for extending your code without changing it (think OCP).
Delegates are also the basis of the eventing system - writing and registering event handlers without delegates would be much harder than it is with them.
See the different Action and Func delegates in Linq - it would hardly be as useful without them.
Having said that, no one forces you to use delegates.
Delegates supports Events
Delegates give your program a way to execute methods without having to know precisely what those methods are at compile time
Anything that can be done with delegates can be done without them, but delegates provide a much cleaner way of doing them. If one didn't have delegates, one would have to define an interface or abstract base class for every possible function signature containing a function Invoke(appropriate parameters), and define a class for each function which was to be callable by pseudo-delegates. That class would inherit the appropriate interface for the function's signature, would contain a reference to the class containing the function it was supposed to represent, and a method implementing Invoke(appropriate parameters) which would call the appropriate function in the class to which it holds a reference. If class Foo has two methods Foo1 and Foo2, both taking a single parameter, both of which can be called by pseudo-delegates, there would be two extra classes created, one for each method.
Without compiler support for this technique, the source code would have to be pretty heinous. If the compiler could auto-generate the proper nested classes, though, things could be pretty clean. Dispatch speed for pseudo-delegates would probably generally be slower than with conventional delegates, but if pseudo-delegates were an interface rather than an abstract base class, a class which only needs to make a pseudo-delegate for one method of a given signature could implement the appropriate pseudo-delegate interface itself; the class instance could then be passed to any code expecting a pseudo-delegate of that signature, avoiding any need to create an extra object. Further, while the number of classes one would need when using pseudo-delegates would be greater than when using "real" delegates, each pseudo-delegate would only need to hold a single object instance.
Think of C/C++ function pointers, and how you treat javascript event-handling functions as "data" and pass them around. In Delphi language also there is procedural type.
Behind the scenes, C# delegate and lambda expressions, and all those things are essentially the same idea: code as data. And this constitutes the very basis for functional programming.
You asked for an example of why you would pass a function as a parameter, I have a perfect one and thought it might help you understand, it is pretty academic but shows a use. Say you have a ListResults() method and getFromCache() method. Rather then have lots of checks if the cache is null etc. you can just pass any method to getCache and then only invoke it if the cache is empty inside the getFromCache method:
_cacher.GetFromCache(delegate { return ListResults(); }, "ListResults");
public IEnumerable<T> GetFromCache(MethodForCache item, string key, int minutesToCache = 5)
{
var cache = _cacheProvider.GetCachedItem(key);
//you could even have a UseCache bool here for central control
if (cache == null)
{
//you could put timings, logging etc. here instead of all over your code
cache = item.Invoke();
_cacheProvider.AddCachedItem(cache, key, minutesToCache);
}
return cache;
}
You can think of them as a construct similar with pointers to functions in C/C++. But they are more than that in C#. Details.