Today I was trying to wrap my head around immutable objects that reference each other. I came to the conclusion that you can't possibly do that without using lazy evaluation but in the process I wrote this (in my opinion) interesting code.
public class A
{
public string Name { get; private set; }
public B B { get; private set; }
public A()
{
B = new B(this);
Name = "test";
}
}
public class B
{
public A A { get; private set; }
public B(A a)
{
//a.Name is null
A = a;
}
}
What I find interesting is that I cannot think of another way to observe object of type A in a state that is not yet fully constructed and that includes threads. Why is this even valid? Are there any other ways to observe the state of an object that is not fully constructed?
Why is this even valid?
Why do you expect it to be invalid?
Because a constructor is supposed to guarantee that the code it contains is executed before outside code can observe the state of the object.
Correct. But the compiler is not responsible for maintaining that invariant. You are. If you write code that breaks that invariant, and it hurts when you do that, then stop doing that.
Are there any other ways to observe the state of an object that is not fully constructed?
Sure. For reference types, all of them involve somehow passing "this" out of the constructor, obviously, since the only user code that holds the reference to the storage is the constructor. Some ways the constructor can leak "this" are:
Put "this" in a static field and reference it from another thread
make a method call or constructor call and pass "this" as an argument
make a virtual call -- particularly nasty if the virtual method is overridden by a derived class, because then it runs before the derived class ctor body runs.
I said that the only user code that holds a reference is the ctor, but of course the garbage collector also holds a reference. Therefore, another interesting way in which an object can be observed to be in a half-constructed state is if the object has a destructor, and the constructor throws an exception (or gets an asynchronous exception like a thread abort; more on that later.) In that case, the object is about to be dead and therefore needs to be finalized, but the finalizer thread can see the half-initialized state of the object. And now we are back in user code that can see the half-constructed object!
Destructors are required to be robust in the face of this scenario. A destructor must not depend on any invariant of the object set up by the constructor being maintained, because the object being destroyed might never have been fully constructed.
Another crazy way that a half-constructed object could be observed by outside code is of course if the destructor sees the half-initialized object in the scenario above, and then copies a reference to that object to a static field, thereby ensuring that the half-constructed, half-finalized object is rescued from death. Please do not do that. Like I said, if it hurts, don't do it.
If you're in the constructor of a value type then things are basically the same, but there are some small differences in the mechanism. The language requires that a constructor call on a value type creates a temporary variable that only the ctor has access to, mutate that variable, and then do a struct copy of the mutated value to the actual storage. That ensures that if the constructor throws, then the final storage is not in a half-mutated state.
Note that since struct copies are not guaranteed to be atomic, it is possible for another thread to see the storage in a half-mutated state; use locks correctly if you are in that situation. Also, it is possible for an asynchronous exception like a thread abort to be thrown halfway through a struct copy. These non-atomicity problems arise regardless of whether the copy is from a ctor temporary or a "regular" copy. And in general, very few invariants are maintained if there are asynchronous exceptions.
In practice, the C# compiler will optimize away the temporary allocation and copy if it can determine that there is no way for that scenario to arise. For example, if the new value is initializing a local that is not closed over by a lambda and not in an iterator block, then S s = new S(123); just mutates s directly.
For more information on how value type constructors work, see:
Debunking another myth about value types
And for more information on how C# language semantics try to save you from yourself, see:
Why Do Initializers Run In The Opposite Order As Constructors? Part One
Why Do Initializers Run In The Opposite Order As Constructors? Part Two
I seem to have strayed from the topic at hand. In a struct you can of course observe an object to be half-constructed in the same ways -- copy the half-constructed object to a static field, call a method with "this" as an argument, and so on. (Obviously calling a virtual method on a more derived type is not a problem with structs.) And, as I said, the copy from the temporary to the final storage is not atomic and therefore another thread can observe the half-copied struct.
Now let's consider the root cause of your question: how do you make immutable objects that reference each other?
Typically, as you've discovered, you don't. If you have two immutable objects that reference each other then logically they form a directed cyclic graph. You might consider simply building an immutable directed graph! Doing so is quite easy. An immutable directed graph consists of:
An immutable list of immutable nodes, each of which contains a value.
An immutable list of immutable node pairs, each of which has the start and end point of a graph edge.
Now the way you make nodes A and B "reference" each other is:
A = new Node("A");
B = new Node("B");
G = Graph.Empty.AddNode(A).AddNode(B).AddEdge(A, B).AddEdge(B, A);
And you're done, you've got a graph where A and B "reference" each other.
The problem, of course, is that you cannot get to B from A without having G in hand. Having that extra level of indirection might be unacceptable.
Yes, this is the only way for two immutable objects to refer to each other - at least one of them must see the other in a not-fully-constructed way.
It's generally a bad idea to let this escape from your constructor but in cases where you're confident of what both constructors do, and it's the only alternative to mutability, I don't think it's too bad.
"Fully constructed" is defined by your code, not by the language.
This is a variation on calling a virtual method from the constructor,
the general guideline is: don't do that.
To correctly implement the notion of "fully constructed", don't pass this out of your constructor.
Indeed, leaking the this reference out during the constructor will allow you to do this; it may cause problems if methods get invoked on the incomplete object, obviously. As for "other ways to observe the state of an object that is not fully constructed":
invoke a virtual method in a constructor; the subclass constructor will not have been called yet, so an override may try to access incomplete state (fields declared or initialized in the subclass, etc)
reflection, perhaps using FormatterServices.GetUninitializedObject (which creates an object without calling the constructor at all)
If you consider the initialization order
Derived static fields
Derived static constructor
Derived instance fields
Base static fields
Base static constructor
Base instance fields
Base instance constructor
Derived instance constructor
clearly through up-casting you can access the class BEFORE the derived instance constructor is called (this is the reason you shouldn't use virtual methods from constructors. They could easily access derived fields not initialized by the constructor/the constructor in the derived class could not have brought the derived class in a "consistent" state)
You can avoid the problem by instancing B last in your constuctor:
public A()
{
Name = "test";
B = new B(this);
}
If what you suggest was not possible, then A would not be immutable.
Edit: fixed, thanks to leppie.
The principle is that don't let your this object escape from the constructor body.
Another way to observe such problem is by calling virtual methods inside the constructor.
As noted, the compiler has no means of knowing at what point an object has been constructed well enough to be useful; it therefore assumes that a programmer who passes this from a constructor will know whether an object has been constructed well enough to satisfy his needs.
I would add, however, that for objects which are intended to be truly immutable, one must avoid passing this to any code which will examine the state of a field before it has been assigned its final value. This implies that this not be passed to arbitrary outside code, but does not imply that there is anything wrong with having an object under construction pass itself to another object for the purpose of storing a back-reference which will not actually be used until after the first constructor has completed.
If one were designing a language to facilitate the construction and use of immutable objects, it may be helpful for it to declare methods as being usable only during construction, only after construction, or either; fields could be declared as being non-dereferenceable during construction and read-only afterward; parameters could likewise be tagged to indicate that should be non-dereferenceable. Under such a system, it would be possible for a compiler to allow the construction of data structures which referred to each other, but where no property could ever change after it was observed. As to whether the benefits of such static checking would outweigh the cost, I'm not sure, but it might be interesting.
Incidentally, a related feature which would be helpful would be the ability to declare parameters and function returns as ephemeral, returnable, or (the default) persistable. If a parameter or function return were declared ephemeral, it could not be copied to any field nor passed as a persistable parameter to any method. Additionally, passing an ephemeral or returnable value as a returnable parameter to a method would cause the return value of the function to inherit the restrictions of that value (if a function has two returnable parameters, its return value would inherit the more restrictive constraint from its parameters). A major weakness with Java and .net is that all object references are promiscuous; once outside code gets its hands on one, there's no telling who may end up with it. If parameters could be declared ephemeral, it would more often be possible for code which held the only reference to something to know it held the only reference, and thus avoid needless defensive copy operations. Additionally, things like closures could be recycled if the compiler could know that no references to them existed after they returned.
Related
[Webmethod]
Public static string GetAge(List<int> input1, string input2)
{
Cust cu=Cust.CalAge(input1,input2)
cu.agemodify=true;
}
An AJAX call calls this Webservice.
CalAge is a static method inside class Cust.
agemodify is a boolean field inside class Cust.
I understand that all non static fields/local variables have
different copies on the stack per thread even in a static method. Is
that correct? So these are thread safe and are not shared resources.
agemodify is therefore thread safe?
My understanding is that List<int> is thread safe as a parameter
to a static method because for int, immutability doesn't apply. Is
that correct? I know list <T> isn't threadsafe. What about just
<T>s as parameters if T is immutable or Objects if they are
immutable? e.g. public static int GenMethod<TFirst,TSecond>(),
GetMethod(List<object> o1, object o2). Just Object vs
List<Object>. I don't want to use System.Collections.Concurrent.
My understanding is that because Webmethod is static, any classes
further down the stack need not be instantiated, and therefore,
might as well be static because there will only be one Webmethod
"instance" throughout. Is that correct? Is it better to have a
dedicated webservice (wcf/asmx) wrt thread safety instead of static
webmethods on an aspx page?
Unfortunately, your question is bordering on being too broad. You seem to be conflating several different and completely unrelated concepts, as if they somehow relate to each other in a meaningful way even though they don't. I'm skeptical that in the context of a Stack Overflow answer, it would be possible to unravel all of the misconceptions represented here.
That said, I'll make a try. At the very least, I hope that with the below, there is enough information that you can go back to review the primary sources (e.g. MSDN, other documentation) and correct your understanding. In the best case, perhaps the below is sufficient to get you entirely back on track.
I understand that all non static fields/local variables have different copies on the stack per thread even in a static method. Is that correct? So these are thread safe and are not shared resources. agemodify is therefore thread safe?
"Is that correct?" — No. It's borderline gibberish. A local variable declared in a method has a new instance, and hence a new copy of whatever value was passed or initialized in the method, for each call to the method. So, yes in the case of the local variable, each time the method is called in different threads, there are new copies, but this has nothing to do with the threads. It's all about the method call.
As for non-static fields, these may or may not be stored on the stack†, so any conclusion based on an assumption that they are stored on the stack is necessarily wrong. For reference types you only get a new copy for each instance of the object. For value types, you get a new copy of each field for each instance of the value type as well, but since you get a new instance of the value type each time it's assigned to a new variable, there can be many more copies. But again, nothing to do with threading, and nothing to do with the stack (since value types can exist either on the stack on the heap).
"agemodify is therefore thread safe?" — There is not enough code to answer that question. You would need to provide a good Minimal, Complete, and Verifiable code example for anyone to answer that definitively. If the Cust type is a reference type, and the CalAge() method always creates a new instance of the type, or the Cust type is a value type, then I would say that yes, the field is thread safe.
Otherwise, not necessarily.
† For that matter, that local variables are stored on the stack is just an implementation detail. It's not likely to change for regular local variables, but there's no requirement in C# that they be handled that way. They could be stored anywhere, as long as the compiler preserves the semantics of a local variable, and indeed this is exactly what happens in a variety of contexts in C#, such as captured local variables, and local variables found in special methods like iterator methods and async methods.
My understanding is that List<int> is thread safe as a parameter to a static method because for int immutability doesn't apply. Is that correct? I know List<T> isn't threadsafe. What about just <T>s as parameters if T is immutable or Objects if they are immutable? e.g. public static int GenMethod<TFirst,TSecond>(),
GetMethod(List<object> o1, object o2). Just Object vs List<Object>. I don't want to use System.Collections.Concurrent.
Again, all that mostly doesn't make any sense. The value of the input1 variable, being a local variable, is inherently thread safe because a new copy exists for each call to the method. But that's just the variable. The List<int> object itself is not affected by this at all, and is decidedly not thread-safe. Only if each thread executing the method gets a unique instance of the List<int> object could you say that would be thread-safe.
As for values of a type parameter T go, there is nothing about generics that would make these inherently thread-safe. As far as thread-safety is concerned, T is just another type. The fact that you don't know the type when the generic type or method is compiled is irrelevant, and does not in any way help make values of that type thread-safe.
My understanding is that because Webmethod is static, any classes further down the stack need not be instantiated, and therefore, might as well be static because there will only be one Webmethod "instance" throughout. Is that correct? Is it better to have a dedicated webservice (wcf/asmx) wrt thread safety instead of static webmethods on an aspx page?
Sorry, more mostly meaningless combinations of words. What is meant by "classes further down the stack"? If you call a method that needs to use a string value, for example, do you really think that means you get to use some mythical static version of the string type, just because the caller is static?
No, the rules about what needs to be static and what doesn't don't have much to do with the caller. The only thing the caller affects is that a non-static caller wouldn't need to provide an explicit instance reference to use other non-static members in the same class, because there's an implicit this reference in that context. Otherwise, a static method calling instance members still needs an instance reference. And if you can make members that the static method uses also be static, that is better because then you don't have to provide that instance.
But just because the caller itself is static, that doesn't automatically mean you get to avoid instantiating objects within the call context. If you need to access instance members of some type, you'll still need an instance of that type.
As far as the web service-specific stuff goes, that all depends on the nature of the values being used in your method. There's not enough context in your question to provide any sort of definitive discussion regarding that.
I have a really complex object (many different properties and also objects which contain same class objects in them and also backwards reference to parent objects) as global in a static class, which is initialized by a static constructor for 1 time and I want it to stay so and never change after that. This object is used many many times in my code in different places, whereas sometimes a clone is made out of it, with care never to change anything in its original reference (and properties, subproperties, etc). However, I guess I made a mistake somewhere and I removed some of its subproperties. I can find where the mistake is by step by step debugging, but it will cost me much time. Is there a way to lock the whole thing (not just the reference, but the whole object with all its properties, no matter how deep they are) not to be altered again after it is initialized for the first time?
I tried looking at readonly modifier, but I guess it won't suit me because it constraints only the reference of the object and not everything that comes under it.
Also private won't suit me for the same reason.
Is there a better way to this?
Is there a way to lock the whole thing
There is no way to prevent mutation of an object (or object graph) that is mutable. Put differently: If the object can be modified (such as if it has a public field that isn't readonly or if it has a property that has a setter), there is no way to prevent it from being modified.
I tried looking at readonly modifier, but I guess it won't suit me
because it constraints only the reference of the object and not
everything that comes under it.
Correct. When a field declaration includes a readonly modifier, assignments to the fields introduced by the declaration can only occur as part of the declaration or in a constructor in the same class. (Source: msdn)
However, you can design the type so it is immutable or at least so it can't be modified through public fields, properties or methods.
You might also consider to return clones of the object graph instead of the original object graph. If the clone gets modified, the original object graph is still unmodified.
Sounds like you want a singleton?
build a function in the class that contains this object:
function getComplexObject()
// is m_object NULL?
// instantiate object.
// return m_object.
And make m_object the property of the class.
Then always use this function to get the object. This way you can be sure it'll only be created once.
I have a Struct with a field in it that loses its value. I can declare the field static and that solves the problem. I can also just change struct to class (changing nothing else) and that also solves the problem. I was just wondering why this is?
Structs are passed by value. In other words, when you pass a struct, you're passing a copy of its value. So if you take a copy of the value and change it, then the original will appear unchanged. You changed the copy, not the original.
Without seeing your code I cannot be sure, but I figure this is what's happening.
This doesn't happen for classes as they're passed by reference.
It's worth mentioning that this is why structs should be immutable -- that is, that once they're created, they do not change their value. Operations that provide modified versions return new structs.
EDIT: In the comments below, #supercat suggests that mutable properties can be more convenient. However property setters on structs can cause weird failures too. Here's an example that can catch you by surprise unless you deeply understand how structs work. For me, it's reason enough to avoid mutable structs altogether.
Consider the following types:
struct Rectangle {
public double Left { get; set; }
}
class Shape {
public Rectangle Bounds { get; private set; }
}
Ok, now imagine this code:
myShape.Bounds.Left = 100;
Perhaps surprisingly, This has no effect at all! Why? Let's re-write the code in longer yet equivalent form:
var bounds = myShape.Bounds;
bounds.Left = 100;
It's easier to see here how the value of Bounds is copied to a local variable, and then its value is changed. However at no point is the original value in Shape updated.
This is pretty compelling evidence to make all public structs immutable. If you know what you're doing, mutable structs can be handy, but personally I only really use them in that form as private nested classes.
As #supercat points out, the alternative is a little unsightly:
myShape.Bounds = new Rectangle(100, myShape.Bounds.Top,
myShape.Bounds.Width, myShape.Bounds.Height);
Sometimes it's more convenient to add helper methods:
myShape.Bounds = myShape.Bounds.WithLeft(100);
When a struct is passed by value, the system will make a copy of the struct for the callee, so it can see its contents, and perhaps modify its own copy, but but cannot affect the fields in the caller's copy. It's also possible to pass structs by ref, in which case the callee will be able to work with the caller's copy of the struct, modifying it if desired, and even pass it by ref to other functions which could do likewise. Note that the only way the called function can make the caller's copy of the struct available to other functions, though, is to pass it by ref, and the called function can't return until all functions to which it has passed the struct by ref have also returned. Thus, the caller can be assured that any changes which might occur to the structure as a consequence of the function call will have occurred by the time it returns.
This behavior is different from class objects; if a function passes a mutable class object to another function, it has no way of knowing if or when that other function will cause that object to be mutated immediately or at any future time, even after the function has finished running. The only way one can ever be sure that any mutable object won't be mutated by outside code is to be the sole holder of that object from the moment of its creation until its abandonment.
While one who is not used to value semantics may initially be "surprised" at the fact passing a struct by value simply gives the called function a copy of it, and assigning one struct storage location to another simply copies the contents of the struct, the guarantees that value types offer can be very useful. Since Point is a structure, one can know that a statement like MyPoints[5].X += 1; (assuming MyPoints is an array) will affect MyPoints[5].X but will not affect any other Point. One can further be assured that the only way MyPoints[5].X will change is if either MyPoints gets replaced with another array, or something writes to MyPoints[5]. By contrast, Point were a class and MyPoint[5] had ever been exposed to the outside world, the only way of knowing whether the aforementioned statement would affect field/property X of any other storage locations of type Point would be to examine every single storage location of type Point or Object that existed anywhere within the code to see if it pointed to the same instance as MyPoints[5]. Since there's no way for code to examine all of the storage locations of a particular type, such assurance would be impossible if Point[5] had ever been exposed to the outside world.
There is one annoying wrinkle with structs, though: generally, the system will only allow structures to be passed by ref if the called code is allowed to write to the structure in question. Struct method calls and property getters, however, receive this as a ref parameter but do not have the above restriction. Instead, when invoking a struct method or property getter on a read-only structure, the system will make a copy of the structure, pass that copy by ref to the method or property getter, and then discard it. Since the system has no way of knowing whether a method or property getter will try to mutate this, it won't complain in such cases--it will just generate silly code. If one avoids mutating this in anything other than property setters (the system won't allow the use of property setters on read-only structures), however, one can avoid problems.
While thinking a little bit about programming in Java/C# I wondered about how methods which belong to objects are represented in memory and how this fact does concern multi threading.
Is a method instantiated for each object in memory seperately or do
all objects of the same type share one instance of the method?
If the latter, how does the executing thread know which object's
attributes to use?
Is it possible to modify the code of a method in
C# with reflection for one, and only one object of many objects of
the same type?
Is a static method which does not use class attributes always thread safe?
I tried to make up my mind about these questions, but I'm very unsure about their answers.
Each method in your source code (in Java, C#, C++, Pascal, I think every OO and procedural language...) has only one copy in binaries and in memory.
Multiple instances of one object have separate fields but all share the same method code. Technically there is a procedure that takes a hidden this parameter to provide an illusion of executing a method on an object. In reality you are calling a procedure and passing structure (a bag of fields) to it along with other parameters. Here is a simple Java object and more-or-less equivalent pseudo-C code:
class Foo {
private int x;
int mulBy(int y) {
return x * y
}
}
Foo foo = new Foo()
foo.mulBy(3)
is translated to this pseude-C code (the encapsulation is forced by the compiler and runtime/VM):
struct Foo {
int x = 0;
}
int Foo_mulBy(Foo *this, int y) {
return this->x * y;
}
Foo* foo = new Foo();
Foo_mulBy(foo, 3)
You have to draw a difference between code and local variables and parameters it operates on (the data). Data is stored on call stack, local to each thread. Code can be executed by multiple threads, each thread has its own copy of instruction pointer (place in the method it currently executes). Also because this is a parameter, it is thread-local, so each thread can operate on a different object concurrently, even though it runs the same code.
That being said you cannot modify a method of only one instance because the method code is shared among all instances.
The Java specifications don't dictate how to do memory layout, and different implementations can do whatever they like, providing it meets the spec where it matters.
Having said that, the mainstream Oracle JVM (HotSpot) works off of things called oops - Ordinary Object Pointers. These consist of two words of header followed by the data which comprises the instance member fields (stored inline for primitive types, and as pointers for reference member fields).
One of the two header words - the class word - is a pointer to a klassOop. This is a special type of oop which holds pointers to the instance methods of the class (basically, the Java equivalent of a C++ vtable). The klassOop is kind-of a VM-level representation of the Class object corresponding to the Java type.
If you're curious about the low-level detail, you can find out a lot more by looking in the OpenJDK source for the definition of some of the oop types (klassOop is a good place to start).
tl;dr Java holds one blob of code for each method of each type. The blobs of code are shared among each instance of the type, and hidden this pointers are used to know which instance's members to use.
I am going to try to answer this in the context of C#.There are basically 3 different types of Methods
virtual
non-virtual
static
When your code is executed, you basically have two kinds of objects that are formed on the heap.
The object corresponding to the type of the object. This is called Type Object. This holds the type object pointer, the sync block index, the static fields and the method table.
The object corresponding to the object itself, which contains all the non static fields.
In response to your questions,
Is a method instantiated for each object in memory seperately or do all objects of the same type share one instance of the method?
This is a wrong way of understanding objects. All methods are per type only. Look at it this way. A method is just a set of instructions. The first time you call a particular method, the IL code is JITed into native instructions and saved in memory. The next time this is called, the address is picked up from the method table and the same instructions are executed again.
2.If the latter, how does the executing thread know which object's attributes to use?
Each static method call on a Type results in looking up the method table from the corresponding Type Object and finding the address of the JITed instruction. In case of methods that are not static, the the relevant object on which the method is called is maintained on the thread's local stack. Basically, you get the nearest object on the stack. That is always the object on which we want the method to be called.
3.Is it possible to modify the code of a method in C# with reflection for one, and only one object of many objects of the same type?
No, It is not possible now. (And I am thankful for that). The reason is that reflection only allows code inspection. If you figure out what some method actually means, there is no way you are going to be able to change the code in the same assembly.
I want to know that why the name of constructor is always same as that of class name and how its get invoked implicitly when we create object of that class. Can anyone please explain the flow of execution in such situation?
I want to know that why the name of constructor is always same as that of class name
Because this syntax does not require any new keywords. Aside from that, there is no good reason.
To minimize the number of new keywords, I didn't use an explicit syntax like this:
class X {
constructor();
destructor();
}
Instead, I chose a declaration syntax that mirrored the use of constructors.
class X {
X();
~X();
This may have been overly clever. [The Design And Evolution Of C++, 3.11.2 Constructor Notation]
Can anyone please explain the flow of execution in such situation?
The lifetime of an object can be summarized like this:
allocate memory
call constructor
use object
call destructor/finalizer
release memory
In Java, step 1 always allocates from the heap. In C#, classes are allocated from the heap as well, whereas the memory for structs is already available (either on the stack in the case of non-captured local structs or within their parent object/closure). Note that knowing these details is generally not necessary or very helpful. In C++, memory allocation is extremely complicated, so I won't go into the details here.
Step 5 depends on how the memory was allocated. Stack memory is automatically released as soon as the method ends. In Java and C#, heap memory is implicitly released by the Garbage Collector at some unknown time after it is no longer needed. In C++, heap memory is technically released by calling delete. In modern C++, delete is rarely called manually. Instead, you should use RAII objects such as std::string, std::vector<T> and std::shared_ptr<T> that take care of that themselves.
Why? Because the designers of the different languages you mention decided to make them that way. It is entirely possible for someone to design an OOP language where constructors do not have to have the same name as the class (as commented, this is the case in python).
It is a simple way to distinguish constructors from other functions and makes the constructing of a class in code very readable, so makes sense as a language design choice.
The mechanism is slightly different in the different languages, but essentially this is just a method call assisted by language features (the new keyword in java and c#, for example).
The constructor gets invoked by the runtime whenever a new object is created.
Seem to me that having sepearte keywords for declaring constructor(s) would be "better", as it would remove the otherwise unnecessary dependency to the name of the class itself.
Then, for instance, the code inside the class could be copied as the body of another without having to make changes regarding name of the constructor(s). Why one would want to do this I don't know (possibly during some code refactoring process), but the point is one always strives for independency between things and here the language syntax goes against that, I think.
Same for destructors.
One of the good reasons for constructor having the same name is their expressiveness. For example, in Java you create an object like,
MyClass obj = new MyClass(); // almost same in other languages too
Now, the constructor is defined as,
class MyClass {
public MyClass () {... }
}
So the statement above very well expresses that, you are creating an object and while this process the constructor MyClass() is called.
Now, whenever you create an object, it always calls its constructor. If that class is extending some other Base class, then their constructor will be called first and so on. All these operations are implicit. First the memory for the object is allocated (on heap) and then the constructor is called to initialize the object. If you don't provide a constructor, compiler will generate one for your class.
In C++, strictly speaking constructors do not have names at all. 12.1/1 in the standard states, "Constructors do not have names", it doesn't get much clearer than that.
The syntax for declaring and defining constructors in C++ uses the name of the class. There has to be some way of doing that, and using the name of the class is concise, and easy to understand. C# and Java both copied C++'s syntax, presumably because it would be familiar to at least some of the audience they were targeting.
The precise flow of execution depends what language you're talking about, but what the three you list have in common is that first some memory is assigned from somewhere (perhaps allocated dynamically, perhaps it's some specific region of stack memory or whatever). Then the runtime is responsible for ensuring that the correct constructor or constructors are called in the correct order, for the most-derived class and also base classes. It's up to the implementation how to ensure this happens, but the required effects are defined by each of those languages.
For the simplest possible case in C++, of a class that has no base classes, the compiler simply emits a call to the constructor specified by the code that creates the object, i.e. the constructor that matches any arguments supplied. It gets more complicated once you have a few virtual bases in play.
I want to know that why the name of constructor is always same as that
of class name
So that it can be unambigously identified as the constructor.
and how its get invoked implicitly when we create object of that class.
It is invoked by the compiler because it has already been unambiguously identified because of its naming sheme.
Can anyone please explain the flow of execution in such situation?
The new X() operator is called.
Memory is allocated, or an exception is thrown.
The constructor is called.
The new() operator returns to the caller.
the question is why designers decided so?
Naming the constructor after its class is a long-established convention dating back at least to the early days of C++ in the early 1980s, possibly to its Simula predecessor.
The convention for the same name of the constructor as that of the class is for programming ease, constructor chaining, and consistency in the language.
For example, consider a scenario where you want to use Scanner class, now what if the JAVA developers named the constructor as xyz!
Then how will you get to know that you need to write :
Scanner scObj = new xyz(System.in) ;
which could've have been really weird, right! Or, rather you might have to reference a huge manual to check constructor name of each class so as to get object created, which is again meaningless if you could have a solution of the problem by just naming constructors same as that of the class.
Secondly, the constructor is itself created by the compiler if you don't explicitly provide it, then what could be the best name for constructor could automatically be chosen by the compiler so it is clear to the programmer! Obviously, the best choice is to keep it the same as that of the class.
Thirdly, you may have heard of constructor chaining, then while chaining the calls among the constructors, how the compiler will know what name you have given to the constructor of the chained class! Obviously, the solution to the problem is again same, KEEP NAME OF THE CONSTRUCTOR SAME AS THAT OF THE CLASS.
When you create the object, you invoke the constructor by calling it in your code with the use of new keyword(and passing arguments if needed) then all superclass constructors are invoked by chaining the calls which finally gives the object.
Thanks for Asking.