Why do C# and Java require everything to be in a class?

Why do C# and Java require everything to be in a class? - c#

It seemed like this question should have been asked before, but searching found nothing.
I've always wondered what's the point of making us put every bit of code inside a class or interface. I seem to remember that there were some advantages to requiring a main() function like C, but nothing for classes. Languages like Python are, in a way, even more object oriented than Java since they don't have primitives, but you can put code wherever you want.
Is this some sort of "misinterpretation" of OOP? After all, you can write procedural code like you would in C and put it inside a class, but it won't be object oriented.

I think the goal of requiring that everything is enclosed in classes is to minimize the number of concepts that you need to deal with in the language. In C# or Java, you only need to understand the object-model (which is fairly complex, though). However, you only have classes with members and instances of classes (objects).
I think this is a very important goal that most of the languages try to follow in one way or another. If C# had some global code (for example to allow interactive evaluation and specification of the startup code without Main method), you'd have one additional concept to learn (top-level code). The choice made by C#/Java is of course just one way to get the simplicity.
Of course, it is a question whether this is the right choice. For example:
In functional languages, programs are structured using types (type declarations) and expressions. The body of the program is simply an expression that is evaluated, which is a lot simpler than a class with Main method and it also enables interactive scripting (as in Python).
In Erlang (and similar languages), program is structured as concurrently executing processes with one main process that starts other processes. This is a dramatically different approach, but it makes a good sense for some types of applications.
In general, every language has some way of looking at the world and modelling it and uses this point of view when looking at everything. This works well in some scenarios, but I think that none of the models is fully universal. That may be a reason why languages that mix multiple paradigms are quite popular today.
As a side-note, I think that the use of Main method is somewhat arguable choice (probably inheriting from C/C++ languages). I would suppose that more clear object-oriented solution would be to start the program by creating an instance of some Main class.

C# was not designed for "programming in the small". Rather, it was designed for component-oriented programming. That is, for programming scenarios where teams of people are developing interdependent software components that are going to be released in multiple versions over time.
The emphasis on programming-in-the-large and away from programming-in-the-small means that sometimes there is a whole lot of 'ceremony' around small programs. Using this, class that, main blah blah blah, all to write 'hello world'.
The property "a one line program is one line long" would be nice to have in C#. We're considering allowing code outside of classes in small programs as a possible feature in a hypothetical future version of C#; if you have constructive opinions pro or con on such a feature, feel free to send them to me via the contact link on my blog.

I think the idea with Java was that a top-level class would represent a single unit of code that would be compiled to a separate .class file. The idea was that these discrete, self-contained units of code could then be easily shared and combined into many projects (just as a carpenter can combine basic parts like nuts, bolts, pieces of wood, etc., to make a variety of items). The class was seen as the smallest, most basic atomic unit, and thus everything should be a part of a class to make it easier to assemble larger programs from these parts.
One can argue that object-oriented programming's promise of easily composable code didn't work out very well, but back when Java was being designed, the goal of OOP was to create little units (classes) that could easily be combined to make unique programs.
I imagine C# had some of the same goals in mind.

Related

How to share business concepts across different programming languages?

We develop a distributed system built from components implemented in different programming languages (C++, C# and Python) and communicating one with another across a network.
All the components in the system operate with the same business concepts and communicate one with another also in terms of these concepts.
As a results we heavily struggle with the following two challenges:
Keeping the representation of our business concepts in these three languages in sync
Serialization / deserialization of our business concepts across these languages
A naive solution for this problem would be just to define the same data structures (and the serialization code) three times (for C++, C# and Python).
Unfortunately, this solution has serious drawbacks:
It creates a lot of “code duplication”
It requires a huge amount of cross-language integration tests to keep everything in sync
Another solution we considered is based on the frameworks like ProtoBufs or Thrift. These frameworks have an internal language, in which the business concepts are defined, and then the representation of these concepts in C++, C# and Python (together with the serialization logic) is auto-generated by these frameworks.
While this solution doesn’t have the above problems, it has another drawback: the code generated by these frameworks couples together the data structures representing the underlying business concepts and the code needed to serialize/deserialize these data-structures.
We feel that this pollutes our code base – any code in our system that uses these auto-generated classes is now “familiar” with this serialization/deserialization logic (a serious abstraction leak).
We can work around it by wrapping the auto-generated code by our classes / interfaces, but this returns us back to the drawbacks of the naive solution.
Can anyone recommend a solution that gets around the described problems?

Lev, you may want to look at ICE. It provides object-oriented IDL with mapping to all the languages you use (C++, Python, .NET (all .NET languages, not just C# as far as I understand)). Although ICE is a middle-ware framework, you don't have to follow all its policies.
Specifically in your situation you may want to define the interfaces of your components in ICE IDL and maintain them as part of the code. You can then generate code as part of your build routine and work from there. Or you can use more of the power that ICE gives you.
ICE support C++ STL data structures and it supports inheritance, hence it should give you sufficiently powerful formalism to build your system gradually over time with good degree of maintainability.

Well, once upon a time MS tried to solve this with IDL. Well, actually it tried to solve a bit more than defining data structures, but, anyway, that's all in the past because no one in their right mind would go the COM route these days.
One option to look at is SWIG which is supposed to be able to port data structures as well as actual invocation across languages. I haven't done this myself but there's a chance it won't couple the serialization and data-structures so tightly as protobufs.
However, you should really consider whether the aforementioned coupling is such a bad thing after all. What would be the ideal solution for you? Supposedly it's something that does two things: it generates compatible data structures across multiple languages based on one definition and it also provides the serialization code to stitch them together - but in a separate abstraction layer. The idea being that if one day you decide to use a different serialization method you could just switch out that layer without having to redefine all your data structures. So consider that - how realistic is it really to expect to some day switch out only the serialization code without touching the interfaces at all? In most cases the serialization format is the most permanent design choice, since you usually have issues with backwards compatibility, etc. - so how much are you willing to pay right now in development cost in order to be able to theoretically pull that off in the future?
Now let's assume for a second that such a tool exists which separates data structure generation from serialization. And lets say that after 2 years you decide you need a completely different serialization method. Unless this tool also supports plugable serialization formats you would need to develop that layer anyway in order to stitch your existing structures to the new serialization solution - and that's about as much work as just choosing a new package altogether. So the only real viable solution that would answer your requirements is something that not only support data type definition and code generation across all your languages, and not only be serialization agnostic, but would also have ready made implementation of that future serialization format you would want to switch to - because if it's only agnostic to the serialization format it means you'd still have the task of implementing it on your own - in all languages - which isn't really less work than redefining some data structures.
So my point is that there's a reason serialization and data type definition so often go together - it's simply the most common use case. I would take a long look at what exactly you wish to be able to achieve using the abstraction level you require, think of how much work developing such a solution would entail and if it's worth it. I'm certain that are tools that do this, btw - just probably the expensive proprietary kind that cost $10k per license - the same argument applies there in my opinion - it's probably just over engineering.

All the components in the system operate with the same business concepts and communicate
one with another also in terms of these concepts.
When I got you right, you have split up your system in different parts communicating by well-defined interfaces. But your interfaces share data structures you call "business concepts" (hard to understand without seeing an example), and since those interfaces have to build for all of your three languages, you have problems keeping them "in-sync".
When keeping interfaces in sync gets a problem, then it seems obvious that your interfaces are too broad. There are different possible reasons for that, with different solutions.
Possible Reason 1 - you overgeneralized your interface concept. If that's the case, redesign here: throw generalization over board and create interfaces which are only as broad as they have to be.
Possible reason 2: parts written in different languages are not dealing with separate business cases, you may have a "horizontal" partition between them, but not a vertical. If that's the case, you cannot avoid the broadness of your interfaces.
Code generation may be the right approach here if reason 2 is your problem. If existing code generators don't suffer your needs, why don't you just write your own? Define the interfaces for example as classes in C#, introduce some meta attributes and use reflection in your code generator to extract the information again when generating the according C++, Python and also the "real-to-be-used" C# code. If you need different variants with or without serialization, generate them too. A working generator should not be more effort than a couple of days (YMMV depending on your requirements).

I agree with Tristan Reid (wrapping the business logic).
Actually, some months ago I faced the same problem, and then I incidentally discovered the book "The Art Of Unix Programming" (freely available online). What grabbed my attention was the philosophy of separating policy from mechanism (i.e. interfaces from engines). Modern programming environments such as the NET platform try to integrate everything under a single domain. In those days I was asked for developing a WEB application that had to satisfy the following requirements:
It had to be easily adapted to future trends of User Interfaces without having to change the core algorithms.
It had to be accessible by means of different interfaces: web, command line and desktop GUI.
It had to run on Windows and Linux.
I bet for developing the mechanism (engines) completely in C/C++ and using native OS libraries (POSIX or WinAPI) and good open source libraries (postgresql, xml, etc...). I developed the engine modules as command-line programs and I eventually implemented 2 interfaces: web (with PHP+JQuery framework) and desktop (NET framework). Both interfaces had nothing to do with the mechanisms: they simply launched the core modules executables by calling functions such as CreateProcess() in Windows, or fork() in UNIX, and used pipes to monitor their processes.
I'm not saying UNIX Programming Philosophy is good for all purposes, but I am applying it from then with good results and maybe it will work for you too. Choose a language for implementing the mechanism and then use another that makes interface design easy.

You can wrap your business logic as a web service and call it from all three languages - just a single implementation.

You could model these data structures using tools like a UML modeler (Enterprise Architect comes to mind as it can generate code for all 3.) and then generate code for each language directly from the model.
Though I would look closely at a previous comment about using XSD.

I would accomplish that by using some kind of meta-information about your domain entities (either XML or DSL, depending on complexity) and then go for code generation for each language. That would reduce (manual) code duplication.

Why there is no declarative immutability in C#?

Why the designers of C# did not allow for something like this?
public readonly class ImmutableThing
{
...
}
One of the most important ways to safe multi-threading is the use of immutable objects/classes, yet there is no way to declare a class as immutable. I know I can make it immutable by proper implementation but having this enforced by class declaration would make it so much easier and safer. Commenting a class as immutable is a "door prop" solution at best.
One look at a class declaration and you would instantly know it was immutable. If you had to modify someone else's code you would know a class does not allow changes by intent. I can only see advantages here but I can't believe no one thought about this before. So why is not supported?
EDIT
Some say this is not very important feature but that does not really convince me. Multicore processors showed up because increasing performance by frequency hit a wall. Supercomputers are heavily multiprocessor machines. Parallel processing is more and more important and is one of the main ways to improve performance. The support for multithreading and parallel processing in .NET is significant (various lock types, thread pool, tasks, async calls, concurrent collections, blocking collection, parallel foreach, PLINQ and so on) and it seems to me everything that helps you write parallel code more easily gives an edge. Even if it's non trivial to implement.

Basically, because it's complicated - and as usr wrote, features need a lot of work in various ways before they're ready to ship. (It's easy being an armchair language designer - I'm sure it's incredibly difficult to really do it, in a language with millions of developers with critical code bases which can't be broken by changes.)
It's tricky for a compiler to verify that a type is visibly-immutable without being overly restrictive in some cases. As an example, String is actually mutable within mscorlib, but the code of other types (e.g. StringBuilder) has been written very carefully to avoid the outside world ever seeing that mutability.
Eric Lippert has written a lot on immutability - it's a complex topic which would/will need a lot of work to turn into a practical language feature. It's also quite hard to retrofit onto a language and framework which didn't have it to start with. I'd love C# to at least make it easier to write immutable types, and I suspect the team has spent quite a while thinking about it - whether they'll ever be happy enough with their ideas to turn it into a production language feature is a different matter.

Features need to be designed, implemented, tested, documented, deployed and supported. That's why we get the most important features first, and the less important ones late or never.
Your proposal is ok, but there is an easy workaround (as you said). Therefore it is not an "urgent" feature.
There is also a thing called representational immutability where state mutations inside the object are allowed but are never made visible to the outside. Example: a lazily-calculated field. This would not be possible under your proposal because the compiler could never prove the class to be immutable to the outside, although its field are routinely written to.

Techniques for translating code by hand

In case I want to translate a program's code in a different programming language, I have these options: to use special software or to do it on my own (please tell me if there are other ways).
Personally I'm a beginner in C++, C# and Java. I'm not familiar with other languages. I'd like to try to do a translation of one of my C++ programs to C# and/or Java. But before doing that I'd like to learn about a technique or two about translating. I'd like to learn it from someone who is familiar with such thing.
So, can you tell me techniques for translating code from one programming language to another? And if there is something that I should know before I start translating, please tell me.

Learn both languages. Understand each language's idioms and philosophy. Acquaint yourself with common coding styles and design patterns.
Read your source program, and understand what it is doing.
Think about how you would express the same overall goals in the target language, using the knowledge and experience gained from (1).
Rewrite the program in the target language.
(Counter-example: Your source is Java, and you see String a = new String;. You open C++ and say, std::string * a = new std::string;. Wrong, go back to (1).)

Some of the worst C# code I've seen was written by programmers that were thinking in C or C++ while writing C# syntax.
C# and Java are fundamentally different from C++. I don't like the term "translating" because this is the wrong attitude. You actually need to rewrite the program after you've learned how to think in the terms of the new language.

There are tools, but I never find one that's perfect. At least if it's convertible syntactically, libraries would be a problem. I prefer to have deep enough knowledge between the languages (and of course, the project being converted), and convert everything by hand. The advantages of this would be you could create a clone of the project, using each language's best techniques (that may be either impossible or not optimal in other languages). The disadvantages are, however, you might be doing something wrong doing the conversion (causing incompatible end result) and it's slower.

If you are into Test Driven Development then you have all the tests for the original implementation. Just keep coding in the new language until you pass all the tests once again. Don't worry too much about Kerrek SB's counterexample. If and when you run into problems like that, just write more tests.
On the other hand, if you don't already have a suite of tests, then why are you wasting time on porting when you should be testing.

Isn't it a hedeque to "translate" it by yourself, if there is a software that can do it for you? Unless if there is a special purpose for having that skill of translate from one language to another....
Pay attention that not all languages have the same computable power,therefore it may not work sometimes....
Perhaps it would be better to stay focus on one main language.

Using a DSL to generate C# Code

Currently the project I'm working with does not have completely fixed models (due to an external influence) and hence I'd like some flexibility in writing them. Currently they are replicated across three different layers of the application (db, web api and client) and each has similar logic in it (ie. validation).
I was wondering if there is an approach that would allow me to write a model file (say in ruby), and then have it convert that model into the necessary c# files. Currently it seems I'm just writing a lot of boilerplate code that may change at any stage, whereas this generated approach would allow me to focus on much more important things.
Does anyone have a recommendation for something like this, a dsl/language I can do this in, and does anyone have any experience regarding something like this?

This can be easily done with ANTLR. If the output is similar enough you can simply use the text templating mechanism—otherwise it can generate an abstract syntax tree for you to traverse.

I have seen a system that used partial classes and partial methods to allow for regeneration of code without affecting custom code. The "rules engine" if you will was completely generated from a Visio state diagram. This is basically poor mans workflow but very easy to modify. The Viso diagram was exported to XML which was read in using powershell and T4 to generate the classes.
The above example is of an external DSL. I.E. external to the programming language that the application runs in. You could on the other hand create an internal DSL which is implemented and used in a programming language.
This and the previous article on DSLSs from Code-Magazine are quite good.
In the above link Neal Ford shows you how to create an internal DSL in C# using a fluent interface.
One thing he hasn't mentioned yet is that you can put this attribute [EditorBrowsable(EditorBrowsableState.Never)] on your methods so that they don't appear to intellisense. This means that you can hide the non-DSL (if you will) methods on the class from the user of the DSL making the fluent API much more discoverable.
You can see a fluent interface being written live in this video series by Daniel Cazzulino on writing an IoC container with TDD
On the subject of external DSLs you also have the option of Oslo (CTP at the moment) which is quite powerful in it's ability to let you create external DSLs that can be executed directly rather than for the use of code generation which come to think of it isn't really much of a DSL at all.

I think you are on the right track.
What I usually do in a situation like this is design a simple language that captures my needs and write a LL1 (Recursive Descent) parser for it.
If the language has to have non-trivial C# syntax in it, I can either quote that, or just wrap it in brackets that I can recognize, and just pass it through to the output code.
I can either have it generate a parse tree structure, and generate say 3 different kinds of code from that, or I can just have it generate code on the fly, either using a mode variable with 3 values, or just simultaneously write code to 3 different output files.
There's more than one way to do it. If you are afraid of writing parsers (as some programmers are), there is lots of help elsewhere on SO.

Comparing C# and Java

I learned Java in college, and then I was hired by a C# shop and have used that ever since. I spent my first week realizing that the two languages were almost identical, and the next two months figuring out the little differences. For the most part, was I noticing the things that Java had that C# doesn't, and thus was mostly frustrated. (example: enum types which are full-fledged classes, not just integers with a fresh coat of paint) I have since come to appreciate the C# world, but I can't say I knew Java well enough to really contrast the two so I'm curious to get a community cross-section.
What are the relative merits and weaknesses of C# and Java? This includes everything from language structure to available IDEs and server software.

Comparing and contrasting the languages between the two can be quite difficult, as in many ways it is the associated libraries that you use in association with the language that best showcases the various advantages of one of another.
So I'll try to list out as many things I can remember or that have already been posted and note who I think has the advantage:
GUI development (thick or thin). C# combined with .NET is currently the better choice.
Automated data source binding. C# has a strong lead with LINQ, also a wealth of 3rd part libraries also gives the edge
SQL connections. Java
Auto-boxing. Both languages provide it, but C# Properties provides a better design for it in regards to setters and getters
Annotation/Attributes. C# attributes are a stronger and clear implementation
Memory management - Java VM in all the testing I have done is far superior to CLR
Garbage collection - Java is another clear winner here. Unmanaged code with the C#/.NET framework makes this a nightmare, especially when working with GUI's.
Generics - I believe the two languages are basically tied here... I've seen good points showing either side being better. My gut feeling is that Java is better, but nothing logic to base it on. Also I've used C# generics ALLOT and Java generics only a few times...
Enumerations. Java all the way, C# implementation is borked as far as I'm concerned.
XML - Toss up here. The XML and serialization capabilities you get with .NET natively beats what you get with eclipse/Java out of the box. But there are lots of libraries for both products to help with XML... I've tried a few and was never really happy with any of them. I've stuck with native C# XML combined with some custom libraries I made on my own and I'm used to it, so hard to give this a far comparison at this point...
IDE - Eclipse is better than Visual Studio for non-GUI work. So Java wins for non-GUI and Visual Studio wins for GUI...
Those are all the items I can't think off for the moment... I'm sure you can literally pick hundreds of items to compare and contrasting the two. Hopefully this lists is a cross section of the more commonly used features...

One difference is that C# can work with Windows better. The downside of this is that it doesn't work well with anything but Windows (except maybe with Mono, which I haven't tried).

Another thing to keep in mind, you may also want to compare their respective VMs.
Comparing the CLR and Java VM will give you another way to differentiate between the two.
For example, if doing heavy multithreading, the Java VM has a stronger memory model than the CLR (.NET's equivalent).

C# has a better GUI with WPF, something that Java has traditionally been poor at.
C# has LINQ which is quite good.
Otherwise the 2 are practically the same - how do you think they created such a large class library so quickly when .NET first came out? Things have changed slightly since then, but fundamentally, C# could be called MS-Java.

Don't take this as anything more than an opinion, but personally I can't stand Java's GUI. It's just close enough to Windows but not quite, so it gets into an uncanny valley area where it's just really upsetting to me.
C# (and other .Net languages, I suppose) allow me to make programs that perfectly blend into Windows, and that makes me happy.
Of course, it's moot if we're not talking about developing a desktop application...

Java:
Enums in Java kick so much ass, its not even funny.
Java supports generic variance
C#:
C# is no longer limited to Windows (Mono).
The lack of the keyword internal in Java is rather disappointing.

You said:
enum types which are full-fledged classes, not just integers with a fresh coat of paint
Have you actually looked at the output? If you compile an application with enums in in then read the CIL you'll see that an enum is actually a sealed class deriving from System.Enum.
Tools such as Red-Gate (formerly Lutz Roeder's) Reflector will disassemble it as close to the orginal C# as possible so it may not be easily visible what is actually happening under the hood.

As Elizabeth Barrett Browning said: How do I love thee? Let me count the ways.
Please excuse the qualitative (vs. quantitative) aspect of this post.
Comparing these 2 languages (and their associated run-times) is very difficult. Comparisons can be at many levels and focus on many different aspects (such as GUI development mentioned in earlier posts). Preference between them is often personal and not just technical.
C# was originally based on Java (and the CLR on the JRE) but, IMHO, has, in general, gone beyond Java in its features, expressiveness and possibly utility. Being controlled by one company (vs. a committee), C# can move forward faster than Java can. The differences ebb and flow across releases with Java often playing catch up (such as the recent addition of lambdas to Java which C# has had for a long time). Neither language is a super-set of the other in all aspects as both have features (and foibles) the other lacks.
A detailed side-by-side comparison would likely take several 100s of pages. But my net is that for most modern business related programming tasks they are similar in power and utility. The most critical difference is probably in portability. Java runs on nearly all popular platforms, which C# runs mostly only on Windows-based platforms (ignoring Mono, which has not been widely successful). Java, because of its portability, arguably has a larger developer community and thus more third party library and framework support.
If you feel the need to select between them, your best criteria is your platform of interest. If all your work will run only on Windows systems, IMHO, C#/CLR, with its richer language and its ability to directly interact with Windows' native APIs, is a clear winner. If you need cross system portability then Java/JRE is a clear winner.
PS. If you need more portable jobs skills, then IMHO Java is also a winner.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.