C# build speed vs native C++

C# build speed vs native C++ - c#

Is there anyone that have made any comparison between build times for similar sized C++ and C# projects ?
We are trying to convince our IT drone that a multi-million line C++ projects with heavy template usages, complies slower than a similar sized C# project and the extra money spent on memory and SSD disks are well spent.

We are trying to convince our IT drone that a multi-million line C++
projects with heavy template usages, complies slower than a similar
sized C# project and the extra money spent on memory and SSD disks are
well spent.
That's for sure a common and really important problem of C++. I think it's the most urgent to fix in the next version of C++, mybe using Modules. It's a language design problem, linked to the way c++ compilation unit are organized.
Anyway, there are efforts to fix that by making compilers fasters. The most important effort currently is CLang.
For sources about the compilation time, there are tons (google : unity build, c++ compilation time)
But I think a demonstration would be better. Take boost (http://boost.org), compile it. It's a big set of libraries so it can easily compare with a big code source base in C#.
About using ssd to boost c++ compilation, here is a study : http://exdream.com/Blog/post/2009/05/03/Visual-Studio-compile-times-on-different-disk-drives-and-SSDs.aspx

But i belive, unless you use Ngen.exe on your project all the C# projects are 2 stage compilation process which we already know. But on C++, its one step compilation (No JITing).
I guess there is a bit of difference in build times.Perhaps Jon Skeet should have a perfect answer.

I never did a comparison like this on similar projects, where similar you probably mean the same row count. But what I usually noticed on my experience, is that C++ compiles much slower. I suppose cause it passes more steps during compilation, one of with linking which can be pretty slow.

Related

Why compiling into intermediate code?

Why do Actionscript, Java, C#, etc. compile into intermediate code?
I'm aware of cross-platform benefits of using intermediate code.
The question is: What is the benefit of compiling to intermediate code, compared to scripts (JS, Python, PHP, Perl, etc.) that are interpreted?
Is it just for code obfuscation? Or what?
In addition what is the benefit compared to compiling to native code?

It is much faster to parse and JIT-compile IL code than to parse a high-level language like Java or especially C# (which has more features).
It also allows developers to use new language features without updating anything on the end-users' machines. (eg, LINQBridge)

First of all historically java was targeted to mobiles and embedded devices with very limited CPU/memory, and JVM on such devices can't do any intensive optimization, like modern JITs do for now. So java engeneers decided to move optimization to compile phase.
Next, having "byte code"/IL allows you to have many compilers from different languages into such byte code - and only one JVM/JIT. It is much more efficient then to create separate VM+JIT for each language. Remember, you have Java, JRuby, JPython, Groovy, etc... in JVM world, and C#, VB#, ASP.NET, F#, etc -- in .NET world, and only one runtime/VM for each.

Intermediate code is very similar to assembly in that it contains a limited set of instructions. The runtime is then able to reliably and consistently operate on this (somewhat) small set of instructions without having to worry about parsing the language, etc. It can therefore gain performance improvements and make optimizations.

In .NET compilation to MSIL solves interoperability between different languages like C#, VB.NET etc.

Binary code instructions are very simple and atomic, much more so than programming statements can be.
For example, z = a + b - (c * d) would need to be translated into loading four values into registers, then an add, multiply, and subtract, then write into another memory location. So we have at least 8 instructions or so just for that line!
Intermediate code is the best of both worlds-- cross-platform, but also closer to the way machine code is normally written.

In addition to SLaks's answer, compiling to IL enables a degree of cross-language interoperability that generally doesn't exist in interpreted languages.
This advantage can be huge for new languages. Scala's only been around since 2003, and it's already gained an immense amount of traction. Ruby, on the other hand, hasn't spread very far beyond being used for Rails apps in its 1.5 decades of existence. This is at least in part because Scala is bytecode-compatible with all pre-existing Java code and libraries, which gives it a huge leg up: Its community can focus most of its effort on the language itself, and potential adopters don't have to worry about going through any special contortions (or, worse yet, replacing their entire codebase) in order to start using Scala. F#'s story is almost identical, but for the other major managed environment.
Meanwhile Ruby doesn't speak with code from other languages quite so easily, so its community has to sink a lot more effort into developing Ruby-specific libraries and frameworks, and its potential users have to be a lot more willing to commit to a large-scale platform shift in order to use it.

C++/CLI performance compared to Native C++?

Good morning,
I am writting a spell checker which, for the case, is performance-critical. That being, and since I am planning to connect to a DB and making the GUI using C#, I wrote an edit-distance calculation routine in C and compiled to a DLL which I use in C# using DllImport. The problem is that I think (though I am possibly wrong) that marshalling words one by one from String to char * is causing a lot of overhead. That being, I thought about using C++/CLI so that I can work with the String type in .NET directly... My question is then how does C++/CLI performance compares to native C code for heavy mathematical calculations and array access?
Thank you very much.

C++/CLI will have to do some kind of marshaling too.
Like all performance related problems, you should measure and optimize. Are you sure C# is not going to be fast enough for your purposes? Don't underestimate the optimizations that JIT compiler is going to do. Don't speculate on the overhead of a language implementation solely for being managed without trying. If it's not enough, have you considered unsafe C# code (with pointers) before trying unmanaged code?
Regarding the performance profile of C++/CLI, it really depends on the way it's used. If you compile to managed code (CIL) with (/clr:pure), it's not going to be very different from C#. Native C++ functions in C++/CLI will have similar performance characteristics to plain C++. Passing objects between native C++ and CLI environment will have some overhead.

I would not expect that the bottleneck will be with the DLLImport.
I have written programs which call DLLImport several hundert times per second and it just works fine.You will pay a small performance fine, but the fine is small.

Don't assume you know what needs to be optimized. Let sampling tell you.
I've done a couple spelling-correctors, and the way I did it (outlined here) was to organize the dictionary as a trie in memory, and search upon that.
If the number of words is large, the size of the trie can be much reduced by sharing common suffixes.

Managed code (C#) vs Matlabs and C++ for speed

I am about to start developing an edge detection system (once I've read through a couple of books, which I'm doing so at good speed), but one thing I am wondering is the speed of an app like Matlabs (which can compile code to C++) vs AFORGE.NET for edge detecton.
Is unmanaged code generally faster?
Thanks

Unamanged code can be faster for some things, but whether it will make a difference in your case is impossible to say without testing it. You could try coding a small, but representative, subset of the problem in both C# and C++ to measure the difference. Of course, making it "representative" (in terms of performance characteristics) is the challenge.

Your first consideration should be to select the most productive language/environment from the set of languages and environments that provide acceptable performance. Productivity thoroughly trumps performance for the vast majority of computing problems.
Another contender, which is highly productive, is Python with numpy and scipy extensions. It can make use of LAPACK, which largely mitigates any performance concerns over using Python.

Firstly yes Unmanaged code is generally faster, but how much ? It depends. In your case, it shouldn't make much difference.
I can tell you this because very recently a friend of mine developed a edge detection system (to implement a motion detection system i guess), using the AFORGE.Net library. And i don't remember him complaining about performance.

C# compilation time for large projects (compared to C++)

I often hear people praise the compilation speed of C#. So far I have only made a few tiny applications, and indeed I noticed that compilation was very fast. However, I was wondering if this still holds for large applications. Do big C# projects compile faster than C++ projects of a similar size?

Yes, C# normally compiles a lot faster. Not always fast enough though. My biggest C# codebase with maybe a million lines of code with lots of projects took about an hour to compile. But I suspect much of this time is due to visual studios poor build system.
Compile time for C++ on the other hand is usually much longer, but is also much more dependent on how you organize your code. Poor handling of header file dependencies can easily increase compilation time with several orders of magnitude.

C++ is so slow to compile as the header files have to be reread and reparse every time they are included. Due to the way “#defines” work, it is very hard for a compiler to automatically pre-compile all header files. (Modula-2 made a much better job of this) Having 100s of headers read for each C++ file that is compiled is normal on a lot of C++ projects.
Sometimes incremental c++ compiles can be a lot faster than C#. If you have all your C++ header files (and design) in a very good state (see books like Large-Scale C++ Software Design, Effective C++) You can make a change to the implementation of a class that is used by most of the system and only have one dll recompile.
As C# does not have separate header files whenever you change the implantation of a class, all uses of the class get recompiled even if the public interface of the class has not changed. This can be reduced in C# by using “interface based programming” and “dependency injection” etc. But it is still a pain.
However on the whole I find that C# compiles fast enough, but large C++ projects are so slow to compile that I find myself not wanting to add a methods to a “base class” due to the time of the rebuild.
Having lots of Visual Studio projects with a handful of classes in each can slow down C# builds a lot. Combining related projects together and then “trusting” the developers not to use class that are private to a namespace can at times have a great benefit. (nDepends can be used to check for people breaking the rules)
(When trying to speed up C++ compiles I have found FileMon very useful. One project I worked on, the STL was added to a header file and the build got a lot slower. Just adding STL to the precompiled header file made a big difference! Therefore track your build time and investigate when it gets slower)

As far as I can tell from my own experience, yes, C# compiles a lot faster than C++ projects. Even for large applications.
This can be explained by the fact that C# is less complicated as a language than C++, and that C# is translated to IL (which can be optimized and translated later on to machine code) and C++ is translated immediately to machine language.

It is also my observation that C# is significantly faster to compile than C++. One of the main reasons is of course templates that don't need to be in headers in C#, as there are no headers. But heavy use of templates (mostly any modern C++ library like Boost) is killing the compile time in C++.

Could managed code (specifically .NET) ever become 'unmanaged'?

Recently I was talking with a friend of mine who had started a C++ class a couple months ago (his first exposure to programming). We got onto the topic of C# and .NET generally, and he made the point to me that he felt it was 'doomed' for all of the commonly-cited issues (low speed, breakable bytecode, etc). I agreed with him on all those issues, but I held back in saying it was doomed, only because I felt that, in time, languages like C# could instead become native code (if Microsoft so chose to change the implementation of .NET from a bytecode, JIT runtime environemnent to one which compiles directly to native code like your C++ program does).
My question is, am I out to lunch here? I mean, it may take a lot of work (and may break too many things), but there isn't some type of magical barrier which prevents C# code from being compiled natively (if one wanted to do it), right? There was a time where C++ was considered a very high-level language (which it still is, but not as much as in the past) yet now it's the bedrock (along with C) for Microsoft's native APIs. The idea that .NET could one day be on the same level as C++ in that respect seems only to be a matter of time and effort to me, not some fundamental flaw in the design of the language.
EDIT: I should add that if native compilation of .NET is possible, why does Microsoft choose not to go that route? Why have they chosen the JIT bytecode path?

Java uses bytecode. C#, while it uses IL as an intermediate step, has always compiled to native code. IL is never directly interpreted for execution as Java bytecode is. You can even pre-compile the IL before distribution, if you really want to (hint: performance is normally better in the long run if you don't).
The idea that C# is slow is laughable. Some of the winforms components are slow, but if you know what you're doing C# itself is a very speedy language. In this day and age it generally comes down to the algorithm anyway; language choice won't help you if you implement a bad bubble sort. If C# helps you use more efficient algorithms from a higher level (and in my experience it generally does) that will trump any of the other speed concerns.
Based on your edit, I also want to explain the (typical) compilation path again.
C# is compiled to IL. This IL is distributed to local machines. A user runs the program, and that program is then JIT-compiled to native code for that machine once. The next time the user runs the program on that machine they're running a fully-native app. There is also a JIT optimizer that can muddy things a bit, but that's the general picture.
The reason you do it this way is to allow individual machines to make compile-time optimizations appropriate to that machine. You end up with faster code on average than if you distributed the same fully-compiled app to everyone.
Regarding decompilation:
The first thing to note is that you can pre-compile to native code before distribution if you really want to. At this point you're close to the same level as if you had distributed a native app. However, that won't stop a determined individual.
It also largely misunderstands the economics at play. Yes, someone might perhaps reverse-engineer your work. But this assumes that all the value of the app is in the technology. It's very common for a programmer to over-value the code, and undervalue the execution of the product: interface design, marketing, connecting with users, and on-going innovation. If you do all of that right, a little extra competition will help you as much as it hurts by building up demand in your market. If you do it wrong, hiding your algorithm won't save you.
If you're more worried about your app showing up on warez sites, you're even more misguided. It'll show up there anyway. A much better strategy is to engage those users.
At the moment, the biggest impediment to adoption (imo) is that the framework redistributable has become mammoth in size. Hopefully they'll address that in a relatively near release.

Are you suggesting that the fact that C# is managed code is a design flaw??

C# can be natively compiled using tool such as NGEN, and the MONO (open source .net framework) team has developed full AOT (ahead of time) compilation which allows c# to run on the IPhone. However, full compilation is culbersome because it destroys cross-platform compatibility, and some machine-specific optimizations cannot be done. However, it is also important to note that .net is not an interpreted language, but a JIT (just in time) compiled language, which means it runs natively on the machine.

dude, fyi, you can always compile your c# assemblies into native image using ngen.exe
and are you suggesting .net is flawed design? it was .net which brought back ms back into the game from their crappy vb 5, vb 6, com days. it was one of their biggest bets
java does the same stuff - so are you suggesting java too is a mistake?
reg. big vendors - please note .net has been hugely hugely successful across companies of all sizes (except for those open source guys - nothing wrong with that). all these companies have made significant amount of investments into the .net framework.
and to compare c# speed with c++ is a crazy idea according to me. does c++ give u managed environment along with a world class powerful framework?
and you can always obfuscate your assemblies if you are so paranoid about decompilation
its not about c++ v/s c#, managed v/s unmanaged. both are equally good and equally powerful in their own domains

C# could be natively compiled but it is unlikely the base class library will ever go there. On the flip side, I really don't see much advantage to moving beyond JIT.

It certainly could, but the real question is why? I mean, sure, it can be slow(er), but most of the time any major differences in performance come down to design problems (wrong algorithms, thread contention, hogging resources, etc.) rather than issues with the language. As for the "breakable" bytecode, it doesn't really seem to be a huge concern of most companies, considering adoption rates.
What it really comes down to is, what's the best tool for the job? For some, it's C++; for others, Java; for others, C#, or Python, or Erlang.

Doomed? Because of supposed performance issues?
How about comparing the price of:
programmer's hour
hardware components
If you have performance issues with applications, it's much cheaper to just buy yourself better hardware, compared to the benefits you loose in switching from a higher-abstraction language to a lower one (and I don't have anything against C++, I've been a C++ developer for a long time).
How about comparing maintenance problems when trying to find memory leaks in C++ code compared to garbage-collected C# code?
"Hardware is Cheap, Programmers are Expensive": http://www.codinghorror.com/blog/archives/001198.html

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.