C# compilation time for large projects (compared to C++)

C# compilation time for large projects (compared to C++) - c#

I often hear people praise the compilation speed of C#. So far I have only made a few tiny applications, and indeed I noticed that compilation was very fast. However, I was wondering if this still holds for large applications. Do big C# projects compile faster than C++ projects of a similar size?

Yes, C# normally compiles a lot faster. Not always fast enough though. My biggest C# codebase with maybe a million lines of code with lots of projects took about an hour to compile. But I suspect much of this time is due to visual studios poor build system.
Compile time for C++ on the other hand is usually much longer, but is also much more dependent on how you organize your code. Poor handling of header file dependencies can easily increase compilation time with several orders of magnitude.

C++ is so slow to compile as the header files have to be reread and reparse every time they are included. Due to the way “#defines” work, it is very hard for a compiler to automatically pre-compile all header files. (Modula-2 made a much better job of this) Having 100s of headers read for each C++ file that is compiled is normal on a lot of C++ projects.
Sometimes incremental c++ compiles can be a lot faster than C#. If you have all your C++ header files (and design) in a very good state (see books like Large-Scale C++ Software Design, Effective C++) You can make a change to the implementation of a class that is used by most of the system and only have one dll recompile.
As C# does not have separate header files whenever you change the implantation of a class, all uses of the class get recompiled even if the public interface of the class has not changed. This can be reduced in C# by using “interface based programming” and “dependency injection” etc. But it is still a pain.
However on the whole I find that C# compiles fast enough, but large C++ projects are so slow to compile that I find myself not wanting to add a methods to a “base class” due to the time of the rebuild.
Having lots of Visual Studio projects with a handful of classes in each can slow down C# builds a lot. Combining related projects together and then “trusting” the developers not to use class that are private to a namespace can at times have a great benefit. (nDepends can be used to check for people breaking the rules)
(When trying to speed up C++ compiles I have found FileMon very useful. One project I worked on, the STL was added to a header file and the build got a lot slower. Just adding STL to the precompiled header file made a big difference! Therefore track your build time and investigate when it gets slower)

As far as I can tell from my own experience, yes, C# compiles a lot faster than C++ projects. Even for large applications.
This can be explained by the fact that C# is less complicated as a language than C++, and that C# is translated to IL (which can be optimized and translated later on to machine code) and C++ is translated immediately to machine language.

It is also my observation that C# is significantly faster to compile than C++. One of the main reasons is of course templates that don't need to be in headers in C#, as there are no headers. But heavy use of templates (mostly any modern C++ library like Boost) is killing the compile time in C++.

Related

What is the efficiency of using CPLEX with C#.NET vs C++?

Provided our UI must be in C#.NET...
Is there a performance hit for using the C++ API for CPLEX vs the C#.NET API?
I'm wondering, because if possible I would like to avoid going across the managed/unmanaged boundary if possible, but I would like to be more informed about what is going on with the C#.NET API.
Does the C#.NET just implement the managed/unmanaged boundary to C++? Does it do it well?

We have been writing CPLEX + Concert code in C++ or C# and even some Java for about 18 years. These are many separate projects for many diverse customers. The preferred choice for many projects was originally C++ because we started before C# existed. Then C# was too new and wasn't widely trusted. But since about 2005, we have been preferring C# because it is easier to program in C#. We did one project in about 2004 where we had to write the CPLEX modelling code in C++, but had to use a C# database interface library (it was a complex project, and the reasoning behind those choices wouldn't make sense today). There was a definite overhead of using the interface between C# and C++; but those effects were much smaller than the differences between using different database access methods like ado.net vs odbc.
In practice, for most of our projects, the extra efficiency gains of using C++ are very small. Typically the created system will spend more than 90% (or 99%) of its time sat inside the CPLEX library calls. We have had many cases where the code in C++ or C# takes maybe 1 or 2 minutes to read and process the input data and create the CPLEX modelling variables and constraints, then 2-10 hours solving the problem inside cplex.solve(). So even if doing the processing outside of CPLEX in C++ instead of C# was ten times as fast, the overall timings aren't going to change a great deal.
If you are writing complex algorithmic stuff that builds and solves many (small) models (e.g. if you are doing column generation or LNS or similar) then there may be measurable benefits in C++, but probably not much in the wider picture.
I would go with whatever language you are more comfortable using. Getting the code right matters more than saving a few seconds.

The official ILOG documentation has this to say (emphasis mine):
Each call to a method of the API goes through a wrapping layer. This may result in a slight performance overhead while the model is created, compared to using the C++ API, depending on the number of API function calls. Since you call only few API functions to load and solve your model, the overhead is negligible in usual cases, but it may become important if you use the low-level Concert, CP Optimizer, or CPLEX® API for a complete model creation (for example, constructing a matrix line by line using INumExpr APIs or adding IConstraint objects one by one to an IModel using the API). It is therefore recommended to use the OPL language to model your problems whenever possible, and use only the low-level Concert APIs for the parts that need it (runtime additions, etc.).

Yes, C++ faster, check this:
http://www.codeproject.com/Articles/253444/PInvoke-Performance
http://msdn.microsoft.com/en-us/library/ky8kkddw.aspx
C# P/Invoke put some overhead for error handling, etc.

C# build speed vs native C++

Is there anyone that have made any comparison between build times for similar sized C++ and C# projects ?
We are trying to convince our IT drone that a multi-million line C++ projects with heavy template usages, complies slower than a similar sized C# project and the extra money spent on memory and SSD disks are well spent.

We are trying to convince our IT drone that a multi-million line C++
projects with heavy template usages, complies slower than a similar
sized C# project and the extra money spent on memory and SSD disks are
well spent.
That's for sure a common and really important problem of C++. I think it's the most urgent to fix in the next version of C++, mybe using Modules. It's a language design problem, linked to the way c++ compilation unit are organized.
Anyway, there are efforts to fix that by making compilers fasters. The most important effort currently is CLang.
For sources about the compilation time, there are tons (google : unity build, c++ compilation time)
But I think a demonstration would be better. Take boost (http://boost.org), compile it. It's a big set of libraries so it can easily compare with a big code source base in C#.
About using ssd to boost c++ compilation, here is a study : http://exdream.com/Blog/post/2009/05/03/Visual-Studio-compile-times-on-different-disk-drives-and-SSDs.aspx

But i belive, unless you use Ngen.exe on your project all the C# projects are 2 stage compilation process which we already know. But on C++, its one step compilation (No JITing).
I guess there is a bit of difference in build times.Perhaps Jon Skeet should have a perfect answer.

I never did a comparison like this on similar projects, where similar you probably mean the same row count. But what I usually noticed on my experience, is that C++ compiles much slower. I suppose cause it passes more steps during compilation, one of with linking which can be pretty slow.

Porting C++ to C#

C++ and C# are quite simmilar programming languages, in my opinion. If a C++ code needs to be ported to platform where C# is the only supported platform, how much work will need to be done?
Should I get ready, that most of the C++ code will need to be rewritten to C#? Or, because of language simmilarities, should refactoring be quite easy and straightforward?
I am aware, that it will depend on the application itself, but I am asking in general.

I have done a major port of a C++ application to C# recently. Overall I thought it was a pleasant experience. These are the steps that I took, they might or might not apply to your application.
Get the "scaffolding" in place in C#, design your architecture. This is the time to get in major architecture changes from the existing application if you choose to do so.
Get your tests in place. I can't over-emphasize this one. Since you are porting an existing application you should have a set of tests already in place that verify the correct behavior of your application. You can and should reuse these tests for your C# application. This is the one thing that gives you an edge when porting - you know (and have written) already many of the tests you want. Start porting your test project.
Put in method stubs for your C# methods that reflect the existing C++ methods. Given the framework support in C# some methods might not be needed at all anymore, or are very simplified - this is the time to decide.
Copy and paste. Yes I used copy and paste for most of the C++ code - all the flow statements basically can be reused if you are careful. Once pasted go through line by line, many things like use of pointers etc. must be rewritten to use a equivalent C# type.
Once you have re-written a method in such a way, do the obvious re-factoring given the framework support / helper classes you might have been lacking in C++ but are readily available in C#. Also naming conventions for variables etc. can be changed here, should be straightforward given the built in support for this in VS 2010.
Run your tests! Test early and often that the overall framework you have in place so far produces the exact same output as your C++ application which you can use as a reference. This is also the time to add missing tests.
Refactor, Refactor, Refactor. Every application ages, and so did your C++ application most likely. Look closely at the underlying design and simplify and remove as much as possible. Verify at each step by running your tests.

First thing first, this is porting and not refactoring. Also I think it's an extremely bad idea.
It is true that you could (with a lot of work) port C++ to unsafe C#, but saying that the syntax is very similar is a bit of a stretch. In fact, following the same line of reasoning you could port C++ to any other C derived language, and it would be equally painful.
Again, if you do it expect a shedload of rework. It's more than likely gonna take you more than re-coding it from scratch using the existing code as mere model, which is in my opinion a better and less messy option.

Just compile the C++ code with the /clr compiler option. That will translate the code to IL, it can execute on most any .NET enabled platform. There are very few C++ constructs that cannot be translated, it would have to use non-standard compiler extensions like __fastcall.
However, I suspect that you will find out that the platform requires verifiable code. Which is the common reason why a platform would restrict code to a .NET compliant language. I cannot guess at this since you didn't mention the execution environment. Native C++ translated to IL is not verifiable due to pointer manipulations. If that's the case then you are looking at a pretty drastic rewrite.

I'd be interested to know where C# is the "only supported platform".
The problem of rewriting in a new language can be whether you need to rewrite every single part of the code and cannot use any of the old code at all. Sometimes it is best, even when doing a rewrite, to make it more of a refactor: rewrite some parts of the code, move others. The existing code is known to work and can be tricky to reproduce. And it takes time. There needs to be a good reason to do a full rewrite.
.NET supports a version of C++, and Visual Studio also comes with Visual C++ to build standard C++, so consider whether or not you can make this a phased transformation, and whether or not you really have to rewrite the whole thing.

Porting C++ code to C# will not be that hard, assuming that all your dependent libraries have existing C# counterparts. Lack of dependencies is the most likely pitfall. The core concepts of your program, such as inheritance, heap, references, data structures, should be fairly easily translatable.
This is assuming that you don't invoke any specific low level behaviour such as custom memory management, because C# does not really support that kind of thing and you could have a serious problem there.

Converting C (not C++) to C#

I have some old C 32 Bit DLLs that are using Oracle's Pro C precompiler library (proc.exe) to expose a hundred or so sproc/func calls to an even older VB6 GUI which references these functions through explicit Declare statements like so:
Declare Function ConnectToDB Lib "C:\windows\system32\EXTRACT32.DLL" (CXN As CXNdets, ERR As ERRdets) As Long
All the structures in the C header files are painstakingly replicated in the VB6 front end. At least the SQL is precompiled.
My question is, is it worth trying to impose a .Net interface (by conversion to an assembly) onto the the C code and upgrade the VB6 to C# or do you think I should just abandon the whole thing and start from scratch. As always, time is of the essence hence my appeal for prior experience. I know that if I keep the Declares in .Net I will have to add lots of complicated marshalling decorations which I'd like to avoid.
I've never had to Convert C to .Net before so my main question if everything else is ignored is are there any porting limitations that make this inadvisable?

... At least the SQL is precompiled.
Is this the only reason you've got code in C? If so, my advice is to abandon that and simply rewrite the entire thing in C# (or even VB6 if that's what your app is written in) ... unless you've profiled it and can prove a measurable difference, you won't be getting any perf benefits from having sql/sproc calls in C. You will only get increased maintenance costs due to the complexity of having to maintain this interop bridge.

You should continue to use the DLL in .NET by creating an assembly around the Declares. That one assembly probably would go a little quicker in VB.NET than C#. Then have your new UI reference that assembly. Once you have that going then you have bought yourself time to convert the C code into .NET. You do this by initially keeping the assembly and replacing the the declares with new .NET code. Soon you will have replaced everything and can refactor it to a different design.
The time killer is breaking behavior. The closer you can preserve the behavior of the original application the faster the conversion will be. Remember there nothing wrong with referencing a traditional DLL. .NET is built on many layers of APIs which ultimately drill down to the traditional DLLs that continue to be used by Windows. Again once you have the .NET UI working then you have more time to work on the core and bring everything into .NET.

I always advise extreme caution before setting out to rewrite anything. If you use a decent tool to upgrade the VB6 to .NET, it will convert the Declare statements automatically, so don't stress about them too much!
It's a common pitfall to start out optimistically rewriting a large piece of software, make good early progress fixing some of the well-known flaws in the old architecture, and then get bogged down in the functionality that you've just been taking for granted for years. At this point your management begin to get twitchy and everything can get very uncomfortable. I have been there and it's no fun. Sounds like your users are already twitchy, which is a bad sign.
...and here's a blog post by a Microsofty that agrees with me:
Many companies I worked with in the early days of .NET looked first at rewriting driven in part by a strong desire to improve the underlying architecture and code structures at the same time as they moved to .NET. Unfortunately many of those projects ran into difficulty and several were never completed. The problem they were trying to solve was too large
...and some official advice from Microsoft UK regarding migrating from VB6 to .NET
Performing a complete rewrite to .NET is far more costly and difficult to do well [than converting] ... we would only recommend this approach for a small number of situations.
Maybe your program is small, and you have a great understanding of the problems it solves, and you are great at estimating accurately and keeping your projects on track, and it will all be fine.

If you move from VB6 to VB.net or C#, throw away the C code and use the appropriate ODP.net classes or LINQ to access those stored procedures. Since the C layer (as I understand it) has no logic other than exposing the stored procedures, it's not useful anymore after the switch. By doing that, you get (at least) much better exception handling (i.e. exceptions at all instead of magic return codes), maintainability etc.
See also: Automatically create C# wrapper classes around stored procedures

Could managed code (specifically .NET) ever become 'unmanaged'?

Recently I was talking with a friend of mine who had started a C++ class a couple months ago (his first exposure to programming). We got onto the topic of C# and .NET generally, and he made the point to me that he felt it was 'doomed' for all of the commonly-cited issues (low speed, breakable bytecode, etc). I agreed with him on all those issues, but I held back in saying it was doomed, only because I felt that, in time, languages like C# could instead become native code (if Microsoft so chose to change the implementation of .NET from a bytecode, JIT runtime environemnent to one which compiles directly to native code like your C++ program does).
My question is, am I out to lunch here? I mean, it may take a lot of work (and may break too many things), but there isn't some type of magical barrier which prevents C# code from being compiled natively (if one wanted to do it), right? There was a time where C++ was considered a very high-level language (which it still is, but not as much as in the past) yet now it's the bedrock (along with C) for Microsoft's native APIs. The idea that .NET could one day be on the same level as C++ in that respect seems only to be a matter of time and effort to me, not some fundamental flaw in the design of the language.
EDIT: I should add that if native compilation of .NET is possible, why does Microsoft choose not to go that route? Why have they chosen the JIT bytecode path?

Java uses bytecode. C#, while it uses IL as an intermediate step, has always compiled to native code. IL is never directly interpreted for execution as Java bytecode is. You can even pre-compile the IL before distribution, if you really want to (hint: performance is normally better in the long run if you don't).
The idea that C# is slow is laughable. Some of the winforms components are slow, but if you know what you're doing C# itself is a very speedy language. In this day and age it generally comes down to the algorithm anyway; language choice won't help you if you implement a bad bubble sort. If C# helps you use more efficient algorithms from a higher level (and in my experience it generally does) that will trump any of the other speed concerns.
Based on your edit, I also want to explain the (typical) compilation path again.
C# is compiled to IL. This IL is distributed to local machines. A user runs the program, and that program is then JIT-compiled to native code for that machine once. The next time the user runs the program on that machine they're running a fully-native app. There is also a JIT optimizer that can muddy things a bit, but that's the general picture.
The reason you do it this way is to allow individual machines to make compile-time optimizations appropriate to that machine. You end up with faster code on average than if you distributed the same fully-compiled app to everyone.
Regarding decompilation:
The first thing to note is that you can pre-compile to native code before distribution if you really want to. At this point you're close to the same level as if you had distributed a native app. However, that won't stop a determined individual.
It also largely misunderstands the economics at play. Yes, someone might perhaps reverse-engineer your work. But this assumes that all the value of the app is in the technology. It's very common for a programmer to over-value the code, and undervalue the execution of the product: interface design, marketing, connecting with users, and on-going innovation. If you do all of that right, a little extra competition will help you as much as it hurts by building up demand in your market. If you do it wrong, hiding your algorithm won't save you.
If you're more worried about your app showing up on warez sites, you're even more misguided. It'll show up there anyway. A much better strategy is to engage those users.
At the moment, the biggest impediment to adoption (imo) is that the framework redistributable has become mammoth in size. Hopefully they'll address that in a relatively near release.

Are you suggesting that the fact that C# is managed code is a design flaw??

C# can be natively compiled using tool such as NGEN, and the MONO (open source .net framework) team has developed full AOT (ahead of time) compilation which allows c# to run on the IPhone. However, full compilation is culbersome because it destroys cross-platform compatibility, and some machine-specific optimizations cannot be done. However, it is also important to note that .net is not an interpreted language, but a JIT (just in time) compiled language, which means it runs natively on the machine.

dude, fyi, you can always compile your c# assemblies into native image using ngen.exe
and are you suggesting .net is flawed design? it was .net which brought back ms back into the game from their crappy vb 5, vb 6, com days. it was one of their biggest bets
java does the same stuff - so are you suggesting java too is a mistake?
reg. big vendors - please note .net has been hugely hugely successful across companies of all sizes (except for those open source guys - nothing wrong with that). all these companies have made significant amount of investments into the .net framework.
and to compare c# speed with c++ is a crazy idea according to me. does c++ give u managed environment along with a world class powerful framework?
and you can always obfuscate your assemblies if you are so paranoid about decompilation
its not about c++ v/s c#, managed v/s unmanaged. both are equally good and equally powerful in their own domains

C# could be natively compiled but it is unlikely the base class library will ever go there. On the flip side, I really don't see much advantage to moving beyond JIT.

It certainly could, but the real question is why? I mean, sure, it can be slow(er), but most of the time any major differences in performance come down to design problems (wrong algorithms, thread contention, hogging resources, etc.) rather than issues with the language. As for the "breakable" bytecode, it doesn't really seem to be a huge concern of most companies, considering adoption rates.
What it really comes down to is, what's the best tool for the job? For some, it's C++; for others, Java; for others, C#, or Python, or Erlang.

Doomed? Because of supposed performance issues?
How about comparing the price of:
programmer's hour
hardware components
If you have performance issues with applications, it's much cheaper to just buy yourself better hardware, compared to the benefits you loose in switching from a higher-abstraction language to a lower one (and I don't have anything against C++, I've been a C++ developer for a long time).
How about comparing maintenance problems when trying to find memory leaks in C++ code compared to garbage-collected C# code?
"Hardware is Cheap, Programmers are Expensive": http://www.codinghorror.com/blog/archives/001198.html

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.