Why is my C program slower than its C# equivalent?

Why is my C program slower than its C# equivalent? - c#

I've wanted to do a small benchmark test between C and C#, so I wrote the following programs:
C:
int main()
{
int i = 1;
while (i <= 500000)
{
printf("%d", i);
i++;
}
}
C#:
class Program
{
static void Main(string[] args)
{
int i = 1;
while (i <= 500000)
{
Console.Write(i);
i++;
}
}
}
And then I compiled them and ran both at the same time.
To my surprise, the C# program has exited about 5 seconds earlier than the C program.
C is a language well-known for its high speed and great performance, so how could it be that in this case, the C# program outperformed the C program significantly in such a simple task? I thought that C# is much slower because it compiles in runtime, but it easily defeated the native executable that was compiled from C.
What is the reason for this, and does this also happen in other types of programs?
I ran these 2 programs on Windows 8.1 Pro 64-bit, if that matters, and compiled the C# program in C# Express 2010, and the C program in VS2013 Express for Desktop.
And please don't bash me for "such a common question" - Yes, I've seen another answers, but they all dealt with complex things such as memory management and buffer size, and the programs in here are very simple and do not deal with those things.
I've also tried replacing printf for puts, but that was just as slow.

Benchmarking how fast a programming language can write to the console is worthless. That's the bottleneck in both of these cases, and it's screwing up your results. The execution time of one of the programs, either the C or C# version, would vary wildly from one run to the next. So any comparisons between the two languages are meaningless.
If you really want to test the performance difference, one way to do it would be to write a complicated mathematical algorithm and benchmark that. Don't write the result to the console, though—you now know that would defeat the purpose! Also, time it using a high-resolution timer. In C#, you'd use the Stopwatch class; in C (at least on Windows), you'd call the QueryPerformanceCounter function. Timing with a real stopwatch is just too error-prone. Or at least it is for me. Either way, make sure you query these values outside of the code you're testing.

This does not have anything to do with the C language or its compiler, everything to do with its runtime support library. The slow-down is caused by you using printf().
The C Runtime Library (CRT) is badly stuck in the mud, modeling a runtime environment that harks back to the 1970s. Back when programmers used a teletype to talk to a computer, one that had a very primitive operating system that didn't support goodies like threading, talking in English.
That runtime environment is a very drastic mismatch with modern operating systems,considerable overhead is needed to emulate it. The C language in general has pretty poor support for Unicode for example. So every string you output has to be painstakingly converted to the native operating system's string format, utf-16 in Windows. The .NET Framework uses System.String, a string type that's encoded in utf-16 as well. Even down to the lowest level, Windows requires string to be zero-terminated, just like C strings need to be. Every .NET string is automatically zero-terminated, even the ones you don't use for I/O. So no conversion is required, the framework can directly call the operating system function that outputs the string.
That's not where it ends. The notion of threading is completely alien to the CRT, it was never designed to consider the ramifications of multiple threads, say, modifying the locale or simultaneously writing text. So the CRT is peppered with low-level locks to bring this to a good end. Overhead that is very rarely actually useful, but must be taken because the 1970s runtime model doesn't disallow nor account for it.
You can make your C program just as fast as your C# program, making it noticeably faster is a bit stretch, but you have to bypass the 1970s to do so. Using wprintf() instead already ought to make a noticeable difference, you don't get parity until you call WriteConsoleW() or WriteFile().
Do note that this does not actually matter at all. The text is scrolling across the screen far faster than a human can read it anyway. So it is not a real problem that anybody would consider fixing.

Related

Primenumbers C# and C++ [duplicate]

Or is it now the other way around?
From what I've heard there are some areas in which C# proves to be faster than C++, but I've never had the guts to test it by myself.
Thought any of you could explain these differences in detail or point me to the right place for information on this.

There is no strict reason why a bytecode based language like C# or Java that has a JIT cannot be as fast as C++ code. However C++ code used to be significantly faster for a long time, and also today still is in many cases. This is mainly due to the more advanced JIT optimizations being complicated to implement, and the really cool ones are only arriving just now.
So C++ is faster, in many cases. But this is only part of the answer. The cases where C++ is actually faster, are highly optimized programs, where expert programmers thoroughly optimized the hell out of the code. This is not only very time consuming (and thus expensive), but also commonly leads to errors due to over-optimizations.
On the other hand, code in interpreted languages gets faster in later versions of the runtime (.NET CLR or Java VM), without you doing anything. And there are a lot of useful optimizations JIT compilers can do that are simply impossible in languages with pointers. Also, some argue that garbage collection should generally be as fast or faster as manual memory management, and in many cases it is. You can generally implement and achieve all of this in C++ or C, but it's going to be much more complicated and error prone.
As Donald Knuth said, "premature optimization is the root of all evil". If you really know for sure that your application will mostly consist of very performance critical arithmetic, and that it will be the bottleneck, and it's certainly going to be faster in C++, and you're sure that C++ won't conflict with your other requirements, go for C++. In any other case, concentrate on first implementing your application correctly in whatever language suits you best, then find performance bottlenecks if it runs too slow, and then think about how to optimize the code. In the worst case, you might need to call out to C code through a foreign function interface, so you'll still have the ability to write critical parts in lower level language.
Keep in mind that it's relatively easy to optimize a correct program, but much harder to correct an optimized program.
Giving actual percentages of speed advantages is impossible, it largely depends on your code. In many cases, the programming language implementation isn't even the bottleneck. Take the benchmarks at http://benchmarksgame.alioth.debian.org/ with a great deal of scepticism, as these largely test arithmetic code, which is most likely not similar to your code at all.

I'm going to start by disagreeing with part of the accepted (and well-upvoted) answer to this question by stating:
There are actually plenty of reasons why JITted code will run slower than a properly optimized C++ (or other language without runtime overhead)
program including:
compute cycles spent on JITting code at runtime are by definition unavailable for use in program execution.
any hot paths in the JITter will be competing with your code for instruction and data cache in the CPU. We know that cache dominates when it comes to performance and native languages like C++ do not have this type of contention, by design.
a run-time optimizer's time budget is necessarily much more constrained than that of a compile-time optimizer's (as another commenter pointed out)
Bottom line: Ultimately, you will almost certainly be able to create a faster implementation in C++ than you could in C#.
Now, with that said, how much faster really isn't quantifiable, as there are too many variables: the task, problem domain, hardware, quality of implementations, and many other factors. You'll have run tests on your scenario to determine the the difference in performance, and then decide whether it is worth the the additional effort and complexity.
This is a very long and complex topic, but I feel it's worth mentioning for the sake of completeness that C#'s runtime optimizer is excellent, and is able to perform certain dynamic optimizations at runtime that are simply not available to C++ with its compile-time (static) optimizer. Even with this, the advantage is still typically deeply in the native application's court, but the dynamic optimizer is the reason for the "almost certainly" qualifier given above.
--
In terms of relative performance, I was also disturbed by the figures and discussions I saw in some other answers, so I thought I'd chime in and at the same time, provide some support for the statements I've made above.
A huge part of the problem with those benchmarks is you can't write C++ code as if you were writing C# and expect to get representative results (eg. performing thousands of memory allocations in C++ is going to give you terrible numbers.)
Instead, I wrote slightly more idiomatic C++ code and compared against the C# code #Wiory provided. The two major changes I made to the C++ code were:
used vector::reserve()
flattened the 2d array to 1d to achieve better cache locality (contiguous block)
C# (.NET 4.6.1)
private static void TestArray()
{
const int rows = 5000;
const int columns = 9000;
DateTime t1 = System.DateTime.Now;
double[][] arr = new double[rows][];
for (int i = 0; i < rows; i++)
arr[i] = new double[columns];
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
t1 = System.DateTime.Now;
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
arr[i][j] = i;
t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
Run time (Release): Init: 124ms, Fill: 165ms
C++14 (Clang v3.8/C2)
#include <iostream>
#include <vector>
auto TestSuite::ColMajorArray()
{
constexpr size_t ROWS = 5000;
constexpr size_t COLS = 9000;
auto initStart = std::chrono::steady_clock::now();
auto arr = std::vector<double>();
arr.reserve(ROWS * COLS);
auto initFinish = std::chrono::steady_clock::now();
auto initTime = std::chrono::duration_cast<std::chrono::microseconds>(initFinish - initStart);
auto fillStart = std::chrono::steady_clock::now();
for(auto i = 0, r = 0; r < ROWS; ++r)
{
for (auto c = 0; c < COLS; ++c)
{
arr[i++] = static_cast<double>(r * c);
}
}
auto fillFinish = std::chrono::steady_clock::now();
auto fillTime = std::chrono::duration_cast<std::chrono::milliseconds>(fillFinish - fillStart);
return std::make_pair(initTime, fillTime);
}
Run time (Release): Init: 398µs (yes, that's microseconds), Fill: 152ms
Total Run times: C#: 289ms, C++ 152ms (roughly 90% faster)
Observations
Changing the C# implementation to the same 1d array implementation
yielded Init: 40ms, Fill: 171ms, Total: 211ms (C++ was still almost
40% faster).
It is much harder to design and write "fast" code in C++ than it is to write "regular" code in either language.
It's (perhaps) astonishingly easy to get poor performance in C++; we saw that with unreserved vectors performance. And there are lots of pitfalls like this.
C#'s performance is rather amazing when you consider all that is going on at runtime. And that performance is comparatively easy to
access.
More anecdotal data comparing the performance of C++ and C#: https://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=gpp&lang2=csharpcore
The bottom line is that C++ gives you much more control over performance. Do you want to use a pointer? A reference? Stack memory? Heap? Dynamic polymorphism or eliminate the runtime overhead of a vtable with static polymorphism (via templates/CRTP)? In C++ you have to... er, get to make all these choices (and more) yourself, ideally so that your solution best addresses the problem you're tackling.
Ask yourself if you actually want or need that control, because even for the trivial example above, you can see that although there is a significant improvement in performance, it requires a deeper investment to access.

It's five oranges faster. Or rather: there can be no (correct) blanket answer. C++ is a statically compiled language (but then, there's profile guided optimization, too), C# runs aided by a JIT compiler. There are so many differences that questions like “how much faster” cannot be answered, not even by giving orders of magnitude.

In my experience (and I have worked a lot with both languages), the main problem with C# compared to C++ is high memory consumption, and I have not found a good way to control it. It was the memory consumption that would eventually slow down .NET software.
Another factor is that JIT compiler cannot afford too much time to do advanced optimizations, because it runs at runtime, and the end user would notice it if it takes too much time. On the other hand, a C++ compiler has all the time it needs to do optimizations at compile time. This factor is much less significant than memory consumption, IMHO.

One particular scenario where C++ still has the upper hand (and will, for years to come) occurs when polymorphic decisions can be predetermined at compile time.
Generally, encapsulation and deferred decision-making is a good thing because it makes the code more dynamic, easier to adapt to changing requirements and easier to use as a framework. This is why object oriented programming in C# is very productive and it can be generalized under the term “generalization”. Unfortunately, this particular kind of generalization comes at a cost at run-time.
Usually, this cost is non-substantial but there are applications where the overhead of virtual method calls and object creation can make a difference (especially since virtual methods prevent other optimizations such as method call inlining). This is where C++ has a huge advantage because you can use templates to achieve a different kind of generalization which has no impact on runtime but isn't necessarily any less polymorphic than OOP. In fact, all of the mechanisms that constitute OOP can be modelled using only template techniques and compile-time resolution.
In such cases (and admittedly, they're often restricted to special problem domains), C++ wins against C# and comparable languages.

C++ (or C for that matter) gives you fine-grained control over your data structures. If you want to bit-twiddle you have that option. Large managed Java or .NET apps (OWB, Visual Studio 2005) that use the internal data structures of the Java/.NET libraries carry the baggage with them. I've seen OWB designer sessions using over 400 MB of RAM and BIDS for cube or ETL design getting into the 100's of MB as well.
On a predictable workload (such as most benchmarks that repeat a process many times) a JIT can get you code that is optimised well enough that there is no practical difference.
IMO on large applications the difference is not so much the JIT as the data structures that the code itself is using. Where an application is memory-heavy you will get less efficient cache usage. Cache misses on modern CPUs are quite expensive. Where C or C++ really win is where you can optimise your usage of data structures to play nicely with the CPU cache.

For graphics the standard C# Graphics class is way slower than GDI accessed via C/C++.
I know this has nothing to do with the language per se, more with the total .NET platform, but Graphics is what is offered to the developer as a GDI replacement, and its performance is so bad I wouldn't even dare to do graphics with it.
We have a simple benchmark we use to see how fast a graphics library is, and that is simply drawing random lines in a window. C++/GDI is still snappy with 10000 lines while C#/Graphics has difficulty doing 1000 in real-time.

The garbage collection is the main reason Java# CANNOT be used for real-time systems.
When will the GC happen?
How long will it take?
This is non-deterministic.

We have had to determine if C# was comparable to C++ in performance and I wrote some test programs for that (using Visual Studio 2005 for both languages). It turned out that without garbage collection and only considering the language (not the framework) C# has basically the same performance as C++. Memory allocation is way faster in C# than in C++ and C# has a slight edge in determinism when data sizes are increased beyond cache line boundaries. However, all of this had eventually to be paid for and there is a huge cost in the form of non-deterministic performance hits for C# due to garbage collection.

C/C++ can perform vastly better in programs where there are either large arrays or heavy looping/iteration over arrays (of any size). This is the reason that graphics are generally much faster in C/C++, because heavy array operations underlie almost all graphics operations. .NET is notoriously slow in array indexing operations due to all the safety checks, and this is especially true for multi-dimensional arrays (and, yes, rectangular C# arrays are even slower than jagged C# arrays).
The bonuses of C/C++ are most pronounced if you stick directly with pointers and avoid Boost, std::vector and other high-level containers, as well as inline every small function possible. Use old-school arrays whenever possible. Yes, you will need more lines of code to accomplish the same thing you did in Java or C# as you avoid high-level containers. If you need a dynamically sized array, you will just need to remember to pair your new T[] with a corresponding delete[] statement (or use std::unique_ptr)—the price for the extra speed is that you must code more carefully. But in exchange, you get to rid yourself of the overhead of managed memory / garbage collector, which can easily be 20% or more of the execution time of heavily object-oriented programs in both Java and .NET, as well as those massive managed memory array indexing costs. C++ apps can also benefit from some nifty compiler switches in certain specific cases.
I am an expert programmer in C, C++, Java, and C#. I recently had the rare occasion to implement the exact same algorithmic program in the latter 3 languages. The program had a lot of math and multi-dimensional array operations. I heavily optimized this in all 3 languages. The results were typical of what I normally see in less rigorous comparisons: Java was about 1.3x faster than C# (most JVMs are more optimized than the CLR), and the C++ raw pointer version came in about 2.1x faster than C#. Note that the C# program only used safe code—it is my opinion that you might as well code it in C++ before using the unsafe keyword.
Lest anyone think I have something against C#, I will close by saying that C# is probably my favorite language. It is the most logical, intuitive and rapid development language I've encountered so far. I do all my prototyping in C#. The C# language has many small, subtle advantages over Java (yes, I know Microsoft had the chance to fix many of Java's shortcomings by entering the game late and arguably copying Java). Toast to Java's Calendar class anyone? If Microsoft ever spends real effort to optimize the CLR and the .NET JITter, C# could seriously take over. I'm honestly surprised they haven't already—they did so many things right in the C# language, why not follow it up with heavy-hitting compiler optimizations? Maybe if we all beg.

As usual, it depends on the application. There are cases where C# is probably negligibly slower, and other cases where C++ is 5 or 10 times faster, especially in cases where operations can be easily SIMD'd.

I know it isn't what you were asking, but C# is often quicker to write than C++, which is a big bonus in a commercial setting.

> From what I've heard ...
Your difficulty seems to be in deciding whether what you have heard is credible, and that difficulty will just be repeated when you try to assess the replies on this site.
How are you going to decide if the things people say here are more or less credible than what you originally heard?
One way would be to ask for evidence.
When someone claims "there are some areas in which C# proves to be faster than C++" ask them why they say that, ask them to show you measurements, ask them to show you programs. Sometimes they will simply have made a mistake. Sometimes you'll find out that they are just expressing an opinion rather than sharing something that they can show to be true.
Often information and opinion will be mixed up in what people claim, and you'll have to try and sort out which is which. For example, from the replies in this forum:
"Take the benchmarks at http://shootout.alioth.debian.org/
with a great deal of scepticism, as
these largely test arithmetic code,
which is most likely not similar to
your code at all."
Ask yourself if you really
understand what "these largely test
arithmetic code" means, and then
ask yourself if the author has
actually shown you that his claim is
true.
"That's a rather useless test, since it really depends on how well
the individual programs have been
optimized; I've managed to speed up
some of them by 4-6 times or more,
making it clear that the comparison
between unoptimized programs is
rather silly."
Ask yourself whether the author has
actually shown you that he's managed
to "speed up some of them by 4-6
times or more" - it's an easy claim to make!

For 'embarassingly parallel' problems, when using Intel TBB and OpenMP on C++ I have observed a roughly 10x performance increase compared to similar (pure math) problems done with C# and TPL. SIMD is one area where C# cannot compete, but I also got the impression that TPL has a sizeable overhead.
That said, I only use C++ for performance-critical tasks where I know I will be able to multithread and get results quickly. For everything else, C# (and occasionally F#) is just fine.

In theory, for long running server-type application, a JIT-compiled language can become much faster than a natively compiled counterpart. Since the JIT compiled language is generally first compiled to a fairly low-level intermediate language, you can do a lot of the high-level optimizations right at compile time anyway. The big advantage comes in that the JIT can continue to recompile sections of code on the fly as it gets more and more data on how the application is being used. It can arrange the most common code-paths to allow branch prediction to succeed as often as possible. It can re-arrange separate code blocks that are often called together to keep them both in the cache. It can spend more effort optimizing inner loops.
I doubt that this is done by .NET or any of the JREs, but it was being researched back when I was in university, so it's not unreasonable to think that these sort of things may find their way into the real world at some point soon.

Applications that require intensive memory access eg. image manipulation are usually better off written in unmanaged environment (C++) than managed (C#). Optimized inner loops with pointer arithmetics are much easier to have control of in C++. In C# you might need to resort to unsafe code to even get near the same performance.

I've tested vector in C++ and C# equivalent - List and simple 2d arrays.
I'm using Visual C#/C++ 2010 Express editions. Both projects are simple console applications, I've tested them in standard (no custom settings) release and debug mode.
C# lists run faster on my pc, array initialization is also faster in C#, math operations are slower.
I'm using Intel Core2Duo P8600#2.4GHz, C# - .NET 4.0.
I know that vector implementation is different than C# list, but I just wanted to test collections that I would use to store my objects (and being able to use index accessor).
Of course you need to clear memory (let's say for every use of new), but I wanted to keep the code simple.
C++ vector test:
static void TestVector()
{
clock_t start,finish;
start=clock();
vector<vector<double>> myList=vector<vector<double>>();
int i=0;
for( i=0; i<500; i++)
{
myList.push_back(vector<double>());
for(int j=0;j<50000;j++)
myList[i].push_back(j+i);
}
finish=clock();
cout<<(finish-start)<<endl;
cout<<(double(finish - start)/CLOCKS_PER_SEC);
}
C# list test:
private static void TestVector()
{
DateTime t1 = System.DateTime.Now;
List<List<double>> myList = new List<List<double>>();
int i = 0;
for (i = 0; i < 500; i++)
{
myList.Add(new List<double>());
for (int j = 0; j < 50000; j++)
myList[i].Add(j *i);
}
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
C++ - array:
static void TestArray()
{
cout << "Normal array test:" << endl;
const int rows = 5000;
const int columns = 9000;
clock_t start, finish;
start = clock();
double** arr = new double*[rows];
for (int i = 0; i < rows; i++)
arr[i] = new double[columns];
finish = clock();
cout << (finish - start) << endl;
start = clock();
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
arr[i][j] = i * j;
finish = clock();
cout << (finish - start) << endl;
}
C# - array:
private static void TestArray()
{
const int rows = 5000;
const int columns = 9000;
DateTime t1 = System.DateTime.Now;
double[][] arr = new double[rows][];
for (int i = 0; i < rows; i++)
arr[i] = new double[columns];
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
t1 = System.DateTime.Now;
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
arr[i][j] = i * j;
t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
Time: (Release/Debug)
C++
600 / 606 ms array init,
200 / 270 ms array fill,
1sec /13sec vector init & fill.
(Yes, 13 seconds, I always have problems with lists/vectors in debug mode.)
C#:
20 / 20 ms array init,
403 / 440 ms array fill,
710 / 742 ms list init & fill.

It's an extremely vague question without real definitive answers.
For example; I'd rather play 3D-games that are created in C++ than in C#, because the performance is certainly a lot better. (And I know XNA, etc., but it comes no way near the real thing).
On the other hand, as previously mentioned; you should develop in a language that lets you do what you want quickly, and then if necessary optimize.

.NET languages can be as fast as C++ code, or even faster, but C++ code will have a more constant throughput as the .NET runtime has to pause for GC, even if it's very clever about its pauses.
So if you have some code that has to consistently run fast without any pause, .NET will introduce latency at some point, even if you are very careful with the runtime GC.

I suppose there are applications written in C# running fast, as well as there are more C++ written apps running fast (well C++ just older... and take UNIX too...)
- the question indeed is - what is that thing, users and developers are complaining about ...
Well, IMHO, in case of C# we have very comfort UI, very nice hierarchy of libraries, and whole interface system of CLI. In case of C++ we have templates, ATL, COM, MFC and whole shebang of alreadyc written and running code like OpenGL, DirectX and so on... Developers complains of indeterminably risen GC calls in case of C# (means program runs fast, and in one second - bang! it's stuck).
To write code in C# very simple and fast (not to forget that also increase chance of errors.
In case of C++, developers complains of memory leaks, - means crushes, calls between DLLs, as well as of "DLL hell" - problem with support and replacement libraries by newer ones...
I think more skill you'll have in the programming language, the more quality (and speed) will characterize your software.

I would put it this way: programmers who write faster code, are the ones who are the more informed of what makes current machines go fast, and incidentally they are also the ones who use an appropriate tool that allows for precise low-level and deterministic optimisation techniques. For these reasons, these people are the ones who use C/C++ rather than C#. I would go as far as stating this as a fact.

Well, it depends. If the byte-code is translated into machine-code (and not just JIT) (I mean if you execute the program) and if your program uses many allocations/deallocations it could be faster because the GC algorithm just need one pass (theoretically) through the whole memory once, but normal malloc/realloc/free C/C++ calls causes an overhead on every call (call-overhead, data-structure overhead, cache misses ;) ).
So it is theoretically possible (also for other GC languages).
I don't really see the extreme disadvantage of not to be able to use metaprogramming with C# for the most applications, because the most programmers don't use it anyway.
Another big advantage is that the SQL, like the LINQ "extension", provides opportunities for the compiler to optimize calls to databases (in other words, the compiler could compile the whole LINQ to one "blob" binary where the called functions are inlined or for your use optimized, but I'm speculating here).

If I'm not mistaken, C# templates are determined at runtime. This must be slower than compile time templates of C++.
And when you take in all the other compile-time optimizations mentioned by so many others, as well as the lack of safety that does, indeed, mean more speed...
I'd say C++ is the obvious choice in terms of raw speed and minimum memory consumption. But this also translates into more time developing the code and ensuring you aren't leaking memory or causing any null pointer exceptions.
Verdict:
C#: Faster development, slower run
C++: Slow development, faster run.

There are some major differences between C# and C++ on the performance aspect:
C# is GC / heap based. The allocation and GC itself is overhead as the non locality of the memory access
C++ optimizer's have become very good over the years. JIT compilers cannot achieve the same level since they have only limited compilation time and don't see the global scope
Besides that programmer competence plays also a role. I have seen bad C++ code where classes where passed by value as argument all over the place. You can actually make the performance worse in C++ if you don't know what you are doing.

I found this April 2020 read: https://www.quora.com/Why-is-C-so-slow-compared-to-Python by a real-world programmer with 15+ years of Software Development experience.
It states that C# is slower usually because it is compiled to Common Intermediate Language (CIL) instead of machine code like C++. The CIL is then put through Common Language Runtime (CLR) which outputs machine code. However, if you keep executing C# it will take the output of the machine code and cache it so the machine code is saved for the next execution. All in all, C# can be faster if you execute multiple times since it is in machine code after multiple executions.
There is also comments that a good C++ programmer can do optimizations that can be time consuming that will in end be optimized.

> After all, the answers have to be somewhere, haven't they? :)
Umm, no.
As several replies noted, the question is under-specified in ways that invite questions in response, not answers. To take just one way:
the question conflates language with language implementation - this C program is both 2,194 times slower and 1.17 times faster than this C# program - we would have to ask you: Which language implementations?
And then which programs? Which machine? Which OS? Which data set?

It really depends on what you're trying to accomplish in your code. I've heard that it's just stuff of urban legend that there is any performance difference between VB.NET, C# and managed C++. However, I've found, at least in string comparisons, that managed C++ beats the pants off of C#, which in turn beats the pants off of VB.NET.
I've by no means done any exhaustive comparisons in algorithmic complexity between the languages. I'm also just using the default settings in each of the languages. In VB.NET I'm using settings to require declaration of variables, etc. Here is the code I'm using for managed C++: (As you can see, this code is quite simple). I'm running the same in the other languages in Visual Studio 2013 with .NET 4.6.2.
#include "stdafx.h"
using namespace System;
using namespace System::Diagnostics;
bool EqualMe(String^ first, String^ second)
{
return first->Equals(second);
}
int main(array<String ^> ^args)
{
Stopwatch^ sw = gcnew Stopwatch();
sw->Start();
for (int i = 0; i < 100000; i++)
{
EqualMe(L"one", L"two");
}
sw->Stop();
Console::WriteLine(sw->ElapsedTicks);
return 0;
}

One area that I was instrumenting code in C++ vs C# was in creating a database connection to SQL Server and returning a resultset. I compared C++ (Thin layer over ODBC) vs C# (ADO.NET SqlClient) and found that C++ was about 50% faster than the C# code. ADO.NET is supposed to be a low-level interface for dealing with the database. Where you see perhaps a bigger difference is in memory consumption rather than raw speed.
Another thing that makes C++ code faster is that you can tune the compiler options at a granular level, optimizing things in a way you can't in C#.

Inspired by this, I did a quick test with 60 percent of common instruction needed in most of the programs.
Here’s the C# code:
for (int i=0; i<1000; i++)
{
StreamReader str = new StreamReader("file.csv");
StreamWriter stw = new StreamWriter("examp.csv");
string strL = "";
while((strL = str.ReadLine()) != null)
{
ArrayList al = new ArrayList();
string[] strline = strL.Split(',');
al.AddRange(strline);
foreach(string str1 in strline)
{
stw.Write(str1 + ",");
}
stw.Write("\n");
}
str.Close();
stw.Close();
}
String array and arraylist are used purposely to include those instructions.
Here's the c++ code:
for (int i = 0; i<1000; i++)
{
std::fstream file("file.csv", ios::in);
if (!file.is_open())
{
std::cout << "File not found!\n";
return 1;
}
ofstream myfile;
myfile.open ("example.txt");
std::string csvLine;
while (std::getline(file, csvLine))
{
std::istringstream csvStream(csvLine);
std::vector csvColumn;
std::string csvElement;
while( std::getline(csvStream, csvElement, ‘,’) )
{
csvColumn.push_back(csvElement);
}
for (std::vector::iterator j = csvColumn.begin(); j != csvColumn.end(); ++j)
{
myfile << *j << ", ";
}
csvColumn.clear();
csvElement.clear();
csvLine.clear();
myfile << "\n";
}
myfile.close();
file.close();
}
The input file size I used was 40 KB.
And here's the result -
C++ code ran in 9 seconds.
C# code: 4 seconds!!!
Oh, but this was on Linux... With C# running on Mono... And C++ with g++.
OK, this is what I got on Windows – Visual Studio 2003:
C# code ran in 9 seconds.
C++ code – horrible 370 seconds!!!

Compile C#, so that it runs with the speed of C++

Alright, so I wanted to ask if it's actually possible to make a parser from c# to c++.
So that code written in C# would be able to run as fast as code written in C++.
Is it actually possible to do? I'm not asking how hard is it going to be.

What makes you think that translating your C# code to C++ would magically make it faster?
Languages don't have a speed. Assuming that C# code is slower (I'll get back to that), it is because of what that code does (including the implicit requirements placed by C#, such as bounds checking on arrays), and not because of the language it is written in.
If you converted your C# code to C++, it would still need to do bounds checking on arrays, because the original source code expected this to happen, so it would have to do just as much work.
Moreover, C# often isn't slower than C++. There are plenty of benchmarks floating around on the internet, generally showing that for the most part, C# is as fast as (or faster than) C++. Only when you spend a lot of time optimizing your code, does C++ become faster.
If you want faster code, you need to write code that requires less work to execute, not try to change the source language. That's just cargo-cult programming at its worst. You once saw some efficient code, and that was written in C++, so now you try to make things C++, in the hope of attracting any efficiency that might be passing by.
It just doesn't work that way.

Although you could translate C# code to C++, there would be the issue that C# depends on the .Net framework libraries which are not native, so you could not simply translate C# code to C++.
Update
Also C# code depends on the runtime to do things such as memory management i.e. Garbage Collection. If you translated the C# code to C++, where would the memory management code be? Parsing and translating is not going to fix issues like that.

The Mono project has invested quite a lot of energy in turning LLVM into a native machine code compiler for the C# runtime, although there are some problems with specific language constructs like shared generics etc.. Check it out and take it for a spin.

You can use NGen to compile IL to native code

Performance related tweaks:
Platform independent
use a profiler to spot the bottlenecks;
prevent unnecessary garbage (spot it using generation #0 collect count and the Large Object heap)
prevent unnecessary copying (use struct wisely)
prevent unwarranted generics (code-sharing has unexpected performance side effects)
prefer oldfashioned loops over enumerator blocks when performance is an issue
When using LINQ watch closely where you maintain/break deferred evaluation. Both can be enormous boosts to performance
use reflection.emit/Expression Trees to precompile certain dynamic logic that is performance bottleneck
Mono
use Mono --gc=sgen --optimize=inline,... (the SGEN garbage collector can make orders of magnitude difference). See also man mono for a lot of tuning/optimization options
use MONO_GENERIC_SHARING=none to disable sharing of generics (making particular tasks a lot quicker especially when supporting both valuetypes and reftypes) (not recommended for regular production use)
use the -optimize+ compile flag (optimizing the CLR code independently from what the JITter may do with that)
Less mainstream:
use the LLVM backend (click the quote:)
This allows Mono to benefit from all of the compiler optimizations done in LLVM. For example the SciMark score goes from 482 to 610.
use mkbundle to create a statically linked NATIVE binary image (already fully JITted, i.e. AOT (ahead-of-time compiled))
MS .NET
Most of the above have direct Microsoft pendants (NGen, `/Optimize' etc.)
Of course MS don't have a switchable/tunable garbage collector, and I don't think a fully compiled native binary can be achieved like with mono.

As always the answer to making code run faster is:
Find the bottleneck and optimize that
Most of the time the bottleneck is either:
time spend in a critical loop
Review your algorithm and datastructure, do not change the language, the latter will give a 10% speedup, the first will give you a 1000x speedup.
If you're stuck on the best algorithm, you can always ask a specific, short and detailed question on SO.
time waiting for resources for a slow source
Reduce the amount of stuff you're requesting from the source
instead of:
SELECT * FROM bigtable
do
SELECT TOP 10 * FROM bigtable ORDER BY xxx
The latter will return instantly and you cannot show a million records in a meaningful way anyhow.
Or you can have the server at the order end reduce the data so that it doesn't take a 100 years to cross the network.
Alternativly you can execute the slow datafetch routine in a separate thread, so the rest of your program can do meaningful stuff instead of waiting.
Time spend because you are overflowing memory with Gigabytes of data
Use a different algorithm that works on a smaller dataset at a time.
Try to optimize cache usage.
The answer to efficient coding is measure where your coding time goes
Use a profiler.
see: http://csharp-source.net/open-source/profilers
And optimize those parts that eat more than 50% of your CPU time.
Do this for a number of iterations, and soon your 10 hour running time will be down to a manageable 3 minutes, instead of the 9.5 hours that you will get from switching to this or that better language.

How close can I get C# to the performance of C++ for small intensive tasks?

I was thinking about the speed difference of C++ to C# being mostly about C# compiling to byte-code that is taken in by the JIT compiler (is that correct?) and all the checks C# does.
I notice that it is possible to turn a lot of these functions off, both in the compile options, and possibly through using the unsafe keyword as unsafe code is not verifiable by the common language runtime.
Therefore if you were to write a simple console application in both languages, that flipped an imaginary coin an infinite number of times and displayed the results to the screen every 10,000 or so iterations, how much speed difference would there be? I chose this because it's a very simple program.
I'd like to test this but I don't know C++ or have the tools to compile it. This is my C# version though:
static void Main(string[] args)
{
unsafe
{
Random rnd = new Random();
int heads = 0, tails = 0;
while (true)
{
if (rnd.NextDouble() > 0.5)
heads++;
else
tails++;
if ((heads + tails) % 1000000 == 0)
Console.WriteLine("Heads: {0} Tails: {1}", heads, tails);
}
}
}
Is the difference enough to warrant deliberately compiling sections of code "unsafe" or into DLLs that do not have some of the compile options like overflow checking enabled? Or does it go the other way, where it would be beneficial to compile sections in C++? I'm sure interop speed comes into play too then.
To avoid subjectivity, I reiterate the specific parts of this question as:
Does C# have a performance boost from using unsafe code?
Do the compile options such as disabling overflow checking boost performance, and do they affect unsafe code?
Would the program above be faster in C++ or negligably different?
Is it worth compiling long intensive number-crunching tasks in a language such as C++ or using /unsafe for a bonus? Less subjectively, could I complete an intensive operation faster by doing this?

The example given is flawed because it does not show real life usage of both programming languages. Using simple datatypes to measure the speed of a language will not bring anything interesting. Instead, I suggest you create a template class in C++ and compare it with what is possible in C# for class generics. In the end, objects will bring some important results and you will see that C++ is faster than C#. Not to mention that you are comparing a lower level programming language with C#.
Does C# have a performance boost from
using unsafe code?
Yes, it will have a boost but it is not suggested that you write only code with unsafe. Here is why: Code written using an unsafe context cannot be verified to be safe, so it will be executed only when the code is fully trusted. In other words, unsafe code cannot be executed in an untrusted environment. For example, you cannot run unsafe code directly from the Internet. http://msdn.microsoft.com/en-us/library/aa288474(VS.71).aspx
Would the program above be faster in
C++ or negligably different?
Yes the program would be slightly faster in C++. C++ is a lower programming language and even faster if you start using the algorithm library (random_shuffle comes to mind).
Is it worth compiling long intensive
number-crunching tasks in a language
such as C++ or using /unsafe for a
bonus? Less subjectively, could I
complete an intensive operation faster
by doing this?
It depends on the project...

Up to more than 100% of speed - depends a lot on the task, simply said.
More than 100% - yes, because the just in time compiler knows your processor, and I doubt you actually optimize for your hardware platform ;)
No SSE is a problem if you do matrix operations.
FOr some things with tons of arrays (image manipulation) The array tests kill you, but pointers work (i.e. unsafe code) as they bypass this.
Regarding things like overflow checking - be carefull. As in: in C++ you have the same possibly. If you need overflow checking, the performance issue is not there ;)
I personally would not bother with C++ in most cases. Partially yes, especially when you can benefit from SSE:
So, at the end a lot depends on the NATURE if your calculations.

Would there be any point in designing a CPU that could handle IL directly?

If I understand this correctly:
Current CPU developing companies like AMD and Intel have their own API codes (the assembly language) as what they see as the 2G language on top of the Machine code (1G language)
Would it be possible or desirable (performance or otherwise) to have a CPU that would perform IL handling at it's core instead of the current API calls?

A similar technology does exist for Java - ARM do a range of CPUs that can do this, they call it their "Jazelle" technology.
However, the operations represented by .net IL opcodes are only well-defined in combination with the type information held on the stack, not on their own. This is a major difference from Java bytecode, and would make it much more difficult to create sensible hardware to execute IL.
Moreover, IL is intended for compilation to a final target. Most back ends that spit out IL do very little optimisation, aiming instead to preserve semantic content for verification and optimisation in the final compilation step. Even if the hardware problems could be overcome, the result will almost certainly still be slower than a decent optimising JIT.
So, to sum up: while it is not impossible, it would be disproportionately hard compared to other architectures, and would achieve little.

You seem a bit confused about how CPU's work. Assembly is not a separate language from machine code. It is simply a different (textual) representation of it.
Assembly code is simply a sequential listing of instructions to be executed. And machine code is exactly the same thing. Every instruction supported by the CPU has a certain bit-pattern that cause it to be executed, and it also has a textual name you can use in assembly code.
If I write add $10, $9, $8 and run it through an assembler, I get the machine code for the add instruction, taking the values in registers 9 and 8, adding them and storing the result in register 10.
There is a 1 to 1 mapping between assembler and machine code.
There also are no "API calls". The CPU simply reads from address X, and matches the subsequent bits against all the instructions it understands. Once it finds an instruction that matches this bit pattern, it executes the instruction, and moves on to read the next one.
What you're asking is in a sense impossible or a contradiction. IL stands for Intermediate Language, that is, a kind of pseudocode that is emitted by the compiler, but has not yet been translated into machine code. But if the CPU could execute that directly, then it would no longer be intermediate, it would be machine code.
So the question becomes "is your IL code a better, more efficient representation of a program, than the machine code the CPU supports now?"
And the answer is most likely no. MSIL (I assume that's what you mean by IL, which is a much more general term) is designed to be portable, simple and consistent. Every .NET language compiles to MSIL, and every MSIL program must be able to be translated into machine code for any CPU anywhere. That means MSIL must be general and abstract and not make assumptions about the CPU. For this reason, as far as I know, it is a purely stack-based architecture. Instead of keeping data in registers, each instruction processes the data on the top of the stack. That's a nice clean and generic system, but it's not very efficient, and doesn't translate well to the rigid structure of a CPU. (In your wonderful little high-level world, you can pretend that the stack can grow freely. For the CPU to get fast access to it, it must be stored in some small, fast on-chip memory with finite size. So what happens if your program push too much data on the stack?)
Yes, you could make a CPU to execute MSIL directly, but what would you gain?
You'd no longer need to JIT code before execution, so the first time you start a program, it would launch a bit faster. Apart from that, though? Once your MSIL program has been JIT'ed, it has been translated to machine code and runs as efficiently as if it had been written in machine code originally. MSIL bytecode no longer exists, just a series of instructions understood by the CPU.
In fact, you'd be back where you were before .NET. Non-managed languages are compiled straight to machine code, just like this would be in your suggestion. The only difference is that non-managed code targets machine code that is designed by CPU designers to be suitable for execution on a CPU, while in your case, it'd target machine code that's designed by software designers to be easy to translate to and from.

This is not a new idea - the same thing was predicted for Java, and Lisp machines were even actually implemented.
But experience with those shows that it's not really useful - by designing special-purpose CPUs, you can't benefit from the advances of "traditional" CPUs, and you very likely can't beat Intel at their own game. A quote from the Wikipedia article illustrates this nicely:
cheaper desktop PCs soon were able to
run Lisp programs even faster than
Lisp machines, without the use of
special purpose hardware.
Translating from one kind of machine code to another on the fly is a well-understood problem and so common (modern CISC CPUs even do something like that internally because they're really RISC) that I believe we can assume it is being done efficiently enough that avoiding it does not yield significant benefits - not when it means you have to decouple yourself from the state of the art in traditional CPUs.

I would say no.
The actual machine language instructions that need to run on a computer are lower level than IL. IL, for example, doesn't really describe how methods calls should be made, how registers should be managed, how the stack should be accessed, or any other of the details that are needed at the machine code level.
Getting the machine to recognize IL directly would, therefore, simple move all the JIT compilation logic from software into hardware.
That would make the whole process very rigid and unchangeable.
By having the machine language based on the capabilities of the machine, and an intermediate language based on capturing programmer intent, you get a much better system. The folks defining the machine can concentrate on defining an efficient computer architecture, and the folks defining the IL system can focus on things like expressiveness and safety.
If both the tool vendors and the hardware vendors had to use the exact same representation for everything, then innovation in either the hardware space or the tool space would be hampered. So, I say they should be separate from one another.

I wouldn't have thought so for two reasons:-
If you had hardware processing IL that hardware would not be able to run a newer version of IL. With JIT you just need a new JIT then existing hardware can run the newer IL.
IL is simple and is designed to be hardware agnostic. The IL would have to become much more complex to enable it to describe operations in the most effecient manner in order to get anywhere close to the performance of existing machine code. But that would mean the IL would be much harder to run on non IL specific hardware.

C# Performance For Proxy Server (vs C++)

I want to create a simple http proxy server that does some very basic processing on the http headers (i.e. if header x == y, do z). The server may need to support hundreds of users. I can write the server in C# (pretty easy) or c++ (much harder). However, would a C# version have as good of performance as a C++ version? If not, would the difference in performance be big enough that it would not make sense to write it in C#?

You can use unsafe C# code and pointers in critical bottleneck points to make it run faster. Those behave much like C++ code and I believe it executes as fast.
But most of the time, C# is JIT-ted to uber-fast already, I don't believe there will be much differences as with what everyone has said.
But one thing you might want to consider is: Managed code (C#) string operations are rather slow compared to using pointers effectively in C++. There are more optimization tricks with C++ pointers than with CLR strings.
I think I have done some benchmarks before, but can't remember where I've put them.

Why do you expect a much higher performance from the C++ application?
There is no inherent slowdown added by a C# application when you are doing it right. (not too many dropped references, frequent object creation/dropping per call, etc.)
The only time a C++ application really outperforms an equivalent C# application is when you can do (very) low level operations. E.g. casting raw memory pointers, inline assembler, etc.
The C++ compiler may be better at creating fast code, but mostly this is wasted in most applications. If you do really have a part of your application that must be blindingly fast, try writing a C call for that hot spot.
Only if most of the system behaves too slowly you should consider writing it in C/C++. But there are many pitfalls that may kill your performance in your C++ code.
(TLDR: A C++ expert may create 'faster' code as an C# expert, but a mediocre C++ programmer may create slower code than mediocre C# one)

I would expect the C# version to be nearly as fast as the C++ one but with smaller memory footprint.
In some cases managed code is actually a LOT faster and uses less memory compared to non optimized C++. C++ code can be faster if written by expert, but it rarely justifies the effort.
As a side note I can recall a performance "competition" in the blogosphere between Michael Kaplan (c#) and Raymond Chan (C++) to write a program, that does exactly the same thing. Raymond Chan, who is considered one of the best programmers in the world (Joel) succeeded to write faster C++ after a long struggle rewriting most of the code.

The proxy server you describe would deal mostly with string data and I think its reasonable to implement in C#. In your example,
if header x == y, do z
the slowest part might actually be doing whatever 'z' is and you'll have to do that work regardless of the language.

In my experience, the design and implementation has much more to do with performance than do the choice of language/framework (however, the usual caveats apply: eg, don't write a device driver in C# or java).
I wouldn't think twice about writing the type of program you describe in a managed language (be it Java, C#, etc). These days, the performance gains you get from using a lower level language (in terms of closeness to hardware) is often easily offset by the runtime abilities of a managed environment. Of course this is coming from a C#/python developer so I'm not exactly unbiased...

If you need a fast and reliable proxy server, it might make sense to try some of those that already exist. But if you have custom features that are required, then you may have to build your own. You may want to collect some more information on the expected load: hundreds of users might be a few requests a minute or a hundred requests a second.
Assuming you need to serve under or around 200 qps on a single machine, C# should easily meet your needs -- even languages known for being slow (e.g. Ruby) can easily pump out a few hundred requests a second.
Aside from performance, there are other reasons to choose C#, e.g. it's much easier to write buffer overflows in C++ than C#.

Is your http server going to run on a dedicated machine? If yes, I would say go with C# if it is easier for you. If you need to run other applications on the same machine, you'll need to take into account the memory footprint of your application and the fact that GC will run at "random" times.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.