openMP for don't improve overal perfomance - c#

I am using a c++ dll in order to do some computational intensive processing- my c++ uses threads - and use this dll in my c# application.
I used MS visual studio profiler to identify the parts of code which consumes most time.
and used openMP for improving the performance through distributing the work load among a Intel i7 8 core processor running #2.9GHz.
the following code for example consumes to much time of the cpu
for (short j = 0; j < 4096; j++)
histsum[j] = hist_tmp[row][col][0][j] + hist_tmp[row][col][1][j] + hist_tmp[row][col][2][j] + hist_tmp[row][col][3][j];
so I modified it to
#pragma omp parallel for
for (short j = 0; j < 4096; j++)
histsum[j] = hist_tmp[row][col][0][j] + hist_tmp[row][col][1][j] + hist_tmp[row][col][2][j] + hist_tmp[row][col][3][j];
i noticed that the 8 cores are 100% loaded , but the overall performance is not improved.
What could be the problem? and how could I overcome it?

Neither C++, nor multi threading, automatically improve performance. Both need to be applied where appropriate to get any benefit.
Using multi threading have some overhead to delegate work and synchronize the the results. If each task is small the overhead will be larger than any possible gain. Adding four numbers together is absolutely to small for this to be useful. You need to apply multi-threading to much larger chunks of data for it to be useful.
I would also not expect any huge gains by using c++ for code like this. The C++ compiler is in general better at optimizing code, but for simple code like this even the c# compiler should do a decent job. You might see some small improvement by moving the indexing, hist_tmp[row][col], outside the loop. This indexing might be optimized away already, but it might be worth a try.
This kind of code should however benefit from SIMD. Some c++ compilers have options for auto-vectorization. To get the best performance from c# there is intrinstics in .net core.
A very important point is to measure performance using the appropriate tools for the platform. In c# this would be a stopwatch, but tools like benchmark.net are recommended since they handle things like compilation overhead. A good profiler is also very useful.
I would also suggest looking for any algorithmic improvements, caching opportunities or other unnecessary or repeated work before going to multi threading and c++. The former can often have a far larger effect on performance.

Related

Primenumbers C# and C++ [duplicate]

Or is it now the other way around?
From what I've heard there are some areas in which C# proves to be faster than C++, but I've never had the guts to test it by myself.
Thought any of you could explain these differences in detail or point me to the right place for information on this.
There is no strict reason why a bytecode based language like C# or Java that has a JIT cannot be as fast as C++ code. However C++ code used to be significantly faster for a long time, and also today still is in many cases. This is mainly due to the more advanced JIT optimizations being complicated to implement, and the really cool ones are only arriving just now.
So C++ is faster, in many cases. But this is only part of the answer. The cases where C++ is actually faster, are highly optimized programs, where expert programmers thoroughly optimized the hell out of the code. This is not only very time consuming (and thus expensive), but also commonly leads to errors due to over-optimizations.
On the other hand, code in interpreted languages gets faster in later versions of the runtime (.NET CLR or Java VM), without you doing anything. And there are a lot of useful optimizations JIT compilers can do that are simply impossible in languages with pointers. Also, some argue that garbage collection should generally be as fast or faster as manual memory management, and in many cases it is. You can generally implement and achieve all of this in C++ or C, but it's going to be much more complicated and error prone.
As Donald Knuth said, "premature optimization is the root of all evil". If you really know for sure that your application will mostly consist of very performance critical arithmetic, and that it will be the bottleneck, and it's certainly going to be faster in C++, and you're sure that C++ won't conflict with your other requirements, go for C++. In any other case, concentrate on first implementing your application correctly in whatever language suits you best, then find performance bottlenecks if it runs too slow, and then think about how to optimize the code. In the worst case, you might need to call out to C code through a foreign function interface, so you'll still have the ability to write critical parts in lower level language.
Keep in mind that it's relatively easy to optimize a correct program, but much harder to correct an optimized program.
Giving actual percentages of speed advantages is impossible, it largely depends on your code. In many cases, the programming language implementation isn't even the bottleneck. Take the benchmarks at http://benchmarksgame.alioth.debian.org/ with a great deal of scepticism, as these largely test arithmetic code, which is most likely not similar to your code at all.
I'm going to start by disagreeing with part of the accepted (and well-upvoted) answer to this question by stating:
There are actually plenty of reasons why JITted code will run slower than a properly optimized C++ (or other language without runtime overhead)
program including:
compute cycles spent on JITting code at runtime are by definition unavailable for use in program execution.
any hot paths in the JITter will be competing with your code for instruction and data cache in the CPU. We know that cache dominates when it comes to performance and native languages like C++ do not have this type of contention, by design.
a run-time optimizer's time budget is necessarily much more constrained than that of a compile-time optimizer's (as another commenter pointed out)
Bottom line: Ultimately, you will almost certainly be able to create a faster implementation in C++ than you could in C#.
Now, with that said, how much faster really isn't quantifiable, as there are too many variables: the task, problem domain, hardware, quality of implementations, and many other factors. You'll have run tests on your scenario to determine the the difference in performance, and then decide whether it is worth the the additional effort and complexity.
This is a very long and complex topic, but I feel it's worth mentioning for the sake of completeness that C#'s runtime optimizer is excellent, and is able to perform certain dynamic optimizations at runtime that are simply not available to C++ with its compile-time (static) optimizer. Even with this, the advantage is still typically deeply in the native application's court, but the dynamic optimizer is the reason for the "almost certainly" qualifier given above.
--
In terms of relative performance, I was also disturbed by the figures and discussions I saw in some other answers, so I thought I'd chime in and at the same time, provide some support for the statements I've made above.
A huge part of the problem with those benchmarks is you can't write C++ code as if you were writing C# and expect to get representative results (eg. performing thousands of memory allocations in C++ is going to give you terrible numbers.)
Instead, I wrote slightly more idiomatic C++ code and compared against the C# code #Wiory provided. The two major changes I made to the C++ code were:
used vector::reserve()
flattened the 2d array to 1d to achieve better cache locality (contiguous block)
C# (.NET 4.6.1)
private static void TestArray()
{
const int rows = 5000;
const int columns = 9000;
DateTime t1 = System.DateTime.Now;
double[][] arr = new double[rows][];
for (int i = 0; i < rows; i++)
arr[i] = new double[columns];
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
t1 = System.DateTime.Now;
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
arr[i][j] = i;
t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
Run time (Release): Init: 124ms, Fill: 165ms
C++14 (Clang v3.8/C2)
#include <iostream>
#include <vector>
auto TestSuite::ColMajorArray()
{
constexpr size_t ROWS = 5000;
constexpr size_t COLS = 9000;
auto initStart = std::chrono::steady_clock::now();
auto arr = std::vector<double>();
arr.reserve(ROWS * COLS);
auto initFinish = std::chrono::steady_clock::now();
auto initTime = std::chrono::duration_cast<std::chrono::microseconds>(initFinish - initStart);
auto fillStart = std::chrono::steady_clock::now();
for(auto i = 0, r = 0; r < ROWS; ++r)
{
for (auto c = 0; c < COLS; ++c)
{
arr[i++] = static_cast<double>(r * c);
}
}
auto fillFinish = std::chrono::steady_clock::now();
auto fillTime = std::chrono::duration_cast<std::chrono::milliseconds>(fillFinish - fillStart);
return std::make_pair(initTime, fillTime);
}
Run time (Release): Init: 398µs (yes, that's microseconds), Fill: 152ms
Total Run times: C#: 289ms, C++ 152ms (roughly 90% faster)
Observations
Changing the C# implementation to the same 1d array implementation
yielded Init: 40ms, Fill: 171ms, Total: 211ms (C++ was still almost
40% faster).
It is much harder to design and write "fast" code in C++ than it is to write "regular" code in either language.
It's (perhaps) astonishingly easy to get poor performance in C++; we saw that with unreserved vectors performance. And there are lots of pitfalls like this.
C#'s performance is rather amazing when you consider all that is going on at runtime. And that performance is comparatively easy to
access.
More anecdotal data comparing the performance of C++ and C#: https://benchmarksgame.alioth.debian.org/u64q/compare.php?lang=gpp&lang2=csharpcore
The bottom line is that C++ gives you much more control over performance. Do you want to use a pointer? A reference? Stack memory? Heap? Dynamic polymorphism or eliminate the runtime overhead of a vtable with static polymorphism (via templates/CRTP)? In C++ you have to... er, get to make all these choices (and more) yourself, ideally so that your solution best addresses the problem you're tackling.
Ask yourself if you actually want or need that control, because even for the trivial example above, you can see that although there is a significant improvement in performance, it requires a deeper investment to access.
It's five oranges faster. Or rather: there can be no (correct) blanket answer. C++ is a statically compiled language (but then, there's profile guided optimization, too), C# runs aided by a JIT compiler. There are so many differences that questions like “how much faster” cannot be answered, not even by giving orders of magnitude.
In my experience (and I have worked a lot with both languages), the main problem with C# compared to C++ is high memory consumption, and I have not found a good way to control it. It was the memory consumption that would eventually slow down .NET software.
Another factor is that JIT compiler cannot afford too much time to do advanced optimizations, because it runs at runtime, and the end user would notice it if it takes too much time. On the other hand, a C++ compiler has all the time it needs to do optimizations at compile time. This factor is much less significant than memory consumption, IMHO.
One particular scenario where C++ still has the upper hand (and will, for years to come) occurs when polymorphic decisions can be predetermined at compile time.
Generally, encapsulation and deferred decision-making is a good thing because it makes the code more dynamic, easier to adapt to changing requirements and easier to use as a framework. This is why object oriented programming in C# is very productive and it can be generalized under the term “generalization”. Unfortunately, this particular kind of generalization comes at a cost at run-time.
Usually, this cost is non-substantial but there are applications where the overhead of virtual method calls and object creation can make a difference (especially since virtual methods prevent other optimizations such as method call inlining). This is where C++ has a huge advantage because you can use templates to achieve a different kind of generalization which has no impact on runtime but isn't necessarily any less polymorphic than OOP. In fact, all of the mechanisms that constitute OOP can be modelled using only template techniques and compile-time resolution.
In such cases (and admittedly, they're often restricted to special problem domains), C++ wins against C# and comparable languages.
C++ (or C for that matter) gives you fine-grained control over your data structures. If you want to bit-twiddle you have that option. Large managed Java or .NET apps (OWB, Visual Studio 2005) that use the internal data structures of the Java/.NET libraries carry the baggage with them. I've seen OWB designer sessions using over 400 MB of RAM and BIDS for cube or ETL design getting into the 100's of MB as well.
On a predictable workload (such as most benchmarks that repeat a process many times) a JIT can get you code that is optimised well enough that there is no practical difference.
IMO on large applications the difference is not so much the JIT as the data structures that the code itself is using. Where an application is memory-heavy you will get less efficient cache usage. Cache misses on modern CPUs are quite expensive. Where C or C++ really win is where you can optimise your usage of data structures to play nicely with the CPU cache.
For graphics the standard C# Graphics class is way slower than GDI accessed via C/C++.
I know this has nothing to do with the language per se, more with the total .NET platform, but Graphics is what is offered to the developer as a GDI replacement, and its performance is so bad I wouldn't even dare to do graphics with it.
We have a simple benchmark we use to see how fast a graphics library is, and that is simply drawing random lines in a window. C++/GDI is still snappy with 10000 lines while C#/Graphics has difficulty doing 1000 in real-time.
The garbage collection is the main reason Java# CANNOT be used for real-time systems.
When will the GC happen?
How long will it take?
This is non-deterministic.
We have had to determine if C# was comparable to C++ in performance and I wrote some test programs for that (using Visual Studio 2005 for both languages). It turned out that without garbage collection and only considering the language (not the framework) C# has basically the same performance as C++. Memory allocation is way faster in C# than in C++ and C# has a slight edge in determinism when data sizes are increased beyond cache line boundaries. However, all of this had eventually to be paid for and there is a huge cost in the form of non-deterministic performance hits for C# due to garbage collection.
C/C++ can perform vastly better in programs where there are either large arrays or heavy looping/iteration over arrays (of any size). This is the reason that graphics are generally much faster in C/C++, because heavy array operations underlie almost all graphics operations. .NET is notoriously slow in array indexing operations due to all the safety checks, and this is especially true for multi-dimensional arrays (and, yes, rectangular C# arrays are even slower than jagged C# arrays).
The bonuses of C/C++ are most pronounced if you stick directly with pointers and avoid Boost, std::vector and other high-level containers, as well as inline every small function possible. Use old-school arrays whenever possible. Yes, you will need more lines of code to accomplish the same thing you did in Java or C# as you avoid high-level containers. If you need a dynamically sized array, you will just need to remember to pair your new T[] with a corresponding delete[] statement (or use std::unique_ptr)—the price for the extra speed is that you must code more carefully. But in exchange, you get to rid yourself of the overhead of managed memory / garbage collector, which can easily be 20% or more of the execution time of heavily object-oriented programs in both Java and .NET, as well as those massive managed memory array indexing costs. C++ apps can also benefit from some nifty compiler switches in certain specific cases.
I am an expert programmer in C, C++, Java, and C#. I recently had the rare occasion to implement the exact same algorithmic program in the latter 3 languages. The program had a lot of math and multi-dimensional array operations. I heavily optimized this in all 3 languages. The results were typical of what I normally see in less rigorous comparisons: Java was about 1.3x faster than C# (most JVMs are more optimized than the CLR), and the C++ raw pointer version came in about 2.1x faster than C#. Note that the C# program only used safe code—it is my opinion that you might as well code it in C++ before using the unsafe keyword.
Lest anyone think I have something against C#, I will close by saying that C# is probably my favorite language. It is the most logical, intuitive and rapid development language I've encountered so far. I do all my prototyping in C#. The C# language has many small, subtle advantages over Java (yes, I know Microsoft had the chance to fix many of Java's shortcomings by entering the game late and arguably copying Java). Toast to Java's Calendar class anyone? If Microsoft ever spends real effort to optimize the CLR and the .NET JITter, C# could seriously take over. I'm honestly surprised they haven't already—they did so many things right in the C# language, why not follow it up with heavy-hitting compiler optimizations? Maybe if we all beg.
As usual, it depends on the application. There are cases where C# is probably negligibly slower, and other cases where C++ is 5 or 10 times faster, especially in cases where operations can be easily SIMD'd.
I know it isn't what you were asking, but C# is often quicker to write than C++, which is a big bonus in a commercial setting.
> From what I've heard ...
Your difficulty seems to be in deciding whether what you have heard is credible, and that difficulty will just be repeated when you try to assess the replies on this site.
How are you going to decide if the things people say here are more or less credible than what you originally heard?
One way would be to ask for evidence.
When someone claims "there are some areas in which C# proves to be faster than C++" ask them why they say that, ask them to show you measurements, ask them to show you programs. Sometimes they will simply have made a mistake. Sometimes you'll find out that they are just expressing an opinion rather than sharing something that they can show to be true.
Often information and opinion will be mixed up in what people claim, and you'll have to try and sort out which is which. For example, from the replies in this forum:
"Take the benchmarks at http://shootout.alioth.debian.org/
with a great deal of scepticism, as
these largely test arithmetic code,
which is most likely not similar to
your code at all."
Ask yourself if you really
understand what "these largely test
arithmetic code" means, and then
ask yourself if the author has
actually shown you that his claim is
true.
"That's a rather useless test, since it really depends on how well
the individual programs have been
optimized; I've managed to speed up
some of them by 4-6 times or more,
making it clear that the comparison
between unoptimized programs is
rather silly."
Ask yourself whether the author has
actually shown you that he's managed
to "speed up some of them by 4-6
times or more" - it's an easy claim to make!
For 'embarassingly parallel' problems, when using Intel TBB and OpenMP on C++ I have observed a roughly 10x performance increase compared to similar (pure math) problems done with C# and TPL. SIMD is one area where C# cannot compete, but I also got the impression that TPL has a sizeable overhead.
That said, I only use C++ for performance-critical tasks where I know I will be able to multithread and get results quickly. For everything else, C# (and occasionally F#) is just fine.
In theory, for long running server-type application, a JIT-compiled language can become much faster than a natively compiled counterpart. Since the JIT compiled language is generally first compiled to a fairly low-level intermediate language, you can do a lot of the high-level optimizations right at compile time anyway. The big advantage comes in that the JIT can continue to recompile sections of code on the fly as it gets more and more data on how the application is being used. It can arrange the most common code-paths to allow branch prediction to succeed as often as possible. It can re-arrange separate code blocks that are often called together to keep them both in the cache. It can spend more effort optimizing inner loops.
I doubt that this is done by .NET or any of the JREs, but it was being researched back when I was in university, so it's not unreasonable to think that these sort of things may find their way into the real world at some point soon.
Applications that require intensive memory access eg. image manipulation are usually better off written in unmanaged environment (C++) than managed (C#). Optimized inner loops with pointer arithmetics are much easier to have control of in C++. In C# you might need to resort to unsafe code to even get near the same performance.
I've tested vector in C++ and C# equivalent - List and simple 2d arrays.
I'm using Visual C#/C++ 2010 Express editions. Both projects are simple console applications, I've tested them in standard (no custom settings) release and debug mode.
C# lists run faster on my pc, array initialization is also faster in C#, math operations are slower.
I'm using Intel Core2Duo P8600#2.4GHz, C# - .NET 4.0.
I know that vector implementation is different than C# list, but I just wanted to test collections that I would use to store my objects (and being able to use index accessor).
Of course you need to clear memory (let's say for every use of new), but I wanted to keep the code simple.
C++ vector test:
static void TestVector()
{
clock_t start,finish;
start=clock();
vector<vector<double>> myList=vector<vector<double>>();
int i=0;
for( i=0; i<500; i++)
{
myList.push_back(vector<double>());
for(int j=0;j<50000;j++)
myList[i].push_back(j+i);
}
finish=clock();
cout<<(finish-start)<<endl;
cout<<(double(finish - start)/CLOCKS_PER_SEC);
}
C# list test:
private static void TestVector()
{
DateTime t1 = System.DateTime.Now;
List<List<double>> myList = new List<List<double>>();
int i = 0;
for (i = 0; i < 500; i++)
{
myList.Add(new List<double>());
for (int j = 0; j < 50000; j++)
myList[i].Add(j *i);
}
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
C++ - array:
static void TestArray()
{
cout << "Normal array test:" << endl;
const int rows = 5000;
const int columns = 9000;
clock_t start, finish;
start = clock();
double** arr = new double*[rows];
for (int i = 0; i < rows; i++)
arr[i] = new double[columns];
finish = clock();
cout << (finish - start) << endl;
start = clock();
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
arr[i][j] = i * j;
finish = clock();
cout << (finish - start) << endl;
}
C# - array:
private static void TestArray()
{
const int rows = 5000;
const int columns = 9000;
DateTime t1 = System.DateTime.Now;
double[][] arr = new double[rows][];
for (int i = 0; i < rows; i++)
arr[i] = new double[columns];
DateTime t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
t1 = System.DateTime.Now;
for (int i = 0; i < rows; i++)
for (int j = 0; j < columns; j++)
arr[i][j] = i * j;
t2 = System.DateTime.Now;
Console.WriteLine(t2 - t1);
}
Time: (Release/Debug)
C++
600 / 606 ms array init,
200 / 270 ms array fill,
1sec /13sec vector init & fill.
(Yes, 13 seconds, I always have problems with lists/vectors in debug mode.)
C#:
20 / 20 ms array init,
403 / 440 ms array fill,
710 / 742 ms list init & fill.
It's an extremely vague question without real definitive answers.
For example; I'd rather play 3D-games that are created in C++ than in C#, because the performance is certainly a lot better. (And I know XNA, etc., but it comes no way near the real thing).
On the other hand, as previously mentioned; you should develop in a language that lets you do what you want quickly, and then if necessary optimize.
.NET languages can be as fast as C++ code, or even faster, but C++ code will have a more constant throughput as the .NET runtime has to pause for GC, even if it's very clever about its pauses.
So if you have some code that has to consistently run fast without any pause, .NET will introduce latency at some point, even if you are very careful with the runtime GC.
I suppose there are applications written in C# running fast, as well as there are more C++ written apps running fast (well C++ just older... and take UNIX too...)
- the question indeed is - what is that thing, users and developers are complaining about ...
Well, IMHO, in case of C# we have very comfort UI, very nice hierarchy of libraries, and whole interface system of CLI. In case of C++ we have templates, ATL, COM, MFC and whole shebang of alreadyc written and running code like OpenGL, DirectX and so on... Developers complains of indeterminably risen GC calls in case of C# (means program runs fast, and in one second - bang! it's stuck).
To write code in C# very simple and fast (not to forget that also increase chance of errors.
In case of C++, developers complains of memory leaks, - means crushes, calls between DLLs, as well as of "DLL hell" - problem with support and replacement libraries by newer ones...
I think more skill you'll have in the programming language, the more quality (and speed) will characterize your software.
I would put it this way: programmers who write faster code, are the ones who are the more informed of what makes current machines go fast, and incidentally they are also the ones who use an appropriate tool that allows for precise low-level and deterministic optimisation techniques. For these reasons, these people are the ones who use C/C++ rather than C#. I would go as far as stating this as a fact.
Well, it depends. If the byte-code is translated into machine-code (and not just JIT) (I mean if you execute the program) and if your program uses many allocations/deallocations it could be faster because the GC algorithm just need one pass (theoretically) through the whole memory once, but normal malloc/realloc/free C/C++ calls causes an overhead on every call (call-overhead, data-structure overhead, cache misses ;) ).
So it is theoretically possible (also for other GC languages).
I don't really see the extreme disadvantage of not to be able to use metaprogramming with C# for the most applications, because the most programmers don't use it anyway.
Another big advantage is that the SQL, like the LINQ "extension", provides opportunities for the compiler to optimize calls to databases (in other words, the compiler could compile the whole LINQ to one "blob" binary where the called functions are inlined or for your use optimized, but I'm speculating here).
If I'm not mistaken, C# templates are determined at runtime. This must be slower than compile time templates of C++.
And when you take in all the other compile-time optimizations mentioned by so many others, as well as the lack of safety that does, indeed, mean more speed...
I'd say C++ is the obvious choice in terms of raw speed and minimum memory consumption. But this also translates into more time developing the code and ensuring you aren't leaking memory or causing any null pointer exceptions.
Verdict:
C#: Faster development, slower run
C++: Slow development, faster run.
There are some major differences between C# and C++ on the performance aspect:
C# is GC / heap based. The allocation and GC itself is overhead as the non locality of the memory access
C++ optimizer's have become very good over the years. JIT compilers cannot achieve the same level since they have only limited compilation time and don't see the global scope
Besides that programmer competence plays also a role. I have seen bad C++ code where classes where passed by value as argument all over the place. You can actually make the performance worse in C++ if you don't know what you are doing.
I found this April 2020 read: https://www.quora.com/Why-is-C-so-slow-compared-to-Python by a real-world programmer with 15+ years of Software Development experience.
It states that C# is slower usually because it is compiled to Common Intermediate Language (CIL) instead of machine code like C++. The CIL is then put through Common Language Runtime (CLR) which outputs machine code. However, if you keep executing C# it will take the output of the machine code and cache it so the machine code is saved for the next execution. All in all, C# can be faster if you execute multiple times since it is in machine code after multiple executions.
There is also comments that a good C++ programmer can do optimizations that can be time consuming that will in end be optimized.
> After all, the answers have to be somewhere, haven't they? :)
Umm, no.
As several replies noted, the question is under-specified in ways that invite questions in response, not answers. To take just one way:
the question conflates language with language implementation - this C program is both 2,194 times slower and 1.17 times faster than this C# program - we would have to ask you: Which language implementations?
And then which programs? Which machine? Which OS? Which data set?
It really depends on what you're trying to accomplish in your code. I've heard that it's just stuff of urban legend that there is any performance difference between VB.NET, C# and managed C++. However, I've found, at least in string comparisons, that managed C++ beats the pants off of C#, which in turn beats the pants off of VB.NET.
I've by no means done any exhaustive comparisons in algorithmic complexity between the languages. I'm also just using the default settings in each of the languages. In VB.NET I'm using settings to require declaration of variables, etc. Here is the code I'm using for managed C++: (As you can see, this code is quite simple). I'm running the same in the other languages in Visual Studio 2013 with .NET 4.6.2.
#include "stdafx.h"
using namespace System;
using namespace System::Diagnostics;
bool EqualMe(String^ first, String^ second)
{
return first->Equals(second);
}
int main(array<String ^> ^args)
{
Stopwatch^ sw = gcnew Stopwatch();
sw->Start();
for (int i = 0; i < 100000; i++)
{
EqualMe(L"one", L"two");
}
sw->Stop();
Console::WriteLine(sw->ElapsedTicks);
return 0;
}
One area that I was instrumenting code in C++ vs C# was in creating a database connection to SQL Server and returning a resultset. I compared C++ (Thin layer over ODBC) vs C# (ADO.NET SqlClient) and found that C++ was about 50% faster than the C# code. ADO.NET is supposed to be a low-level interface for dealing with the database. Where you see perhaps a bigger difference is in memory consumption rather than raw speed.
Another thing that makes C++ code faster is that you can tune the compiler options at a granular level, optimizing things in a way you can't in C#.
Inspired by this, I did a quick test with 60 percent of common instruction needed in most of the programs.
Here’s the C# code:
for (int i=0; i<1000; i++)
{
StreamReader str = new StreamReader("file.csv");
StreamWriter stw = new StreamWriter("examp.csv");
string strL = "";
while((strL = str.ReadLine()) != null)
{
ArrayList al = new ArrayList();
string[] strline = strL.Split(',');
al.AddRange(strline);
foreach(string str1 in strline)
{
stw.Write(str1 + ",");
}
stw.Write("\n");
}
str.Close();
stw.Close();
}
String array and arraylist are used purposely to include those instructions.
Here's the c++ code:
for (int i = 0; i<1000; i++)
{
std::fstream file("file.csv", ios::in);
if (!file.is_open())
{
std::cout << "File not found!\n";
return 1;
}
ofstream myfile;
myfile.open ("example.txt");
std::string csvLine;
while (std::getline(file, csvLine))
{
std::istringstream csvStream(csvLine);
std::vector csvColumn;
std::string csvElement;
while( std::getline(csvStream, csvElement, ‘,’) )
{
csvColumn.push_back(csvElement);
}
for (std::vector::iterator j = csvColumn.begin(); j != csvColumn.end(); ++j)
{
myfile << *j << ", ";
}
csvColumn.clear();
csvElement.clear();
csvLine.clear();
myfile << "\n";
}
myfile.close();
file.close();
}
The input file size I used was 40 KB.
And here's the result -
C++ code ran in 9 seconds.
C# code: 4 seconds!!!
Oh, but this was on Linux... With C# running on Mono... And C++ with g++.
OK, this is what I got on Windows – Visual Studio 2003:
C# code ran in 9 seconds.
C++ code – horrible 370 seconds!!!

Compile C#, so that it runs with the speed of C++

Alright, so I wanted to ask if it's actually possible to make a parser from c# to c++.
So that code written in C# would be able to run as fast as code written in C++.
Is it actually possible to do? I'm not asking how hard is it going to be.
What makes you think that translating your C# code to C++ would magically make it faster?
Languages don't have a speed. Assuming that C# code is slower (I'll get back to that), it is because of what that code does (including the implicit requirements placed by C#, such as bounds checking on arrays), and not because of the language it is written in.
If you converted your C# code to C++, it would still need to do bounds checking on arrays, because the original source code expected this to happen, so it would have to do just as much work.
Moreover, C# often isn't slower than C++. There are plenty of benchmarks floating around on the internet, generally showing that for the most part, C# is as fast as (or faster than) C++. Only when you spend a lot of time optimizing your code, does C++ become faster.
If you want faster code, you need to write code that requires less work to execute, not try to change the source language. That's just cargo-cult programming at its worst. You once saw some efficient code, and that was written in C++, so now you try to make things C++, in the hope of attracting any efficiency that might be passing by.
It just doesn't work that way.
Although you could translate C# code to C++, there would be the issue that C# depends on the .Net framework libraries which are not native, so you could not simply translate C# code to C++.
Update
Also C# code depends on the runtime to do things such as memory management i.e. Garbage Collection. If you translated the C# code to C++, where would the memory management code be? Parsing and translating is not going to fix issues like that.
The Mono project has invested quite a lot of energy in turning LLVM into a native machine code compiler for the C# runtime, although there are some problems with specific language constructs like shared generics etc.. Check it out and take it for a spin.
You can use NGen to compile IL to native code
Performance related tweaks:
Platform independent
use a profiler to spot the bottlenecks;
prevent unnecessary garbage (spot it using generation #0 collect count and the Large Object heap)
prevent unnecessary copying (use struct wisely)
prevent unwarranted generics (code-sharing has unexpected performance side effects)
prefer oldfashioned loops over enumerator blocks when performance is an issue
When using LINQ watch closely where you maintain/break deferred evaluation. Both can be enormous boosts to performance
use reflection.emit/Expression Trees to precompile certain dynamic logic that is performance bottleneck
Mono
use Mono --gc=sgen --optimize=inline,... (the SGEN garbage collector can make orders of magnitude difference). See also man mono for a lot of tuning/optimization options
use MONO_GENERIC_SHARING=none to disable sharing of generics (making particular tasks a lot quicker especially when supporting both valuetypes and reftypes) (not recommended for regular production use)
use the -optimize+ compile flag (optimizing the CLR code independently from what the JITter may do with that)
Less mainstream:
use the LLVM backend (click the quote:)
This allows Mono to benefit from all of the compiler optimizations done in LLVM. For example the SciMark score goes from 482 to 610.
use mkbundle to create a statically linked NATIVE binary image (already fully JITted, i.e. AOT (ahead-of-time compiled))
MS .NET
Most of the above have direct Microsoft pendants (NGen, `/Optimize' etc.)
Of course MS don't have a switchable/tunable garbage collector, and I don't think a fully compiled native binary can be achieved like with mono.
As always the answer to making code run faster is:
Find the bottleneck and optimize that
Most of the time the bottleneck is either:
time spend in a critical loop
Review your algorithm and datastructure, do not change the language, the latter will give a 10% speedup, the first will give you a 1000x speedup.
If you're stuck on the best algorithm, you can always ask a specific, short and detailed question on SO.
time waiting for resources for a slow source
Reduce the amount of stuff you're requesting from the source
instead of:
SELECT * FROM bigtable
do
SELECT TOP 10 * FROM bigtable ORDER BY xxx
The latter will return instantly and you cannot show a million records in a meaningful way anyhow.
Or you can have the server at the order end reduce the data so that it doesn't take a 100 years to cross the network.
Alternativly you can execute the slow datafetch routine in a separate thread, so the rest of your program can do meaningful stuff instead of waiting.
Time spend because you are overflowing memory with Gigabytes of data
Use a different algorithm that works on a smaller dataset at a time.
Try to optimize cache usage.
The answer to efficient coding is measure where your coding time goes
Use a profiler.
see: http://csharp-source.net/open-source/profilers
And optimize those parts that eat more than 50% of your CPU time.
Do this for a number of iterations, and soon your 10 hour running time will be down to a manageable 3 minutes, instead of the 9.5 hours that you will get from switching to this or that better language.

C# Performance - should I write the computation heavy methods in c++?

I am building a prototype for a quantitative library that does some signal analysis using image processing techniques. I built the initial prototype entirely in C#, but the performance is not as good as expected. Most of the computation is done through heavy matrix calculations, and these are taking up most of the time.
I am wondering if it is worth it to write a C++/CLI interface to unmanaged C++ code. Has anyone ever gone through this? Other suggestions for optimizing C# performance is welcome.
There was a time where it would definitely be better to write in C/C++, but the C# optimizer and JIT is so good now, that for pure math, there's probably no difference.
The difference comes when you have to deal with memory and possibly arrays. Even so, I'd still work with C# (or F#) and then optimize hotspots. The JIT is really good at optimizing away small, short-lived objects.
With arrays, you have to worry about C# doing bounds-checks on each access. Read this:
Link
Test it yourself -- I've been finding C# to be comparable -- sometimes faster.
It's hard to give a definitive answer here, but if performance is an issue I'd find a time-tested library with the performance you need and wrap it.
Something simple like multiplication or division is not much different between c++ and c# - the c++ compiler has an optimizer, and the CLR runtime has the on-demand JITer that does optimizations. So in theory, the c++ would outperform c# only on the first call.
However, theory and practice are not the same. With more complicated algorithms you also run into the differences between memory managers, and the maturity of the optimization techniques. If you want anecdotal evidence, you can find some math-heavy comparisons here.
Personally, I find doing the heavy computations in a native library and using c++/CLI to call it gives a good boost when the computations are the biggest bottleneck. As always, make sure that's the case before doing any optimization.
Matrix math is best done in native code in my opinion. Even the C++ libraries typically allow binding to a lower-level implementation like LAPACK.
There is a C# LAPACK port here (also C# BLAS on the same site) which you could try but I'd be surprised if this is faster than native code.
I've done a lot of image processing work in C# and, yes, I usually do use native code for heavy duty code where performance matters but I used just PInvokes and not the C++/CLI interface. A lot of time this is not needed, though.
There's quite a few good .NET profilers. The Red Gate one is my personal favorite. It might help you to visualize where the bottlenecks are.
The only reasonable language benchmark out there: http://shootout.alioth.debian.org/
See for yourself.
Performance for mathematical computation is pretty poor in C#. I was gobbsmacked to find how slow mathematical calculations are in C#. Just write a loop in C# and C++ having a few Multiplication, Sin, Cos, ... and the difference is immense.
I do not know Managed C++ but imlpementing it all in unmanaged C++ abd I would umagine exposing granular interfaces through P/Invoke should have little perfromance hit.
That is what I have done for heavy real-time image processing.
I built the initial prototype entirely in C#, but the performance is not as good as expected.
then you have two options:
Build another prototype in C++, and see how that compares, or optimize your C# code. No matter what language you write in, your code won't be fast until you've profiled and optimized and profiled and optimized it. That is especially true in C++. If you write the fastest possible implementation in C# and compare it to the fastest possible implementation in C++, then the C++ version will most likely be faster. But it will come at a cost in terms of development time. It's not trivial to write efficient C++ code. If you are new to the language then you will most likely write very inefficient code, especially if you are coming from C# or Java, where things are done differently and have different costs.
If you just write a working implementation, without worrying too much about performance, then I'm guessing that the C# version will probably be faster.
But it really depends on what kind of performance you're after (and not least, how expensive the operations you need to perform are. There's an overhead associated with the transition from managed to native code, so it is not worth it for short operations that are executed often.
Number-crunching code in C++ can be as fast as code written in Fortran (give or take a few percent), but to achieve that, you need to use a lot of advanced techniques (expression templates and lots of metaprogramming) or some fairly complex libraries which implement it for you.
Is that worth it? Or can C# be made fast enough for your needs?
You should write computational heavy programs in C++, you cannot reach anywhere near to C++ performance by optimizing C#. The overhead of calling wrappers is negligible assuming the computation takes considerable time. I have done coding in both C++ and C# and have never seen any occasion where .NET framework code come comparable to C++. There are a few instances where C# where runs better, but it was better because of lack of appropriate libraries or bad coding in C++. If you can write code equally well in C# and C++, I would write performance code in C++ and everything else is C#.
If x is the world best C++ programmer and y is the best C# programmer then most of the times x can write faster code than y. However, y can finish the coding faster than x most of the times.

Suitability of C# for clustered calculation-heavy apps?

I'm preparing to write a photonic simulation package that will run on a 128-node Linux and Windows cluster, with a Windows-based client for designing jobs (CAD-like) and submitting them to to the cluster.
Most of this is well-trod ground, but I'm curious how C# stacks up to C++ in terms of real number-crunching ability. I'm very comfortable with both languages, but I find the superior object model and framework support of C# with .NET or Mono incredibly enticing. However, I can't, with this application, sacrifice too much in processing power for the sake of developer preference.
Does anyone have any experience in this area? Are there any hard benchmarks available? I'd assume that the the final machine code would be optimized using the same techniques whether it comes from a C# or C++ source, especially since that typically takes place at the pcode/IL level.
The optimisation techniques employed by C# and native C++ are vastly different. C# compilers emit IL, which is only marginally optimised and then JIT'ed to binary code when it is about to execute for the first time. Most of the optimisation work happens inside the JIT compiler.
This has pros and cons. JIT has time budgets, which limits how much effort it can expend on optimisation. But it also has intimate knowledge of the hardware it is actually running on, so it can (in theory) make transparent use of newer CPU opcodes and detailed knowledge of performance data such as a pipeline hazards database.
In practice, I don't know how significant the latter is. I do know that at least Mono will parallelise some loops automatically if it finds itself running on a CPU with SSE (SSE2, perhaps?), which may be a big deal for your scenario.
I did a quick search and found this:
http://www.drdobbs.com/184401976;jsessionid=232QX0GU3C3KXQE1GHOSKH4ATMY32JVN
Edit: Bear in mind (on reading the article) that this was done 5 years ago so performance is likely to be better all round!
I have found these performance comparisons:
http://reverseblade.blogspot.com/2009/02/c-versus-c-versus-java-performance.html
http://systematicgaming.wordpress.com/2009/01/03/performance-c-vs-c/
http://journal.stuffwithstuff.com/2009/01/03/debunking-c-vs-c-performance/
http://www.csharphelp.com/2007/01/managed-c-versus-unmanaged-c/
And here is even a case study:
http://www.itu.dk/~sestoft/papers/numericperformance.pdf
Hope it helps.

How close can I get C# to the performance of C++ for small intensive tasks?

I was thinking about the speed difference of C++ to C# being mostly about C# compiling to byte-code that is taken in by the JIT compiler (is that correct?) and all the checks C# does.
I notice that it is possible to turn a lot of these functions off, both in the compile options, and possibly through using the unsafe keyword as unsafe code is not verifiable by the common language runtime.
Therefore if you were to write a simple console application in both languages, that flipped an imaginary coin an infinite number of times and displayed the results to the screen every 10,000 or so iterations, how much speed difference would there be? I chose this because it's a very simple program.
I'd like to test this but I don't know C++ or have the tools to compile it. This is my C# version though:
static void Main(string[] args)
{
unsafe
{
Random rnd = new Random();
int heads = 0, tails = 0;
while (true)
{
if (rnd.NextDouble() > 0.5)
heads++;
else
tails++;
if ((heads + tails) % 1000000 == 0)
Console.WriteLine("Heads: {0} Tails: {1}", heads, tails);
}
}
}
Is the difference enough to warrant deliberately compiling sections of code "unsafe" or into DLLs that do not have some of the compile options like overflow checking enabled? Or does it go the other way, where it would be beneficial to compile sections in C++? I'm sure interop speed comes into play too then.
To avoid subjectivity, I reiterate the specific parts of this question as:
Does C# have a performance boost from using unsafe code?
Do the compile options such as disabling overflow checking boost performance, and do they affect unsafe code?
Would the program above be faster in C++ or negligably different?
Is it worth compiling long intensive number-crunching tasks in a language such as C++ or using /unsafe for a bonus? Less subjectively, could I complete an intensive operation faster by doing this?
The example given is flawed because it does not show real life usage of both programming languages. Using simple datatypes to measure the speed of a language will not bring anything interesting. Instead, I suggest you create a template class in C++ and compare it with what is possible in C# for class generics. In the end, objects will bring some important results and you will see that C++ is faster than C#. Not to mention that you are comparing a lower level programming language with C#.
Does C# have a performance boost from
using unsafe code?
Yes, it will have a boost but it is not suggested that you write only code with unsafe. Here is why: Code written using an unsafe context cannot be verified to be safe, so it will be executed only when the code is fully trusted. In other words, unsafe code cannot be executed in an untrusted environment. For example, you cannot run unsafe code directly from the Internet. http://msdn.microsoft.com/en-us/library/aa288474(VS.71).aspx
Would the program above be faster in
C++ or negligably different?
Yes the program would be slightly faster in C++. C++ is a lower programming language and even faster if you start using the algorithm library (random_shuffle comes to mind).
Is it worth compiling long intensive
number-crunching tasks in a language
such as C++ or using /unsafe for a
bonus? Less subjectively, could I
complete an intensive operation faster
by doing this?
It depends on the project...
Up to more than 100% of speed - depends a lot on the task, simply said.
More than 100% - yes, because the just in time compiler knows your processor, and I doubt you actually optimize for your hardware platform ;)
No SSE is a problem if you do matrix operations.
FOr some things with tons of arrays (image manipulation) The array tests kill you, but pointers work (i.e. unsafe code) as they bypass this.
Regarding things like overflow checking - be carefull. As in: in C++ you have the same possibly. If you need overflow checking, the performance issue is not there ;)
I personally would not bother with C++ in most cases. Partially yes, especially when you can benefit from SSE:
So, at the end a lot depends on the NATURE if your calculations.

Categories