C# StringWriter is faster than C++ ofstream (via pinvoke)? - c#

I'm just a beginner at c++ so I might be doing something wrong but anyways I created a c++ dll and I call it from my wpf project:
c++ code:
extern "C" __declspec (dllexport) double writeTxt()
{
ofstream mf("c:\\cpp.txt");
for(int i=0;i<999;i++)
{
mf<<"xLine: \n";
}
mf.close();
return 1;
}
calling code from c#:
[DllImport(#"C:\Users\neo\Documents\visual studio 2010\Projects\TestDll\Debug\TestDll.dll",
CallingConvention = CallingConvention.Cdecl)]
public static extern double writeTxt();
Now I'm trying to compare the execution time with this c# function:
double writeTxtCs()
{
StreamWriter sw = new StreamWriter(#"c:\cs.txt");
for (int i = 0; i < 999; i++)
{
sw.WriteLine("Line: " + i);
}
sw.Close();
return 0;
}
but the c# function is about twice as faster than the c++ function.
tested like this:
private void Window_Loaded(object sender, RoutedEventArgs e)
{
long[] arr = new long[100];
Stopwatch sw = new Stopwatch();
for (int i = 0; i < 99; i++)
{
sw.Start();
//double xxx = writeTxt();
double xxx = writeTxtCs();
arr[i] = sw.ElapsedMilliseconds;
sw.Reset();
}
MessageBox.Show(arr.Average().ToString());
Close();
}
When running the c# function I normally get ~0.65ms and when running the c++ function I get ~1.1ms.
my question is: am I doing something wrong or does c# really is faster in this scenario than c++?

All other answers have valid points. In addition to those:
You are testing against "Debug" build of your C++ DLL and that might be degrading C++ performance more than how it affects C#'s performance. Try unleashing optimizations on both and see how it works out for you.
Nevertheless I/O doesn't have much to do with the "language". It's more about runtime and the OS.

You're not testing C++ vs. C#. You're testing [C++ plus libraries] vs. [C# plus libraries].
In order to find out why ofstream is slower than StreamWriter you'd need to profile the code or look into the internals.
Anyway, single milliseconds are a very small amount of time for a computer. I'd repeat the test with 1000 times the load to make timing jitter irrelevant.

It could be a buffering issue, i.e. the C# and C++ file writing guts might be buffering data in a different way, which would result in performance differences.
I recommend you use an operation which is purely CPU bound for benchmarking, rather than something IO bound (such as writing to a hard drive). For instance, see how fast each function can count from 0 to MAX_INT.

That's not really a reliable benchmark. You're doing file IO there, which is highly dependent on the implementation of ofstream and StreamWriter. Also, is it a problem?

Related

C# are field reads guaranteed to be reliable (fresh) when using multithreading?

Background
My colleague thinks reads in multithreaded C# are reliable and will always give you the current, fresh value of a field, but I've always used locks because I was sure I'd experienced problems otherwise.
I spent some time googling and reading articles, but I mustn't be able to provide google with correct search input, because I didn't find exactly what I was after.
So I wrote the below program without locks in an attempt to prove why that's bad.
Question
I'm assuming the below is a valid test, then the results show that the reads aren't reliable/fresh.
Can someone explain what this is caused by? (reordering, staleness or something else)?
And link me to official Microsoft documentation/section explaining why this happens and what is the recommended solution?
If the below isn't a valid test, what would be?
Program
If there are two threads, one calls SetA and the other calls SetB, if the reads are unreliable without locks, then intermittently Foo's field "c" will be false.
using System;
using System.Threading.Tasks;
namespace SetASetBTestAB
{
class Program
{
class Foo
{
public bool a;
public bool b;
public bool c;
public void SetA()
{
a = true;
TestAB();
}
public void SetB()
{
b = true;
TestAB();
}
public void TestAB()
{
if (a && b)
{
c = true;
}
}
}
static void Main(string[] args)
{
int timesCWasFalse = 0;
for (int i = 0; i < 100000; i++)
{
var f = new Foo();
var t1 = Task.Run(() => f.SetA());
var t2 = Task.Run(() => f.SetB());
Task.WaitAll(t1, t2);
if (!f.c)
{
timesCWasFalse++;
}
}
Console.WriteLine($"timesCWasFalse: {timesCWasFalse}");
Console.WriteLine("Finished. Press Enter to exit");
Console.ReadLine();
}
}
}
Output
Release mode. Intel Core i7 6700HQ:
Run 1: timesCWasFalse: 8
Run 2: timesCWasFalse: 10
Of course it is not fresh. The average CPU nowadays has 3 layers of Caches between each cores Registers and the RAM. And it can take quite some time for a write to one cache to be propagate to all of them.
And then there is the JiT Compiler. Part of it's job is dead code dection. And one of the first things it will do is cut out "useless" variables. For example this code tried to force a OOM excpetion by running into the 2 GiB Limit on x32 Systems:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace OOM_32_forced
{
class Program
{
static void Main(string[] args)
{
//each short is 2 byte big, Int32.MaxValue is 2^31.
//So this will require a bit above 2^32 byte, or 2 GiB
short[] Array = new short[Int32.MaxValue];
/*need to actually access that array
Otherwise JIT compiler and optimisations will just skip
the array definition and creation */
foreach (short value in Array)
Console.WriteLine(value);
}
}
}
The thing is that if you cut out the output stuff, there is a decent chance that the JiT will remove the variable Array inlcuding the instantionation order. The JiT has a decent chance to reduce this programming to doing nothing at all at runtime.
volatile is first preventing the JiT from doing any optimisations on that value. And it might even have some effect on how the CPU processes stuff.

How to use Intel's RDRAND using inline assembly with .Net

I'm using an Intel Ivy Bridge CPU and want to use the RDRAND opcode (https://software.intel.com/en-us/articles/intel-digital-random-number-generator-drng-software-implementation-guide) in C#.
How can I call this CPU instruction via C#? I've seen an example of executing assembly code from c# here: x86/x64 CPUID in C#
But I'm not sure how to use it for RDRAND. The code doesn't need to check whether the CPU executing the code supports the instruction or not.
I've seen this C++ example of executing assembly byte code coming from drng_samples of Intel:
int rdrand32_step (uint32_t *rand)
{
unsigned char ok;
/* rdrand edx */
asm volatile(".byte 0x0f,0xc7,0xf0; setc %1"
: "=a" (*rand), "=qm" (ok)
:
: "edx"
);
return ok;
}
How can the example of executing assembly code in C# be combined with the C++ code coming from the Intel drng sample code?
There are answers out on SO that will generate (unmanaged) assembly code at runtime for managed code to call back into. That's all very interesting, but I propose that you simply use C++/CLI for this purpose, because it was designed to simplify interop scenarios. Create a new Visual C++ CLR class library and give it a single rdrandwrapper.cpp:
#include <immintrin.h>
using namespace System;
namespace RdRandWrapper {
#pragma managed(push, off)
bool getRdRand(unsigned int* pv) {
const int max_rdrand_tries = 10;
for (int i = 0; i < max_rdrand_tries; ++i) {
if (_rdrand32_step(pv)) return true;
}
return false;
}
#pragma managed(pop)
public ref class RandomGeneratorError : Exception
{
public:
RandomGeneratorError() : Exception() {}
RandomGeneratorError(String^ message) : Exception(message) {}
};
public ref class RdRandom
{
public:
int Next() {
unsigned int v;
if (!getRdRand(&v)) {
throw gcnew RandomGeneratorError("Failed to get hardware RNG number.");
}
return v & 0x7fffffff;
}
};
}
This is a very bare-bones implementation that just tries to mimic Random.Next in getting a single non-negative random integer. Per the question, it does not attempt to verify that RDRAND is actually available on the CPU, but it does handle the case where the instruction is present but fails to work. (This "cannot happen" on current hardware unless it's broken, as detailed here.)
The resulting assembly is a mixed assembly that can be consumed by managed C# code. Make sure to compile your assembly as either x86 or x64, same as your unmanaged code (by default, projects are set to compile as "Any CPU", which will not work correctly since the unmanaged code has only one particular bitness).
using System;
using RdRandWrapper;
class Program {
static void Main(string[] args) {
var r = new RdRandom();
for (int i = 0; i != 10; ++i) {
Console.WriteLine(r.Next());
}
}
}
I make no claims as to performance, but it's probably not great. If you wanted to get many random values this way, you would probably want a Next(int[] values) overload to get many random values in one call, to reduce the overhead of interop.

free memory of an local array of string in method C#

I have recently encountered a problem about the memory my program used. The reason is the memory of an array of string i used in a method. More specifically, this program is to read an integer array from a outside file. Here is my code
class Program
{
static void Main(string[] args)
{
int[] a = loadData();
for (int i = 0; i < a.Length; i++)
{
Console.WriteLine(a[i]);
}
Console.ReadKey();
}
private static int[] loadData()
{
string[] lines = System.IO.File.ReadAllLines(#"F:\data.txt");
int[] a = new int[lines.Length];
for (int i = 0; i < lines.Length; i++)
{
string[] temp = lines[i].Split(new char[]{','},StringSplitOptions.RemoveEmptyEntries);
a[i] = Convert.ToInt32(temp[0]);
}
return a;
}
}
File data.txt is about 7.4 MB and 574285 lines. But when I run, the memory of program shown in task manager is : 41.6 MB. It seems that the memory of the array of string I read in loadData() (it is string[] lines) is not be freed. How can i free it, because it is never used later.
You can call GC.Collect() after setting lines to null, but I suggest you look at all answers here, here and here. Calling GC.Collect() is something that you rarely want to do. The purpose of using a language such as C# is that it manages the memory for you. If you want granular control over the memory read in, then you could create a C++ dll and call into that from your C# code.
Instead of reading the entire file into a string array, you could read it line by line and perform the operations that you need to on that line. That would probably be more efficient as well.
What problem does the 40MB of used memory cause? How often do you read the data? Would it be worth caching it for future use (assuming the 7MB is tolerable).

Method with just one line will hit performance?

In a code ValidateRequestmethod is defined
private bool ValidateRequest()
{
return _doc != null;
}
This method is called from everywhere I want to check if _doc is null. This method has been used 5 times in a cs file.
Performance point of view is it advisable to define a method with just a line? I think before calling this method everything from called will be pushed on stack and after it will be pulled from stack.
Any thoughts?
=== Edit ====
I am using .NET version 3.5
Don't bother with it. The compiler will probably inline the method as the corresponding IL is quite short.
If that method helps with code maintainability, as it communicates intention go on with it
It's highly unlikely that moving a single line into a method will have a significant impact on your application. It's actually quite possible that this will have no impact as the JIT could choose to inline such a function call. I would definitely opt for keeping the check in a separate method unless a profiler specifically showed it to be a problem.
Focus on writing code that is clear and well abstracted. Let the profiler guide you to the real performance problems.
As always: when you have doubts, benchmark!
And when you benchmark, do it in release mode, otherwise you're not benchmarking with compiler optimizations.
After that, if it does indeed impact performance, you can inline it with NGen.
This SO post talks about it.
ok, so this is just from LinqPad, and not I guess a definitive answer, but the following code produced a minuscule discrepancy:(00:00:00.7360736 vs 00:00:00.0740074)
void Main()
{
var starttime = DateTime.Now;
for (var i = 0; i < 1000000000; i++)
{
if (ValidateRequest()) continue;
}
var endtime = DateTime.Now;
Console.WriteLine(endtime.Subtract(starttime));
starttime = DateTime.Now;
for (var i = 0; i < 100000000; i++)
{
if (_doc != null) continue;
}
endtime = DateTime.Now;
Console.WriteLine(endtime.Subtract(starttime));
}
private object _doc = null;
private bool ValidateRequest()
{
return _doc != null;
}

Simple method call is really slow?

Edit: I've resolved my problem. The cause was an error in testing procedure and will be detailed once I'm allowed to answer my own question.
I know this type of question should generally be avoided, but I've come across a really strange situation that I can't make sense of. I've been trying to implement a PRNG, and I've been testing its performance against System.Random. I found that my code was ~50 times slower, but it wasn't the algorithm that was the problem, but just calling the method. Even if I just returned a constant, it would still be many times slower.
So I write a simple test program that compares calling a method that wraps random.NextDouble(), a method that returns -1, and calling random.NextDouble() directly. I ran my test in Ideone, and it gave the expected results; all the times were similar, and returning a constant was fastest. The times were all around 0.1 seconds.
However, the same code compiled in Visual Studio 2011 Beta or 2010 C# Express would result in 4 seconds, 4 seconds, and 0.1 seconds, for each case respectively. I'm definitely running in release mode, the optimize code checkbox is ticked, and launching from outside Visual Studio gives the same results. So why are such simple method calls so much slower in Visual Studio than Ideone? Here's the code I used to benchmark:
using System;
using System.Diagnostics;
public class Test{
static Random random = new Random();
public static Double Random() {
return random.NextDouble();
}
public static Double Random2() {
return -1;
}
public static void Main() {
{
Stopwatch s = new Stopwatch();
Double a = 0;
s.Start();
for (Int32 i = 0; i < 5000000; i++)
a += Random();
s.Stop();
Console.WriteLine(s.ElapsedMilliseconds);
}
{
Stopwatch s = new Stopwatch();
Double a = 0;
s.Start();
for (Int32 i = 0; i < 5000000; i++)
a += Random2();
s.Stop();
Console.WriteLine(s.ElapsedMilliseconds);
}
{
Stopwatch s = new Stopwatch();
Double a = 0;
s.Start();
for (Int32 i = 0; i < 5000000; i++)
a += random.NextDouble();
s.Stop();
Console.WriteLine(s.ElapsedMilliseconds);
}
}
}
You shouldn't measure the first call to Random() and Random2(). The first time a function is called, it is handled by the JITTER. Instead, call Random() and Random2() once, then start measuring. random.NextDouble() was already compiled after .NET was installed, so it doesn't suffer from the same problem.
I don't believe this will explain all the difference, but it should level the playing field.

Categories