How to make CUDA dll that can be used in C# application?

How to make CUDA dll that can be used in C# application? - c#

It would be good if you could give me a brief tutorial instead of a few words.
My CUDA application is working as I wanted. Now, the problem is how to export CUDA code to C# as I would like to make front end and everything else in C#.
From this link:
http://www.codeproject.com/Articles/9826/How-to-create-a-DLL-library-in-C-and-then-use-it-w
I know how to make a library in C language that can be imported into C# application as Win32 dll.
But my question is, how to make CUDA application dll (or some other extension) that can be shipped to C# and used from C# application?
It would be good if there is somewhere tutorial for CUDA like the one for C library to C# app(above link).
I am using Win7 64 bit, Visual Studio 2010 Ultimate, Cuda Toolikt 5.0 and NSight 2.2.012313

ManagedCUDA is perfect for this type of thing. First you need to follow the instructions in the documentation to set up your Visual Studio Project.
Here is an example of a solution:
test.cu (compiles to test.ptx)
#if !defined(__CUDACC__)
#define __CUDACC__
#include <host_config.h>
#include <device_launch_parameters.h>
#include <device_functions.h>
#include <math_functions.h>
#endif
extern "C"
{
__global__ void test(float * data)
{
float a = data[0];
float b = data[1];
float c = data[2];
data[0] = max(a, max(b, c));
}
}
and here is the C# code:
private static void Test()
{
using (CudaContext ctx = new CudaContext())
{
CudaDeviceVariable<float> d = new CudaDeviceVariable<float>(3);
CUmodule module = ctx.LoadModulePTX("test.ptx");
CudaKernel kernel = new CudaKernel("test", module, ctx)
{
GridDimensions = new dim3(1, 1),
BlockDimensions = new dim3(1, 1)
};
kernel.Run(d.DevicePointer);
}
}
This is just a proof of concept, the device memory is not even initialized and the result is not read but is enough to illustrate how to do it.
You have several options how to distribute your application. In this case i opted for compiling the .cu file into PTX and load it inside the C# project from filesystem.
You could also embed the PTX as a resource directly into your C# application.
You could also compile into a cubin and load or embed that instead of PTX.

Related

How can I create and use a dynamic library ".so" from CUDA C++ and use it inside C# code under Linux environment (CentOS)?

I am trying to create a dynamic library .so using CUDA C++ kernel to use it inside C# code under Linux environment (CentOS). I searched for a way yo do this, but unfortunately didn't find a complete clear solution for it. Some solutions only make partial part of it, like creating C++ shared library on Linux, or creating chain of libraries in CUDA using nvcc, but there was no method for creating a dynamic library from CUDA C++. The using of .so created from C++ seamed possible like in this solution.
Is there a way to create this dynamic library and use it successfully inside C# code?

After searching multiple different solutions, and trying to collect and test the available possibilities, I finally reached to this simple method.
The original C++ library can be made using gcc in one step like this answer.
gcc -shared -o dll.so -fPIC dllmain.cpp
but make sure to add extern "C" before the required function(s) inside .cpp file, like this:
#include <stdio.h>
extern "C" void func()
{
// code
}
For CUDA C++, nvcc can be used in the same way similarly to this answer and this answer combined. Make sure to use .so instead of .dll and use the proper device architecture, I used 60 here as I am using "Tesla P100-PCIE-16GB".
nvcc -arch=sm_60 --compiler-options '-fPIC' -o dll.so --shared kernel.cu
The .cu file will be similar to this.
#include "cuda_runtime.h"
#include "device_launch_parameters.h"
#include <stdio.h>
extern "C" void myfunc(int a, int b, ...);
__global__ void kernel(int a, int b, ...);
__global__ void kernel(int a, int b, ...)
{
int i = threadIdx.x;
// kernel code
}
void myfunc(int a, int b, ...)
{
// code
}
Now the dynamic library .so is created and can be used inside C# code like this.
using System;
using System.Runtime.InteropServices;
class Program
{
[DllImport("dll.so")]
static extern myfunc(int a, int b, ...);
private void Method()
{
int a, b;
// code
myfunc(a, b, ...);
}
}
The C# code then is compiled using Mono.
mcs Program.cs
mono Program.exe
But it will probably be necessary to set the path of the used library like this.
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/library/
This worked for a simple CUDA C++ code, it will likely work for other ones, but some problems may arise depending on their complexity.

Use C# Dll in Java Android Project

I would like to use a C# DLL in an Android Project that is developing by Java (Not Xamarin or Mono).
My first strategy to solve this contains these steps:
Create a C++ usable DLL by DllExport from the C# code. (create a C++ wrapper for C# code)
Create a .so library from the C++ project that uses generated DLL from the previous step.
So I've used the DLLExport (https://github.com/3F/DllExport)
C# Side:
namespace NumbersLibCS
{
public class Math
{
public Math()
{
}
[DllExport]
public static int GetRandomEvenNumber()
{
var rnd = new Random();
int n = rnd.Next(1, 1000 ) * 2;
return n;
}
}
}
But in C++ to use the generated DLL I think I should use just windows.h library that is not available on the Android OS:
#include <windows.h>
typedef int(__cdecl *GetRandomEvenNumber)();
int main(){
HMODULE lib = LoadLibrary("mycsharp.dll");
auto getRndNumber= (GetRandomEvenNumber)GetProcAddress(lib, "GetRandomEvenNumber");
int c = getRndNumber();
}
Now here are my questions:
How can I use DLLExport generated DLL without windows.h?
Can I use .NetCore 2.0 to achieve this without using DLLExport?

Boost.Interprocess v1.66 - get_bootstamp segfault with C#

I have problem with Boost.Interprocess (v1.66) library which I use in my C/C++ library which I use in C# through Marshalling (calling C native code from C#).
I found the problem if I was using Boost.Interprocess named_semaphore for sync between processes. (in open_or_create mode)
If I use my C/C++ lib with another native C/C++ code everything works fine (under newest Windows 10, Linux (4+ kernel) and even Mac OS X (>=10.11)).
The problem occurred under Windows - with C# I have C wrapper around C++ code. If I use Marshalling with simple own-build EXE --> Everything works! But If I use The same C# code (with the same C lib) in the third party application as a DLL plugin I got segfault from get_bootstamp in named_semaphore.
So I have third-party C# SW for which I create plugins (C# DLL). In that plugin I use my C library through marshalling. Marshalling work fine in test C# project (which just call C functions from C lib) but same code segfault in third-party SW.
C Library workflow:
Init all necessary C structures
Start desired TCP server (native C/C++ app) using Boost.Process
Wait for server (through named_semaphore) <-- segfault
Connect to the server...
C# code has same workflow.
Found the problem
The problem occured in boost::interprocess::ipcdetail::get_booststamp (which is called in named_semaphore). here:
struct windows_bootstamp
{
windows_bootstamp()
{
//Throw if bootstamp not available
if(!winapi::get_last_bootup_time(stamp)){
error_info err = system_error_code();
throw interprocess_exception(err);
}
}
//Use std::string. Even if this will be constructed in shared memory, all
//modules/dlls are from this process so internal raw pointers to heap are always valid
std::string stamp;
};
inline void get_bootstamp(std::string &s, bool add = false)
{
const windows_bootstamp &bootstamp = windows_intermodule_singleton<windows_bootstamp>::get();
if(add){
s += bootstamp.stamp;
}
else{
s = bootstamp.stamp;
}
}
If I debug to the line
const windows_bootstamp &bootstamp = windows_intermodule_singleton<windows_bootstamp>::get()
booststamp.stamp is not readable. The size is set to 31, capacity is set to some weird value (like 19452345) and the data is not readable. If i step over to
s += bootstamp.stamp;
the segfault occured!
Found the reason
I debug once more and set debug point to the windows_bootstamp constructor entry and I got no hit so the stamp is not initialized (I guess).
Confirmation
If I change get_bootstamp to
inline void get_bootstamp(std::string &s, bool add = false)
{
const windows_bootstamp &bootstamp = windows_intermodule_singleton<windows_bootstamp>::get();
std::string stamp;
winapi::get_last_bootup_time(stamp);
if(add){
s += stamp;
}
else{
s = stamp;
}
}
Recompile my lib and exe - everything works fine (without any problem).
My question is - what I am doing wrong? I read Boost.Interprocess doc really thoroughly but there are no advice/warnings about my problem (yeah there is "COM Initialization" in Interprocess doc but it not seems helpfull).
Or it's just a bug in Boost.interprocess and I may report it to Boost bug tracker?
Notice - if I start server manually (before I run C# code) It works without segfaults

Calling native Win32 code from .NET (C++/CLI and C#)

I'm developing an app for C#, and I want to use DirectX (mostly Direct2D) for the graphical component of it. So I'm trying use use C++/CLI as an intermediary layer between the native C++ code and the managed code of C#. So far a have 3 projects in my solution: A C# project (which I won't really discuss since it's not giving me any problems yet), a C++ static library that includes Windows.h, and a dynamic C++/CLI library that's intended to marshal information between the other two projects. Here is my code so far:
In the native C++ project, I have a class named RenderWindowImpl that so for only contains 2 methods:
//RenderWindowImpl.h
#pragma once
#include <Windows.h>
class RenderWindowImpl final
{
public:
RenderWindowImpl() = default;
~RenderWindowImpl() = default;
int test();
private:
static void InitializeWin32Class();
};
// RenderWindowImpl.cpp
#include "RenderWindowImpl.h"
int RenderWindowImpl::test()
{
return 5;
}
void RenderWindowImpl::InitializeWin32Class()
{
WNDCLASSEXW wc = { 0 };
wc.cbSize = sizeof(WNDCLASSEXW);
wc.style = CS_HREDRAW | CS_VREDRAW;
wc.lpfnWndProc = nullptr;
wc.hInstance = GetModuleHandleW(0);
wc.hCursor = LoadCursorW(nullptr, IDC_ARROW);
//wc.lpszClassName = L"wz.1RenderWindowImpl";
//// TODO: error check
//RegisterClassExW(&wc);
}
And in my C++/CLI project, I have a class named RenderWindow that acts as a wrapper around RenderWindowImpl:
// wzRenderWindow.h
#pragma once
//#pragma managed(push, off)
#include "RenderWindowImpl.h"
//#pragma managed(pop)
using namespace System;
namespace wzRenderWindow {
public ref class RenderWindow sealed
{
public:
RenderWindow();
~RenderWindow();
int test();
private:
RenderWindowImpl* impl;
};
}
// wzRenderWindow.h.
#include "stdafx.h"
#include "wzRenderWindow.h"
wzRenderWindow::RenderWindow::RenderWindow()
{
// Initialize unmanaged resource
impl = new RenderWindowImpl();
try
{
// Any factory logic can go here
}
catch (...)
{
// Catch any exception and avoid memory leak
delete impl;
throw;
}
}
wzRenderWindow::RenderWindow::~RenderWindow()
{
// Delete unmanaged resource
delete impl;
}
int wzRenderWindow::RenderWindow::test()
{
return impl->test();
}
When I compile my project, I get the following warnings and errors:
Error LNK1120 1 unresolved externals wzRenderWindow d:\documents\visual studio 2015\Projects\WizEngCS\Debug\wzRenderWindow.dll 1
Warning LNK4075 ignoring '/EDITANDCONTINUE' due to '/OPT:LBR' specification wzRenderWindow d:\documents\visual studio 2015\Projects\WizEngCS\wzRenderWindow\wzRenderWindowImpl.lib(RenderWindowImpl.obj) 1
Error LNK2019 unresolved external symbol __imp__LoadCursorW#8 referenced in function "private: static void __cdecl RenderWindowImpl::InitializeWin32Class(void)" (?InitializeWin32Class#RenderWindowImpl##CAXXZ) wzRenderWindow d:\documents\visual studio 2015\Projects\WizEngCS\wzRenderWindow\wzRenderWindowImpl.lib(RenderWindowImpl.obj) 1
It seems to be the call to LoadCursorW that C++/CLI doesn't like, as the code compiles fine if I comment out that line. With the Win32 function calls removed, I was able to successfully call RenderWindow::test() from a C# application, outputting the expected result of 5.
I'm a bit of a loss because my understanding of C++/CLI is that it's very good at wrapping native C++ classes for consumption by managed .NET applications. I would really like to understand why my code is not compiling.
As a related follow-up question, am I barking up the wrong tree here? What's the conventional way to access DirectX methods (or similar COM-based C/C++ libraries) from .NET? I'd like to avoid using 3rd-party wrapper libraries like SharpDX.

I fixed the problem by putting #pragma comment(lib, "User32.lib") at the top of my RenderWindowImpl.cpp. Thanks to #andlabs for the fix. I'm not sure why this fixed the problem (I've never needed to explicitly link to user32.lib in any of my previous projects).

Unmanaged C++ wrapper for C# usage

SOLVED: Thanks to Casey Price for their answer. I then ran into 2 other errors: BadImageFormatException and FileNotFoundException, the former was solved by matching the platform target (x64 or x86) for each project and the latter was solved by setting the output directory of the C# project to the directory containing the dll file.
I'm working on a game 'engine' which currently has a working graphics subsystem that draws/textures movable models. I'm trying to write a C++/CLR wrapper so I can use the engine in a C# program (as a designer tool).
My wrapper project is a C++/CLR class library and contains the following 2 files (as well as resource.h/cpp and Stdafx.h/cpp)
// pEngineWrapper.h
#pragma once
#define null NULL
#include "..\pEngine\pEntry.h"
using namespace System;
namespace pEngineWrapper
{
public ref class EngineWrapper
{
public:
EngineWrapper();
~EngineWrapper();
bool Initialise();
private:
pEntry* engine;
};
}
and the .cpp file
// This is the main DLL file.
#include "stdafx.h"
#include "pEngineWrapper.h"
pEngineWrapper::EngineWrapper::EngineWrapper()
{
engine = null;
}
pEngineWrapper::EngineWrapper::~EngineWrapper()
{
delete engine;
engine = null;
}
bool pEngineWrapper::EngineWrapper::Initialise()
{
bool result;
engine = new pEntry;
result = engine->Initialise();
if( result == false )
{
return false;
}
return true;
}
When I go to build this project however I get 14 errors: LNK2028, LNK2019, and LNK2001 which points to some classes within the engine. I have included the errors in the file below.
https://www.dropbox.com/s/ewhaas8d1te7bh3/error.txt?dl=0
I also get a lot of warnings regarding XMFLOAT/XMMATRIX which you may notice.
In all of the engine classes I use the dllexport attribute
class __declspec(dllexport) pEntry
I feel like I'm missing the point and doing it all wrong seeing all of these errors but I haven't found any documents telling me anything considerably different than what I'm doing here

you have to add a reference to the static .lib files for the game engine you are using as well as it's dll's or load the dll's manually
to add the references to the .lib files right click your project->Properties->Linker->input add the lib file to the additional dependencies
See also: DLL References in Visual C++

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to make CUDA dll that can be used in C# application? - c#

Related

How can I create and use a dynamic library ".so" from CUDA C++ and use it inside C# code under Linux environment (CentOS)?

Use C# Dll in Java Android Project

Boost.Interprocess v1.66 - get_bootstamp segfault with C#

Calling native Win32 code from .NET (C++/CLI and C#)

Unmanaged C++ wrapper for C# usage

Categories

Resources