I have a C (not C++) library that consistently uses the first parameter of functions as context object (let's call the type t_context), and I'd like to use SWIG to generate C# wrappers keep this style of call (i.e. instead of the functions being more or less isolated, wrap them as methods in some class and access the t_context via a reference from the this object within the methods).
Example (C signature):
void my_lib_function(t_context *ctx, int some_param);
Desired C# API:
class Context
{
// SWIG generated struct reference
private SWIG_t_context_ptr ctx;
public void my_lib_function(int some_param)
{
// call SWIG generated my_lib_function with ctx
}
}
I'd also be happy if someone points out to me a SWIG generated wrapper for an existing C (again: not C++) library that uses this API style; I could not find anything.
Alternatively, are there wrapper generators for the C to C# use case other than SWIG that offer more control over the API (perhaps by exposing the templates used for code generation)?
In order to work through this problem I've created the following mini header-file to demonstrate all the pieces we (probably) care about to do this for real. My goals in doing this are:
C# users shouldn't even realise there's anything non-OO happening here.
The maintainer of your SWIG module shouldn't have to echo everything and write lots of proxy functions by hand if possible.
To kick things off I wrote the following header file, test.h:
#ifndef TEST_H
#define TEST_H
struct context;
typedef struct context context_t;
void init_context(context_t **new);
void fini_context(context_t *new);
void context_func1(context_t *ctx, int arg1);
void context_func2(context_t *ctx, const char *arg1, double arg2);
#endif
And a corresponding test.c with some stub implementations:
#include <stdlib.h>
#include "test.h"
struct context {};
typedef struct context context_t;
void init_context(context_t **new) {
*new = malloc(sizeof **new);
}
void fini_context(context_t *new) {
free(new);
}
void context_func1(context_t *ctx, int arg1) {
(void)ctx;
(void)arg1;
}
void context_func2(context_t *ctx, const char *arg1, double arg2) {
(void)ctx;
(void)arg1;
(void)arg2;
}
There are a few different problems we need to solve to make this into a neat, usable OO C# interface. I'll work through them one at a time and present my preferred solution at the end. (This problem can be solved in a simpler way for Python, but the solution here will be applicable to Python, Java, C# and probably others)
Problem 1: Constructor and destructor.
Typically in an OO style C API you'd have some kind of constructor and destructor functions written that encapsulate whatever setup of your (likely opaque). To present them to the target language in a sensible way we can use %extend to write what looks rather like a C++ constructor/destructor, but is still comes out after the SWIG processing as C.
%module test
%{
#include "test.h"
%}
%rename(Context) context; // Make it more C# like
%nodefaultctor context; // Suppress behaviour that doesn't work for opaque types
%nodefaultdtor context;
struct context {}; // context is opaque, so we need to add this to make SWIG play
%extend context {
context() {
context_t *tmp;
init_context(&tmp);
// we return context_t * from our "constructor", which becomes $self
return tmp;
}
~context() {
// $self is the current object
fini_context($self);
}
}
Problem 2: member functions
The way I've set this up allows us to use a cute trick. When we say:
%extend context {
void func();
}
SWIG then generates a stub that looks like:
SWIGEXPORT void SWIGSTDCALL CSharp_Context_func(void * jarg1) {
struct context *arg1 = (struct context *) 0 ;
arg1 = (struct context *)jarg1;
context_func(arg1);
}
The two things to take away from that are:
The function that implements the extended context::func call is called context_func
There's an implicit 'this' equivalent argument going into this function as argument 1 always
The above pretty much matches what we set out to wrap on the C side to begin with. So to wrap it we can simply do:
%module test
%{
#include "test.h"
%}
%rename(Context) context;
%nodefaultctor context;
%nodefaultdtor context;
struct context {};
%extend context {
context() {
context_t *tmp;
init_context(&tmp);
return tmp;
}
~context() {
fini_context($self);
}
void func1(int arg1);
void func2(const char *arg1, double arg2);
}
This doesn't quite meet point #2 of my goals as well as I'd hoped, you have to write out the function declarations manually (unless you use a trick with %include and keeping themin individual header files). With Python you could pull all the pieces together at import time and keep it much simpler but I can't see a neat way to enumerate all the functions that match a pattern into the right place at the point where SWIG generates the .cs files.
This was sufficient for me to test (using Mono) with the following code:
using System;
public class Run
{
static public void Main()
{
Context ctx = new Context();
ctx.func2("", 0.0);
}
}
There are other variants of C OO style design, using function pointers which are possible to solve and a similar question looking at Java I've addressed in the past.
Related
I'm trying to build a a simple C# application to test passing data between languages. I have a simple main in C# calling a function that I wish to pass to C++:
using System;
namespace CS_Console
{
class Program
{
static void Main(string[] args)
{
CPP_Sum(5, 2); //pseudo code
}
}
}
and then the function in a C++ project:
CPP_Sum(int x, int y)
{
return x+y;
}
The problem is, I don't even know where to start on how to pass these between each other.
This is being done via two projects, CS_Console and CPP_Console, in the same solution in Visual Studio.
It has already been mentioned by #steveo314. The easiest way to call a C++-based function from C# is using PInvoke. I assume you can create a DLL without any documentation but this looks like it will give you all the information you need to use your DLL from C#.
Basically:
Put your function as a C++ DLL; then
Use a DllImport attribute to create a function prototype in your C# code.
Make sure you export your C++ function definitions.
Daniel's answer above is also correct, but if you really want to dive into the guts of this, you may want to learn C++/CLI, as that's more or less C++ running on top of the .NET framework, with more precise interop, and depending on where/what your DLL is, communication both ways (P/Invoke is just calling C++, not calling back into .NET... usually).
C++/CLI allows this kind of thing:
public class Program
{
public void main(String [] args)
{
NativeWrapper wrapper = new NativeWrapper(); // C++/CLI ref class
wrapper.doStuff("hey there!");
}
}
In a C++/CLI DLL, in the .h file:
// This is C++/CLI
class NativeClass; // Forward declare
ref class NativeWrapper
{
public:
NativeWrapper(); // Constructor - MUST be in .cpp file, as size of NativeClass not known here
~NativeWrapper(); // Destructor - MUST be in .cpp file, same reason as above
void doStuff(System::String^ inputString);
private:
NativeClass* mp_impl;
};
In a C++/CLI .cpp file:
class NativeClass // True C++ class, could be defined elsewhere!
{
public:
void nativeDoStuff(const std::string& inputString)
{
std::cout << "Previous input was: '" << cachedResult << "'";
cachedResult = inputString;
}
private:
std::string cachedResult{};
};
NativeWrapper::NativeWrapper()
{
mp_impl = new NativeClass();
}
NativeWrapper::~NativeWrapper()
{
delete mp_impl;
mp_impl = nullptr;
}
// Utility function!
std::string _convertClrString(System::String^ instr)
{
marshal_context context;
std::string mystring{ context.marshal_as<const char*>(instr) };
return mystring;
}
void NativeWrapper::doStuff(System::String^ inputString)
{
auto native_string = _convertClrString(inputString);
mp_impl->nativeDoStuff(native_string);
}
It's a toy example (necessary includes removed), but hopefully that gets the idea across of what's possible with C++/CLI. The link I put near the top is the main start of the resources at Microsoft for doing this. Keep in mind what files are compiling as .NET, which aren't (you can do it per-file), and what that all means.
I have a set of C functions that I need to use on an ARM target, in C++ and in C#. I can successfully wrap up the C into a C++ DLL and then into a C# DLL and use all the C functions I've bound successfully. However, I have a debug function that I want to be able to print to the C# GUI and the delegate it uses is being garbage collected rather than left in place for the duration.
Managed Debugging Assistant 'CallbackOnCollectedDelegate' has detected a
problem in 'C:\utm\pc\utm_win32_app\bin\Debug\utm_win32_app.vshost.exe'.
Additional Information: A callback was made on a garbage collected delegate of
type
'utm_dll_wrapper_cs!MessageCodec.MessageCodec_dll+guiPrintToConsoleCallback::
Invoke'. This may cause application crashes, corruption and data loss. When
passing delegates to unmanaged code, they must be kept alive by the managed
application until it is guaranteed that they will never be called.
Here's the snippet of C code that uses and sets up the callback mp_guiPrintToConsole:
#ifdef WIN32
static void (* mp_guiPrintToConsole) (const char*) = NULL;
void logMsg (const char * pFormat, ...)
{
char buffer[MAX_DEBUG_MESSAGE_LEN];
va_list args;
va_start (args, pFormat);
vsnprintf (buffer, sizeof (buffer), pFormat, args);
va_end (args);
#ifdef WIN32
if (mp_guiPrintToConsole)
{
(*mp_guiPrintToConsole) (buffer);
}
#else
// Must be on ARM
printf (buffer);
#endif
}
void initDll (void (*guiPrintToConsole) (const char *))
{
#ifdef WIN32
mp_guiPrintToConsole = guiPrintToConsole;
// This is the signal to the GUI that we're done with initialisation
logMsg ("ready.\r\n");
#endif
}
Here's the C++ code, built into a DLL along with the C code, that can be called from C# and passes in the function pointer printToConsole:
void msInitDll (void (*printToConsole) (const char *))
{
initDll (printToConsole);
}
Here's the snippet code from the C# DLL that calls msInitDll(), passing in guiPrintToConsole(), and defines the delegate onConsoleTrace, which I guess is the thing that is disappearing:
[UnmanagedFunctionPointer (CallingConvention.Cdecl)]
public delegate void _msInitDll([MarshalAs (UnmanagedType.FunctionPtr)] guiPrintToConsoleCallback callbackPointer);
public _msInitDll msInitDll;
public delegate void ConsoleTrace(string data);
public event ConsoleTrace onConsoleTrace;
public void guiPrintToConsole(StringBuilder data)
{
if (onConsoleTrace != null)
{
onConsoleTrace (data.ToString ());
}
}
public void bindDll(string dllLocation)
{
IntPtr ptrDll = LoadLibrary (dllLocation);
if (ptrDll == IntPtr.Zero) throw new Exception (String.Format ("Cannot find {0}", dllLocation));
//...
// All the other DLL function bindings are here
//...
msInitDll = (_msInitDll)bindItem(ptrDll, "msInitDll", typeof(_msInitDll));
msInitDll(guiPrintToConsole);
}
I've looked at the various answers here and the most promising seemed to be to create a static variable in the C# code:
static GCHandle gch;
...and then use that to reference onConsoleTrace in the C# bindDll() function:
gch = GCHandle.Alloc(onConsoleTrace);
However, that doesn't do me any good. I've tried a few other attempts at declaring things static but nothing seems to get me where I want to be. Can anyone suggest another approach to fixing the problem? I have a bug that I need to fix and the lack of any debug is proving quite annoying.
Rob
The following line uses some syntactic sugar:
msInitDll(guiPrintToConsole);
The full syntax is:
msInitDll(new guiPrintToConsoleCallback(guiPrintToConsole));
Hopefully now you see why the delegate can get garbage-collected.
One simple workaround:
var callback = new guiPrintToConsoleCallback(guiPrintToConsole);
msInitDll(callback);
// ... some other code
GC.KeepAlive(callback);
Now the delegate is guaranteed to be alive up to the GC.KeepAlive call.
But you most probably need the delegate to stay alive for longer. As the error message says, simply keep a reference to it. If you need it for the full C# app lifetime duration, turn the callback local to a static field in your class. Static fields are treated as GC roots as their values are always reachable.
And the answer was, in the C# DLL code, add the static variable:
public static guiPrintToConsoleCallback debugCallback;
...and then, in C# bindDLL(), change:
msInitDll(guiPrintToConsole);
...to
debugCallback = new guiPrintToConsoleCallback(guiPrintToConsole);
msInitDll(debugCallback);
Simple when you know how.
I would like to support downcasting in a SWIG-generated C# project.
I have a series of C++ std::shared_ptr-wrapped class templates that inherit from a common base. Any C++ method that returns a base class (IBasePtr) in C++ code results in a generated method that returns a concrete IBase object, which has no relation to the object I am actually trying to get. The blog post here deals with this exact problem by inserting custom code to perform a downcast based on object type metadata.
C++ (simplified for the purpose of illustration):
IBase.h:
namespace MyLib
{
enum DataTypes
{
Float32,
Float64,
Integer32
};
typedef std::tr1::shared_ptr<IBase> IBasePtr;
class IBase
{
public:
virtual ~IBase() {}
DataTypes DataType() const = 0;
};
}
CDerived.h:
#include "IBase.h"
namespace MyLib
{
template <class T>
class CDerived : public IBase
{
public:
CDerived(const DataTypes dataType)
:
m_dataType(dataType)
{}
DataTypes DataType() const
{
return m_dataType;
}
private:
DataTypes m_dataType;
};
}
CCaller.h:
#include "IBase.h"
namespace MyLib
{
class CCaller
{
public:
IBasePtr GetFloatObject()
{
//My code doesn't really do this - type identification is handled more elegantly, it's just to illustrate.
base = IBasePtr(new CDerived<float>(Float32));
return base;
}
IBasePtr GetDoubleObject()
{
//My code doesn't really do this - type identification is handled more elegantly, it's just to illustrate.
base = IBasePtr(new CDerived<double>(Float64));
return base;
}
private:
IBasePtr base;
};
}
SWIG interface:
%module SwigWrapper
%include "typemaps.i"
%include <cpointer.i>
#define SWIG_SHARED_PTR_SUBNAMESPACE tr1
%include <std_shared_ptr.i>
%shared_ptr(MyLib::IBase)
%shared_ptr(MyLib::CDerived< float >)
%shared_ptr(MyLib::CDerived< double >)
%shared_ptr(MyLib::CDerived< int >)
%typemap(ctype, out="void *") MyLib::IBasePtr &OUTPUT "MyLib::IBasePtr *"
%typemap(imtype, out="IntPtr") MyLib::IBasePtr &OUTPUT "out IBase"
%typemap(cstype, out="$csclassname") MyLib::IBasePtr &OUTPUT "out IBase"
%typemap(csin) MyLib::IBasePtr &OUTPUT "out $csinput"
%typemap(in) MyLib::IBasePtr &OUTPUT
%{ $1 = ($1_ltype)$input; %}
%apply MyLib::IBasePtr &OUTPUT { MyLib::IBasePtr & base };
%{
#include "IBase.h"
#include "CDerived.h"
#include "CCaller.h"
using namespace std;
using namespace MyLib;
%}
namespace MyLib
{
typedef std::tr1::shared_ptr<IBase> IBasePtr;
%template (CDerivedFloat) CDerived<float>;
%template (CDerivedDouble) CDerived<double>;
%template (CDerivedInt) CDerived<int>;
}
%typemap(csout, excode=SWIGEXCODE)
IBase
IBasePtr
MyLib::IBase,
MyLib::IBasePtr
{
IntPtr cPtr = $imcall;
$csclassname ret = ($csclassname) $modulePINVOKE.InstantiateConcreteClass(cPtr, $owner);$excode
return ret;
}
%pragma(csharp) imclasscode=%{
public static IBase InstantiateConcreteClass(IntPtr cPtr, bool owner)
{
IBase ret = null;
if (cPtr == IntPtr.Zero)
{
return ret;
}
int dataType = SwigWrapperPINVOKE.IBase_DataType(new HandleRef(null, cPtr));
DataTypes dt = (DataTypes)dataType;
switch (dt)
{
case DataTypes.Float32:
ret = new CDerivedFloat(cPtr, owner);
break;
case DataTypes.Float64:
ret = new CDerivedDouble(cPtr, owner);
break;
case DataTypes.Integer32:
ret = new CDerivedInt(cPtr, owner);
break;
default:
System.Diagnostics.Debug.Assert(false,
String.Format("Encountered type '{0}' that is not a supported MyLib concrete class", dataType.ToString()));
break;
}
return ret;
}
%}
The part I am struggling with is the use of SWIG's %typemap command. %typemap is intended to instruct SWIG to map input and target types, in my case via the code to perform an explicit conversion. The method InstantiateConcreteClass is generated but there are no references to it.
Is there a vital step I am missing? I wondered whether the was some additional complication due to the use of shared_ptr in native code, but I don't think this is the case.
The problem with your example seems to be that you've written typemaps for inputs, but that doesn't seem to make sense in and of itself, because the important part is getting the type right when creating things, not using them as input. As far as output arguments go the second half of this answer addresses that, but there are errors with using your typemaps for arguments too.
I've simplified your example fractionally and made it complete and working. The main thing I had to add that was missing was a 'factory' function that creates derived instances, but returns them as the base type. (If you just create them with new directly then this isn't needed).
I merged your header files and implemented an inline factory as test.h:
#include <memory>
enum DataTypes {
Float32,
Float64,
Integer32
};
class IBase;
typedef std::shared_ptr<IBase> IBasePtr;
class IBase {
public:
virtual ~IBase() {}
virtual DataTypes DataType() const = 0;
};
template <typename T> struct DataTypesLookup;
template <> struct DataTypesLookup<float> { enum { value = Float32 }; };
template <> struct DataTypesLookup<double> { enum { value = Float64 }; };
template <> struct DataTypesLookup<int> { enum { value = Integer32 }; };
template <class T>
class CDerived : public IBase {
public:
CDerived() : m_dataType(static_cast<DataTypes>(DataTypesLookup<T>::value)) {}
DataTypes DataType() const {
return m_dataType;
}
private:
const DataTypes m_dataType;
};
inline IBasePtr factory(const DataTypes type) {
switch(type) {
case Integer32:
return std::make_shared<CDerived<int>>();
case Float32:
return std::make_shared<CDerived<float>>();
case Float64:
return std::make_shared<CDerived<double>>();
}
return IBasePtr();
}
The main changes here being the addition of some template meta programming to allow IBase to lookup the correct value of DataType from just the T template parameter and changing DataType to be const. I did this because it doesn't make sense to let CDerived instances to lie about their type - it's set exactly once and isn't something that should be exposed any further.
Given this I can then write some C# that shows how I intend to use it after wrapping:
using System;
public class HelloWorld {
static public void Main() {
var result = test.factory(DataTypes.Float32);
Type type = result.GetType();
Console.WriteLine(type.FullName);
result = test.factory(DataTypes.Integer32);
type = result.GetType();
Console.WriteLine(type.FullName);
}
}
Essentially if my typemaps are working we will have used the DataType member to transparently make test.factory return a C# proxy that matches the derived C++ type rather than a proxy that knows nothing more than the base type.
Note also that here because we have the factory we could also have modified the wrapping of that to use the input arguments to determine the output type, but this is less generic than using the DataType on the output. (For the factory approach we'd have to write per-function rather than per-type code for the correct wrapping).
We can write a SWIG interface for this example, which is substantially similar to yours and the blog post referenced but with a few changes:
%module test
%{
#include "test.h"
%}
%include <std_shared_ptr.i>
%shared_ptr(IBase)
%shared_ptr(CDerived<float>)
%shared_ptr(CDerived<double>)
%shared_ptr(CDerived<int>)
%newobject factory; // 1
%typemap(csout, excode=SWIGEXCODE) IBasePtr { // 2
IntPtr cPtr = $imcall;
var ret = $imclassname.make(cPtr, $owner);$excode // 3
return ret;
}
%include "test.h" // 4
%template (CDerivedFloat) CDerived<float>;
%template (CDerivedDouble) CDerived<double>;
%template (CDerivedInt) CDerived<int>;
%pragma(csharp) imclasscode=%{
public static IBase make(IntPtr cPtr, bool owner) {
IBase ret = null;
if (IntPtr.Zero == cPtr) return ret;
ret = new IBase(cPtr, false); // 5
switch(ret.DataType()) {
case DataTypes.Float32:
ret = new CDerivedFloat(cPtr, owner);
break;
case DataTypes.Float64:
ret = new CDerivedDouble(cPtr, owner);
break;
case DataTypes.Integer32:
ret = new CDerivedInt(cPtr, owner);
break;
default:
if (owner) ret = new IBase(cPtr, owner); // 6
break;
};
return ret;
}
%}
There are 6 notable changes highlighted via comments in that typemap:
We've told SWIG that the objects returned by factory are new, i.e. there is a transfer of ownership from C++ to C#. (This causes the owner boolean to get set correctly)
My typemap is a csout typemap and that's the only one needed.
Compared to the tutorial you linked I used $imclassname, which expands to $modulePINVOKE or equivalent correctly always.
I used %include with my header file directly to avoid repeating myself lots unnecessarily.
Rather than touch the inner workings of the wrapper I created a temporary instance of IBase directly that allowed me to access the enum value in a much cleaner way. The temporary instance has ownership set false that means we never incorrectly delete the underlying C++ instance when disposing of it.
I chose to let the default path through the switch statement return an IBase instance with no knowledge of the derived type if it couldn't be figured out for some reason.
Based on what you showed in your question it actually looks like what you're mostly struggling with is output reference arguments. Without the shared_ptr angle this wouldn't work at all. The simplest solution for wrapping this is to use %inline or %extend within SWIG to write an alternative version of the function to be used that doesn't pass output via reference arguments.
We can however make this work naturally on the C# side too, with some more typemaps. You're on the right track with the OUTPUT and %apply style typemaps you've shown, however I don't think you've got them quite right. I've extended my example to cover this as well.
First, although I don't much like using functions like this I added factory2 to test.h:
inline bool factory2(const DataTypes type, IBasePtr& result) {
try {
result = factory(type);
return true;
}
catch (...) {
return false;
}
}
The key thing to note here is that by the time we call factory2 we must have a valid reference to an IBasePtr (std::shared_ptr<IBase>), even if that shared_ptr is null. Since you're using out instead of ref in you're C# we'll need to arrange to make a temporary C++ std::shared_ptr before the call actually happens. Once the call happens we want to pass this back to the make static function we wrote for the simpler case earlier.
We're going to have to look fairly closely at how SWIG handles smart pointers to make this all work.
So secondly my SWIG interface ended up adding:
%typemap(cstype) IBasePtr &OUTPUT "out $typemap(cstype,$1_ltype)"
%typemap(imtype) IBasePtr &OUTPUT "out IntPtr" // 1
// 2:
%typemap(csin,pre=" IntPtr temp$csinput = IntPtr.Zero;",
post=" $csinput=$imclassname.make(temp$csinput,true);")
IBasePtr &OUTPUT "out temp$csinput"
// 3:
%typemap(in) IBasePtr &OUTPUT {
$1 = new $*1_ltype;
*static_cast<intptr_t*>($input) = reinterpret_cast<intptr_t>($1);
}
%apply IBasePtr &OUTPUT { IBasePtr& result }
Before the %include of the simple case.
The main things this does are:
Change the intermediate function to accept an IntPtr by reference for output. This will eventually hold the value that we want to pass in to make that is itself a pointer to a std::shared_ptr.
For the csin typemap we're going to arrange to create a temporary IntPtr and use that for the intermediate call. After the intermediate call has happened we then need to pass the output into make and assign the resulting IBase instance to our output parameter.
When we call the real C++ function we need to construct a shared_ptr for it to bind to the reference when we make the call. We also store the address of the shared_ptr into our output parameter so that C# code can pick it up and work with it later.
This is now sufficient to solve our problem. I added the following code to my original test case:
IBase result2;
test.factory2(DataTypes.Float64, out result2);
Console.WriteLine(result2.GetType().FullName);
A word of caution: this is the biggest chunk of C# code I've ever written. I tested it all on Linux with Mono using:
swig -c++ -Wall -csharp test.i && mcs -g hello.cs *.cs && g++ -std=c++11 -Wall -Wextra -shared -o libtest.so test_wrap.cxx
warning CS8029: Compatibility: Use -debug option instead of -g or --debug
warning CS2002: Source file `hello.cs' specified multiple times
Which when run gave:
CDerivedFloat
CDerivedInt
CDerivedDouble
and I think the generated marshalling is correct but you should verify it yourself.
Got this working with the help of Flexo's example above. Use of %newobject is essential here - in my original question there the lifetime of the derived objects was not being correctly managed.
I needed to make one minor change - the namespace name needed to be added to the typemap:
%typemap(csout, excode=SWIGEXCODE) MyLib::IBasePtr { // Need the fully-qualified name incl. namespace
IntPtr cPtr = $imcall;
var ret = $imclassname.make(cPtr, $owner);$excode // 3
return ret;
}
It was not necessary to make the changes suggested after point 6.
I'm writing a DLL wrapper for my C++ library, to be called from C#. This wrapper should also have callback functions called from the library and implemented in C#. These functions have for instance std::vector<unsigned char> as output parameters. I don't know how to make this. How do I pass a buffer of unknown size from C# to C++ via a callback function?
Let's take this example
CallbackFunction FunctionImplementedInCSharp;
void FunctionCalledFromLib(const std::vector<unsigned char>& input, std::vector<unsigned char>& output)
{
// Here FunctionImplementedInCSharp (C# delegate) should somehow be called
}
void RegisterFunction(CallbackFunction f)
{
FunctionImplementedInCSharp = f;
}
How should CallbackFunction be defined and what is the code inside FunctionCalledFromLib?
One of the things that dumb me is: how do I delete a buffer created by C# inside C++ code?
As of Visual Studio 2013 at least, there is a safe way to pass callbacks from C# to C++ and have C++ store them and invoke them asynchronously later from unmanaged code. What you can do is create a managed C++/CX class (e.g., named "CallbackManager") to hold the callback delegate references in a map, keyed off an enum value for each. Then your unmanaged code can retrieve a managed delegate reference from the managed C++/CX CallbackManager class via the delegate's associated enum value. That way you don't have to store raw function pointers and so you don't have to worry about the delegate being moved or garbage-collected: it stays in the managed heap throughout its lifecycle.
On the C++ side in CallbacksManager.h:
#include <unordered_map>
#include <mutex>
using namespace Platform;
namespace CPPCallbacks
{
// define callback IDs; this is what unmanaged C++ code will pass to the managed CallbacksManager class to retrieve a delegate instance
public enum class CXCallbackType
{
cbtLogMessage,
cbtGetValueForSetting
// TODO: add additional enum values as you add more callbacks
}
// defines the delegate signatures for our callbacks; these are visible to the C# side as well
public delegate void LogMessageDelegate(int level, String^ message);
public delegate bool GetValueForSettingDelegate(String^ settingName, String^* settingValueOut);
// TODO: define additional callbacks here as you need them
// Singleton WinRT class to manage C# callbacks; since this class is marked 'public' it is consumable from C# as well
public ref class CXCallbacksManager sealed
{
private:
CXCallbacksManager() { } // this is private to prevent incorrect instantiation
public:
// public methods and properties are all consumable by C# as well
virtual ~CXCallbacksManager() { }
static property CXCallbacksManager^ Instance
{
CXCallbacksManager^ get();
}
bool UnregisterCallback(CXCallbackType cbType);
void UnregisterAllCallbacks();
Delegate^ GetCallback(CXCallbackType cbType);
// define callback registration methods
RegisterLogMessageCallback(LogMessageDelegate^ cb) { RegisterCallback(CXCallbackType::cbtLogMessage, cb); }
RegisterGetValueForSettingCallback(GetValueForSettingDelegate^ cb) { RegisterCallback(CXCallbackType::GetValueForSetting, cb); }
// TODO: define additional callback registration methods as you add more callbacks
private:
void RegisterCallback(CXCallbackType cbType, Delegate^ rCallbackFunc);
typedef unordered_map<CXCallbackType, Delegate^> CALLBACK_MAP;
typedef pair<CXCallbackType, Delegate^> CBType_Delegate_Pair;
// Note: IntelliSense errors shown for static data is a Visual Studio IntellSense bug; the code below builds fine
// See http://social.msdn.microsoft.com/Forums/windowsapps/en-US/b5d43215-459a-41d6-a85e-99e3c30a162e/about-static-member-of-ref-class?forum=winappswithnativecode
static mutex s_singletonMutex;
static CXCallbacksManager^ s_rInstance;
mutex m_callbackMapMutex;
CALLBACK_MAP m_callbacksMap; // key=CallbackType, value = C# delegate (function) pointer
};
}
In CallbacksManager.cpp we implement the managed C++/CX class accessed by both C# and our unmanaged C++ code:
#include <assert.h>
#include "CXCallbacksManager.h"
using namespace Platform;
namespace CPPCallbacks
{
// define static class data
CXCallbacksManager^ CXCallbacksManager::s_rInstance;
mutex CXCallbacksManager::s_singletonMutex;
// Returns our singleton instance; this method is thread-safe
CXCallbacksManager^ CXCallbacksManager::Instance::get()
{
s_singletonMutex.lock();
if (s_rInstance == nullptr)
s_rInstance = ref new CXCallbacksManager(); // this lives until the application terminates
s_singletonMutex.unlock();
return s_rInstance;
}
// Register a C# callback; this method is thread-safe
void CXCallbacksManager::RegisterCallback(const CXCallbackType cbType, Delegate^ rCallbackFunc)
{
_ASSERTE(rCallbackFunc);
m_callbackMapMutex.lock();
m_callbacksMap.insert(CBType_Delegate_Pair(cbType, rCallbackFunc));
m_callbackMapMutex.unlock();
}
// Unregister a C# callback; this method is thread-safe
// Returns: true on success, false if no callback was registered for callbackType
bool CXCallbacksManager::UnregisterCallback(const CXCallbackType cbType)
{
m_callbackMapMutex.lock();
const bool bRemoved = (m_callbacksMap.erase(cbType) > 0);
m_callbackMapMutex.unlock();
return bRemoved;
}
// Unregister all callbacks; this method is thread-safe
void CXCallbacksManager::UnregisterAllCallbacks()
{
// must lock the map before iterating across it
// Also, we can't change the contents of the map as we iterate across it, so we have to build a vector of all callback types in the map first.
vector<CXCallbackType> allCallbacksList;
m_callbackMapMutex.lock();
for (CALLBACK_MAP::const_iterator it = m_callbacksMap.begin(); it != m_callbacksMap.end(); it++)
allCallbacksList.push_back(it->first);
for (unsigned int i = 0; i < allCallbacksList.size(); i++)
{
CALLBACK_MAP::const_iterator it = m_callbacksMap.find(allCallbacksList[i]);
if (it != m_callbacksMap.end()) // sanity check; should always succeed
UnregisterCallback(it->first);
}
m_callbackMapMutex.unlock();
}
// Retrieve a registered C# callback; returns NULL if no callback registered for type
Delegate^ CXCallbacksManager::GetCallback(const CXCallbackType cbType)
{
Delegate^ rCallbackFunc = nullptr;
m_callbackMapMutex.lock();
CALLBACK_MAP::const_iterator it = m_callbacksMap.find(cbType);
if (it != m_callbacksMap.end())
rCallbackFunc = it->second;
else
_ASSERTE(false); // should never happen! This means the caller either forgot to register a callback for this cbType or already unregistered the callback for this cbType.
m_callbackMapMutex.unlock();
return rCallbackFunc;
}
}
The delegate instances remain stored in the managed heap by our CXCallbacksManager class, so now it's easy and safe to store callbacks on the C++ side for unmanaged code to invoke later asynchronously. Here is the C# side registering two callbacks:
using CPPCallbacks;
namespace SomeAppName
{
internal static class Callbacks
{
// invoked during app startup to register callbacks for unmanaged C++ code to invoke asynchronously
internal static void RegisterCallbacks()
{
CPPCallbacks.CXCallbacksManager.Instance.RegisterLogMessageCallback(new LogMessageDelegate(LogMessageDelegateImpl));
CPPCallbacks.CXCallbacksManager.Instance.RegisterGetValueForSettingCallback(new GetValueForSettingDelegate(GetValueForSettingDelegateImpl));
// TODO: register additional callbacks as you add them
}
//-----------------------------------------------------------------
// Callback delegate implementation methods are below; these are invoked by C++
// Although these example implementations are in a static class, you could also pass delegate instances created
// from inside a non-static class, which would maintain their state just like any other instance method (i.e., they have a 'this' object).
//-----------------------------------------------------------------
private static void LogMessageDelegateImpl(int level, string message)
{
// This next line is shown for example purposes, but at this point you can do whatever you want because
// you are running in a normal C# delegate context.
Logger.WriteLine(level, message);
}
private static bool GetValueForSettingDelegateImpl(String settingName, out String settingValueOut)
{
// This next line is shown for example purposes, but at this point you can do whatever you want because
// you are running in a normal C# delegate context.
return Utils.RetrieveEncryptedSetting(settingName, out settingValueOut);
}
};
}
Lastly, here is how to invoke your registered C# callbacks from unmanaged C++ code:
#include <assert.h>
#include <atlstr.h> // for CStringW
#include "CXCallbacksManager.h"
using namespace CPPCallbacks;
// this is an unmanaged C++ function in the same project as our CXCallbacksManager class
void LogMessage(LogLevel level, const wchar_t *pMsg)
{
_ASSERTE(msg);
auto rCallback = static_cast<LogMessageDelegate^>(CXCallbacksManager::Instance->GetCallback(CXCallbackType::cbtLogMessage));
_ASSERTE(rCallback);
rCallback(level, ref new String(pMsg)); // invokes C# method
}
// this is an unmanaged C++ function in the same project as our CXCallbacksManager class
// Sets settingValue to the value retrieved from C# for pSettingName
// Returns: true if the value existed and was set, false otherwise
bool GetValueForSetting(const wchar_t *pSettingName, CStringW &settingValue)
{
bool bRetCode = false;
auto rCallback = static_cast<GetValueForSettingDelegate^>(CXCallbacksManager::Instance->GetCallback(CXCallbackType::cbtGetValueForSetting));
_ASSERTE(rCallback);
if (rCallback) // sanity check; should never be null
{
String^ settingValueOut;
bRetCode = rCallback(ref new String(pSettingName), &settingValueOut);
// store the retrieved setting value to our unmanaged C++ CStringW output parameter
settingValue = settingValueOut->Data();
}
return bRetCode;
}
This all works because although you cannot store a managed delegate reference as a member variable inside an unmanaged class, you can still retrieve and invoke a managed delegate from unmanaged code, which is what the above two native C++ methods do.
There are some things you should be aware of. The first is that if you are calling a .NET delegate from unmanaged code, then unless you follow some pretty narrow constraints, you will be in for pain.
Ideally, you can create a delegate in C# pass it into managed code, marshal it into a function pointer, hold onto it for as long as you like, then call it with no ill effects. The .NET documentation says so.
I can tell you that this is simply not true. Eventually, part of your delegate or its thunk will get garbage collected and when you call the function pointer from unmanaged code you will get sent into oblivion. I don't care what Microsoft says, I've followed their prescription to the letter and watched function pointers get turned into garbage, especially in server-side code behinds.
Given that, the most effective way to use function pointers is thus:
C# code calls unmanaged code, passing in delegate.
Unmanaged code marshals the delegate to a function pointer.
Unmanaged code does some work, possible calling the function pointer.
Unmanaged code drops all references to the function pointer.
Unmanaged code returns to managed code.
Given that, suppose we have the following in C#:
public void PerformTrick(MyManagedDelegate delegate)
{
APIGlue.CallIntoUnamangedCode(delegate);
}
and then in managed C++ (not C++/CLI):
static CallIntoUnmanagedCode(MyManagedDelegate *delegate)
{
MyManagedDelegate __pin *pinnedDelegate = delegate;
SOME_CALLBACK_PTR p = Marshal::GetFunctionPointerForDelegate(pinnedDelegate);
CallDeepIntoUnmanagedCode(p); // this will call p
}
I haven't done this recently in C++/CLI - the syntax is different - I think it ends up looking like this:
// This is declared in a class
static CallIntoUnamangedCode(MyManagedDelegate ^delegate)
{
pin_ptr<MyManagedDelegate ^> pinnedDelegate = &delegate;
SOME_CALLBACK_PTR p = Marshal::GetFunctionPointerForDelegate(pinnedDelegate);
CallDeepIntoUnmanagedCode(p); // This will call p
}
When you exit this routines, the pinning gets released.
When you really, really need to have function pointers hanging around for a while before calling, I have done the following in C++/CLI:
Made a hashtable that is a map from int -> delegate.
Made register/unregister routines that add new delegates into the hashtable, bumping up a counter for the hash int.
Made a single static unmanaged callback routine that is registered into unmanaged code with an int from the register call. When this routine is called, it calls back into managed code saying "find the delegate associated with <int> and call it on these arguments".
What happens is that the delegates don't have thunks that do transitions anymore since they're implied. They're free to hang around in limbo being moved by the GC as needed. When they get called, the delegate will get pinned by the CLR and released as needed. I have also seen this method fail, particularly in the case of code that statically registers callbacks at the beginning of time and expects them to stay around to the end of time. I've seen this fail in ASP.NET code behind as well as server side code for Silverlight working through WCF. It's rather unnerving, but the way to fix it is to refactor your API to allow late(r) binding to function calls.
To give you an example of when this will happen - suppose you have a library that includes a function like this:
typedef void * (*f_AllocPtr) (size_t nBytes);
typedef void *t_AllocCookie;
extern void RegisterAllocFunction(f_AllocPtr allocPtr, t_AllocCookie cookie);
and the expectation is that when you call an API that allocates memory, it will be vectored off into the supplied f_AllocPtr. Believe it or not, you can write this in C#. It's sweet:
public IntPtr ManagedAllocMemory(long nBytes)
{
byte[] data = new byte[nBytes];
GCHandle dataHandle = GCHandle.Alloc(data, GCHandleType.Pinned);
unsafe {
fixed (byte *b = &data[0]) {
dataPtr = new IntPtr(b);
RegisterPointerHandleAndArray(dataPtr, dataHandle, data);
return dataPtr;
}
}
}
RegisterPointerHandleAndArray stuffs the triplet away for safe keeping. That way when the corresponding free gets called, you can do this:
public void ManagedFreeMemory(IntPtr dataPointer)
{
GCHandle dataHandle;
byte[] data;
if (TryUnregister(dataPointer, out dataHandle, out data)) {
dataHandle.Free();
// do anything with data? I dunno...
}
}
And of course this is stupid because allocated memory is now pinned in the GC heap and will fragment it to hell - but the point is that it's doable.
But again, I have personally seen this fail unless the actual pointers are short lived. This typically means wrapping your API, so that when you call into a routine that accomplishes a specific task, it registers callbacks, does the task, and then pulls the callbacks out.
As it turns out, the answer to the original question is rather simple, once you know it, and the whole callback issue was no issue. The input buffer parameter is replaced with parameter pair unsigned char *input, int input_length, and the output buffer parameter is replaced with parameter pair unsigned char **output, int *output_length. The C# delegate should be something like this
public delegate int CallbackDelegate(byte[] input, int input_length,
out byte[] output, out int output_length);
And wrapper in C++ should be something like this
void FunctionCalledFromLib(const std::vector<unsigned char>& input, std::vector<unsigned char>& output)
{
unsigned char *output_aux;
int output_length;
FunctionImplementedInCSharp(
&input[0], input.size(), &ouput_aux, &output_length);
output.assign(output_aux, output_aux + output_length);
CoTaskMemFree(output_aux); // IS THIS NECESSARY?
}
The last line is the last part of the mini-puzzle. Do I have to call CoTaskMemFree, or will the marshaller do it for me automagically?
As for the beautiful essay by plinth, I hope to bypass the whole problem by using a static function.
There is no point to using C++/cli.
And here is a real world example from my project.
public ImageSurface(byte[] pngData)
: base(ConstructImageSurfaceFromPngData(pngData), true)
{
offset = 0;
}
private static int offset;
private static IntPtr ConstructImageSurfaceFromPngData(byte[] pngData)
{
NativeMethods.cairo_read_func_t func = delegate(IntPtr closure, IntPtr out_data, int length)
{
Marshal.Copy(pngData, offset, out_data, length);
offset += length;
return Status.Success;
};
return NativeMethods.cairo_image_surface_create_from_png_stream(func, IntPtr.Zero);
}
That is used to transfer PNG data from C# to the native cairo API.
You can see how the C function pointer cairo_read_func_t is implemented in C# and then used as a callback for cairo_image_surface_create_from_png_stream.
Here is a similar example.
The following code compile without errors. Basically, the C#2005 Console application calls VC++2005 class library which in turn calls native VC++6 code. I get the following error when I run the C#2005 application:
"Unhandled Exception: System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt."
What is the cause of this error? And how to go about correcting it?
Edit1: It crashes at the line StdStringWrapper ssw = w.GetNext();
Edit2: I followed the advice of Naveen and used an integer index instead of iterators and there is no more errors now. A big thanks to all who commented as well!
Code Written in C#2005 as Console Application:
class Program
{
static void Main(string[] args)
{
Class1 test= new Class1();
test.PerformAction();
test.PerformAction();
test.PerformAction();
test.PerformAction();
}
}
Code Written in VC++2005 as Class Library:
public ref class Class1
{
public:
void PerformAction();
};
void Class1::PerformAction()
{
DoSomethingClass d;
StdStringContainer w;
d.PerformAction(w);
for(int i=0; i<w.GetSize(); i++)
{
StdStringWrapper ssw = w.GetNext();
std::cout << ssw.CStr() << std::endl;
}
}
Code Written in VC++6 as Dynamic Link Library:
#ifdef NATIVECODE_EXPORTS
#define NATIVECODE_API __declspec(dllexport)
#else
#define NATIVECODE_API __declspec(dllimport)
#endif
class NATIVECODE_API StdStringWrapper
{
private:
std::string _s;
public:
StdStringWrapper();
StdStringWrapper(const char *s);
void Append(const char *s);
const char* CStr() const;
};
StdStringWrapper::StdStringWrapper()
{
}
StdStringWrapper::StdStringWrapper(const char *s)
{
_s.append(s);
}
void StdStringWrapper::Append(const char *s)
{
_s.append(s);
}
const char* StdStringWrapper::CStr() const
{
return _s.c_str();
}
//
class NATIVECODE_API StdStringContainer
{
private:
std::vector<StdStringWrapper> _items;
std::vector<StdStringWrapper>::iterator _it;
public:
void Add(const StdStringWrapper& item);
int GetSize() const;
StdStringWrapper& GetNext();
};
void StdStringContainer::Add(const StdStringWrapper &item)
{
_items.insert(_items.end(),item);
}
int StdStringContainer::GetSize() const
{
return _items.size();
}
StdStringWrapper& StdStringContainer::GetNext()
{
std::vector<StdStringWrapper>::iterator it = _it;
_it++;
return *it;
}
//
class NATIVECODE_API DoSomethingClass
{
public:
void PerformAction(StdStringContainer &s);
};
void DoSomethingClass::PerformAction(StdStringContainer &s)
{
StdStringWrapper w1;
w1.Append("This is string one");
s.Add(w1);
StdStringWrapper w2;
w2.Append("This is string two");
s.Add(w2);
}
The member _it in StdStringContainer is never initialized to point into the _items vector. This means it's an invalid iterator. When you assign _it to it in GetNext(), you've given it the invalid, uninitialized value that existed in _it. You then increment the uninitialized _it via _it++, which is what's triggering your fault.
As Stroustrup says in 19.2, an uninitialized iterator is an invalid iterator. This means that your uninitialized _it is invalid and that operations performed with it are undefined, and likely to cause dramatic failure.
Your problem is deeper, however. Iterators have a fundamentally different lifetime from the containers that they enumerate. There aren't really any "good" ways to do what you're trying to do with a single iterator member like this unless the container is immutable and initialized in the constructor.
If you can't expose the std:: namespace names, have you considered aliasing them via typedef's, e.g.? What about your organization or project makes it impossible to expose the template classes?
The main problem from my point of view is you are storing an iterator to a vector in your stdStringContainer class. Remember that whenever vector resizes all the existing iterators are invalidated. So whenever you do insert operation into the vector it may be possible that it resizes and your existing iterator becomes invalid. If you try to to dereference it in GetNext() then it will access invalid memory location. For checking whether this really the case try to reserve the initial vector size to some relatively big number so that the resizing doesn't happen. You can reserve the size using reserve() method, in which case it is guaranteed that the capacity() of the vector is greater than or equal to the reserved value.
Sounds like you have a memory leak. I would suggest looking anywhere where there is pointer arithmetic, writing to memory, or array usage. Check for the bounds conditions in the array accessing.
Another issue: The leak many not even be in your code. If this is the case you'll have to exclude the library from your project.
My guess is, that you have the crash because std::string and std::vector in the interface between two C++ modules were compiled with different compilers and runtime libraries.
The memory layout of vector and string maybe changed between VC6 and 2005.
When the 2005 DLL allocates objects of type StdStringContainer and StdStringWrapper, it does so based on the declarations of string and vector in the 2005 headers.
When member functions are called on these objects (which have been compiled with the VC6 compiler and libraries), they assume a different memory layout and fail with access violations.