How safe is ref when used with unsafe code? - c#

Using Microsoft Visual C# 2010, I recently noticed that you can pass objects by ref to unmanaged code. So I tasked myself with attempting to write some unmanaged code that converts a C++ char* to a a C# string using a callback to managed code. I made two attempts.
Attempt 1: Call unmanaged function that stores a ref parameter. Then, once that function has returned to managed code, call a another unmanaged function that calls a callback function that converts the char* to a managed string.
C++
typedef void (_stdcall* CallbackFunc)(void* ManagedString, char* UnmanagedString);
CallbackFunc UnmanagedToManaged = 0;
void* ManagedString = 0;
extern "C" __declspec(dllexport) void __stdcall StoreCallback(CallbackFunc X) {
UnmanagedToManaged = X;
}
extern "C" __declspec(dllexport) void __stdcall StoreManagedStringRef(void* X) {
ManagedString = X;
}
extern "C" __declspec(dllexport) void __stdcall CallCallback() {
UnmanagedToManaged(ManagedString, "This is an unmanaged string produced by unmanaged code");
}
C#
[DllImport("Name.dll", CallingConvention = CallingConvention.StdCall)]
public static extern void StoreCallback(CallbackFunc X);
[DllImport("Name.dll", CallingConvention = CallingConvention.StdCall)]
public static extern void StoreManagedStringRef(ref string X);
[DllImport("Name.dll", CallingConvention = CallingConvention.StdCall)]
public static extern void CallCallback();
[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate void CallbackFunc(ref string Managed, IntPtr Native);
static void Main(string[] args) {
string a = "This string should be replaced";
StoreCallback(UnmanagedToManaged);
StoreManagedStringRef(ref a);
CallCallback();
}
static void UnmanagedToManaged(ref string Managed, IntPtr Unmanaged) {
Managed = Marshal.PtrToStringAnsi(Unmanaged);
}
Attempt 2: Pass string ref to unmanaged function that passes the string ref to the managed callback.
C++
typedef void (_stdcall* CallbackFunc)(void* ManagedString, char* UnmanagedString);
CallbackFunc UnmanagedToManaged = 0;
extern "C" __declspec(dllexport) void __stdcall StoreCallback(CallbackFunc X) {
UnmanagedToManaged = X;
}
extern "C" __declspec(dllexport) void __stdcall DoEverything(void* X) {
UnmanagedToManaged(X, "This is an unmanaged string produced by unmanaged code");
}
C#
[DllImport("Name.dll", CallingConvention = CallingConvention.StdCall)]
public static extern void StoreCallback(CallbackFunc X);
[DllImport("Name.dll", CallingConvention = CallingConvention.StdCall)]
public static extern void DoEverything(ref string X);
[UnmanagedFunctionPointer(CallingConvention.StdCall)]
public delegate void CallbackFunc(ref string Managed, IntPtr Unmanaged);
static void Main(string[] args) {
string a = "This string should be replaced";
StoreCallback(UnmanagedToManaged);
DoEverything(ref a);
}
static void UnmanagedToManaged(ref string Managed, IntPtr Unmanaged) {
Managed = Marshal.PtrToStringAnsi(Unmanaged);
}
Attempt 1 doesn't work but attempt 2 does. In attempt 1 it seems that as soon as the unmanaged code returns after storing the ref, the ref becomes invalid. Why is this happening?
Given the outcomes of attempt 1, I have doubts that attempt 2 will work reliably. So, how safe is ref on the unmanaged side of code when used with unmanaged code? Or in other words, what won't work in unmanaged code when using ref?
Things I'd like to know are are:
What exactly happens when objects are passed using ref to unmanaged code?
Does it guarantee that the objects will stay at their current position in memory while the ref is being used in unmanaged code?
What are the limitations of ref (what can't I do with a ref) in unmanaged code?

A complete discussion of how p/invoke works is beyond the proper scope of a Stack Overflow Q&A. But briefly:
In neither of your examples are you really passing the address of your managed variable to the unmanaged code. The p/invoke layer includes marshaling logic that translates your managed data to something usable by the unmanaged code, and then translates back when the unmanaged code returns.
In both examples, the p/invoke layer has to create an intermediate object for the purpose of marshaling. In the first example, this object is gone by the time you call the unmanaged code again. Of course in the second example, it's not, since all of the work happens all at once.
I believe that your second example should be safe to use. That is, the p/invoke layer is smart enough to handle ref correctly in that case. The first example is unreliable because p/invoke is being misused, not because of any fundamental limitation of ref parameters.
A couple of additional points:
I wouldn't use the word "unsafe" here. Yes, calling out to unmanaged code is in some ways unsafe, but in C# "unsafe" has a very specific meaning, related to the use of the unsafe keyword. I don't see anything in your code example that actually uses unsafe.
In both examples, you have a bug related to your use of the delegate passed to unmanaged code. In particular, while the p/invoke layer can translate your managed delegate reference to a function pointer that unmanaged code can use, it doesn't know anything about the lifetime of the delegate object. It will keep the object alive long enough for the p/invoked method call to complete, but if you need it to live longer than that (as would be the case here), you need to do that yourself. For example, use GC.KeepAlive() on a variable in which you've stored the reference. (You likely can reproduce a crash by inserting a call to GC.Collect() between the call to StoreCallback() and the later call to unmanaged code where the function pointer would be used).

Related

Preserve C++ Pointer in C#

I am writing a little Program in C# which includes a C++ Dll.
In C++, there are many classes which needed to be instanced and left for later use.
This looks like the following function:
C++:
__declspec(dllexport) FrameCapture* GetFrameCapturer(HWND windowHandle) {
ProcessWindow* window = ProcessWindowCollection::GetInstance()->FindWindow(windowHandle);
FrameCapture* capture = new FrameCapture(window);
return capture;
}
As you can see I just create a FrameCapture class and return a Pointer to it.
This Pointer is stored in C# as an IntPtr.
C#:
[DllImport("<dllnamehere>")]
public static extern IntPtr GetFrameCapturer(IntPtr windowHandle);
This works really well so far.
But if I use that Pointer to get an Instance of FrameCapture
C++:
__declspec(dllexport) BITMAPFILEHEADER* GetBitmapFileHeader(FrameCapture* frameCapturer) {
return frameCapturer->GetBitmapFileHeader();
}
the class will be completely empty.
How do I get the Instance of the Class I initialized in step one?
EDIT:
I did some testing and replaced the Pointers with integers which are better to look at.
I casted 'capture' to an Int32 and returned this instead.
In my testcase it returned byte(208,113,244,194).
This values are, as expected, in C++ and C# the same.
But, now it becomes odd.
If I pass this Int32 into 'GetBitmapFileHeader' the value becomes suddenly byte(184,231,223,55).
That's not even close! I thought of Little <-> Big Endian or something like this but, this is a whole new Memoryblock?
The same behavior will go on with the IntPtr.
As requested I post also the Import of 'GetBitmapFileHeader'
[DllImport("<dllnamehere>")]
public static extern tagBITMAPFILEHEADER GetBitmapFileHeader(IntPtr capturerHandle);
Okay, I got it.
See this import from C#.
C#:
[DllImport("<dllnamehere>")]
public static extern tagBITMAPFILEHEADER GetBitmapFileHeader(IntPtr capturerHandle);
Its wrong!
The function now returns an IntPtr wich works completely fine.
This is the new Setup:
C++:
__declspec(dllexport) void* __stdcall GetBitmapFileHeader(void* frameCapturer) {
FrameCapture* cap = (FrameCapture*)frameCapturer;
return cap->GetBitmapFileHeader();
}
C#:
[DllImport("libWinCap.dll", CallingConvention = CallingConvention.StdCall)]
public static extern IntPtr GetBitmapFileHeader(IntPtr frameCapturer);
[...]
//calling
IntPtr ptr = GetBitmapFileHeader(m_capturerHandle);
m_bitmap.m_fileHeader = (tagBITMAPFILEHEADER)Marshal.PtrToStructure(ptr, typeof(tagBITMAPFILEHEADER));
Iam only moving Pointers now and use PtrToStructure to read the Memory.
Also, Thanks for every comment.

C# marshal native sized unsigned integer size_t, retrieve value via ref/out parameter

I have C# code that calls to a C function exported from native dll (DllImport) .
I want the C code to modify a value of x parameter passed from C# and to use modified value in managed code. C function has to be a void returning function. C# code:
uint x=0;
Func(x);
C code:
void Func(size_t x)
{
x=8;
}
I tried:
[DllImport("1.dll")]
public static extern void Func(size_t x);
But after the C# calls Func var x is still 0. I also tried the following C code. But it doesn't work either. What is my error?
void Func(size_t* x)
{
x=8;
}
Your example has several problems which have to be solved to get it working as expected. First of all your goal as I understand it is to retrieve value of a C type size_t which is set in a C void returning function via parameter.
First simple problem of retrieving value via parameters is solved with help of using either pointers to values in both C# and C or by using a combination of C# parameter modifier (ref or out) which would enforce passing of C# parameter as a pointer and a pointer in C. The function signatures will be as follows:
// Implementation with pointers
[DllImport("MyC.dll", CallingConvention = CallingConvention.Cdecl, PreserveSig = true, EntryPoint = "Func")]
public static extern unsafe void CSharpFuncPtr(UIntPtr* x);
// Implementation with parameter modifiers - ref can be raplaced by out
[DllImport("MyC.dll", CallingConvention = CallingConvention.Cdecl, PreserveSig = true, EntryPoint = "Func")]
public static extern void CSharpFuncMod(ref UIntPtr x); //
// C function implementation
void Func(size_t* x) { *x = 7; }
The second even more important problem which has to be solved during marshalling is the use of native sized unsigned integer size_t as C type. It is defined as either 32bit unsigned integer on x86 architectures or 64bit unsigned integer on x64 architectures (I do intentionally skip all other processor architectures). In .NET type system native sized integral types are missing. The workaround is to use managed unsigned pointer type UIntPtr to emulate size_t in managed type system.
.NET team is aware of this limitations and there are discussions and plans on implementing so called native integers - see: Support natural size data types in the CLR
[DllImport("MyC.dll", CallingConvention = CallingConvention.Cdecl, PreserveSig = true, EntryPoint = "Func")]
public static extern void CSharpFuncNativeInt(ref nuint x);
Finally the problem which seems to be easy at the surface is not that simple after all and simple and elegant solution requires even .NET runtime and BCL libraries changes.
If you want to use pointers to and from unmanaged code, you need to use the ref qualifier.
Like:
[DllImport("1.dll", CallingConvention = CallingConvention.Cdecl)]
public static extern void Func(ref int x);
static void Main(string[] args)
{
int value = 0;
GCHandle handle = GCHandle.Alloc(value, GCHandleType.Pinned);
Func(ref value);
handle.Free();
Console.WriteLine(value);
}
external code:
extern "C" __declspec(dllexport) void Func(int * nStatus)
{
*nStatus = 10;
}
This is because in c# you normally can't use pointers, because of the memory management (aka GC). You have to "force it" using ref-s, or unsafe code.
P.S.
It works without the GCHandle.Alloc function, but without that, there is a possibility, that while you are doing work in the unmanaged code the GC moves the value, and it gets corrupted. In this case you don't need it, but if you use a reference type instead of a value type (class instead of struct), than it becomes extremely useful.

System.AccessViolationException while marshaling from native to managed by implementing ICustomMarshaler

Somehow the continuation to the question I recently posted, I've a nasty System.AccessViolationException while trying to marshal from native to managed (by using ICustomMarshaler), which I do not understand. Here a sample code that reproduce the error (**). The C++ side:
typedef struct Nested1{
int32_t n1_a; // (4*)
char* n1_b;
char* n1_c;
} Nested1;
typedef struct Nested3{
uint8_t n3_a;
int64_t n3_b;
wchar_t* n3_c;
} Nested3;
typedef struct Nested2{
int32_t n2_a;
Nested3 nest3;
uint32_t n2_b;
uint32_t n2_c;
} Nested2;
typedef struct TestStruct{
Nested1 nest1; // (2*)
Nested2 nest2;
} TestStruct;
void ReadTest(TestStruct& ts)
{
ts.nest2.n2_c = 10; // (3*)
}
On the C# side a fake TestStruct just to show the error and the ICustomMarshaler implementation:
class TestStruct{};
[DllImport("MyIOlib.dll", CallingConvention = CallingConvention.Cdecl)]
extern static void ReadTest([Out, MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef = typeof(CustomMarshaler))]TestStruct ts);
class CustomMarshaler : ICustomMarshaler
{
public static ICustomMarshaler GetInstance(string Cookie) { return new CustomMarshaler(); }
public object MarshalNativeToManaged(IntPtr pNativeData)
{
return new TestStruct();
}
public void CleanUpNativeData(IntPtr pNativeData)
{
} // (1*)
public int GetNativeDataSize() { return 40; }
public IntPtr MarshalManagedToNative(object ManagedObj)
{
TestStruct ts = (TestStruct)ManagedObj;
IntPtr intPtr = Marshal.AllocHGlobal(GetNativeDataSize());
return intPtr;
}
}
private void Form1_Load(object sender, EventArgs e)
{
TestStruct ts = new TestStruct();
ReadTest(ts);
}
Now, I have the following:
with exactly this code I get a System.AccessViolationException just after line (1*);
if I comment out line (2*) or line (3*) I get no exception and everything works fine;
if I comment one among several other struct fields, e.g. line (3*) I get a "Managed Debugging Assistant 'FatalExecutionEngineError' has detected a problem in [...] This error may be a bug in the CLR or in the unsafe or non verifiable portions of user code. Common sources of this bug include user marshaling errors for COM-interop or PInvoke, which may corrupt the stack"
(**) I did an heavy editing of my original post because I think I've found an easier way to show the problem and leaving my previous text would have confused the reader. Hope is not a problem, however if previous readers want I can re-post my original text.
You need the In attribute as well as Out. Without the In attribute, MarshalManagedToNative is never called. And no unmanaged memory is allocated. Hence the access violation.
extern static void ReadTest(
[In, Out, MarshalAs(UnmanagedType.CustomMarshaler,
MarshalTypeRef = typeof(CustomMarshaler))]
TestStruct ts
);
Strictly speaking, the unmanaged code should use a pointer to the struct rather than a reference parameter. You can pass null from the managed code but that is then invalid as a C++ reference parameter.
void ReadTest(TestStruct* ts)
{
ts->nest2.n2_c = 10;
}
David Haffernan's reply (thanks a lot BTW!) is correct (so read that for the quick reply), here I add only a few considerations, which might be helpful for readers (even if I'm not an expert on the matter, so please take it with a pinch of salt). When MarshalManagedToNative is called to pass the struct to the native code (only [In] attribute), the code is something similar to:
public IntPtr MarshalManagedToNative(object managedObj)
{
IntPtr intPtr = MarshalUtils.AllocHGlobal(GetNativeDataSize());
// ...
// use Marshal.WriteXXX to write struct fields and, for arrays,
// Marshal.WriteIntPtr to write a pointer to unmanaged memory
// allocated with Marshal.AllocHGlobal(size)
return intPtr;
}
Now, when it is needed to read the struct from the native code as David Haffernan said we still need to allocate memory (it cannot be done on the native side as also Hans Passant suggested), hence the need to add also in this case the [In] attribute (aside the [Out] one).
However I need only the memory for the first level struct and not all the other memory for storing arrays (I have several memory consuming arrays), which is allocated on the native side (e.g. with malloc()) and should be freed later on (with a call to a method that uses free() to reclaim the native memory), hence I detect that MarshalManagedToNative is going to be used for that purpose using a public static variable in CustomMarshaler that I set once I know I need to read the struct from the native code (I understand it is not an elegant solution, but it helps me to save time):
public IntPtr MarshalManagedToNative(object managedObj)
{
IntPtr intPtr = MarshalUtils.AllocHGlobal(GetNativeDataSize());
if(readingFromNative) // this is my public static variable
return intPtr;
// ...
// use Marshal.WriteXXX to write struct fields and, for arrays, Marshal.WriteIntPtr
// to write a pointer to unmanaged memory allocated with Marshal.AllocHGlobal(size)
return intPtr;
}

Return contents of a std::wstring from C++ into C#

I have an unmanaged C++ DLL that I have wrapped with a simple C interface so I can call PInvoke on it from C#. Here is an example method in the C wrapper:
const wchar_t* getMyString()
{
// Assume that someWideString is a std::wstring that will remain
// in memory for the life of the incoming calls.
return someWideString.c_str();
}
Here is my C# DLLImport setup.
[DllImport( "my.dll", CharSet = CharSet.Unicode, CallingConvention = CallingConvention.Cdecl )]
private static extern string GetMyString();
However the string is not correctly marshalled, often screwing up the first character or sometimes way off showing a bunch of chinese characters instead. I have logged output from the implementation on the C side to confirm that the std::wstring is correctly formed.
I have also tried changing the DLLImport to return an IntPtr and convert with a wrapped method using Marshal.PtrToStringUni and it has the same result.
[DllImport( "my.dll", CallingConvention = CallingConvention.Cdecl )]
private static extern IntPtr GetMyString();
public string GetMyStringMarshal()
{
return Marshal.PtrToStringUni( GetMyString() );
}
Any ideas?
Update with Answer
So as mentioned below, this is not really an issue with my bindings but the lifetime of my wchar_t*. My written assumption was wrong, someWideString was in fact being copied during my calls to the rest of the application. Therefore it existed only on the stack and was being let go before my C# code could finish marshalling it.
The correct solution is to either pass a pointer in to my method as described by shf301, or make sure my wchar_t* reference does not get moved / reallocated / destroyed before my C# interface has time to copy it.
Returning the std::wstring down to my C layer as a "const &std::wstring" means my call to c_str() will return a reference that won't be immediately dealloc'd outside the scope of my C method.
The calling C# code then needs to use Marshal.PtrToStringUni() to copy data from the reference into a managed string.
You are going to have to rewrite your getMyString function for the reasons mentioned in Hans Passant's answer.
You need to have the C# code pass a buffer in to your C++ code. That way the your code (ok, the CLR Marshaller) controls the lifetime of the buffer and you don't get into any undefined behavior.
Below is an implementation:
C++
void getMyString(wchar_t *str, int len)
{
wcscpy_s(str, len, someWideString.c_str());
}
C#
[DllImport( "my.dll", CallingConvention = CallingConvention.Cdecl, CharSet = CharSet.Unicode )]
private static extern void GetMyString(StringBuffer str, int len);
public string GetMyStringMarshal()
{
StringBuffer buffer = new StringBuffer(255);
GetMyString(buffer, buffer.Capacity);
return buffer.ToString();
}
You need to specify MarshalAs attribute for the return value:
[DllImport( "my.dll", CharSet = CharSet.Unicode, CallingConvention = CallingConvention.Cdecl)]
[return : MarshalAs(UnmanagedType.LPWStr)]
private static extern string GetMyString();
Make sure the function is indeed cdecl and that the wstring object is not destroyed when the function returns.

How to specify whether to take ownership of the marshalled string or not?

suppose I have x.dll in C++ which looks like this
MYDLLEXPORT
const char* f1()
{
return "Hello";
}
MYDLLEXPORT
const char* f2()
{
char* p = new char[20];
strcpy(p, "Hello");
return p;
}
Now, suppose I want to use this in C#
[DllImport("x.dll")]
public static extern string f1();
[DllImport("x.dll")]
public static extern string f2();
Is there any way to tell CLR to take strong ownership of the string returned from f2, but not f1? The thing is that the fact that the string returned from f1 will eventually be freed, deleted, or whatever by GC is equally bad with the fact that the string returned from f2 won't. Hope the question was clear. Thanks in advance
If you have any influence at all over the dll implementation, then I strongly suggest you simply don't do it like you showed in your example. Otherwise, please refine the question to mention that constraint.
If you have to return a heap allocated string from the dll, then you should also provide a cleanup function (always good practice when exporting dynamically allocated memory from a dll). You P/Invoke the allocating function with a return of IntPtr and marshal that with one of the Marshal.PtrToString... at http://msdn.microsoft.com/en-us/library/atxe881w.aspx and finish off by calling the cleanup function for the native side of things.
Another way is to use BSTR (example from Marshaling BSTRs in COM/Interop or P/Invoke):
Native:
__declspec(dllexport)
void bstrtest(BSTR *x)
{
*x = SysAllocString(L"Something");
}
Managed:
[DllImport("mydll.dll")]
extern static void bstrtest(ref IntPtr dummy);
static void Main(string[] args)
{
var bstr = IntPtr.Zero;
bstrtest(ref bstr);
var text = Marshal.PtrToStringBSTR(bstr);
Console.WriteLine(text);
Marshal.FreeBSTR(bstr);
}
I just found a similar question on SO: PInvoke for C function that returns char *

Categories