Problem with Interop C#/C: AccessViolationException - c#

and thanks in advice for any help.
i have this trivial function in C:
__declspec(dllexport) Point* createPoint (int x, int y) {
Point *p;
p = (Point*) malloc(sizeof(Point));
p->x = x;
p->y=y;
return p;
}
Point is a very simple struct with two int fields, x and y.
I would like calling this function from C#.
I use this code:
[DllImport("simpleC.dll", EntryPoint = "createPoint", CallingConvention = CallingConvention.Cdecl, SetLastError = true, CharSet = CharSet.Auto)]
[return: MarshalAs(UnmanagedType.LPStruct)]
public static extern Point createPoint(int x, int y);
Point p = Wrapper.createPoint(1, 2);
But at runtime I have an AccessViolationException. Watching exception in detail, I found that exception is thrown from Marshal.CoTaskMemFree(IntPtr) method.
It seems that this method is unable to free memory allocated by C malloc.
What am i doing wrong?
Really thanks.

CoTaskMemFree cannot be used to free memory allocated by malloc (because they use different allocators). According to MSDN, "The runtime always uses the CoTaskMemFree method to free memory. If the memory you are working with was not allocated with the CoTaskMemAlloc method, you must use an IntPtr and free the memory manually using the appropriate method."
Additionally, Adam Nathan notes that "UnmanagedType.LPStruct is only supported for one specific case: treating a System.Guid value type as an unmanaged GUID with an extra level of indirection. ... You should probably just stay away from UnmanagedType.LPStruct."
There are two possible solutions:
Declare the return type of the method as IntPtr and use Marshal.ReadInt32 to read the fields of the struct, or use Marshal.PtrToStructure to copy the data to a managed struct, or use unsafe code to cast the IntPtr value to a Point *. The C library will need to expose a destroyPoint(Point *) method that frees the memory.
Change the C method signature to void getPoint(int x, int y, Point *). This lets C# allocate the struct, and the C method simply fills in the data values. (Most of the Win32 APIs are defined this way).
One final note: Unless your method uses the SetLastError Win32 API, you don't need to specify SetLastError = true on your P/Invoke attribute.

Since you don't have the code that frees "p", it is hard to say. However it is likely that the way malloc() and free() work together is completely different to the way C# manages memory. Since C# has garbage collection (I believe) it is likely that it uses a completely different memory management system.
In any case, the correct solution is that if you use your library to create an object, you should also use it to destroy it. Implement a "destroyPoint" function that frees the memory in your C library, import it to the C# code, and call it from there to destroy the objects created by your C library.
As a general design/coding rule, every "create" function should have a matching "free/destroy/delete" function. Apart from nothing else, it makes it easy to ensure that all created items get properly destroyed.

How is the Point type defined on the C# side?
It has to be unsafe, or you need to return a void pointer (IntPtr). The GC is not able to count references from outside (here the allocated memory), thus your code can not expect to manage externally allocated memory via the GC. One alternative is to keep a static reference to avoid a Garbage collection, if you need to keep the object persistently during the runtime of your application.

Related

C# marshal unmanaged pointer return type

I have an unmanaged library which has a function like this:
type* foo();
foo basically allocates an instance of the unmanaged type on the managed heap through Marshal.AllocHGlobal.
I have a managed version of type. It's not blittable but I have MarshalAs attributes set on members so I can use Marshal.PtrToStructure to get a managed version of it. But having to wrap calls to foo with extra bookkeeping to call Marshal.PtrToStructure is a bit annoying.
I'd like to be able to do something like this on the C# side:
[DllImport("mylib", CallingConvention = CallingConvention.Cdecl)]
[return: MarshalAs(UnmanagedType.LPStruct)]
type* foo();
and have C#'s marshaller handle the conversion behind the scenes, like it does for function arguments. I thought I should be able to do this because type is allocated on the managed heap. But maybe I can't? Is there any way to have C#'s inbuilt marshaller handle the unmanaged-to-managed transition on the return type for me without having to manually call Marshal.PtrToStructure?
A custom marshaler works fine if, on the .NET side, typeis declared as a class, not as a struct.
This is clearly stated in UnmanagedType enumeration:
Specifies the custom marshaler class when used with the
MarshalAsAttribute.MarshalType or MarshalAsAttribute.MarshalTypeRef
field. The MarshalAsAttribute.MarshalCookie field can be used to pass
additional information to the custom marshaler. You can use this
member on any reference type.
Here is some sample code that should work fine
[[DllImport("mylib", CallingConvention = CallingConvention.Cdecl)]
[return : MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef= typeof(typeMarshaler))]
private static extern type Foo();
private class typeMarshaler : ICustomMarshaler
{
public static readonly typeMarshaler Instance = new typeMarshaler();
public static ICustomMarshaler GetInstance(string cookie) => Instance;
public int GetNativeDataSize() => -1;
public object MarshalNativeToManaged(IntPtr nativeData) => Marshal.PtrToStructure<type>(nativeData);
// in this sample I suppose the native side uses GlobalAlloc (or LocalAlloc)
// but you can use any allocation library provided you use the same on both sides
public void CleanUpNativeData(IntPtr nativeData) => Marshal.FreeHGlobal(nativeData);
public IntPtr MarshalManagedToNative(object managedObj) => throw new NotImplementedException();
public void CleanUpManagedData(object managedObj) => throw new NotImplementedException();
}
[StructLayout(LayoutKind.Sequential)]
class type
{
/* declare fields */
};
Of course, changing unmanaged struct declarations into classes can have deep implications (that may not always raise compile-time errors), especially if you have a lot of existing code.
Another solution is to use Roslyn to parse your code, extract all Foo-like methods and generate one additional .NET method for each. I would do this.
type* foo()
This is very awkward function signature, hard to use correctly in a C or C++ program and that never gets better when you pinvoke. Memory management is the biggest problem, you want to work with the programmer that wrote this code to make it better.
Your preferred signature should resemble int foo(type* arg, size_t size). In other words, the caller supplies the memory and the native function fills it in. The size argument is required to avoid memory corruption, necessary when the version of type changes and gets larger. Often included as a field of type. The int return value is useful to return an error code so you can fail gracefully. Beyond making it safe, it is also much more efficient since no memory allocation is required at all. You can simply pass a local variable.
... allocates an instance of the unmanaged type on the managed heap through Marshal.AllocHGlobal
No, this is where memory management assumptions get very dangerous. Never the managed heap, native code has no decent way to call into the CLR. And you cannot assume that it used the equivalent of Marshal.AllocHGlobal(). The native code typically uses malloc() to allocate the storage, which heap is used to allocate from is an implementation detail of the CRT it links. Only that CRT's free() function is guaranteed to release it reliably. You cannot call free() yourself. Skip to the bottom to see why AllocHGlobal() appeared to be correct.
There are function signatures that forces the pinvoke marshaller to release the memory, it does so by calling Marshal.FreeCoTaskMem(). Note that this is not equivalent to Marshal.AllocHGlobal(), it uses a different heap. It assumes that the native code was written to support interop well and used CoTaskMemAlloc(), it uses the heap that is dedicated to COM interop.
It's not blittable but I have MarshalAs attributes set...
That is the gritty detail that explains why you have to make it awkward. The pinvoke marshaller does not want to solve this problem since it has to marshal a copy and there is too much risk automatically releasing the storage for the object and its members. Using [MarshalAs] is unnecessary and does not make the code better, simply change the return type to IntPtr. Ready to pass to Marshal.PtrToStructure() and whatever memory release function you need.
I have to talk about the reason that Marshal.AllocHGlobal() appeared to be correct. It did not used to be, but has changed in recent Windows and VS versions. There was a big design change in Win8 and VS2012. The OS no longer creates separate heaps that Marshal.AllocHGlobal and Marshal.AllocCoTaskMem allocate from. It is now a single heap, the default process heap (GetProcessHeap() returns it). And there was a corresponding change in the CRT included with VS2012, it now also uses GetProcessHeap() instead of creating its own heap with HeapCreate().
Very big change and not publicized widely. Microsoft has not released any motivation for this that I know of, I assume that the basic reason was WinRT (aka UWP), lots of memory management nastiness to get C++, C# and Javascript code to work together seamlessly. This is quite convenient to everybody that has to write interop code, you can now assume that Marshal.FreeHGlobal() gets the job done. Or Marshal.FreeCoTaskMem() like the pinvoke marshaller uses. Or free() like the native code would use, no difference anymore.
But also a significant risk, you can no longer assume that the code is bug-free when it works well on your dev machine and must re-test on Win7. You get an AccessViolationException if you guessed wrong about the release function. It is worse if you also have to support XP or Win2003, no crash at all but you'll silently leak memory. Very hard to deal with that when it happens since you can't get ahead without changing the native code. Best to get it right early.

P/Invoke AccessViolationException

I have an unmanaged function call that is throwing this exception when I try to pass it the path to a file name.
I've read that this is likely caused by the DLL itself but I don't think that is the case since the DLL is used in another application, so the problem is likely in my method calling the function.
The specification:
libvlc_media_new_path (libvlc_instance_t *p_instance, const char *path)
Description:
p_instance the instance
path local filesystem path
And my method:
[DllImport("libvlc", EntryPoint = "libvlc_media_new_path")]
public static extern IntPtr NewMedia(IntPtr instance,
[MarshalAs(UnmanagedType.LPStr)] string path);
I think I'm missing the convention call but what would that likely be? Or would it be something else causing this exception?
EDIT: Based on some comments I did some poking around and found... well, nothing. The struct for the instance is opaque, which means I have no idea in Laymans terms. My guess is that it means you don't need to reconstruct it in the application that is using it?
In a blind guess based on this, I replaced the return value that I had been using with the function responsible for setting the *p_instance value to a long instead of an IntPtr since when it was an IntPtr it was returning 0, and with a long I was seeing a value. Again, what an IntPtr is I don't really know. I was pretty happy to see something not 0 in the instance variable but when I ran it past that, it errored out again.
EDIT: I've expanded the question to here.
Based on the exception you're seeing and the declaration you've provided for the native function,
libvlc_media_new_path (libvlc_instance_t *p_instance, const char *path)
your p/invoke declaration is incorrect. You've mismatched the calling conventions. The default for the .NET p/invoke system is stdcall (to match the Windows API functions), but the default for C and C++ code is cdecl. You have to tell .NET explicitly that your function uses the cdecl calling convention.
So change it to look like this:
[DllImport("libvlc", EntryPoint = "libvlc_media_new_path", CallingConvention = CallingConvention.Cdecl)]
public static extern IntPtr NewMedia(IntPtr instance,
[MarshalAs(UnmanagedType.LPStr)] string path);
Of course, I'm guessing that you're right about the return value being a pointer. The native function declaration you've shown is missing the return type.
As for your question about the instance parameter, and whether you are correctly using the IntPtr type: The parameter is a pointer to a libvlc_instance_t, so you have two basic ways of making that work using p/invoke. First is to declare the parameter as an IntPtr, which gets it marshalled like a raw pointer value. This is not particularly useful for cases where the pointer needs to be anything other than opaque (i.e. retrieved from one native function, stored, and then passed to another native function). Second is to declare a managed structure that mirrors the native structure, and then write the p/invoke declaration to use this structure so that the marshaller will handle things automatically. This is most useful if you actually need to interact with the values stored in the structure pointed to by the pointer.
In this case, after a Google search, it looks like you're using one of the VLC APIs. Specifically this one. That also tells us what an libvlc_instance_t is: it is an opaque structure that represents a libvlc instance. So declaring a managed structure is not an option here, because the structure is treated as opaque even by the native code. All you really need is the pointer, passed back and forth; a perfect case for the first method I talked about above. So the declaration shown above is your winner.
The only battle now is obtaining a valid pointer to a libvlc instance that you can pass to the function whenever you call it. Chances are good that will come from a prior call to a function like libvlc_new, which is documented as creating and intializing a new libvlc instance. Its return value is exactly what you need here. So unless you've already created a libvlc instance (in which case, use that pointer), you will also need to call this function and store its result.
If the documentation is correct about the required values for the libvlc_new function's parameters, you can declare it very simply:
[DllImport("libvlc", EntryPoint = "libvlc_new", CallingConvention = CallingConvention.Cdecl)]
public static extern IntPtr NewCore(int argc, IntPtr argv);
And call it thus:
IntPtr pLibVlc = NewCore(0, IntPtr.Zero);
// now pLibVlc should be non-zero
And of course, I know nothing about the VLC APIs, but my general knowledge of API design tells me that you will probably need to call the libvlc_release function with that same pointer to an instance once you're finished with it.
Try without the [MarshalAs(UnmanagedType.LPStr)], usually works for me.

Is it safe to keep C++ pointers in C#?

I'm currently working on some C#/C++ code which makes use of invoke. In the C++ side there is a std::vector full of pointers each identified by index from the C# code, for example a function declaration would look like this:
void SetName(char* name, int idx)
But now I'm thinking, since I'm working with pointers couldn't I sent to C# the pointer address itself then in code I could do something like this:
void SetName(char*name, int ptr)
{
((TypeName*)ptr)->name = name;
}
Obviously that's a quick version of what I'm getting at (and probably won't compile).
Would the pointer address be guaranteed to stay constant in C++ such that I can safely store its address in C# or would this be too unstable or dangerous for some reason?
In C#, you don't need to use a pointer here, you can just use a plain C# string.
[DllImport(...)]
extern void SetName(string name, int id);
This works because the default behavior of strings in p/invoke is to use MarshalAs(UnmanagedType.LPStr), which converts to a C-style char*. You can mark each argument in the C# declaration explicitly if it requires some other way of being marshalled, eg, [MarshalAs(UnmanagedType.LPWStr)], for an arg that uses a 2-byte per character string.
The only reason to use pointers is if you need to retain access to the data pointed to after you've called the function. Even then, you can use out parameters most of the time.
You can p/invoke basically anything without requiring pointers at all (and thus without requiring unsafe code, which requires privileged execution in some environments).
Yes, no problem. Native memory allocations never move so storing the pointer in an IntPtr on the C# side is fine. You need some kind of pinvoked function that returns this pointer, then
[DllImport("something.dll", CharSet = CharSet.Ansi)]
void SetName(IntPtr vector, string name, int index);
Which intentionally lies about this C++ function:
void SetName(std::vector<std::string>* vect, const char* name, int index) {
std::string copy = name;
(*vect)[index] = copy;
}
Note the usage of new in the C++ code, you have to copy the string. The passed name argument points to a buffer allocated by the pinvoke marshaller and is only valid for the duration of the function body. Your original code cannot work. If you intend to return pointers to vector<> elements then be very careful. A vector re-allocates its internal array when you add elements. Such a returned pointer will then become invalid and you'll corrupt the heap when you use it later. The exact same thing happens with a C# List<> but without the risk of dangling pointers.
I think it's stable till you command C++ code and perfectly aware what he does, and other developers that work on the same code know about that danger too.
So by my opinion, it's not very secure way of architecture, and I would avoid it as much as I can.
Regards.
The C# GC moves things, but the C++ heap does not move anything- a pointer to an allocated object is guaranteed to remain valid until you delete it. The best architecture for this situation is just to send the pointer to C# as an IntPtr and then take it back in C++.
It's certainly a vastly, incredibly better idea than the incredibly BAD, HORRIFIC integer cast you've got going there.

Will struct modifications in C# affect unmanaged memory?

My gut reaction is no, because managed and unmanaged memory are distinct, but I'm not sure if the .NET Framework is doing something with Marshaling behind the scenes.
What I believe happens is:
When getting a struct from my unmanaged DLL, it is the same as making that call gets an IntPtr and then uses it and the Marshal class to copy the struct into managed memory (and changes made to the struct in managed memory do not bubble up).
I can't seem to find this documented anywhere on MSDN. Any links would be appreciated.
Here is what my code looks like:
[DllImport("mydll.dll", BestFitMapping=false, CharSet=CharSet.Ansi)]
private static extern int GetStruct(ref MyStruct s);
[StructLayout(LayoutKind.Sequential, Pack=0)]
struct MyStruct
{
public int Field1;
public IntPtr Field2;
}
public void DoSomething()
{
MyStruct s = new MyStruct();
GetStruct(ref s);
s.Field1 = 100; //does unmanaged memory now have 100 in Field1 as well?
s.Field2 = IntPtr.Zero; //does unmanaged memory now have a NULL pointer in field Field2 as well?
}
No, the P/Invoke marshaller copied the unmanaged structure member values into the managed version of the structure. In general, the managed version of a structure is not in any way compatible with the unmanaged version of it. The memory layout is not discoverable, something the CLR uses to reorder fields to make the structure smaller. Marshaling is essential, you have to create a copy.
Modifying the structure is not possible with the given function signature since you let fill in the memory that's passed to it. The function itself already copies the structure. You can however party on the Field2 value since it is a raw pointer. If that points to a structure then marshal it yourself with Marshal.PtrToStructure(). Modify the managed copy of it and copy it back to unmanaged memory with Marshal.StructureToPtr(). Or access it directly with Marshal.ReadXxx() and WriteXxx().
CSharp Language Specification.doc pg 26
Struct constructors are invoked with the new operator, but that does not imply that memory is being allocated. Instead of dynamically allocating an object and returning a reference to it, a struct constructor simply returns the struct value itself (typically in a temporary location on the stack), and this value is then copied as necessary.
Since, there is nothing special about a 'struct' backing store, so one would not expect there to be annonymous marshalling operations going on behind the member assignments.

Do I need to delete structures marshaled via Marshal.PtrToStructure in unmanaged code?

I have this C++ code:
extern "C" __declspec(dllexport) VOID AllocateFoo(MY_DATA_STRUCTURE** foo)
{
*foo = new MY_DATA_STRUCTURE;
//do stuff to foo
}
Then in C# I call the function thus:
[DllImport("MyDll.dll")]
static extern void AllocateFoo(out IntPtr pMyDataStruct);
...
MyDataStructure GetMyDataStructure()
{
IntPtr pData;
ManagedAllocateFooDelegate(out pData);
MyDataStructure foo = (MyDataStructure)Marshal.PtrToStructure(pData, typeof(MyDataStructure));
return foo;
}
Where MyDataStructure is a struct (not class) which corresponds to MY_DATA_STRUCTURE and members are marshalled appropriately.
So questions: do I need to store pData and then release it again in unmanaged code when MyDataStructure is GC'd?
MSDN says for Marshal.PtrToStructure(IntPtr, Type):
"Marshals data from an unmanaged block of memory to a newly allocated managed object of the specified type."
In that sentence does "Marshall" mean "copy"? In which case I'd need to preserve (IntPtr pData) and then pass it to unmanaged code (in the MyDataStructure destructor) so I can do a C++ "delete"?
I've searched but I can't locate a sufficiently explicit answer for this.
As Erik said, the Marshal does mean copy, but I don't think he answered the main point of your question.
Do you need to hold onto the pData native pointer until the MyDataStructure is GCed? No.
Once marshaled, your MyDataStructure instance, foo, contains a copy of the structure pointed to by pData. You need not hold onto pData any longer. To avoid a memory leak, you must pass that pData into another unmanaged function that will delete it, and that can be done right after the marshaling, regardless of how long you hold on to the MyDataStructure instance.
Yes, in this case, Marshall means copy; thus, you need to deallocate your memory in unmanaged code. All the call to PtrToStructure does is read a number of bytes indicated by the size of the destination structure 'MyDataStructure' from the memory location pointed to by pData.
The details of course depend on exactly what 'MyDataStructure' looks like (do you use any FieldOffset or StructLayout attributes in MyDataStructure) - but the end result is that the return from PtrToStructure is a copy of the data.
As GBegen points out in his answer, I didn't answer the main point of your question. Yes, you will need to delete the unmanaged copy of your structure in unmanaged code, but no, you don't need to hold onto pData - you can delete the unmanaged copy as soon as the call to PtrToStructure completes.
PS: I've edited my post to contain this information so as to consolidate the answers into one post - if anyone upvotes this answer, please upvote GBegen's answer as well for his contribution.

Categories