My gut reaction is no, because managed and unmanaged memory are distinct, but I'm not sure if the .NET Framework is doing something with Marshaling behind the scenes.
What I believe happens is:
When getting a struct from my unmanaged DLL, it is the same as making that call gets an IntPtr and then uses it and the Marshal class to copy the struct into managed memory (and changes made to the struct in managed memory do not bubble up).
I can't seem to find this documented anywhere on MSDN. Any links would be appreciated.
Here is what my code looks like:
[DllImport("mydll.dll", BestFitMapping=false, CharSet=CharSet.Ansi)]
private static extern int GetStruct(ref MyStruct s);
[StructLayout(LayoutKind.Sequential, Pack=0)]
struct MyStruct
{
public int Field1;
public IntPtr Field2;
}
public void DoSomething()
{
MyStruct s = new MyStruct();
GetStruct(ref s);
s.Field1 = 100; //does unmanaged memory now have 100 in Field1 as well?
s.Field2 = IntPtr.Zero; //does unmanaged memory now have a NULL pointer in field Field2 as well?
}
No, the P/Invoke marshaller copied the unmanaged structure member values into the managed version of the structure. In general, the managed version of a structure is not in any way compatible with the unmanaged version of it. The memory layout is not discoverable, something the CLR uses to reorder fields to make the structure smaller. Marshaling is essential, you have to create a copy.
Modifying the structure is not possible with the given function signature since you let fill in the memory that's passed to it. The function itself already copies the structure. You can however party on the Field2 value since it is a raw pointer. If that points to a structure then marshal it yourself with Marshal.PtrToStructure(). Modify the managed copy of it and copy it back to unmanaged memory with Marshal.StructureToPtr(). Or access it directly with Marshal.ReadXxx() and WriteXxx().
CSharp Language Specification.doc pg 26
Struct constructors are invoked with the new operator, but that does not imply that memory is being allocated. Instead of dynamically allocating an object and returning a reference to it, a struct constructor simply returns the struct value itself (typically in a temporary location on the stack), and this value is then copied as necessary.
Since, there is nothing special about a 'struct' backing store, so one would not expect there to be annonymous marshalling operations going on behind the member assignments.
Related
I have an unmanaged library which has a function like this:
type* foo();
foo basically allocates an instance of the unmanaged type on the managed heap through Marshal.AllocHGlobal.
I have a managed version of type. It's not blittable but I have MarshalAs attributes set on members so I can use Marshal.PtrToStructure to get a managed version of it. But having to wrap calls to foo with extra bookkeeping to call Marshal.PtrToStructure is a bit annoying.
I'd like to be able to do something like this on the C# side:
[DllImport("mylib", CallingConvention = CallingConvention.Cdecl)]
[return: MarshalAs(UnmanagedType.LPStruct)]
type* foo();
and have C#'s marshaller handle the conversion behind the scenes, like it does for function arguments. I thought I should be able to do this because type is allocated on the managed heap. But maybe I can't? Is there any way to have C#'s inbuilt marshaller handle the unmanaged-to-managed transition on the return type for me without having to manually call Marshal.PtrToStructure?
A custom marshaler works fine if, on the .NET side, typeis declared as a class, not as a struct.
This is clearly stated in UnmanagedType enumeration:
Specifies the custom marshaler class when used with the
MarshalAsAttribute.MarshalType or MarshalAsAttribute.MarshalTypeRef
field. The MarshalAsAttribute.MarshalCookie field can be used to pass
additional information to the custom marshaler. You can use this
member on any reference type.
Here is some sample code that should work fine
[[DllImport("mylib", CallingConvention = CallingConvention.Cdecl)]
[return : MarshalAs(UnmanagedType.CustomMarshaler, MarshalTypeRef= typeof(typeMarshaler))]
private static extern type Foo();
private class typeMarshaler : ICustomMarshaler
{
public static readonly typeMarshaler Instance = new typeMarshaler();
public static ICustomMarshaler GetInstance(string cookie) => Instance;
public int GetNativeDataSize() => -1;
public object MarshalNativeToManaged(IntPtr nativeData) => Marshal.PtrToStructure<type>(nativeData);
// in this sample I suppose the native side uses GlobalAlloc (or LocalAlloc)
// but you can use any allocation library provided you use the same on both sides
public void CleanUpNativeData(IntPtr nativeData) => Marshal.FreeHGlobal(nativeData);
public IntPtr MarshalManagedToNative(object managedObj) => throw new NotImplementedException();
public void CleanUpManagedData(object managedObj) => throw new NotImplementedException();
}
[StructLayout(LayoutKind.Sequential)]
class type
{
/* declare fields */
};
Of course, changing unmanaged struct declarations into classes can have deep implications (that may not always raise compile-time errors), especially if you have a lot of existing code.
Another solution is to use Roslyn to parse your code, extract all Foo-like methods and generate one additional .NET method for each. I would do this.
type* foo()
This is very awkward function signature, hard to use correctly in a C or C++ program and that never gets better when you pinvoke. Memory management is the biggest problem, you want to work with the programmer that wrote this code to make it better.
Your preferred signature should resemble int foo(type* arg, size_t size). In other words, the caller supplies the memory and the native function fills it in. The size argument is required to avoid memory corruption, necessary when the version of type changes and gets larger. Often included as a field of type. The int return value is useful to return an error code so you can fail gracefully. Beyond making it safe, it is also much more efficient since no memory allocation is required at all. You can simply pass a local variable.
... allocates an instance of the unmanaged type on the managed heap through Marshal.AllocHGlobal
No, this is where memory management assumptions get very dangerous. Never the managed heap, native code has no decent way to call into the CLR. And you cannot assume that it used the equivalent of Marshal.AllocHGlobal(). The native code typically uses malloc() to allocate the storage, which heap is used to allocate from is an implementation detail of the CRT it links. Only that CRT's free() function is guaranteed to release it reliably. You cannot call free() yourself. Skip to the bottom to see why AllocHGlobal() appeared to be correct.
There are function signatures that forces the pinvoke marshaller to release the memory, it does so by calling Marshal.FreeCoTaskMem(). Note that this is not equivalent to Marshal.AllocHGlobal(), it uses a different heap. It assumes that the native code was written to support interop well and used CoTaskMemAlloc(), it uses the heap that is dedicated to COM interop.
It's not blittable but I have MarshalAs attributes set...
That is the gritty detail that explains why you have to make it awkward. The pinvoke marshaller does not want to solve this problem since it has to marshal a copy and there is too much risk automatically releasing the storage for the object and its members. Using [MarshalAs] is unnecessary and does not make the code better, simply change the return type to IntPtr. Ready to pass to Marshal.PtrToStructure() and whatever memory release function you need.
I have to talk about the reason that Marshal.AllocHGlobal() appeared to be correct. It did not used to be, but has changed in recent Windows and VS versions. There was a big design change in Win8 and VS2012. The OS no longer creates separate heaps that Marshal.AllocHGlobal and Marshal.AllocCoTaskMem allocate from. It is now a single heap, the default process heap (GetProcessHeap() returns it). And there was a corresponding change in the CRT included with VS2012, it now also uses GetProcessHeap() instead of creating its own heap with HeapCreate().
Very big change and not publicized widely. Microsoft has not released any motivation for this that I know of, I assume that the basic reason was WinRT (aka UWP), lots of memory management nastiness to get C++, C# and Javascript code to work together seamlessly. This is quite convenient to everybody that has to write interop code, you can now assume that Marshal.FreeHGlobal() gets the job done. Or Marshal.FreeCoTaskMem() like the pinvoke marshaller uses. Or free() like the native code would use, no difference anymore.
But also a significant risk, you can no longer assume that the code is bug-free when it works well on your dev machine and must re-test on Win7. You get an AccessViolationException if you guessed wrong about the release function. It is worse if you also have to support XP or Win2003, no crash at all but you'll silently leak memory. Very hard to deal with that when it happens since you can't get ahead without changing the native code. Best to get it right early.
I have been looking all over google to find some answers to my questions but do not quite understand what I have found. I have some objects which are created and stored in C# List after using System.IO to read some text files. After that, I want to send references (using const pointers) to each of these objects to the internal classes in C++ dll so that it can use them for computation of some algorithms.
Here are some simple example (not actual code) of what I am doing:
The C# class:
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi)]
public class SimpleClass
{
[MarshalAs(UnmanagedType.LPStr)]
public string Name;
public float Length;
}
with corresponding C struct:
struct SimpleClass
{
const char* Name;
float Length;
};
stored in
List<SimpleClass> ItemList;
after parsing some text files.
Then calling the following dll function:
C#:
[DllImport("SimpleDLL")]
public static extern void AddSimpleReference(SimpleClass inSimple);
C:
void AddSimpleReference(const SimpleClass* inSimple)
{
g_Vector.push_back(inSimple); // g_Vector is a std::vector<const SimpleClass* > type
}
What I have tried is:
for(int i=0; i<ItemList.Count;++i)
{
SimpleClass theSimpleItem = ItemList[i];
AddSimpleReference(theSimpleItem);
}
Initially, I thought it would be easy to get a actual reference/address just by using the assignment operator since classes in C# are passed-by-reference but it turns out that the C++ class is always pushing the same address value (the address value of the temp reference) into the container instead of the actual address. How do I get the actual object addresses and send it to C++ DLL so that it can have read-only access to the C# objects?
UPDATE: Sorry to those who posted answers with unsafe codes. I forgot to mention that the C# code is actually used as a script in Unity game engine which does not allow the use of unsafe codes.
First you need to change your interop signature to take a pointer (and thus making it unsafe).
[DllImport("SimpleDLL")]
public unsafe static extern void AddSimpleReference(SimpleClass* inSimple);
Then, because the GC is free to move objects around in memory as it pleases, you will need to pin the object in memory for the entire time you will need its address on the unmanaged side. For that you need the fixed statement:
SimpleClass theSimpleItem = ItemList[i];
unsafe
{
fixed(SimpleClass* ptr = &theSimpleItem)
{
AddSimpleReference(ptr);
}
}
This would work if AddSimpleReference used the pointer and then discarded it. But you're storing the pointer in a std::vector for later. That won't work, because the pointer will probably become invalid due to the GC moving the original item somewhere else once execution leaves the fixed block.
To solve this, you need to pin the items until you are done with them. To do this you may need to resort to the GCHandle type.
// Change the interop signature, use IntPtr instead (no "unsafe" modifier)
[DllImport("SimpleDLL")]
public static extern void AddSimpleReference(IntPtr inSimple);
// ----
for(int i=0; i<ItemList.Count;++i)
{
SimpleClass theSimpleItem = ItemList[i];
// take a pinned handle before passing the item down.
GCHandle handle = GCHandle.Alloc(theSimpleItem, GCHandleType.Pinned);
AddSimpleReference(GCHandle.ToIntPtr(handle));
// probably a good idea save this handle somewhere for later release
}
// ----
// when you're done, don't forget to ensure the handle is freed
// probably in a Dispose method, or a finally block somewhere appropriate
GCHandle.Free(handle);
When doing something like this, keep in mind that pinning objects in memory for a long time is a bad idea, because it prevents the garbage collector from doing its job efficiently.
Even though I think this is not a good idea, have a look at unsafe code and memory pinning. Here is a good start on MSDN.
fixed and unsafe keywords are likely what you should be looking for.
You cannot. The C# GC will move objects for fun. Your addresses will go out of scope.
and thanks in advice for any help.
i have this trivial function in C:
__declspec(dllexport) Point* createPoint (int x, int y) {
Point *p;
p = (Point*) malloc(sizeof(Point));
p->x = x;
p->y=y;
return p;
}
Point is a very simple struct with two int fields, x and y.
I would like calling this function from C#.
I use this code:
[DllImport("simpleC.dll", EntryPoint = "createPoint", CallingConvention = CallingConvention.Cdecl, SetLastError = true, CharSet = CharSet.Auto)]
[return: MarshalAs(UnmanagedType.LPStruct)]
public static extern Point createPoint(int x, int y);
Point p = Wrapper.createPoint(1, 2);
But at runtime I have an AccessViolationException. Watching exception in detail, I found that exception is thrown from Marshal.CoTaskMemFree(IntPtr) method.
It seems that this method is unable to free memory allocated by C malloc.
What am i doing wrong?
Really thanks.
CoTaskMemFree cannot be used to free memory allocated by malloc (because they use different allocators). According to MSDN, "The runtime always uses the CoTaskMemFree method to free memory. If the memory you are working with was not allocated with the CoTaskMemAlloc method, you must use an IntPtr and free the memory manually using the appropriate method."
Additionally, Adam Nathan notes that "UnmanagedType.LPStruct is only supported for one specific case: treating a System.Guid value type as an unmanaged GUID with an extra level of indirection. ... You should probably just stay away from UnmanagedType.LPStruct."
There are two possible solutions:
Declare the return type of the method as IntPtr and use Marshal.ReadInt32 to read the fields of the struct, or use Marshal.PtrToStructure to copy the data to a managed struct, or use unsafe code to cast the IntPtr value to a Point *. The C library will need to expose a destroyPoint(Point *) method that frees the memory.
Change the C method signature to void getPoint(int x, int y, Point *). This lets C# allocate the struct, and the C method simply fills in the data values. (Most of the Win32 APIs are defined this way).
One final note: Unless your method uses the SetLastError Win32 API, you don't need to specify SetLastError = true on your P/Invoke attribute.
Since you don't have the code that frees "p", it is hard to say. However it is likely that the way malloc() and free() work together is completely different to the way C# manages memory. Since C# has garbage collection (I believe) it is likely that it uses a completely different memory management system.
In any case, the correct solution is that if you use your library to create an object, you should also use it to destroy it. Implement a "destroyPoint" function that frees the memory in your C library, import it to the C# code, and call it from there to destroy the objects created by your C library.
As a general design/coding rule, every "create" function should have a matching "free/destroy/delete" function. Apart from nothing else, it makes it easy to ensure that all created items get properly destroyed.
How is the Point type defined on the C# side?
It has to be unsafe, or you need to return a void pointer (IntPtr). The GC is not able to count references from outside (here the allocated memory), thus your code can not expect to manage externally allocated memory via the GC. One alternative is to keep a static reference to avoid a Garbage collection, if you need to keep the object persistently during the runtime of your application.
Can someone explain what exactly is happening at a low level / memory management perspective on the 2 C# lines in "Main" in the following?
C++ Code (unmanaged):
#define DLLEXPORT extern "C" __declspec(dllexport)
DLLEXPORT MyClass* MyClass_MyClass()
{
return new MyClass();
}
DLLEXPORT void MyClass_setName(MyClass* myClass, const char* name)
{
myClass->setName(name);
}
MyClass::MyClass()
{
_name.clear();
}
void MyClass::setName(const char* name)
{
_name.setCString(name, NAME_MAX_BYTES);
}
C# Code:
[DllImport(#"lib.dll")]
private static extern IntPtr MyClass_MyClass();
[DllImport(#"lib.dll")]
public static extern void MyClass_setName(
IntPtr myClass,
[System.Runtime.InteropServices.InAttribute()]
[System.Runtime.InteropServices.MarshalAsAttribute(System.Runtime.InteropServices.UnmanagedType.LPStr)]
string name);
public static void Main(string[] args)
{
var myClass = MyClass_MyClass();
MyClass_setName(myClass , "Test Name");
}
Specifically, I'm wondering how does .NET know much space to allocate for "myClass"? It's got to be doing some kind of "Marshal.AllocHGlobal(SIZE)" in the background, right? What happens if more space is needed (I set a name?)? Also, is there any risk of garbage collection coming around and moving memory around and messing up my "IntPtr myClass" ?
.NET knows nothing about MyClass type, it only stores a pointer to it. Size of the pointer is always known and fixed - 4 bytes for 32bit processes and 8 bytes for 64bit processes. All memory allocation and management in this particular case happens in unmanaged C++ code here:
return new MyClass();
and here:
myClass->setName(name);
It's up to this C++ DLL to decide how to allocate/free/manage memory, C# code will just call imported functions of this DLL.
No garbage collection will be performed on your unmanaged object and you'll need to provide additional (unmanaged) method to release it to avoid memory leak.
If the c++ code is not managed, .net isn't allocating anything beyond the IntPtr. It's being allocated by the c++ code.
This means that the only garbage collection will be done on that IntPtr. As that's small, it may take a long time before the garbage collector decides to clean it up.
What this means is that even if your C++ code is cleaning up well after itself, it may take a long time before it actually gets to do the cleanup. The C++ code may be using a ton of memory, but it's invisible to .net so it won't prioritize it for clean up over "larger" .net objects.
I have this C++ code:
extern "C" __declspec(dllexport) VOID AllocateFoo(MY_DATA_STRUCTURE** foo)
{
*foo = new MY_DATA_STRUCTURE;
//do stuff to foo
}
Then in C# I call the function thus:
[DllImport("MyDll.dll")]
static extern void AllocateFoo(out IntPtr pMyDataStruct);
...
MyDataStructure GetMyDataStructure()
{
IntPtr pData;
ManagedAllocateFooDelegate(out pData);
MyDataStructure foo = (MyDataStructure)Marshal.PtrToStructure(pData, typeof(MyDataStructure));
return foo;
}
Where MyDataStructure is a struct (not class) which corresponds to MY_DATA_STRUCTURE and members are marshalled appropriately.
So questions: do I need to store pData and then release it again in unmanaged code when MyDataStructure is GC'd?
MSDN says for Marshal.PtrToStructure(IntPtr, Type):
"Marshals data from an unmanaged block of memory to a newly allocated managed object of the specified type."
In that sentence does "Marshall" mean "copy"? In which case I'd need to preserve (IntPtr pData) and then pass it to unmanaged code (in the MyDataStructure destructor) so I can do a C++ "delete"?
I've searched but I can't locate a sufficiently explicit answer for this.
As Erik said, the Marshal does mean copy, but I don't think he answered the main point of your question.
Do you need to hold onto the pData native pointer until the MyDataStructure is GCed? No.
Once marshaled, your MyDataStructure instance, foo, contains a copy of the structure pointed to by pData. You need not hold onto pData any longer. To avoid a memory leak, you must pass that pData into another unmanaged function that will delete it, and that can be done right after the marshaling, regardless of how long you hold on to the MyDataStructure instance.
Yes, in this case, Marshall means copy; thus, you need to deallocate your memory in unmanaged code. All the call to PtrToStructure does is read a number of bytes indicated by the size of the destination structure 'MyDataStructure' from the memory location pointed to by pData.
The details of course depend on exactly what 'MyDataStructure' looks like (do you use any FieldOffset or StructLayout attributes in MyDataStructure) - but the end result is that the return from PtrToStructure is a copy of the data.
As GBegen points out in his answer, I didn't answer the main point of your question. Yes, you will need to delete the unmanaged copy of your structure in unmanaged code, but no, you don't need to hold onto pData - you can delete the unmanaged copy as soon as the call to PtrToStructure completes.
PS: I've edited my post to contain this information so as to consolidate the answers into one post - if anyone upvotes this answer, please upvote GBegen's answer as well for his contribution.