C# Pinvoke IntPtr in structs and memory - c#

If i have understood right, when using structs with non blitable values, the struct data from unmanaged memory is copied into the managed memory (basically having the same struct twice).
Also if im not wrong, IntPtr variables, are stored in managed memory, but the data they point to is in the unmanaged memory.
Now lets say i have a delegate method, which is being called by a c++ function and receives a struct as ref, and a handler would be something like:
private void _handler(ref MyStruct p){}
Now the api says I should not keep the reference to the struct as the memory might be recycled and used for further calls, but since the unmanaged struct is copied into managed memory, and after I assign it to a local variable it is copied again (because it is a struct!) i shouldnt have any issues at all if the unmanaged memory gets freed or re written.
However, if the struct contains an IntPtr variable, i might save a copy of the pointer but not of the data it points, so if i try to access the IntPtr data from my saved struct, i might get some memory fault, is this correct?
One last question is, doing changes to the struct in managed memory, will also affect the unmanaged memory struct, since it is passed by ref, and it implicitly means IN/OUT, so after the event handler call ends, changes in managed memory will be made in the unmanaged memory as well?

Your understanding is largely correct. Taking away the details of blittabiltiy, the root question is one of ownership of the memory being passed to your callback. Typically, the contract between the caller of the callback and the callback itself is that the memory is owned by the caller and will be valid only for the duration of the callback itself. (Other contracts are possible as well - for instance, the caller could transfer ownership of the memory to the callback, in which case the callback is responsible for freeing the memory when it is done).
In the documentation you reference, the API is telling you not to keep a reference to the memory, which means that you are very likely in the standard case where the caller owns the structure. What this means is that both the contents of the structure and everything pointed to by that structure should be valid during the call to your callback but cannot be relied upon after your callback completes.
Copying a structure to a local wouldn't be problematic because the native memory should not be freed until after your callback completes, so keeping as many copies as you need during the execution of the callback should be fine. What would be problematic is saving a copy of the structure to a location, such as a static field, which will be alive after your callback returns.
Since the structure is implicitly in/out, then modifications you make to it will be reflected on the native side of the call as well. This is true regardless of if the structure contained only blittable data. If the structure contained non-blittable data then the CLR would marshal your changes to the structure back out to the native copy when your code completes.
Note that just because you can modify the native structure does not mean that the API contract expects you to do so. Frequently API signatures contain pointers to structures in C++ in order to avoid copying the entire structure when making a call, not to enable modification of the structure.

Related

Save reference to managed object in unmanaged memory

I want to put a reference to a C# object into unmanaged memory (C), I guess as a pointer (int), and when the C code calls back into C# later on, I want to get the reference back from the unmanaged memory, so I can resolve it, and access the object. The reason is that the C code controls which object should be used, there's no real alternative. I have limited control over the C code and C++/CLI is not an option.
Question: Is that possible and safe, if so, how?
Well, it is possible. Primary concern is that your scheme is very incompatible with the garbage collector, it moves objects in memory when it compacts the heap. That's something you can stop, you can pin the object so the GC cannot move it. You use GCHandle.Alloc() to allocate a GCHandleType.Pinned handle and pass the return value of GCHandle.AddrOfPinnedObject() to your C code, presumably with a pinvoke call.
You have to fret about how long that object needs to stay pinned. A couple of seconds, tops, is okay, but it gets pretty detrimental to the GC if you keep it pinned for a long time. It is a rock in the road that the GC constantly has to drive around. And the heap segment can never be recycled, that single object can cost you a handful of megabytes.
In which case you should consider allocating unmanaged memory and copying the object into it. Use Marshal.AllocHGlobal() to allocate, Marshal.StructureToPtr() to copy the object into it. Possibly multiple times if you modify the object and the changes need to be visible to the C code as well.
Either way, the object must be blittable or you get a runtime error. An expensive word that just means that the object must have simple field types, the kind that a C program has a shot at reading correctly. Don't use bool. Be careful with the declaration in the C program, pretty easy to corrupt the heap when you get it wrong.
When you control the 'handing out' and the 'use after receiving back' phases you can simply use a List or array and pass around the index.
It's possible to consume C# objects via COM and proxies created by the CLR called COM-Callable Wrappers.
You just need assign a GUID assembly attribute to identify the COM type library, e.g.:
[assembly: Guid ("39ec755f-022e-497a-9ac8-70ba92cfdb7c")]
And then use the Type Library Exporter tool (tlbexp.exe) to genereate the COM type library (.tlb) file which can be consumed in the COM world:
tlbexp.exe YourLibrary.dll
If you mean safe in the C#'s sense of the word, then certainly unsafe, as you'll be using the objects in the unmanaged world, and lifetimes are controlled from the COM side via reference counting as opposed to CLR's GC.

What's the difference between HandleRef and GCHandle?

What's the difference between HandleRef and GCHandle?
http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.handleref.aspx
http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.gchandle.aspx
Thanks
The point of both these structures is to prevent the garbage collector from releasing a resource and invalidating the handle before the P/Invoke call has finished. The documentation you linked indicates that these are special types recognized by the interop marshaller.
What I gather from the documentation is that HandleRef is essentially a special case of the more general GCHandle structure.
The HandleRef structure is specifically intended for wrapping handles to unmanaged resources that are used with P/Invoke code. For example, window handles (HWND) or device contexts (HDC). It has a Handle property that returns a value of type IntPtr, which is an integer value the size of a pointer on the underlying system's architecture. You can use this to quickly & easily obtain the handle it wraps.
Whereas the GCHandle structure allows one to specify the type of handle it wraps using one of the members of the GCHandleType enumeration, the HandleRef structure was specifically designed to wrap handles to unmanaged resources. You'd probably use the GCHandle structure when you're dealing directly with unmanaged memory, rather than the special handles that the Win32 API treats as black boxes.
It is not necessary to use either. One can simply call GC.KeepAlive to keep the garbage collector from prematurely releasing the resource.
And even that is probably not necessary. I've been writing P/Invoke code for years, and I've found that when it's correctly written, there's no need for either of these structures. If a class object gets garbage collected while the API call is in the middle of executing, then that's a bug in your application. I actually want to be notified of the failure via an exception, not hide it.
One difference is given in the link you mentioned:
The HandleRef value type, like GCHandle, is a special type recognized
by the interop marshaler. A normal, nonpinned GCHandle also prevents
untimely garbage collection, yet HandleRef provides better
performance. Although using HandleRef to keep an object alive for the
duration of a platform invoke call is preferred, you can also use the
GC.KeepAlive method for the same purpose.

Holding onto an array marshalled into C# from C++

I am passing an array of integers from C++ to C#, using a parameter like this in my C# method:
[MarshalAs(UnmanagedType.LPArray, SizeParamIndex = 0)]
UInt32[] myStuff,
When this data arrives in the CLR, I think that "LPArray" indicates that I am working with the pointer from the C++-world directly? So if I want to hold onto this array after the method call is over, should I make a copy of it?
You have to be careful that this is allocated using the same memory allocation mechanisms in both the managed and unmanaged worlds. And even if that's the case, it's just safer to make a copy and work with that.
Note: In your example, the pointer isn't passed by reference, and so the callee can only add to your previous array, not give you a new array. Is that really what you intended?
Calling from unmanaged code into managed code (C++ to C#), the array gets copied into a fully managed array. This array will be freed when the garbage collector notices that nobody has any more references to it, not when the function exits.
But the other direction is more dangerous: If you were going the other direction (C# to C++), it would either: pin the C# array so it won't move, or make a temporary copy of the array into unmanaged memory (which gets freed when the function returns). In either case, it would not be safe to hold onto the array on the C++ side after the function call completes, you'd want to copy the contents to somewhere else.

is it necessary to gchandle.alloc() each callback in a class?

I have a .NT class which has multiple delegates for callbacks from native code. Is it necessary to allocate all the delegates? I mean does GCHandle.Alloc() protects just the delegate or the entire class that owns the delegate from being collected?
A delegate has two relevant properties, Method and Target. The Target will be non-null if the delegate was created for an instance method. And that keeps the object alive, as long as the garbage collector can see the delegate instance.
Native code is relevant to having problems with callbacks. When you pass a delegate instance to a pinvoked native function then the P/Invoke marshaller will use Marshal.GetFunctionPointerForDelegate() to create a little stub that produces the required Target reference when the native code makes the callback. The garbage collector however can not see this stub and therefore won't find a reference to the delegate object. And collects it. The next callback from the native code produces a crash.
To avoid this, you must store the delegate object yourself so that it stays referenced for as long as the native code can make the callback. Storing it in a static variable is an obvious solution.
I am wrong.
You don't need to pinned the delegate (at the other hand, you cannot pinned the delegate, there will be a exception be thrown (System.ArgumentException: Object contains non-primitive or non-blittable data.)
reference:
http://social.msdn.microsoft.com/Forums/vstudio/en-US/bd662199-d150-4fbf-a5ee-7a06af0493bb/interop-pinning-and-delegates?forum=
Details about that in Chris Brumme's blog http://blogs.msdn.com/cbrumme/archive/2003/05/06/51385.aspx
Chris Brumme wrote:
Along the same lines, managed Delegates can be marshaled to unmanaged code, where they are exposed as unmanaged function pointers. Calls on those pointers will perform an unmanaged to managed transition; a change in calling convention; entry into the correct AppDomain; and any necessary argument marshaling. Clearly the unmanaged function pointer must refer to a fixed address. It would be a disaster if the GC were relocating that! This leads many applications to create a pinning handle for the delegate. This is completely unnecessary. The unmanaged function pointer actually refers to a native code stub that we dynamically generate to perform the transition & marshaling. This stub exists in fixed memory outside of the GC heap.
However, the application is responsible for somehow extending the lifetime of the delegate until no more calls will occur from unmanaged code. The lifetime of the native code stub is directly related to the lifetime of the delegate. Once the delegate is collected, subsequent calls via the unmanaged function pointer will crash or otherwise corrupt the process. In our recent release, we added a Customer Debug Probe which allows you to cleanly detect this – all too common – bug in your code. If you haven’t started using Customer Debug Probes during development, please take a look!
I think there will be something wrong happened (but not usually) when you just store the delegate object.
As we all know , the managed memory will be arranged by Garbage Collect. ( That means the physical memory address of a managed object will be changed. )
Imaging there is a long-time-life delegate to be called by native code, we set the delegate as the static member or class member . But sometime (we don't know when , we just know it will happen) , GC arranged memory, and the physical memory of the delegate may from 0x000000A to 0x0000010 . But the native code know nothing about it , for the native code , it only knows to call at 0x000000A forever.
So we should not only store the delegate object but also use GCHandle.Alloc to tell GC not move the physical memory of the delegate object. Then the native code will do well at callback time.
Well , because the GC do not arrange managed memory frequencly , so for a short-time-life delegate , even you do not call the GCHandle.Alloc , your codes always "DO WELL" , but sometimes it will mad.
Now , you know the reason.
reference:
http://dotnet.dzone.com/news/net-memory-control-use-gchandl

Memory Management Of Unmanaged Component By CLR

I am having a little confusion , may be this question is very silly one.
where does the memory allocated for a unmanaged component?
In my .net code if i initiated an unmanaged component, where this component is going to be loaded and memory is allocated ?
How CLR marshall call between Managed and Unmanaged heap ?
EDIT
Thanks for your reply but what i am asking is say suppose i do a DLLIMPORT of User32.Dll , this is clearly a unmanaged dll and i call some function in User32.DLL now my question , how CLR marshall my call to this unmanged dll?
It starts out pretty easy. The pinvoke marshaller first calls LoadLibrary and passes the DLL name you specified, the DllImportAttribute.Value property. In your case, user32.dll is already loaded because it gets loaded by the .NET bootstrapper, its reference count just gets incremented. But normally the Windows loader gets the DLL mapped into the address space of the process so the exported functions can be called.
Next is GetProcAddress to get the address of the function to call, the DllImportAttribute.EntryPoint property. The marshaller makes a couple of tries unless you used ExactSpelling. A function name like "foo" is tested several possible ways, foo and fooW or fooA. Nasty implementation detail of Win32 related to the difference between Unicode and Ansi characters. The CharSet property matters here.
Now I need to wave hands a bit because it gets tricky. The marshaller constructs a stack frame, setting up the arguments that need to be passed to the exported function. This requires low level code, carefully excluded from prying eyes. Take it at face value that it performs the kind of translations that the Marshal class supports to convert between managed and unmanaged types. The DllImportAttribute.CallingConvention property matters here because that determines what argument value needs to be place where so that the called function can read it properly.
Next it sets up an SEH exception handler so that hardware exceptions raised by the called code can be caught and translated into a managed exception. The one that generates the more common one, AccessViolationException. And others.
Next, it pushes a special cookie on the stack to indicate that unmanaged code is about to start using stack. This prevents the garbage collector from blundering into unmanaged stack frames and interpret the pointers it finds there as managed object references. You can see this cookie back in the debugger's call stack, [Managed to Native Transition].
Next, just an indirect call to the function address as found with GetProcAddress(). That gets the unmanaged code running.
After the call, cleanup might need to be done to release memory that was allocated to pass the unmanaged arguments. The return value might need to be translated back to a managed value. And that's it, assuming nothing nasty happened, execution continues on the next managed code statement.
Unmanaged memory allocations come from the process heap. You are responsible for allocating/deallocating the memory, since it will not get garbage collected because the GC does not know about these objects.
Just as an academic piece of info expanding on what has been posted here:
There are about 8 different heaps that the CLR uses:
Loader Heap: contains CLR structures and the type system
High Frequency Heap: statics, MethodTables, FieldDescs, interface map
Low Frequency Heap: EEClass, ClassLoader and lookup tables
Stub Heap: stubs for CAS, COM wrappers, P/Invoke
Large Object Heap: memory allocations that require more than 85k bytes
GC Heap: user allocated heap memory private to the app
JIT Code Heap: memory allocated by mscoreee (Execution Engine) and the JIT compiler for managed code
Process/Base Heap: interop/unmanaged allocations, native memory, etc
HTH
Part of your question is answered by Michael. I answer the other part.
If CLR loaded into an unmanaged process, it is called CLR hosting. This usually involves calling an entry point in mscoree DLL and then the default AppDomain is loaded. In such a case, CLR asks for a block of memory from the process and when given, that becomes its memory space and will have a stack and heap.

Categories