Storing a "managed" context parameter in an unmanaged DLL

Storing a "managed" context parameter in an unmanaged DLL - c#

I don't know if this is a bad idea or not. I'm using an unmanaged DLL (written by me) in C#.
There are some callback functions that can be set up in the DLL, but these can only mapped to static class members on the C# side.
Since I want to make a callback operate on a particular class instance I'm wondering if it would be safe to store a class instance pointer inside the DLL's state information.
From the DLL's perspective this will simply be a 32-bit context integer, but from the C# side this will be an actual class "pointer" or "reference", with the callback signature defined something like so:
public delegate void StatusChangeHandler(ContextClass context, int someCallbackValue);
It does compile and it does appear to work, I just don't know if this is guaranteed. Is this an acceptable practice?

One problem that I see here, is that .Net have a garbage collector, which can move your class around. So your saved pointer may be invalidated. In order to prevent this for simple types you should pin the object like this:
byte[] b = new byte[1000];
// pin b, and get pointer to the first element.
fixed (byte* ptr = &b)
{
//use your fixed pointer to b. b will not be moved untill code leaves fixed region.
}
Though, for complex types, .Net may be smartenough to pin objects automatically, I would not rely on that.
So you have write something like this:
var ctx = new Context();
fixed (IntPtr ptr = &ctx)
{
StatusChange(ptr);
// do other stuff, and don't leave fixed region, until you can clear the pointer in the native library.
}
But really, I think a much simpler and reliably way will be to create a static dictionary for your context objects, and give your native dll only a key for that dictionary, which could be a number, string or GUID. E.g. anything that is a value, not a pointer.

Related

How does C#/.NET implement pinning of ref/in/out parameters?

In "unsafe" C# code, it is possible to get a pointer to a ref, in, or out parameter by using the fixed statement:
class A
{
unsafe void Test(ref int i)
{
fixed(int* ptr = &i)
{
// Do something with ptr.
}
}
}
The fixed statement "pins" the memory for i in place for the duration of the block so that the GC won't move the memory for i someplace else, which would invalidate ptr.
So my question, which I ask out of curiosity and a desire to better understand the performance implications of pinning ref/in/out parameters, is: How does C# and/or the .NET runtime know what object, if any, actually needs to be pinned? Because if i is a reference to a member field of an object, then doesn't it need to pin that whole object? And if i is a reference to a local variable in the calling function, then isn't there nothing that needs to be pinned at all? Does it somehow walk up the call stack until it finds the actual variable or field referred to by i? (Which sounds potentially expensive.)

Why is the Pinnable<T> class in C# 7.2 defined the way it is?

I'm aware that Pinnable<T> is an internal class used by the methods in the new Unsafe class, and it's not meant to be used anywhere else other than in that class. This question is not about something practical, but it's just to understand why it's been designed like this and to learn a bit more about the language and its various "tricks" like this one.
As a recap, the Pinnable<T> class is defined here, and it looks like this:
[StructLayout(LayoutKind.Sequential)]
internal sealed class Pinnable<T>
{
public T Data;
}
And it's mainly used in the Span<T>.DangerousCreate method, here:
public static Span<T> DangerousCreate(object obj, ref T objectData, int length)
{
Pinnable<T> pinnable = Unsafe.As<Pinnable<T>>(obj);
IntPtr byteOffset = Unsafe.ByteOffset<T>(ref pinnable.Data, ref objectData);
return new Span<T>(pinnable, byteOffset, length);
}
The reason for Pinnable<T> being that it's used to keep track of the original object, in case the Span<T> instance was created by one (instead of a native pointer).
Given that reference type doesn't matter when pinning a reference (fixing both a ref T and Unsafe.As<T, byte>(ref T) works the same), is there a specific reason why the Pinnable<T> class was made generic? The original design in DotNetCross here in fact had a Pinnable class with just a single byte field, and it worked just the same. Is there any reason why using a generic class in this case would be an advantage, other than avoiding to cast the reference time when writing/reading/returning it?
Is there any other way, other than this unsafe-cast done with Unsafe.As, to get a reference to an object (I mean a reference to the object contents, otherwise it'd be the same as any variable of a class type)? I mean, any way to get a reference (which should basically have the same address of the actual object variable in the first place, right?) to an object without having to pass through some custom defined secondary class.

First of all, the Struct in [StructLayout(LayoutKind.Sequential)] doesn't mean that it is only valid for structs, it means the layout of the actual structure of the fields in memory, be it in a class or in a value type. This controls the actual runtime layout of the data, not just how the type would marshal to unmanaged code. The Sequential is important because without it, the runtime is pretty much free to store the memory however it sees fit, which means that Data may have some padding before it.
From what I understand about the implementation, the reason for Pinnable is to allow creating an instance of Span to a memory that may be moved by the GC, without having to pin the object first. If you don't use actual pointers and just references, nothing at all will need to be pinned.
I have noticed that it was introduced in a commit with a description saying it made Span more "portable" (a bold word for something that does a lot of unsafe things). I can't think of any other reason than something related to alignment for why it is generic. I suppose representing a T in terms of an offset from another T is better than as an offset from a byte. It may happen that the type of the first field may play a role in its actual address, even if the type was marked with LayoutKind.Sequential.
A reference to an object is different from an interior reference to an object (a reference to its data). It is implementation defined, but in .NET Framework, an instance of any class (or a boxed value type) starts with a header consisting of a sync block (for lock) and a pointer to the method table, a.k.a. the type of the object. On 32-bit, the header is 8 bytes, but the actual pointer points to the pointer to the method table (for performance reasons, getting the type happens more often than locking an object).
One but not portable way of getting the pointer to the start of the data is therefore casting the object reference to a pointer and adding 4 bytes to it. There the first field should start.
Another way I can think of is utilising GCHandle.AddrOfPinnedObject. It is commonly used for accessing array or string data, but it works for other objects:
[StructLayout(LayoutKind.Sequential)]
class Obj
{
public int A;
}
var obj = new Obj();
var gc = GCHandle.Alloc(obj, GCHandleType.Pinned);
IntPtr interior = gc.AddrOfPinnedObject();
Marshal.WriteInt32(interior, 0, 16);
Console.WriteLine(obj.A);
I think this actually is quite portable, but still needs to pin the object (there is InternalAddrOfPinnedObject defined in GCHandle, but even if that doesn't check whether the handle is actually pinned, the returned value may not be valid if it was used on a non-pinned object).
Still, the technique Span uses seems like the most portable way of doing that, since a lot of the underlying work is done in pure CIL (like reference arithmetics).

How to pass address of objects created in a C# List to C++ dll?

I have been looking all over google to find some answers to my questions but do not quite understand what I have found. I have some objects which are created and stored in C# List after using System.IO to read some text files. After that, I want to send references (using const pointers) to each of these objects to the internal classes in C++ dll so that it can use them for computation of some algorithms.
Here are some simple example (not actual code) of what I am doing:
The C# class:
[StructLayout(LayoutKind.Sequential, CharSet = CharSet.Ansi)]
public class SimpleClass
{
[MarshalAs(UnmanagedType.LPStr)]
public string Name;
public float Length;
}
with corresponding C struct:
struct SimpleClass
{
const char* Name;
float Length;
};
stored in
List<SimpleClass> ItemList;
after parsing some text files.
Then calling the following dll function:
C#:
[DllImport("SimpleDLL")]
public static extern void AddSimpleReference(SimpleClass inSimple);
C:
void AddSimpleReference(const SimpleClass* inSimple)
{
g_Vector.push_back(inSimple); // g_Vector is a std::vector<const SimpleClass* > type
}
What I have tried is:
for(int i=0; i<ItemList.Count;++i)
{
SimpleClass theSimpleItem = ItemList[i];
AddSimpleReference(theSimpleItem);
}
Initially, I thought it would be easy to get a actual reference/address just by using the assignment operator since classes in C# are passed-by-reference but it turns out that the C++ class is always pushing the same address value (the address value of the temp reference) into the container instead of the actual address. How do I get the actual object addresses and send it to C++ DLL so that it can have read-only access to the C# objects?
UPDATE: Sorry to those who posted answers with unsafe codes. I forgot to mention that the C# code is actually used as a script in Unity game engine which does not allow the use of unsafe codes.

First you need to change your interop signature to take a pointer (and thus making it unsafe).
[DllImport("SimpleDLL")]
public unsafe static extern void AddSimpleReference(SimpleClass* inSimple);
Then, because the GC is free to move objects around in memory as it pleases, you will need to pin the object in memory for the entire time you will need its address on the unmanaged side. For that you need the fixed statement:
SimpleClass theSimpleItem = ItemList[i];
unsafe
{
fixed(SimpleClass* ptr = &theSimpleItem)
{
AddSimpleReference(ptr);
}
}
This would work if AddSimpleReference used the pointer and then discarded it. But you're storing the pointer in a std::vector for later. That won't work, because the pointer will probably become invalid due to the GC moving the original item somewhere else once execution leaves the fixed block.
To solve this, you need to pin the items until you are done with them. To do this you may need to resort to the GCHandle type.
// Change the interop signature, use IntPtr instead (no "unsafe" modifier)
[DllImport("SimpleDLL")]
public static extern void AddSimpleReference(IntPtr inSimple);
// ----
for(int i=0; i<ItemList.Count;++i)
{
SimpleClass theSimpleItem = ItemList[i];
// take a pinned handle before passing the item down.
GCHandle handle = GCHandle.Alloc(theSimpleItem, GCHandleType.Pinned);
AddSimpleReference(GCHandle.ToIntPtr(handle));
// probably a good idea save this handle somewhere for later release
}
// ----
// when you're done, don't forget to ensure the handle is freed
// probably in a Dispose method, or a finally block somewhere appropriate
GCHandle.Free(handle);
When doing something like this, keep in mind that pinning objects in memory for a long time is a bad idea, because it prevents the garbage collector from doing its job efficiently.

Even though I think this is not a good idea, have a look at unsafe code and memory pinning. Here is a good start on MSDN.
fixed and unsafe keywords are likely what you should be looking for.

You cannot. The C# GC will move objects for fun. Your addresses will go out of scope.

c++/c# Marshal a struct to get a fixed pointer

I have an older app which accepts dll plugins. Each plugin has a function that creates a new instance of a "widget". This function identifies the new "widget" instance by returning a unique int. In the c++ plugin template I have it's done by declaring a new widget struct and returning its pointer cast as an int as a unique identifier (not my idea, but I have to work with it):
struct widget {
...
};
int widget_instance (HWND hwnd) {
widget* myWidget = new widget;
...
return (int) myWidget;
}
I'm working on a c# plugin for this app and I need to replicate this mechanism in c#. How do I declare a struct inside a function, so that each time the function is called it returns a fixed, non-changing pointer to the newly declared struct? My code so far:
public struct Widget {
...
}
public static int widget_instance(uint hwnd) {
Widget w = new Widget();
...
IntPtr myWidget = Marshal.AllocCoTaskMem(Marshal.SizeOf(w));
Marshal.StructureToPtr(w, myWidget, false);
return (int) myWidget;
}
Does this seem right? Do I need to "pin" the struct for this to work as c++ does?
EDIT: To elaborate: the unique ID is not just a simple "cookie" int, it's in fact used later on in the c++ template to access the struct instance by casting the ID as a pointer:
widget* w = (widget*) WidgetID;
w->firstElement = 123;
I assume the c# equivalent would be:
widget w = (widget)Marshal.PtrToStructure((IntPtr)widgetID,typeof(widget));
w.firstElement = 123;

You seem to imply that the returned integer identifier is just a cookie, and isn't treated by the app as a pointer. That would mean that the app never performs operations on the widget, but always asks the plugin to perform the operations on the app's behalf. If that's the case then there's no reason why you can't use your own scheme for assigning cookie identifiers. So...
You could add a "cookie" member to your Widget class, and a static "next cookie" member for assigning cookies. This would be a simple and semantically sound implementation. Obviously it would require thread-safety measures. It would also avoid the C++ technique of casting the "this" pointer to an int, which you say you don't like although I think it's perfectly fine.
The C++ pointer-to-int cast technique might serve a second purpose in addition to providing a unique identifier. When the app gives the plugin a cookie, the plugin can find the widget directly and at zero runtime cost by casting the cookie back to a pointer. Although that's hardly safe. But I don't think you can do that in C# anyway.
Your proposed C# code means that for every widget the app asks for, you wantonly create and initialize a second widget. If you really want to mimic the C++ technique so that your cookies are guaranteed to be unique process-wide, as opposed to just unique within the single plugin, then you could just allocate a one-byte block but don't initialize it. Or even a zero-sized block, but I'm not quite certain that that would yield unique addresses.

Yes, it is correct, since AllocCoTaskMem allocates the memory from the unmanaged COM allocator.
The memory is released or relocated only with the FreeCoTaskMem or ReAllocCoTaskMem functions, respectively.

Are ref and out in C# the same a pointers in C++?

I just made a Swap routine in C# like this:
static void Swap(ref int x, ref int y)
{
int temp = x;
x = y;
y = temp;
}
It does the same thing that this C++ code does:
void swap(int *d1, int *d2)
{
int temp=*d1;
*d1=*d2;
*d2=temp;
}
So are the ref and out keywords like pointers for C# without using unsafe code?

They're more limited. You can say ++ on a pointer, but not on a ref or out.
EDIT Some confusion in the comments, so to be absolutely clear: the point here is to compare with the capabilities of pointers. You can't perform the same operation as ptr++ on a ref/out, i.e. make it address an adjacent location in memory. It's true (but irrelevant here) that you can perform the equivalent of (*ptr)++, but that would be to compare it with the capabilities of values, not pointers.
It's a safe bet that they are internally just pointers, because the stack doesn't get moved and C# is carefully organised so that ref and out always refer to an active region of the stack.
EDIT To be absolutely clear again (if it wasn't already clear from the example below), the point here is not that ref/out can only point to the stack. It's that when it points to the stack, it is guaranteed by the language rules not to become a dangling pointer. This guarantee is necessary (and relevant/interesting here) because the stack just discards information in accordance with method call exits, with no checks to ensure that any referrers still exist.
Conversely when ref/out refers to objects in the GC heap it's no surprise that those objects are able to be kept alive as long as necessary: the GC heap is designed precisely for the purpose of retaining objects for any length of time required by their referrers, and provides pinning (see example below) to support situations where the object must not be moved by GC compacting.
If you ever play with interop in unsafe code, you will find that ref is very closely related to pointers. For example, if a COM interface is declared like this:
HRESULT Write(BYTE *pBuffer, UINT size);
The interop assembly will turn it into this:
void Write(ref byte pBuffer, uint size);
And you can do this to call it (I believe the COM interop stuff takes care of pinning the array):
byte[] b = new byte[1000];
obj.Write(ref b[0], b.Length);
In other words, ref to the first byte gets you access to all of it; it's apparently a pointer to the first byte.

Reference parameters in C# can be used to replace one use of pointers, yes. But not all.
Another common use for pointers is as a means for iterating over an array. Out/ref parameters can not do that, so no, they are not "the same as pointers".

ref and out are only used with function arguments to signify that the argument is to be passed by reference instead of value. In this sense, yes, they are somewhat like pointers in C++ (more like references actually). Read more about it in this article.

The nice thing about using out is that you're guaranteed that the item will be assigned a value -- you will get a compile error if not.

Actually, I'd compare them to C++ references rather than pointers. Pointers, in C++ and C, are a more general concept, and references will do what you want.
All of these are undoubtedly pointers under the covers, of course.

While comparisons are in the eye of the beholder...I say no. 'ref' changes the calling convention but not the type of the parameters. In your C++ example, d1 and d2 are of type int*. In C# they are still Int32's, they just happen to be passed by reference instead of by value.
By the way, your C++ code doesn't really swap its inputs in the traditional sense. Generalizing it like so:
template<typename T>
void swap(T *d1, T *d2)
{
T temp = *d1;
*d1 = *d2;
*d2 = temp;
}
...won't work unless all types T have copy constructors, and even then will be much more inefficient than swapping pointers.

The short answer is Yes (similar functionality, but not exactly the same mechanism).
As a side note, if you use FxCop to analyse your code, using out and ref will result in a "Microsoft.Design" error of "CA1045:DoNotPassTypesByReference."

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.