c#: generically convert unmanaged array to managed list - c#

I am dealing with a set of native functions that return data through dynamically-allocated arrays. The functions take a reference pointer as input, then point it to the resulting array.
For example:
typedef struct result
{
//..Some Members..//
}
int extern WINAPI getInfo(result**);
After the call, 'result' points to a null-terminated array of result*.
I want to create a managed list from this unmanaged array. I can do the following:
struct Result
{
//..The Same Members..//
}
public static unsafe List<Result> getManagedResultList(Result** unmanagedArray)
{
List<Result> resultList = new List<Result>();
while (*unmanagedArray != null)
{
resultList.Add(**unmanagedArray);
++unmanaged;
}
return result;
}
This works, it will be tedious and ugly to reimplement for every type of struct that I'll have to deal with (~35). I'd like a solution that is generic over the type of struct in the array. To that end, I tried:
public static unsafe List<T> unmanagedArrToList<T>(T** unmanagedArray)
{
List<T> result = new List<T>();
while (*unmanagedArray != null)
{
result.Add((**unmanagedArray));
++unmanagedArray;
}
return result;
}
But that won't compile because you cannot "take the address of, get the size of, or declare a pointer to a managed type('T')".
I also tried to do this without using unsafe code, but I ran into the problem that Marshal.Copy() needs to know the size of the unmanaged array. I could only determine this using unsafe code, so there seemed to be no benefit to using Marshal.Copy() in this case.
What am I missing? Could someone suggest a generic approach to this problem?

You can make a reasonable assumption that size and representation of all pointers is the same (not sure if C# spec guarantees this, but in practice you'll find it to be the case). So you can treat your T** as IntPtr*. Also, I don't see how Marshal.Copy would help you here, since it only has overloads for built-in types. So:
public static unsafe List<T> unmanagedArrToList<T>(IntPtr* p)
{
List<T> result = new List<T>();
for (; *p != null; ++p)
{
T item = (T)Marshal.PtrToStructure(*p, typeof(T));
result.Add(item);
}
return result;
}
Of course you'll need an explicit cast to IntPtr* whenever you call this, but at least there's no code duplication otherwise.

You said:
Marshal.Copy() needs to know the size
of the unmanaged array. I could only
determine this using unsafe code
It seems that you're missing Marshal.SizeOf().
From what you've mentioned in the post, that may be enough to solve your problem. (Also, the parameter of your function may need to be Object** instead of T**.)

Related

How to extract a struct array element into a variable without copying in C#?

I have a little question about arrays of struct in C#: lets say I have a struct Foo:
struct Foo
{
public string S;
public int X;
...
...
}
and I have an array of Foo:
Foo[] arr = ...
In one method, I use arr[i] quite often, so I'd like to keep it in a local variable (the expression for i is also a little long):
var f = arr[i]
Now, my problem is that I know structs are value type, which means assignments like this cause a copy. The struct is a little big (7 strings and a bool), so I'd prefer to avoid copying in this case.
If I am not mistaken, the only way to access the struct's fields without copying the struct is to use the array directly: arr[i].S or arr[i].X, but this quickly becomes annoying to read. I'd really like to keep the array element in a local variable, but I don't want to waste performance by copying it into the variable.
Is there a way to make something like a reference variable (similar to C++) to avoid copying? If not, than I'm curious if it's something the compiler optimizes?
How should I deal with this element? Can I put it in a local variable without copying or do I have to access it through the array to avoid copying?
Thanks in advance.
You can do this in C# 7 and later using ref local variables:
using System;
public struct LargeStruct
{
public string Text;
public int Number;
}
class Test
{
static void Main()
{
LargeStruct[] array = new LargeStruct[5];
// elementRef isn't a copy of the array value -
// it's really the variable in the array
ref LargeStruct elementRef = ref array[2];
elementRef.Text = "Hello";
Console.WriteLine(array[2].Text); // Prints hello
}
}
Of course, I'd normally recommend avoiding:
Large structs
Mutable structs
Public fields (although if it's mutable, doing that via public fields is probably best)
... but I acknowledge there are always exceptions.

__makeref as a way to get a reference value in C#?

I've seen this code which was used showing the reference value :
static void Main(string[] args)
{
string s1 = "ab";
string s2 = "a"+"b";
string s3 = new StringBuilder("a").Append("b").ToString();
Console.WriteLine(GetMemoryAddress(s1));
Console.WriteLine(GetMemoryAddress(s2));
Console.WriteLine(GetMemoryAddress(s3));
}
static IntPtr GetMemoryAddress(object s1)
{
unsafe
{
TypedReference tr = __makeref(s1);
IntPtr ptr = **(IntPtr**) (&tr);
return ptr;
}
}
Result (as expected):
(I know that string interning kicks in here, but that's not the question).
Question:
Although it seems that it does do the job,
Does using __makeref is this the right way of getting the reference value in c#?
Or are there any situations in which this ^ would fail ....?
Although it seems that it does do the job, Does using __makeref is this the right way of getting the reference value in c#?
There is no "right way" of doing this in C# - it isn't something you're meant to try and do, but: in terms of what it is doing - this is essentially relying on the internal layout of TypedReference and a type coercion; it'll work (as long as TypedReference doesn't change internally - for example the order of the Type and Value fields changes), but... it is nasty.
There is a more direct approach; in IL, you can convert from a managed pointer to an unmanaged pointer silently. Which means you can do something nasty like:
unsafe delegate void* RefToPointer(object obj);
static RefToPointer GetRef { get; } = MakeGetRef();
static RefToPointer MakeGetRef()
{
var dm = new DynamicMethod("evil", typeof(void*), new[] { typeof(object) });
var il = dm.GetILGenerator();
il.Emit(OpCodes.Ldarg_0);
il.Emit(OpCodes.Ret);
return (RefToPointer)dm.CreateDelegate(typeof(RefToPointer));
}
and now you can just do:
var ptr = new IntPtr(GetRef(o));
Console.WriteLine(ptr);
This is horrible, and you should never do it - and of course the GC can move things while you're not looking (or even while you are looking), but... it works.
Whether ref-emit is "better" than undocumented and unsupported language features like __makeref and type-coercion: is a matter of some debate. Hopefully purely academic debate!

Implementing `.CopyTo(Array,int)` for the array itself as a target array(to be copied into)

In a class definition, I implemented IList<T> to make it look like an array.
// Foo has C++ arrays inside for a
// fast communication with some hardware
public abstract class Foo<T> : IList<T(or uint for a derived class)>
{
public virtual void CopyTo(uint[] array, int arrayIndex)
{
int dL = Length;
if (dL == array.Length)
{
/* needs pinning the target before this?*/
Marshal.Copy(handleForFooUnmanagedArray,
(int[])(object) array,
arrayIndex,
dL - arrayIndex);
return;
}
throw new NotImplementedException();
}
}
so it can do this now:
uint [] bar = new uint[L];
foo.CopyTo(bar,0);
but now I want to make it work like an array with this:
uint [] bar = new uint[L];
bar.CopyTo(foo,0);
so I looked what interfaces an array implements in run-time(here) to find something like a private .CopyFrom that I thought should be called implicity in `.CopyTo',
IList
ICloneable
ICollection
IEnumerable
IStructuralComparable
IStructuralEquatable
non of these have any .CopyFrom.
Maybe there is some IntPtr property as a handle for Marshal copying in .CopyTo but I couldn't see it in intellisense.
Question:
How can I find that which method does the .CopyTo use to get necessary info about target array and what that necessary info would that be? Another method like a .CopyFrom or a handle pointing to start of target array, or some interpreter intermediate codes stored in somewhere? Is the target array pinned in the process?
Side question:
Do I need to implement some extra methods in IList<T> on top of important(unknown) ones?
I already implemented toArray, Count and [] but I havent done anything for others yet. Then Foo also has Length(with a custom interface) but it doesn't belong Array so an uint[] may not use it in its CopyTo.
I'm not experienced with IL so I may not understand if thats the solution but I can look back in time.
Also I tried to implement Array which refuses to be implemented because of being a special class.
Thank you very much for your time.
CopyTo is implemented in unmanaged code by runtime itself, and signature of method looks like this:
[MethodImpl(MethodImplOptions.InternalCall)]
internal static extern void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length, bool reliable);
As you see it still expects Array and not some pointer, so it's hard to do what you want.
But if you can have a managed array inside your Foo then it's easy to achieve the goal - just use implicit conversion to Array like this:
class MyFakeArray {
uint[] _realArray = new uint[10];
public MyFakeArray() {
}
public static implicit operator uint[](MyFakeArray a) {
return a._realArray;
}
}
Then CopyTo will work as expected:
var a = new uint[10];
var fa = new MyFakeArray();
a.CopyTo(fa, 0);

Compare byte[] to T

I want to make a list of pointers to locations that contains a certain value in the process memory of another process. The value can be a short, int, long, string, bool or something else.
My idea is to use Generics for this. I have one problem with making it, how can I tell the compiler to what type he needs to convert the byte array?
This is what I made:
public List<IntPtr> ScanProccessFor<T>(T ItemToScanFor)
{
List<IntPtr> Output = new List<IntPtr>();
IntPtr StartOffset = SelectedProcess.MainModule.BaseAddress;
int ScanSize = SelectedProcess.MainModule.ModuleMemorySize;
for (int i = 0; i < ScanSize; i++)
if (ReadMemory(SelectedProcess, StartOffset + i, (UInt16)Marshal.SizeOf(ItemToScanFor)) == ItemToScanFor)
Output.Insert(Output.Count,StartOffset + i);
return Output;
}
How can I tell the compiler that he needs to convert the byte[] to type T?
Your question is a little bit confusing, but I'll try to answer what I can
Instead of taking a generic type, I would probably write a method that takes an instance of an interface like IConvertableToByteArray or something.
public IConvertableToByteArray
{
public byte[] ToByteArray();
}
Then If you needed to allow a specific type to be compatible with that method, you could make an encapsulating class
public IntConvertableToByteArray : IConvertableToByteArray
{
public int Value{get; set;}
public byte[] ToByteArray()
{
insert logic here
}
}
You could use Marshal.StructureToPtr to get an unmanaged representation of the structure (which has to be a 'simple' structure). You might need to special case strings though.
You should also think about the alignment constraints on what you are searching for -- advancing through memory 1 byte at a time will be very slow and wasteful if the item must be 4 or 8 byte aligned.

IntPtr arithmetics

I tried to allocate an array of structs in this way:
struct T {
int a; int b;
}
data = Marshal.AllocHGlobal(count*Marshal.SizeOf(typeof(T));
...
I'd like to access to allocated data "binding" a struct to each element in array allocated
with AllocHGlobal... something like this
T v;
v = (T)Marshal.PtrToStructure(data+1, typeof(T));
but i don't find any convenient way... why IntPtr lack of arithmetics? How can I workaround this in a "safe" way?
Someone could confirm that PtrToStructure function copy data into the struct variable? In other words, modifing the struct reflect modifications in the structure array data, or not?
Definitely, I want to operate on data pointed by an IntPtr using struct, without copying data each time, avoiding unsafe code.
Thank all!
You have four options that I can think of, two using only "safe" code, and two using unsafe code. The unsafe options are likely to be significantly faster.
Safe:
Allocate your array in managed memory, and declare your P/Invoke function to take the array. i.e., instead of:
[DllImport(...)]
static extern bool Foo(int count, IntPtr arrayPtr);
make it
[DllImport(...)]
static extern bool Foo(int count, NativeType[] array);
(I've used NativeType for your struct name instead of T, since T is often used in a generic context.)
The problem with this approach is that, as I understand it, the NativeType[] array will be marshaled twice for every call to Foo. It will be copied from managed memory to unmanaged
memory before the call, and copied from unmanaged memory to managed memory afterward. It can be improved, though, if Foo will only read from or write to the array. In this case, decorate the tarray parameter with an [In] (read only) or [Out] (write only) attribute. This allows the runtime to skip one of the copying steps.
As you're doing now, allocate the array in unmanaged memory, and use a bunch of calls to Marshal.PtrToStructure and Marshal.StructureToPtr. This will likely perform even worse than the first option, as you still need to copy elements of the array back and forth, and you're doing it in steps, so you have more overhead. On the other hand, if you have many elements in the array, but you only access a small number of them in between calls to Foo, then this may perform better. You might want a couple of little helper functions, like so:
static T ReadFromArray<T>(IntPtr arrayPtr, int index){
// below, if you **know** you'll be on a 32-bit platform,
// you can change ToInt64() to ToInt32().
return (T)Marshal.PtrToStructure((IntPtr)(arrayPtr.ToInt64() +
index * Marshal.SizeOf(typeof(T)));
}
// you might change `T value` below to `ref T value` to avoid one more copy
static void WriteToArray<T>(IntPtr arrayPtr, int index, T value){
// below, if you **know** you'll be on a 32-bit platform,
// you can change ToInt64() to ToInt32().
Marshal.StructureToPtr(value, (IntPtr)(arrayPtr.ToInt64() +
index * Marshal.SizeOf(typeof(T)), false);
}
Unsafe:
Allocate your array in unmanaged memory, and use pointers to access the elements. This means that all the code that uses the array must be within an unsafe block.
IntPtr arrayPtr = Marhsal.AllocHGlobal(count * sizeof(typeof(NativeType)));
unsafe{
NativeType* ptr = (NativeType*)arrayPtr.ToPointer();
ptr[0].Member1 = foo;
ptr[1].Member2 = bar;
/* and so on */
}
Foo(count, arrayPtr);
Allocate your array in managed memory, and pin it when you need to call the native routine:
NativeType[] array = new NativeType[count];
array[0].Member1 = foo;
array[1].Member2 = bar;
/* and so on */
unsafe{
fixed(NativeType* ptr = array)
Foo(count, (IntPtr)ptr);
// or just Foo(count, ptr), if Foo is declare as such:
// static unsafe bool Foo(int count, NativeType* arrayPtr);
}
This last option is probably the cleanest if you can use unsafe code and are concerned about performance, because your only unsafe code is where you call the native routine. If performance isn't an issue (perhaps if the size of the array is relatively small), or if you can't use unsafe code (perhaps you don't have full trust), then the first option is likely cleanest, although, as I mentioned, if the number of elements you'll access in between calls to the native routine are a small percentage of the number of elements within the array, then the second option is faster.
Note:
The unsafe operations assume that your struct is blittable. If not, then the safe routines are your only option.
"Why IntPtr lack of arithmetics?"
IntPtr stores just a memory address. It doesn't have any kind of information about the contents of that memory location. In this manner, it's similar to void*. To enable pointer arithmetic you have to know the size of the object pointed to.
Fundamentally, IntPtr is primarily designed to be used in managed contexts as an opaque handle (i.e. one that you don't directly dereference in managed code and you just keep around to pass to unmanaged code.) unsafe context provides pointers you can manipulate directly.
Indeed, the IntPtr type does not have its own arithmetic operators. Proper (unsafe) pointer arithmetic is supported in C#, but IntPtr and the Marshal class exist for 'safer' usage of pointers.
I think you want something like the following:
int index = 1; // 2nd element of array
var v = (T)Marshal.PtrToStructure(new IntPtr(data.ToInt32() +
index * Marshal.SizeOf(typeof(T)), typeof(T));
Also, note that IntPtr has no implicit conversion between int and IntPtr, so no luck there.
Generally, if you're going to be doing anything remotely complex with pointers, it's probably best to opt for unsafe code.
You can use the integral memory address of the pointer structure using IntPtr.ToInt32() but beware of platform "bitness" (32/64).
For typical pointer arithmetics, use pointers (look up fixed and unsafe in the documentation):
T data = new T[count];
fixed (T* ptr = &data)
{
for (int i = 0; i < count; i++)
{
// now you can use *ptr + i or ptr[i]
}
}
EDIT:
I'm pondering that IntPtr allows you to handle pointers to data without explicitly manipulating pointer addresses. This allows you to interop with COM and native code without having to declare unsafe contexts. The only requirement that the runtime imposes is the unmanaged code permission. For those purposes, it seems like most marshalling methods only accept whole IntPtr data, and not pure integer or long types, as it provides a thin layer that protects against manipulating the content of the structure. You could manipulate the internals of an IntPtr directly, but that either requires unsafe pointers (again unsafe contexts) or reflection. Finally, IntPtr is automatically adopted to the platform's pointer size.
You could use Marshal.UnsafeAddrOfPinnedArrayElement to get address of specific elements in an array using an IntPtr from a pinned array.
Here is a sample class for a wrapper around pinned arrays so that I can use them with IntPtr and Marshaling code:
/// <summary>
/// Pins an array of Blittable structs so that we can access the data as bytes. Manages a GCHandle around the array.
/// https://learn.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.marshal.unsafeaddrofpinnedarrayelement?view=netframework-4.7.2
/// </summary>
public sealed class PinnedArray<T> : IDisposable
{
public GCHandle Handle { get; }
public T[] Array { get; }
public int ByteCount { get; private set; }
public IntPtr Ptr { get; private set; }
public IntPtr ElementPointer(int n)
{
return Marshal.UnsafeAddrOfPinnedArrayElement(Array, n);
}
public PinnedArray(T[] xs)
{
Array = xs;
// This will fail if the underlying type is not Blittable (e.g. not contiguous in memory)
Handle = GCHandle.Alloc(xs, GCHandleType.Pinned);
if (xs.Length != 0)
{
Ptr = ElementPointer(0);
ByteCount = (int) Ptr.Distance(ElementPointer(Array.Length));
}
else
{
Ptr = IntPtr.Zero;
ByteCount = 0;
}
}
void DisposeImplementation()
{
if (Ptr != IntPtr.Zero)
{
Handle.Free();
Ptr = IntPtr.Zero;
ByteCount = 0;
}
}
~PinnedArray()
{
DisposeImplementation();
}
public void Dispose()
{
DisposeImplementation();
GC.SuppressFinalize(this);
}
}
IMHO Working with PInvoke and IntPtr is as dangerous as marking your assembly as unsafe and using pointers in an unsafe context (if not more)
If you don't mind unsafe blocks you can write extension functions that operate on the IntPtr cast to byte* like the following:
public static long Distance(this IntPtr a, IntPtr b)
{
return Math.Abs(((byte*)b) - ((byte*)a));
}
However, like always you have to be aware of possible alignment issues when casting to different pointer types.

Categories