Building off of my marshalling helloworld question, I'm running into issues marshalling an array allocated in C to C#. I've spent hours researching where I might be going wrong, but everything I've tried ends up with errors such as AccessViolationException.
The function that handles creating an array in C is below.
__declspec(dllexport) int __cdecl import_csv(char *path, struct human ***persons, int *numPersons)
{
int res;
FILE *csv;
char line[1024];
struct human **humans;
csv = fopen(path, "r");
if (csv == NULL) {
return errno;
}
*numPersons = 0; // init to sane value
/*
* All I'm trying to do for now is get more than one working.
* Starting with 2 seems reasonable. My test CSV file only has 2 lines.
*/
humans = calloc(2, sizeof(struct human *));
if (humans == NULL)
return ENOMEM;
while (fgets(line, 1024, csv)) {
char *tmp = strdup(line);
struct human *person;
humans[*numPersons] = calloc(1, sizeof(*person));
person = humans[*numPersons]; // easier to work with
if (person == NULL) {
return ENOMEM;
}
person->contact = calloc(1, sizeof(*(person->contact)));
if (person->contact == NULL) {
return ENOMEM;
}
res = parse_human(line, person);
if (res != 0) {
return res;
}
(*numPersons)++;
}
(*persons) = humans;
fclose(csv);
return 0;
}
The C# code:
IntPtr humansPtr = IntPtr.Zero;
int numHumans = 0;
HelloLibrary.import_csv(args[0], ref humansPtr, ref numHumans);
HelloLibrary.human[] humans = new HelloLibrary.human[numHumans];
IntPtr[] ptrs = new IntPtr[numHumans];
IntPtr aIndex = (IntPtr)Marshal.PtrToStructure(humansPtr, typeof(IntPtr));
// Populate the array of IntPtr
for (int i = 0; i < numHumans; i++)
{
ptrs[i] = new IntPtr(aIndex.ToInt64() +
(Marshal.SizeOf(typeof(IntPtr)) * i));
}
// Marshal the array of human structs
for (int i = 0; i < numHumans; i++)
{
humans[i] = (HelloLibrary.human)Marshal.PtrToStructure(
ptrs[i],
typeof(HelloLibrary.human));
}
// Use the marshalled data
foreach (HelloLibrary.human human in humans)
{
Console.WriteLine("first:'{0}'", human.first);
Console.WriteLine("last:'{0}'", human.last);
HelloLibrary.contact_info contact = (HelloLibrary.contact_info)Marshal.
PtrToStructure(human.contact, typeof(HelloLibrary.contact_info));
Console.WriteLine("cell:'{0}'", contact.cell);
Console.WriteLine("home:'{0}'", contact.home);
}
The first human struct gets marshalled fine. I get the access violation exceptions after the first one. I feel like I'm missing something with marshalling structs with struct pointers inside them. I hope I have some simple mistake I'm overlooking. Do you see anything wrong with this code?
See this GitHub gist for full source.
// Populate the array of IntPtr
This is where you went wrong. You are getting back a pointer to an array of pointers. You got the first one correct, actually reading the pointer value from the array. But then your for() loop got it wrong, just adding 4 (or 8) to the first pointer value. Instead of reading them from the array. Fix:
IntPtr[] ptrs = new IntPtr[numHumans];
// Populate the array of IntPtr
for (int i = 0; i < numHumans; i++)
{
ptrs[i] = (IntPtr)Marshal.PtrToStructure(humansPtr, typeof(IntPtr));
humansPtr = new IntPtr(humansPtr.ToInt64() + IntPtr.Size);
}
Or much more cleanly since marshaling arrays of simple types is already supported:
IntPtr[] ptrs = new IntPtr[numHumans];
Marshal.Copy(humansPtr, ptrs, 0, numHumans);
I found the bug by using the Debug + Windows + Memory + Memory 1. Put humansPtr in the Address field, switched to 4-byte integer view and observed that the C code was doing it correctly. Then quickly found out that ptrs[] did not contain the values I saw in the Memory window.
Not sure why you are writing code like this, other than as a mental exercise. It is not the correct way to go about it, you are for example completely ignoring the need to release the memory again. Which is very nontrivial. Parsing CSV files in C# is quite simple and just as fast as doing it in C, it is I/O bound, not execute-bound. You'll easily avoid these almost impossible to debug bugs and get lots of help from the .NET Framework.
Related
I'm trying to hand a string[] as byte** from C# to C++. I have to say, as a disclaimer, that I never really worked with C++ or pointers before. So I may get some things wrong. I encode and pin every string with a code like this:
var strings = new string[] { "Some string", "Another string" };
var handles = new GCHandle[string.Length];
var pointers = new byte*[string.Length];
for (int index = 0; index < strings.Length; index++)
{
var bytes = Encoding.UTF8.GetBytes(strings[index]);
var handle = GCHandle.Alloc(bytes, GCHandleType.Pinned);
handles[index] = handle;
pointers[index] = (byte*)handle.AddrOfPinnedObject().ToPointer();
}
//Release the handles later...
The problem I'm currently having is, that I can't pin the array that contains all the pointers. I tried to use
GCHandle.Alloc(pointers, GCHandleType.Pinned);
which throws an exception and
fixed (byte** pointer = &pointers[0])
won't work, because the array is unfixed outside the fixed block. So I thought I could maybe do something like this
byte** pointer = (byte**)this.pointers[0]
to obtain a pointer, which results in an exception, because I'm trying to read/write protected memory.
I'm currently unsure if it's even possible to accomplish, or if my approach is just wrong. Does anybody has a idea or was in a similar situation?
Thanks for the help!
I'm trying to use Alea to speed up a program I'm working on but I need some help.
What I need to do is a lot of bitcount and bitwise operations with values stored in two arrays.
For each element of my first array I have to do a bitwise & operation with each element of my second array, then count the bits set to 1 of the & result.
If the result is greater than/equal to a certain value I need to exit the inner for and go to the next element of my first array.
The first array is usually a big one, with millions of elements, the second one is usually less than 200.000 elements.
Trying to do all these operations in parallel, here is my code:
[GpuManaged]
private long[] Check(long[] arr1, long[] arr2, int limit)
{
Gpu.FreeAllImplicitMemory(true);
var gpu = Gpu.Default;
long[] result = new long[arr1.Length];
gpu.For(0, arr1.Length, i =>
{
bool found = false;
long b = arr1[i];
for (int i2 = 0; i2 < arr2.Length; i2++)
{
if (LibDevice.__nv_popcll(b & arr2[i2]) >= limit)
{
found = true;
break;
}
}
if (!found)
{
result[i] = b;
}
});
return result;
}
This works as expected but is just a little faster than my version running in parallel on a quad core CPU.
I'm certainly missing something here, it's my very first attempt to write GPU code.
By the way, my NVIDIA is a GeForce GT 740M.
EDIT
The following code is 2x faster than the previous one, at least on my PC. Many thanks to Michael Randall for pointing me in the right direction.
private static int[] CheckWithKernel(Gpu gpu, int[] arr1, int[] arr2, int limit)
{
var lp = new LaunchParam(16, 256);
var result = new int[arr1.Length];
try
{
using (var dArr1 = gpu.AllocateDevice(arr1))
using (var dArr2 = gpu.AllocateDevice(arr2))
using (var dResult = gpu.AllocateDevice<int>(arr1.Length))
{
gpu.Launch(Kernel, lp, arr1.Length, arr2.Length, dArr1.Ptr, dArr2.Ptr, dResult.Ptr, limit);
Gpu.Copy(dResult, result);
return result;
}
}
finally
{
Gpu.Free(arr1);
Gpu.Free(arr2);
Gpu.Free(result);
}
}
private static void Kernel(int a1, int a2, deviceptr<int> arr1, deviceptr<int> arr2, deviceptr<int> arr3, int limit)
{
var iinit = blockIdx.x * blockDim.x + threadIdx.x;
var istep = gridDim.x * blockDim.x;
for (var i = iinit; i < a1; i += istep)
{
bool found = false;
int b = arr1[i];
for (var j = 0; j < a2; j++)
{
if (LibDevice.__nv_popcll(b & arr2[j]) >= limit)
{
found = true;
break;
}
}
if (!found)
{
arr3[i] = b;
}
}
}
Update
It seems pinning wont work with GCHandle.Alloc()
However the point of this answer is you will get a much greater performance gain out of direct memory access.
http://www.aleagpu.com/release/3_0_3/doc/advanced_features_csharp.html
Directly Working with Device Memory
Device memory provides even more flexibility as it also allows all
kind of pointer arithmetics. Device memory is allocated with
Memory<T> Gpu.AllocateDevice<T>(int length)
Memory<T> Gpu.AllocateDevice<T>(T[] array)
The first overload creates a device memory object for the specified
type T and length on the selected GPU. The second one allocates
storage on the GPU and copies the .NET array into it. Both return a
Memory<T> object, which implements IDisposable and can therefore
support the using syntax which ensures proper disposal once the
Memory<T> object goes out of scope. A Memory<T> object has properties
to determine the length, the GPU or the device on which it lives. The
Memory<T>.Ptr property returns a deviceptr<T>, which can be used in
GPU code to access the actual data or to perform pointer arithmetics.
The following example illustrates a simple use case of device
pointers. The kernel only operates on part of the data, defined by an
offset.
using (var dArg1 = gpu.AllocateDevice(arg1))
using (var dArg2 = gpu.AllocateDevice(arg2))
using (var dOutput = gpu.AllocateDevice<int>(Length/2))
{
// pointer arithmetics to access subset of data
gpu.Launch(Kernel, lp, dOutput.Length, dOutput.Ptr, dArg1.Ptr + Length/2, dArg2.Ptr + Length / 2);
var result = dOutput.ToArray();
var expected = arg1.Skip(Length/2).Zip(arg2.Skip(Length/2), (x, y) => x + y);
Assert.That(result, Is.EqualTo(expected));
}
Original Answer
Disregarding the logic going on, or how relevant this is to GPU code. However you could compliment your Parallel routine and possibly speed things up by by Pinning your Arrays in memory with GCHandle.Alloc() and the GCHandleType.Pinned flag and using Direct Pointer access (if you can run unsafe code)
Notes
You will cop a hit from pinning the memory, however for large arrays you can realize a lot of performance from direct access*
You will have to mark your assembly unsafe in Build Properties*
This is obviously untested and just an example*
You could used fixed, however the Parallel Lambda makes it fiddlier
Example
private unsafe long[] Check(long[] arr1, long[] arr2, int limit)
{
Gpu.FreeAllImplicitMemory(true);
var gpu = Gpu.Default;
var result = new long[arr1.Length];
// Create some pinned memory
var resultHandle = GCHandle.Alloc(result, GCHandleType.Pinned);
var arr2Handle = GCHandle.Alloc(result, GCHandleType.Pinned);
var arr1Handle = GCHandle.Alloc(result, GCHandleType.Pinned);
// Get the addresses
var resultPtr = (int*)resultHandle.AddrOfPinnedObject().ToPointer();
var arr2Ptr = (int*)arr2Handle.AddrOfPinnedObject().ToPointer();
var arr1Ptr = (int*)arr2Handle.AddrOfPinnedObject().ToPointer();
// I hate nasty lambda statements. I always find local methods easier to read.
void Workload(int i)
{
var found = false;
var b = *(arr1Ptr + i);
for (var j = 0; j < arr2.Length; j++)
{
if (LibDevice.__nv_popcll(b & *(arr2Ptr + j)) >= limit)
{
found = true;
break;
}
}
if (!found)
{
*(resultPtr + i) = b;
}
}
try
{
gpu.For(0, arr1.Length, i => Workload(i));
}
finally
{
// Make sure we free resources
arr1Handle.Free();
arr2Handle.Free();
resultHandle.Free();
}
return result;
}
Additional Resources
GCHandle.Alloc Method (Object)
A new GCHandle that protects the object from garbage collection. This
GCHandle must be released with Free when it is no longer needed.
GCHandleType Enumeration
Pinned : This handle type is similar to Normal, but allows the address of the pinned object to be taken. This prevents the garbage
collector from moving the object and hence undermines the efficiency
of the garbage collector. Use the Free method to free the allocated
handle as soon as possible.
Unsafe Code and Pointers (C# Programming Guide)
In the common language runtime (CLR), unsafe code is referred to as
unverifiable code. Unsafe code in C# is not necessarily dangerous; it
is just code whose safety cannot be verified by the CLR. The CLR will
therefore only execute unsafe code if it is in a fully trusted
assembly. If you use unsafe code, it is your responsibility to ensure
that your code does not introduce security risks or pointer errors.
A note, there has since been an update, this:
http://www.aleagpu.com/release/3_0_3/doc/advanced_features_csharp.html
is now this:
http://www.aleagpu.com/release/3_0_4/doc/advanced_features_csharp.html
some of the samples and info have changed or moved in release 3.0.4.
I am currently testing some PInvoke stuff and wrote a short C function to try some different things out. I successfully managed to pass Ints and return an addition, but I am having some trouble when it comes to strings.
Here is the C function:
__declspec(dllexport) int test(char *str, int slen){
for(int i = 0; i < slen; i++){
str[i] = 'a';
}
return slen;
}
And here is the C# function declaration and usage:
[DllImport("solver.dll", CharSet = CharSet.Ansi ,CallingConvention = CallingConvention.Cdecl)]
public static extern int test(StringBuilder sol, int len);
StringBuilder sol = new StringBuilder(15);
int ret = test(sol, sol.Capacity);
string str = sol.ToString();
I've been researching this for most of the day and I've seen several posts about simply passing a StringBuilder, filling it on the C end and it should be accessible when the function finishes. However I am currently getting an AccessViolation error in the C code as if the Memory hadn't been allocated, but I definitely allocate the Memory with new StringBuilder(15)
The C function definitely works if I allocate a piece of memory in the C code and pass it.
Is there something I am missing?
Sounds like you are missing to NUL-terminate the string buffer.
You may want to update your code like this:
for (int i = 0; i < (slen-1); i++) {
str[i] = 'a';
}
str[slen-1] = '\0'; // Add a NUL-terminator
Note that in this case I'm assuming the buffer length passed by the caller is the total length, including room for the NUL-terminator.
(Other conventions are possible as well, for example assuming the length passed by the caller excludes the NUL-terminator; the important thing is clarity of the interface documentation.)
I'd also like to add that a usual calling convention for those exported functions is __stdcall instead of __cdecl (although that's not correlated to your problem).
I have spent the last day searching documentation, reviewing forum posts, and googling to try to do something that I am guessing could be easily done with the right information.
I have a very large existing C++ application that has an already defined COM server with many methods exposed. I am trying to use those COM methods in a C# application (I am experienced in C++, but a C# newbie).
So in my VS2010 C# application, I add the COM server as a reference. COM Methods are visible in the object browser, and passing single-valued strings, floats, and ints seems to work fine.
But I am stumped trying to read SAFEARRAY values passed out of the C++ COM server into the C# application. Eventually I need to pass string arrays from C++ server to the C# app, but in just testing passing an array of floats, I have code that builds but fails with the following exception when I try to cast the System.Object containing the array of floats to (float[]) ,
"exception {System.InvalidCastException: Unable to cast object of type 'System.Object[]' to type 'System.Single[]'."
With intellisence I can see that the Object contains the correct 8760 long array of floats, but I am unable to access that data in C#.
Here is the code on the C# side (d2RuleSet is an interface defined in the DOE2Com COM server). The exception above is thrown on the last line below.
DOE2ComLib.DOE2Com d2RuleSet;
d2RuleSet = new DOE2ComLib.DOE2Com();
System.Int32 i_Series =0;
System.Object pv_WeatherData;
float[] faWeatherData;
iOut = d2RuleSet.GetWeatherData(i_Series, out pv_WeatherData);
Type typeTest;
typeTest = pv_WeatherData.GetType();
int iArrayRank = typeTest.GetArrayRank();
Type typeElement = typeTest.GetElementType();
faWeatherData = (float[])pv_WeatherData;
Below is the section in the idl file defining the C++ COM method
HRESULT GetWeatherData( [in] int iSeries, [out] VARIANT* pvWeatherData, [out,retval] int * piErrorCode);
Below is the C++ code where the VARIANT data is loaded.
void CDOE2BaseClass::GetWeatherData( int iSeries, VARIANT* pvWeatherData, int* piErrorCode)
{
*piErrorCode = 0;
if (iSeries < 0 || iSeries >= D2CWS_NumSeries)
*piErrorCode = 1;
else if (m_faWeatherData[iSeries] == NULL)
*piErrorCode = 3;
else
{
SAFEARRAYBOUND rgsaBound;
rgsaBound.lLbound = 0;
rgsaBound.cElements = 8760;
// First lets create the SafeArrays (populated with VARIANTS to ensure compatibility with VB and Java)
SAFEARRAY* pSAData = SafeArrayCreate( VT_VARIANT, 1, &rgsaBound );
if( pSAData == NULL ) {
#ifndef _DOE2LIB
_com_issue_error( E_OUTOFMEMORY);
#else
//RW_TO_DO - Throw custom Lib-version exception
OurThrowDOE2LibException(-1,__FILE__,__LINE__,0,"OUT OF MEMORY");
#endif //_DOE2LIB
}
for (long hr=0; hr<8760; hr++)
{
COleVariant vHrResult( m_faWeatherData[iSeries][hr] );
SafeArrayPutElement( pSAData, &hr, vHrResult );
}
// Now that we have populated the SAFEARRAY, assign it to the VARIANT pointer that we are returning to the client.
V_VT( pvWeatherData ) = VT_ARRAY | VT_VARIANT;
V_ARRAY( pvWeatherData ) = pSAData;
}
}
Thanks in advance for help with this problem, I feel like I spent too much time on what should be a simple problem. Also please post up any links or books that that cover interop between native C++ and C# well (I think I have already ping-ponged through most of the Visual Studio/MSDN documentation, but maybe I missed something there too).
-----------------End Of Original Question------------------------------------------------------
I am editing to post the code from phoog's successful solution below, so others can read and use it.
int iOut = 0;
System.Int32 i_Series =0;
System.Object pv_WeatherData = null;
iOut = d2RuleSet.GetWeatherData(i_Series, out pv_WeatherData);
Type typeTest;
typeTest = pv_WeatherData.GetType();
int iArrayRank = typeTest.GetArrayRank();
Type typeElement = typeTest.GetElementType();
//float[] faWeatherData = (float[])pv_WeatherData;
float[] faWeatherData = ConvertTheArray((object[])pv_WeatherData);
....
float[] ConvertTheArray(object[] inputArray)
{
float[] result = new float[inputArray.Length];
for (var index = 0; index < result.Length; index++)
result[index] = (float)inputArray[index];
return result;
}
The marshalled SAFEARRAY is a System.Object[]; you can't reference-convert that to a System.Single[]. You have to cast the individual elements. This would work, assuming that all the elements of the argument array are in fact boxed floats:
float[] ConvertTheArray(object[] inputArray)
{
float[] result = new float[inputArray.Length];
for (var index = 0; index < result.Length; index++)
result[index] = (float)inputArray[index];
return result;
}
You could do that with a lot less typing using Linq, but as you're a C# newbie I thought a more basic solution might be better.
EDIT
Since you indicated in your comment that your object[] is referenced as an object, here's the usage example:
object obj = GetMarshalledArray();
float[] floats = ConvertTheArray((object[])obj);
EDIT 2
A shorter solution using Linq:
float[] ConvertTheArray(object[] inputArray)
{
return inputArray.Cast<float>().ToArray();
}
I'm following the pinvoke code provided here but am slightly scared by the marshalling of the variable-length array as size=1 and then stepping through it by calculating an offset instead of indexing into an array. Isn't there a better way? And if not, how should I do this to make it safe for 32-bit and 64-bit?
[StructLayout(LayoutKind.Sequential)]
public struct SID_AND_ATTRIBUTES
{
public IntPtr Sid;
public uint Attributes;
}
[StructLayout(LayoutKind.Sequential)]
public struct TOKEN_GROUPS
{
public int GroupCount;
[MarshalAs(UnmanagedType.ByValArray, SizeConst = 1)]
public SID_AND_ATTRIBUTES[] Groups;
};
public void SomeMethod()
{
IntPtr tokenInformation;
// ...
string retVal = string.Empty;
TOKEN_GROUPS groups = (TOKEN_GROUPS)Marshal.PtrToStructure(tokenInformation, typeof(TOKEN_GROUPS));
int sidAndAttrSize = Marshal.SizeOf(new SID_AND_ATTRIBUTES());
for (int i = 0; i < groups.GroupCount; i++)
{
// *** Scary line here:
SID_AND_ATTRIBUTES sidAndAttributes = (SID_AND_ATTRIBUTES)Marshal.PtrToStructure(
new IntPtr(tokenInformation.ToInt64() + i * sidAndAttrSize + IntPtr.Size),
typeof(SID_AND_ATTRIBUTES));
// ...
}
I see here another approach of declaring the length of the array as much bigger than it's likely to be, but that seemed to have its own problems.
As a side question: When I step through the above code in the debugger I'm not able to evaluate tokenInformation.ToInt64() or ToInt32(). I get an ArgumentOutOfRangeException. But the line of code executes just fine!? What's going on here?
Instead of guessing what the offset, is its generally better to use Marshal.OffsetOf(typeof(TOKEN_GROUPS), "Groups") to get the correct offset to the start of the array.
I think it looks okay -- as okay as any poking about in unmanaged land is, anyway.
However, I wonder why the start is tokenInformation.ToInt64() + IntPtr.Size and not tokenInformation.ToInt64() + 4 (as the GroupCount field type is an int and not IntPtr). Is this for packing/alignment of the structure or just something fishy? I do not know here.
Using tokenInformation.ToInt64() is important because on a 64-bit machine will explode (OverflowException) if the IntPtr value is larger than what an int can store. However, the CLR will handle a long just fine on both architectures and it doesn't change the actual value extracted from the IntPtr (and thus put back into the new IntPtr(...)).
Imagine this (untested) function as a convenience wrapper:
// unpacks an array of structures from unmanaged memory
// arr.Length is the number of items to unpack. don't overrun.
void PtrToStructureArray<T>(T[] arr, IntPtr start, int stride) {
long ptr = start.ToInt64();
for (int i = 0; i < arr.Length; i++, ptr += stride) {
arr[i] = (T)Marshal.PtrToStructure(new IntPtr(ptr), typeof(T));
}
}
var attributes = new SID_AND_ATTRIBUTES[groups.GroupCount];
PtrToStructureArray(attributes, new IntPtr(tokenInformation.ToInt64() + IntPtr.Size), sidAndAttrSize);
Happy coding.