Multi-threaded 'fixed' - c#

I have a huge array that is being analyzed differently by two threads:
Data is large- no copies allowed
Threads must process concurrently
Must disable bounds checking for maximum performance
Therefore, each thread looks something like this:
unsafe void Thread(UInt16[] data)
{
fixed(UInt16* pData = data)
{
UInt16* pDataEnd = pData + data.Length;
for(UInt16* pCur=pData; pCur != pDataEnd; pCur++)
{
// do stuff
}
}
}
Since there is no mutex (intentionally), I'm wondering if it's safe to use two fixed statements on the same data on parallel threads?? Presumably the second fixed should return the same pointer as the first, because memory is already pinned... and when the first completes, it won't really unpin memory because there is a second fixed() still active.. Has anyone tried this scenario?

According to "CLR via C#" it is safe to do so.
The compiler sets a 'pinned' flag on pData variable (on the pointer, not on the array instance).
So multiple/recursive use should be OK.

Maybe instead of using fixed, you could use GCHandle.Alloc to pin the array:
// not inside your thread, but were you init your shared array
GCHandle handle = GCHandle.Alloc(anArray, GCHandleType.Pinned);
IntPtr intPtr = handle.AddrOfPinnedObject();
// your thread
void Worker(IntPtr pArray)
{
unsafe
{
UInt16* ptr = (UInt16*) pArray.ToPointer();
....
}
}

If all you need to do is
for(int i = 0; i < data.Length; i++)
{
// do stuff with data[i]
}
the bounds check is eliminated by the JIT compiler. So no need for unsafe code.
Note that this does not hold if your access pattern is more complex than that.

Related

GPU global memory calculation

In the worst case, does this sample allocate testCnt * xArray.Length storage in the GPU global memory? How to make sure just one copy of the array is transferred to the device? The GpuManaged attribute seems to serve this purpose but it doesn't solve our unexpected memory consumption.
void Worker(int ix, byte[] array)
{
// process array - only read access
}
void Run()
{
var xArray = new byte[100];
var testCnt = 10;
Gpu.Default.For(0, testCnt, ix => Worker(ix, xArray));
}
EDIT
The main question in a more precise form:
Does each worker thread get a fresh copy of xArray or is there only one copy of xArray for all threads?
Your sample code should allocate 100 bytes of memory on the GPU and 100 bytes of memory on the CPU.
(.Net adds a bit of overhead, but we can ignore that)
Since you're using implicit memory, some resources need to be allocated to track that memory, (basically where it lives: CPU/GPU).
Now... You're probably seeing a bigger memory consumption on the CPU side I assume.
The reason for that is possibly due to kernel compilation happening on the fly.
AleaGPU has to compile your IL code into LLVM, that LLVM is fed into the Cuda compiler which in turn converts it into PTX.
This happens when you run a kernel for the first time.
All of the resources and unmanaged dlls are loaded into memory.
That's possibly what you're seeing.
testCnt has no effect on the amount of memory being allocated.
EDIT*
One suggestion is to use memory in an explicit way.
Its faster and more efficient:
private static void Run()
{
var input = Gpu.Default.AllocateDevice<byte>(100);
var deviceptr = input.Ptr;
Gpu.Default.For(0, input.Length, i => Worker(i, deviceptr));
Console.WriteLine(string.Join(", ", Gpu.CopyToHost(input)));
}
private static void Worker(int ix, deviceptr<byte> array)
{
array[ix] = 10;
}
Try use explicit memory:
static void Worker(int ix, byte[] array)
{
// you must write something back, note, I changed your Worker
// function to static!
array[ix] += 1uy;
}
void Run()
{
var gpu = Gpu.Default;
var hostArray = new byte[100];
// set your host array
var deviceArray = gpu.Allocate<byte>(100);
// deviceArray is of type byte[], but deviceArray.Length = 0,
assert deviceArray.Length == 0
assert Gpu.ArrayGetLength(deviceArray) == 100
Gpu.Copy(hostArray, deviceArray);
var testCnt = 10;
gpu.For(0, testCnt, ix => Worker(ix, deviceArray));
// you must copy memory back
Gpu.Copy(deviceArray, hostArray);
// check your result in hostArray
Gpu.Free(deviceArray);
}

C# "A heap has been corrupted" using sam-ba.dll

I'm writing a C# program that makes a call to the AT91Boot_Scan function in sam-ba.dll. In the documentation for this DLL, the signature for this function is void AT91Boot_Scan(char *pDevList). The purpose of this function is to scan and return a list of connected devices.
Problem: My current problem is that every time I call this function from C#, the code in the DLL throws an a heap has been corrupted exception.
Aside: From what I understand from reading the documentation, the char *pDevList parameter is a pointer to an array of buffers that the function can use to store the device names. However, when calling the method from C#, IntelliSense reports that the signature for this function is actually void AT91Boot_Scan(ref byte pDevList)
I was sort of confused by why this is. A single byte isn't long enough to be a pointer. We would need 4 bytes for 32-bit and 8 bytes for 64-bit... If it's the ref keyword that is making this parameter a pointer, then what byte should I be passing in? The first byte in my array of buffers or the first byte of the first buffer?
Code: The C# method I've written that calls this function is as follows.
/// <summary>
/// Returns a string array containing the names of connected devices
/// </summary>
/// <returns></returns>
private string[] LoadDeviceList()
{
const int MAX_NUM_DEVICES = 10;
const int BYTES_PER_DEVICE_NAME = 100;
SAMBADLL samba = new SAMBADLL();
string[] deviceNames = new string[MAX_NUM_DEVICES];
try
{
unsafe
{
// Allocate an array (of size MAX_NUM_DEVICES) of pointers
byte** deviceList = stackalloc byte*[MAX_NUM_DEVICES];
for (int n = 0; n < MAX_NUM_DEVICES; n++)
{
// Allocate a buffer of size 100 for the device name
byte* deviceNameBuffer = stackalloc byte[BYTES_PER_DEVICE_NAME];
// Assign the buffer to a pointer in the deviceList
deviceList[n] = deviceNameBuffer;
}
// Create a pointer to the deviceList
byte* pointerToStartOfList = *deviceList;
// Call the function. A heap has been corrupted error is thrown here.
samba.AT91Boot_Scan(ref* pointerToStartOfList);
// Read back out the names by converting the bytes to strings
for (int n = 0; n < MAX_NUM_DEVICES; n++)
{
byte[] nameAsBytes = new byte[BYTES_PER_DEVICE_NAME];
Marshal.Copy((IntPtr)deviceList[n], nameAsBytes, 0, BYTES_PER_DEVICE_NAME);
string nameAsString = System.Text.Encoding.UTF8.GetString(nameAsBytes);
deviceNames[n] = nameAsString;
}
}
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
return deviceNames;
}
My attempt at a solution: I noticed that the line byte* pointerToStartOfList = *deviceList; wasn't correctly assigning the pointer for deviceList to pointerToStartOfList. The address was always off by 0x64.
I thought if I hard-coded in a 0x64 offset then the two addresses would match and all would be fine. pointerToStartOfList += 0x64;
However despite forcing the addresses to match I was still getting a a heap has been corrupted error.
My thoughts: I think in my code I'm either not creating the array of buffers correctly, or I'm not passing the pointer for said array correctly .
In the end I could not get sam-ba.dll to work. I had tried writing a C++ wrapper around the DLL but even then it was still throwing the a heap has been corrupted error. My final solution was to just embed the SAM-BA executable, sam-ba.exe, along with all of its dependencies inside my C# program.
Then, whenever I needed to use it I would run sam-ba.exe in command line mode and pass it the relevant arguments. Section 5.1 in the SAM-BA documentation provides instructions on how to run sam-ba.exe in command line mode.
i.e.
SAM-BA.exe \usb\ARM0 AT91SAM9G25-EK myCommand.tcl

Jagged array pinning in c#

I have kind of an issue. I am trying to pin a jagged array (which i am using due to the sheer size of the data i am handling):
public void ExampleCode(double[][] variables) {
int nbObservants = variables.Length;
var allHandles = new List<GCHandle>();
double*[] observationsPointersTable = new double*[nbObservants];
double** observationsPointer;
GCHandle handle;
for (int i = 0; i < nbObservants; i++) {
handle = GCHandle.Alloc(variables[i], GCHandleType.Pinned);
allHandles.Add(handle);
observationsPointersTable[i] = (double*) handle.AddrOfPinnedObject(); // no prob here
}
fixed(double** obsPtr = observationsPointersTable) { // works just fine
Console.WriteLine("haha {0}", obsPtr[0][0]);
}
handle = GCHandle.Alloc(observationsPointersTable, GCHandleType.Pinned); // won't work
allHandles.Add(handle);
observationsPointer = (double**) handle.AddrOfPinnedObject();
// ...
foreach (var aHandle in allHandles) {
aHandle.Free();
}
allHandles.Clear();
}
I need to use these double** in multiple parts of my code, and don't really want to explicitly pin them every time I need to use them. It seems to me that, as I can fix them through the usual fixed statement, I should be able to allocate a pinned handle to them.
Is there any way to actually pin a double*[] ?
Apparently there is no way to do this. I therefore settled for using fixed statements instead of allocating pinned handles.

free memory of an local array of string in method C#

I have recently encountered a problem about the memory my program used. The reason is the memory of an array of string i used in a method. More specifically, this program is to read an integer array from a outside file. Here is my code
class Program
{
static void Main(string[] args)
{
int[] a = loadData();
for (int i = 0; i < a.Length; i++)
{
Console.WriteLine(a[i]);
}
Console.ReadKey();
}
private static int[] loadData()
{
string[] lines = System.IO.File.ReadAllLines(#"F:\data.txt");
int[] a = new int[lines.Length];
for (int i = 0; i < lines.Length; i++)
{
string[] temp = lines[i].Split(new char[]{','},StringSplitOptions.RemoveEmptyEntries);
a[i] = Convert.ToInt32(temp[0]);
}
return a;
}
}
File data.txt is about 7.4 MB and 574285 lines. But when I run, the memory of program shown in task manager is : 41.6 MB. It seems that the memory of the array of string I read in loadData() (it is string[] lines) is not be freed. How can i free it, because it is never used later.
You can call GC.Collect() after setting lines to null, but I suggest you look at all answers here, here and here. Calling GC.Collect() is something that you rarely want to do. The purpose of using a language such as C# is that it manages the memory for you. If you want granular control over the memory read in, then you could create a C++ dll and call into that from your C# code.
Instead of reading the entire file into a string array, you could read it line by line and perform the operations that you need to on that line. That would probably be more efficient as well.
What problem does the 40MB of used memory cause? How often do you read the data? Would it be worth caching it for future use (assuming the 7MB is tolerable).

What is the purpose of anonymous { } blocks in C style languages?

What is the purpose of anonymous { } blocks in C style languages (C, C++, C#)
Example -
void function()
{
{
int i = 0;
i = i + 1;
}
{
int k = 0;
k = k + 1;
}
}
Edit - Thanks for all of the excellent answers!
It limits the scope of variables to the block inside the { }.
Brackets designate an area of scope - anything declared within the brackets is invisible outside of them.
Furthermore, in C++ an object allocated on the stack (e.g. without the use of 'new') will be destructed when it goes out of scope.
In some cases it can also be a way to highlight a particular piece of a function that the author feels is worthy of attention for people looking at the source. Whether this is a good use or not is debatable, but I have seen it done.
They are often useful for RAII purposes, which means that a given resource will be released when the object goes out of scope. For example:
void function()
{
{
std::ofstream out( "file.txt" );
out << "some data\n";
}
// You can be sure that "out" is closed here
}
By creating a new scope they can be used to define local variables in a switch statement.
e.g.
switch (i)
{
case 0 :
int j = 0; // error!
break;
vs.
switch (i)
{
case 0 :
{
int j = 0; // ok!
}
break;
{ ... } opens up a new scope
In C++, you can use them like this:
void function() {
// ...
{
// lock some mutex.
mutex_locker lock(m_mutex);
// ...
}
// ...
}
Once control goes out of the block, the mutex locker is destroyed. And in its destructor, it would automatically unlock the mutex that it's connected to. That's very often done, and is called RAII (resource acquisition is initialization) and also SBRM (scope bound resource management). Another common application is to allocate memory, and then in the destructor free that memory again.
Another purpose is to do several similar things:
void function() {
// set up timer A
{
int config = get_config(TIMER_A);
// ...
}
// set up timer B
{
int config = get_config(TIMER_B);
// ...
}
}
It will keep things separate so one can easily find out the different building blocks. You may use variables having the same name, like the code does above, because they are not visible outside their scope, thus they do not conflict with each other.
Another common use is with OpenGL's glPushMatrix() and glPopMatrix() functions to create logical blocks relating to the matrix stack:
glPushMatrix();
{
glTranslate(...);
glPushMatrix();
{
glRotate(...);
// draw some stuff
}
glPopMatrix();
// maybe draw some more stuff
}
glPopMatrix();
class ExpensiveObject {
public:
ExpensiveObject() {
// acquire a resource
}
~ExpensiveObject() {
// release the resource
}
}
int main() {
// some initial processing
{
ExpensiveObject obj;
// do some expensive stuff with the obj
} // don't worry, the variable's scope ended, so the destructor was called, and the resources were released
// some final processing
}
Scoping of course. (Has that horse been beaten to death yet?)
But if you look at the language definition, you see patterns like:
if ( expression ) statement
if ( expression ) statement else statement
switch ( expression ) statement
while ( expression ) statement
do statement while ( expression ) ;
It simplifies the language syntax that compound-statement is just one of several possible statement's.
compound-statement: { statement-listopt }
statement-list:
statement
statement-list statement
statement:
labeled-statement
expression-statement
compound-statement
selection-statement
iteration-statement
jump-statement
declaration-statement
try-block
You are doing two things.
You are forcing a scope restriction on the variables in that block.
You are enabling sibling code blocks to use the same variable names.
They're very often used for scoping variables, so that variables are local to an arbitrary block defined by the braces. In your example, the variables i and k aren't accessible outside of their enclosing braces so they can't be modified in any sneaky ways, and that those variable names can be re-used elsewhere in your code. Another benefit to using braces to create local scope like this is that in languages with garbage collection, the garbage collector knows that it's safe to clean up out-of-scope variables. That's not available in C/C++, but I believe that it should be in C#.
One simple way to think about it is that the braces define an atomic piece of code, kind of like a namespace, function or method, but without having to actually create a namespace, function or method.
As far as I understand, they are simply for scoping. They allow you to reuse variable names in the parent/sibling scopes, which can be useful from time to time.
EDIT: This question has in fact been answered on another Stack Overflow question. Hope that helps.
As the previous posters mentioned, it limits the use of a variable to the scope in which it is declared.
In garbage collected languages such as C# and Java, it also allows the garbage collector to reclaim memory used by any variables used within the scope (although setting the variables to null would have the same effect).
{
int[] myArray = new int[1000];
... // Do some work
}
// The garbage collector can now reclaim the memory used by myArray
It's about the scope, it refers to the visibility of variables and methods in one part of a program to another part of that program, consider this example:
int a=25;
int b=30;
{ //at this point, a=25, b=30
a*=2; //a=50, b=30
b /= 2; //a=50,b=15
int a = b*b; //a=225,b=15 <--- this new a it's
// declared on the inner scope
}
//a = 50, b = 15
If you are limited to ANSI C, then they could be used to declare variables closer to where you use them:
int main() {
/* Blah blah blah. */
{
int i;
for (i = 0; i < 10; ++i) {
}
}
}
Not neccessary with a modern C compiler though.
A useful use-cas ihmo is defining critical sections in C++.
e.g.:
int MyClass::foo()
{
// stuff uncritical for multithreading
...
{
someKindOfScopeLock lock(&mutexForThisCriticalResource);
// stuff critical for multithreading!
}
// stuff uncritical for multithreading
...
}
using anonymous scope there is no need calling lock/unlock of a mutex or a semaphore explicitly.
I use it for blocks of code that need temporary variables.
One thing to mention is that scope is a compiler controlled phenomenon. Even though the variables go out of scope (and the compiler will call any destructors; POD types are optimised immediately into the code), they are left on the stack and any new variables defined in the parent scope do not overwrite them on gcc or clang (even when compiling with -Ofast). Except it is undefined behaviour to access them via address because the variables have conceptually gone out of scope at the compiler level -- the compiler will stop you accessing them by their identifiers.
#include <stdio.h>
int main(void) {
int* c;
{
int b = 5;
c=&b;
}
printf("%d", *c); //undefined behaviour but prints 5 for reasons stated above
printf("%d", b); //compiler error, out of scope
return 0;
}
Also, for, if and else all precede anonymous blocks aka. compound statements, which either execute one block or the other block based on a condition.

Categories