Pass a pointer to an array section as a parameter in C#

Pass a pointer to an array section as a parameter in C# - c#

I'm just learning neural networks and I would like to have the neuron's constructor receive a pointer to a section in an array that would be the chromosome. Something like this:
public int* ChromosomeSection;
public Neuron(int* chromosomeSection)
{
ChromosomeSection = chromosomeSection;
}
So then I would create my neurons with something like this:
int[] Chromosome = new int[neuronsCount * neuronDataSize];
for (int n = 0; n < Chromosome.Length; n += neuronDataSize)
{
AddNeuron(new Neuron(Chromosome + n));
}
Is it possible to do this in C#? I know C# supports some unsafe code. But I don't know how to tell the compiler that the line public Neuron(int* chromosomeSection) is unsafe.
Also, will I be able to do every operation that I would do in C++ or C? Is there any gotcha I should be aware of before starting to do it this way? Never worked with unsafe code in C# before.

Eric Lippert has nice two-part series: References and Pointers, Part One and "managed pointers" implementation (References and Pointers, Part Two).
Here's a handy type I whipped up when I was translating some complex pointer-manipulation code from C to C#. It lets you make a safe "managed pointer" to the interior of an array. You get all the operations you can do on an unmanaged pointer: you can dereference it as an offset into an array, do addition and subtraction, compare two pointers for equality or inequality, and represent a null pointer. But unlike the corresponding unsafe code, this code doesn't mess up the garbage collector and will assert if you do something foolish, like try to compare two pointers that are interior to different arrays. (*) Enjoy!
Hope it helps.

Sounds like you could use ArraySegment<int> for what you are trying to do.
ArraySegment is a wrapper around
an array that delimits a range of
elements in that array. Multiple
ArraySegment instances can refer to
the same original array and can
overlap.
The Array property returns the entire
original array, not a copy of the
array; therefore, changes made to the
array returned by the Array property
are made to the original array.

Yes this is perfectly possible in C#, although a pointer on alone is not sufficient information to use it in this way, you'd also need an Int32 length parameter, so you know how many times it's safe to increment that pointer without an overrun - this should be familiar if you're from a C++ background.

Related

Best way to return array of certain length

I'm new to C# from C programming background, sorry for probably a basic question. I'm trying to find the best way to return an array of values from a method.
In C, we can do either of the following:
void myFuction(double[] inputA, int lenA, double* outputA) : the function gets the input array "inputA" of length "lenA" elements, and returns the "outputA" pointer which is allocated memory before the call to "myFunction" and has got populated in the function "myFunction".
double* myFunction (double[] inputA, int lenA) : alternatively, be allocated memory inside "myFucntion" based on a desired length, gets populated, and returned as "return outputA".
What's the best way to do this in C#?

What's the best way to do this in C#?
Pointers are almost never used directly in C#, so the idiomatic signature would use arrays:
double[] myFunction (double[] inputA, int lenA)
but note that in C# you can also get the length of an array very easily (int len = inputA.Length), so lenA may not be needed unless you want to support processing a subset of the array.
As you progress, you'll find that other data structures like List and interfaces like IEnumerable may be preferable depending on your use case (i.e. you may decide that the function should just return an iterable collection without specifying the actual type of that collection).
As a side note (and from someone who went from C++ to C# myself), the best thing you can do is NOT think of C# as an "extension" or "alternative" to C++. While the syntax and types are very similar, C# as a managed language does many fundamental things very differently, so you may be limiting yourself if you just try to "port" C++ code to C#. Try to think of it as a brand new language and framework with some syntax overlap.

c# allows for both of these methods as well, however arrays are denoted with [] even in returns instead of the * pointer.
double[] myFunction(double[] inputA, int lenA)
{
double[] output = new double[lengthOfArray];
...
return output;
}
Alternatively, c# allows for an "out" parameter that can be used in a similar manner.
void myFunction(double[] inputA, int lenA, out double[] output)
{
...
output = ...
//OR
output[0] = 1; //replace 0 with index #
}
Both are acceptable practices, with the first example being the more common way to do it in c# for readable and organization.

Fixing an array of array in C# (unsafe code)

I'm trying to come up with a solution as to how I can pass an array of arrays from C# into a native function. I already have a delegate to the function (Marshal.GetDelegateForFunctionPointer), but now I'm trying to pass a multidimensional array (or rather; an array of arrays) into it.
This code example works when the input has 2 sub-arrays, but I need to be able to handle any number of sub-arrays. What's the easiest way you can think of to do that? I'd prefer not to copy the data between arrays as this will be happening in a real-time loop (I'm communicating with an audio effect)
public void process(float[][] input)
{
unsafe
{
// If I know how many sub-arrays I have I can just fix them like this... but I need to handle n-many arrays
fixed (float* inp0 = input[0], inp1 = input[1] )
{
// Create the pointer array and put the pointers to input[0] and input[1] into it
float*[] inputArray = new float*[2];
inputArray[0] = inp0;
inputArray[1] = inp1;
fixed(float** inputPtr = inputArray)
{
// C function signature is someFuction(float** input, int numberOfChannels, int length)
functionDelegate(inputPtr, 2, input[0].length);
}
}
}
}

You can pin an object in place without using fixed by instead obtaining a pinned GCHandle to the object in question. Of course, it should go without saying that by doing so you take responsibility for ensuring that the pointer does not survive past the point where the object is unpinned. We call it "unsafe" code for a reason; you get to be responsible for safe memory management, not the runtime.
http://msdn.microsoft.com/en-us/library/system.runtime.interopservices.gchandle.aspx

It makes no sense trying to lock the array of references to the managed arrays.
The references values in there probably don't point to the adress of the first element, and even if they did, that would be an implementation detail. It could change from release to release.
Copying an array of pointers to a lot of data should not be that slow, especcially not when compared with the multimedia processing you are calling into.
If it is significant, allocate your data outside of the managed heap, then there is no pinning or copying. But more bookkeeping.

The easiest way I know is to use one dimension array. It reduce complexity, memory fragmentation and also will have better performance. I actually do so in my project. You can use manual indexing like array[i][j] = oneDimArray[i *n + j] and pass n as param to a function. And you will do only one fixing just like you done in your example:
public void process(float[] oneDimInput, int numberOfColumns)
{
unsafe
{
fixed (float* inputPtr = &oneDimInput[0])
{
// C function signature is someFuction(
// float* input,
// int number of columns in oneDimInput
// int numberOfChannels,
// int length)
functionDelegate(inputPtr, numberOfColumns, 2, oneDimInput[0].length);
}
}
}
Also I need to note, that two dimension arrays rarely used in high performance computation libraries as Intel MKL, Intel IPP and many others. Even BLAS and Lapack interfaces contain only one dimension arrays and emulate two dimension using aproach I've mentioned (for performance reasons).

How do you explain C++ pointers to a C#/Java developer? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I am a C#/Java developer trying to learn C++. As I try to learn the concept of pointers, I am struck with the thought that I must have dealt with this concept before. How can pointers be explained using only concepts that are familiar to a .NET or Java developer? Have I really never dealt with this, is it just hidden to me, or do I use it all the time without calling it that?

Java objects in C++
A Java object is the equivalent of a C++ shared pointer.
A C++ pointer is like a Java object without the garbage collection built in.
C++ objects.
C++ has three ways of allocating objects:
Static Storage Duration objects.
These are created at startup (before main) and die after main exits.
There are some technical caveats to that but that is the basics.
Automatic Storage Duration objects.
These are created when declared and destroyed when they go out of scope.
I believe these are like C# structs
Dynamic Storage Duration objects
These are created via new and the closest to a C#/Java object (AKA pointers)
Technically pointers need to be destroyed manually via delete. But this is considered bad practice and under normal situations they are put inside Automatic Storage Duration Objects (usually called smart pointers) that control their lifespan. When the smart pointer goes out of scope it is destroyed and its destructor can call delete on the pointer. Smart pointers can be though of as fine grain garbage collectors.
The closest to Java is the shared_ptr, this is a smart pointer that keeps a count of the number of users of the pointer and deletes it when nobody is using it.

You are "using pointers" all the time in C#, it's just hidden from you.
The best way I reckon to approach the problem is to think about the way a computer works. Forget all of the fancy stuff of .NET: you have the memory, which just holds byte values, and the processor, which just does things to these byte values.
The value of a given variable is stored in memory, so is associated with a memory address. Rather than having to use the memory address all the time, the compiler lets you read from it and write to it using a name.
Furthermore, you can choose to interpret a value as a memory address at which you wish to find another value. This is a pointer.
For example, lets say our memory contains the following values:
Address [0] [1] [2] [3] [4] [5] [6] [7]
Data 5 3 1 8 2 7 9 4
Let's define a variable, x, which the compiler has chosen to put at address 2. It can be seen that the value of x is 1.
Let's now define a pointer, p which the compiler has chosen to put at address 7. The value of p is 4. The value pointed to by p is the value at address 4, which is the value 2. Getting at the value is called dereferencing.
An important concept to note is that there is no such thing as a type as far as memory is concerned: there are just byte values. You can choose to interpret these byte values however you like. For example, dereferencing a char pointer will just get 1 byte representing an ASCII code, but dereferencing an int pointer may get 4 bytes making up a 32 bit value.
Looking at another example, you can create a string in C with the following code:
char *str = "hello, world!";
What that does is says the following:
Put aside some bytes in our stack frame for a variable, which we'll call str.
This variable will hold a memory address, which we wish to interpret as a character.
Copy the address of the first character of the string into the variable.
(The string "hello, world!" will be stored in the executable file and hence will be loaded into memory when the program loads)
If you were to look at the value of str you'd get an integer value which represents an address of the first character of the string. However, if we dereference the pointer (that is, look at what it's pointing to) we'll get the letter 'h'.
If you increment the pointer, str++;, it will now point to the next character. Note that pointer arithmetic is scaled. That means that when you do arithmetic on a pointer, the effect is multiplied by the size of the type it thinks it's pointing at. So assuming int is 4 bytes wide on your system, the following code will actually add 4 to the pointer:
int *ptr = get_me_an_int_ptr();
ptr++;
If you end up going past the end of the string, there's no telling what you'll be pointing at; but your program will still dutifully attempt to interpret it as a character, even if the value was actually supposed to represent an integer for example. You may well be trying to access memory which is not allocated to your program however, and your program will be killed by the operating system.
A final useful tip: arrays and pointer arithmetic are the same thing, it's just syntactic sugar. If you have a variable, char *array, then
array[5]
is completely equivalent to
*(array + 5)

A pointer is the address of an object.
Well, technically a pointer value is the address of an object. A pointer object is an object (variable, call it what you prefer) capable of storing a pointer value, just as an int object is an object capable of storing an integer value.
["Object" in C++ includes instances of class types, and also of built-in types (and arrays, etc). An int variable is an object in C++, if you don't like that then tough luck, because you have to live with it ;-)]
Pointers also have static type, telling the programmer and the compiler what type of object it's the address of.
What's an address? It's one of those 0x-things with numbers and letters it it that you might sometimes have seen in a debugger. For most architectures we can consider memory (RAM, to over-simplify) as a big sequence of bytes. An object is stored in a region of memory. The address of an object is the index of the first byte occupied by that object. So if you have the address, the hardware can get at whatever's stored in the object.
The consequences of using pointers are in some ways the same as the consequences of using references in Java and C# - you're referring to an object indirectly. So you can copy a pointer value around between function calls without having to copy the whole object. You can change an object via one pointer, and other bits of code with pointers to the same object will see the changes. Sharing immutable objects can save memory compared with lots of different objects all having their own copy of the same data that they all need.
C++ also has something it calls "references", which share these properties to do with indirection but are not the same as references in Java. Nor are they the same as pointers in C++ (that's another question).
"I am struck with the thought that I must have dealt with this concept before"
Not necessarily. Languages may be functionally equivalent, in the sense that they all compute the same functions as a Turing machine can compute, but that doesn't mean that every worthwhile concept in programming is explicitly present in every language.
If you wanted to simulate the C memory model in Java or C#, though, I suppose you'd create a very large array of bytes. Pointers would be indexes in the array. Loading an int from a pointer would involve taking 4 bytes starting at that index, and multiplying them by successive powers of 256 to get the total (as happens when you deserialize an int from a bytestream in Java). If that sounds like a ridiculous thing to do, then it's because you haven't dealt with the concept before, but nevertheless it's what your hardware has been doing all along in response to your Java and C# code[*]. If you didn't notice it, then it's because those languages did a good job of creating other abstractions for you to use instead.
Literally the closest the Java language comes to the "address of an object" is that the default hashCode in java.lang.Object is, according to the docs, "typically implemented by converting the internal address of the object into an integer". But in Java, you can't use an object's hashcode to access the object. You certainly can't add or subtract a small number to a hashcode in order to access memory within or in the vicinity of the original object. You can't make mistakes in which you think that your pointer refers to the object you intend it to, but actually it refers to some completely unrelated memory location whose value you're about to scribble all over. In C++ you can do all those things.
[*] well, not multiplying and adding 4 bytes to get an int, not even shifting and ORing, but "loading" an int from 4 bytes of memory.

References in C# act the same way as pointers in C++, without all the messy syntax.
Consider the following C# code:
public class A
{
public int x;
}
public void AnotherFunc(A a)
{
a.x = 2;
}
public void SomeFunc()
{
A a = new A();
a.x = 1;
AnotherFunc(a);
// a.x is now 2
}
Since classes are references types, we know that we are passing an existing instance of A to AnotherFunc (unlike value types, which are copied).
In C++, we use pointers to make this explicit:
class A
{
public:
int x;
};
void AnotherFunc(A* a) // notice we are pointing to an existing instance of A
{
a->x = 2;
}
void SomeFunc()
{
A a;
a.x = 1;
AnotherFunc(&a);
// a.x is now 2
}

"How can pointers be explained using only concepts that are familiar to a .NET or Java developer? "
I'd suggest that there are really two distinct things that need to be learnt.
The first is how to use pointers, and heap allocated memory, to solve specific problems. With an appropriate style, using shared_ptr<> for example, this can be done in a manner analogous to that of Java. A shared_ptr<> has a lot in common with a Java object handle.
Secondly, however, I would suggest that pointers in general are a fundamentally lower level concept that Java, and to a lesser extent C#, deliberately hides. To program in C++ without moving to that level will guarantee a host of problems. You need to think in terms of the underlying memory layout and think of pointers as literally pointers to specific pieces of storage.
To attempt to understand this lower level in terms of higher concepts would be an odd path to take.

Get two sheets of large format graph paper, some scissors and a friend to help you.
Each square on the sheets of paper represents one byte.
One sheet is the stack.
The other sheet is the heap. Give the heap to your friend - he is the memory manager.
You are going to pretend to be a C program and you'll need some memory. When running your program, cut out chunks from the stack and the heap to represent memory allocation.
Ready?
void main() {
int a; /* Take four bytes from the stack. */
int *b = malloc(sizeof(int)); /* Take four bytes from the heap. */
a = 1; /* Write on your first little bit of graph paper, WRITE IT! */
*b = 2; /* Get writing (on the other bit of paper) */
b = malloc(sizeof(int)); /* Take another four bytes from the heap.
Throw the first 'b' away. Do NOT give it
back to your friend */
free(b); /* Give the four bytes back to your friend */
*b = 3; /* Your friend must now kill you and bury the body */
} /* Give back the four bytes that were 'a' */
Try with some more complex programs.

Explain the difference between the stack and the heap and where objects go.
Value types such as structs (both C++ and C#) go on the stack. Reference types (class instances) get put on the heap. A pointer (or reference) points to the memory location on the heap for that specific instance.
Reference type is the key word. Using a pointer in C++ is like using ref keyword in C#.
Managed apps make working with this stuff easy so .NET devs are spared the hassle and confusion. Glad I don't do C anymore.

The key for me was to understand the way memory works. Variables are stored in memory. The places in which you can put variables in memory are numbered. A pointer is a variable that holds this number.

Any C# programmer that understands the semantic differences between classes and structs should be able to understand pointers. I.e., explaining in terms of value vs. reference semantics (in .NET terms) should get the point across; I wouldn't complicate things by trying to explain in terms of ref (or out).

In C#, all references to classes are roughly the equivalent to pointers in the C++ world. For value types (structs, ints, etc..) this is not the case.
C#:
void func1(string parameter)
void func2(int parameter)
C++:
void func1(string* parameter)
void func2(int parameter)
Passing a parameter using the ref keyword in C# is equivalent to passing a parameter by reference in C++.
C#:
void func1(ref string parameter)
void func2(ref int parameter)
C++:
void func1((string*)& parameter)
void func2(int& parameter)
If the parameter is a class, it would be like passing a pointer by reference.

How to get the pointer to the middle of an array in c#

First, basic info on our environment: We're using c# .net 4.0, on Win7-x64, targeting 32-bit.
We have a preallocated -large- array. In a function, we would like to return a pointer to an arbitrary point in this array, so that the calling function can know where to write. Ex:
class SomeClass {
void function_that_uses_the_array() {
Byte [] whereToWrite = getEmptyPtrLocation(1200);
Array.Copy(sourceArray, whereToWrite, ...);
}
}
class DataProvider {
int MAX_SIZE = 1024*1024*64;
Byte [] dataArray = new Byte[MAX_SIZE];
int emptyPtr=0;
Byte[] getEmptyPtrLocation(int requestedBytes) {
int ref = emptyPtr;
emptyPtr += requestedBytes;
return dataArray[ref];
}
}
Essentially, we want to preallocate a big chunk of memory, and reserve arbitrary length portions of this memory block and let some other class/function to use that portion of memory.
In the above example, getEmptyPtrLocation function is incorrect; it is declared as returning Byte[], but attempting to return a single byte value.
Thanks

As others have said, you can't do this in C# - and generally you shouldn't do anything like it. Work with the system - take advantage of the garbage collector etc.
Having said that, if you want to do something similar and you can trust your clients not to overrun their allocated slot, you could make your method return ArraySegment<byte> instead of byte[]. That would represent "part of an array". Obviously most of the methods in .NET don't use ArraySegment<T> - but you could potentially write extension methods using it as the target for some of the more common operations that you want to use.

This is C# not C++ - you are truly working against the garbage collector here. Memory allocation is generally very fast in C#, also it doesn't suffer from memory fragmentation one of the other main reasons to pre-allocate in C++.
Having said that and you still want to do it I would just use the array index and length, so you can use Array.Copy to write at that position.
Edit:
As others have pointed out the ArraySegment<T> fits your needs better: it limits the consumer to just his slice of the array and it doesn't require to allocate a separate array before updating the content in the original array (as represented by the ArraySegment).

Quite old post this is but I have an easy solution. I see that this question has lot's of answers like "you shouldn't". If there is a method to do something, why not use it?
But that aside, you can use the following
int[] array = new int[1337];
position = 69;
void* pointer = (void*)Marshal.UnsafeAddrOfPinnedArrayElement(array, position);

Without going into unmanaged code, you can't return a pointer to the middle of the array in C#, all you can do is return the array index where you want to enter the data.

If you must do this, your options include returning an ArraySegment (as Jon Skeet pointed out), or using Streams. You can return a MemoryStream that is a subset of an array using this constructor. You will need to handle all of your memory reads and writes as stream operations however, not as offset operations.

Are ref and out in C# the same a pointers in C++?

I just made a Swap routine in C# like this:
static void Swap(ref int x, ref int y)
{
int temp = x;
x = y;
y = temp;
}
It does the same thing that this C++ code does:
void swap(int *d1, int *d2)
{
int temp=*d1;
*d1=*d2;
*d2=temp;
}
So are the ref and out keywords like pointers for C# without using unsafe code?

They're more limited. You can say ++ on a pointer, but not on a ref or out.
EDIT Some confusion in the comments, so to be absolutely clear: the point here is to compare with the capabilities of pointers. You can't perform the same operation as ptr++ on a ref/out, i.e. make it address an adjacent location in memory. It's true (but irrelevant here) that you can perform the equivalent of (*ptr)++, but that would be to compare it with the capabilities of values, not pointers.
It's a safe bet that they are internally just pointers, because the stack doesn't get moved and C# is carefully organised so that ref and out always refer to an active region of the stack.
EDIT To be absolutely clear again (if it wasn't already clear from the example below), the point here is not that ref/out can only point to the stack. It's that when it points to the stack, it is guaranteed by the language rules not to become a dangling pointer. This guarantee is necessary (and relevant/interesting here) because the stack just discards information in accordance with method call exits, with no checks to ensure that any referrers still exist.
Conversely when ref/out refers to objects in the GC heap it's no surprise that those objects are able to be kept alive as long as necessary: the GC heap is designed precisely for the purpose of retaining objects for any length of time required by their referrers, and provides pinning (see example below) to support situations where the object must not be moved by GC compacting.
If you ever play with interop in unsafe code, you will find that ref is very closely related to pointers. For example, if a COM interface is declared like this:
HRESULT Write(BYTE *pBuffer, UINT size);
The interop assembly will turn it into this:
void Write(ref byte pBuffer, uint size);
And you can do this to call it (I believe the COM interop stuff takes care of pinning the array):
byte[] b = new byte[1000];
obj.Write(ref b[0], b.Length);
In other words, ref to the first byte gets you access to all of it; it's apparently a pointer to the first byte.

Reference parameters in C# can be used to replace one use of pointers, yes. But not all.
Another common use for pointers is as a means for iterating over an array. Out/ref parameters can not do that, so no, they are not "the same as pointers".

ref and out are only used with function arguments to signify that the argument is to be passed by reference instead of value. In this sense, yes, they are somewhat like pointers in C++ (more like references actually). Read more about it in this article.

The nice thing about using out is that you're guaranteed that the item will be assigned a value -- you will get a compile error if not.

Actually, I'd compare them to C++ references rather than pointers. Pointers, in C++ and C, are a more general concept, and references will do what you want.
All of these are undoubtedly pointers under the covers, of course.

While comparisons are in the eye of the beholder...I say no. 'ref' changes the calling convention but not the type of the parameters. In your C++ example, d1 and d2 are of type int*. In C# they are still Int32's, they just happen to be passed by reference instead of by value.
By the way, your C++ code doesn't really swap its inputs in the traditional sense. Generalizing it like so:
template<typename T>
void swap(T *d1, T *d2)
{
T temp = *d1;
*d1 = *d2;
*d2 = temp;
}
...won't work unless all types T have copy constructors, and even then will be much more inefficient than swapping pointers.

The short answer is Yes (similar functionality, but not exactly the same mechanism).
As a side note, if you use FxCop to analyse your code, using out and ref will result in a "Microsoft.Design" error of "CA1045:DoNotPassTypesByReference."

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.