I've been trying to decide the best way to store information in VBO's for OpenGL. I've been reading about how matricies are stored as single-dimensional arrays because of the performance penalties of using multi-dimensional arrays.
I was wondering, does this extend to storing information in VBO's? does it make more sense to store vertex information within VBOs in single or multidimensional arrays?
Before I get a bunch of answers saying it depends on what I'm storing, I'm talking specifically about things that would traditionally be considered for multidimensional arrays (maps, grids, etc....).
what type of performance hits would I be looking at by using the multidimensional arrays if any?
The question is invalid, as it makes no sense. Buffer objects are blocks of memory stored and managed by OpenGL. It's not a C# array or a C# multi-dimensional array. It's a block of memory that you transfer data into.
The vertex data you put into a buffer object must conform with what glVertexAttribPointer allows. How you achieve that in C# is up to you and whatever API you use.
are stored as single-dimensional arrays because of the performance penalties of using multi-dimensional arrays.
I think your confusion stems from differences in terminology regarding multidimensional arrays and how they can be represented.
In C an array is accessed through pointer dereferencing:
int arr[10];
arr[5] <--> *(arr+5)
In fact the indexing operator o[i] fully equivalent to *(o+i) and you can actually also write i[o], because it yields the same expression. So the index operator does pointer arithmetic and dereferences. What does this mean for C multidim arrays:
int marr[10][10];
int (*row)[10] = marr[5] <--> *(marr + 5);
mrow[5] = *(mrow + 5) <--> marr[5][5];
And you can expand this for as many dimensions as you like. So, in C multidimensional arrays go through multiple pointer dereferences, which may also mean, that the data is not contigous in memory, and in face may have different sizes for each row. And that's why they are inefficient. C does this that was, to allow for dynamic allocation of the column pointers and the rows.
However the other way to store multidimensional arrays is a flat storage model. I.e. you just concatenate all the elements into one single string of values. And by introducing dimensions you know, where to cut this string to rearrange it.
int fmarr[10*20*30];
int at1_2_3 = fmarr[1*30*20 + 2*20 + 3];
As you can see all sizes are static, and the data is contigous. There are no pointer dereferences, which makes working with the data much simpler and more importantly allows for vectorization of the operations. This makes things much more performant.
Related
I have a few questions about how .Net stores multi-dimensional arrays in memory. I am interested in true multi-dimensional arrays not jagged arrays.
How does the Common Language Runtime (CLR) store multi-dimensional arrays? Is it in row-major, column-major, Iliffe vector or some other format?
Is this format required by the Common Language Infrastucture (CLI) or any other specification?
Is it likely to vary over time or between platforms?
This is in the specification ECMA-335 Partition I (my bold)
8.9.1 Array types
Array elements shall be laid out within the array object in row-major order (i.e., the elements associated with the rightmost array dimension shall be laid out contiguously from lowest to highest index). The actual storage allocated for each array element can include platform-specific padding. (The size of this storage, in bytes, is returned by the sizeof instruction when it is applied to the type of that array‘s elements.)
Section 14.2 also has more explanation.
These two sections specifically refer to arrays as opposed to vectors, the latter of which is the more familiar zero-based one-dimensional array used in most places.
Arrays on the other hand, can be multi-dimensional and based on any bound, they can also be one-dimensional.
So essentially, it's just one big array under the hood, and uses simple math to calculate offsets.
"Is it likely to vary over time or between platforms?" I'll get my crystal ball out... The spec can always change, and implementations may decide to derogate from it.
Should primitive array content be accessed by int for best performance?
Here's an example
int[] arr = new arr[]{1,2,3,4,5};
Array is only 5 elements in length, so the index doesn't have to be int, but short or byte, that would save useless 3 byte memory allocation if byte is used instead of int. Of course, if only i know that array wont overflow size of 255.
byte index = 1;
int value = arr[index];
But does this work as good as it sounds?
Im worried about how this is executed on lower level, does index gets casted to int or other operations which would actually slow down the whole process instead of this optimizing it.
In C and C++, arr[index] is formally equivalent to *(arr + index). Your concerns about casting should be answerable in terms of the simpler question about what the machine will do when it needs to add add an integer offset to a pointer.
I think it's safe to say that on most modern machines when you add a "byte" to a pointer, its going to use the same instruction as it would if you added a 32-bit integer to a pointer. And indeed it's still going to represent that byte using the machine word size, padded with some unused space. So this isn't going to make using the array faster.
Your optimization might make a difference if you need to store millions of these indices in a table, and then using byte instead of int would use 4 times less memory and take less time to move that memory around. If the array you are indexing is huge, and the index needs to be larger than the machine word side, then that's a different consideration. But I think it's safe to say that in most normal situations this optimization doesn't really make sense, and size_t is probably the most appropriate generic type for array indices all things being equal (since it corresponds exactly to the machine word size, on the majority of architectures).
does index gets casted to int or other operations which would actually slow down the whole process instead of this optimizing it
No, but
that would save useless 3 byte memory allocation
You don't gain anything by saving 3 bytes.
Only if you are storing a huge array of those indices then the amount of space you would save might make it a worthwhile investment.
Otherwise stick with a plain int, it's the processor's native word size and thus the fastest.
If I declare a List of char arrays, are they allocated in contiguous memory, or does .NET create a linked list instead?
If it's not contiguous, is there a way I can declare a contiguous list of char arrays? The size of the char arrays is know ahead of time and is fixed (they are all the same size).
Yes, but not in the way that you want. List<T> guarantees that its elements are stored contiguously.
Arrays are a reference type, so the references are stored cotiguously as List<T> guarantees. However, the arrays themselves are allocated separately and where they are stored has nothing to do with the list. It is only concerned with its elements, the references.
If you require that then you should simply use one large array and maintain boundary data.
EDIT: Per your comment:
The inner arrays are always 9 chars.
So, in this case, cache coherency may be an issue because the sub-arrays are so small. You'll be jumping around a lot in memory getting from one array to the next, and I'll just take you on your word about the performance sensitivity of this code.
Just use a multi-dimensional if you can. This of course assumes you know the size or that you can impose a maximum size on it.
Is it possible to trade some memory to reduce complexity/time and just set a max size for N? Using a multi-dimensional array (but don't use the latter) is the only way you can guarantee contiguous allocation.
EDIT 2:
Trying to keep the answer in sync with the comments. You say that the max size of the first dimension is 9! and, as before, the size of the second dimension is 9.
Allocate it all up front. You're trading some memory for time. 9! * 9 * 2 / 1024 / 1024 == ~6.22MB.
As you say, the List may grow to that size anyway, so worst case you waste a few MB of memory. I don't think it's going to be an issue unless you plan on running this code in a toaster oven. Just allocate the buffer as one array up front and you're good.
List functions as a dynamic array, not a linked list, but this is beside the point. No memory will be allocated for the char[]s until they themselves are instantiated. The List is merely responsible for holding references to char[]s, of which it will contain none when first created.
If it's not contiguous, is there a way I can declare a contiguous list of char arrays? The size of the char arrays is know ahead of time and is fixed (they are all the same size).
No, but you could instantiate a 2-dimensional array of chars, if you also know how many char arrays there would have been:
char[,] array = new char[x, y];
I'm trying to optimize some code where I have a large number of arrays containing structs of different size, but based on the same interface. In certain cases the structs are larger and hold more data, othertimes they are small structs, and othertimes I would prefer to keep null as a value to save memory.
My first question is. Is it a good idea to do something like this? I've previously had an array of my full data struct, but when testing mixing it up I would virtually be able to save lots of memory. Are there any other downsides?
I've been trying out different things, and it seams to work quite well when making an array of a common interface, but I'm not sure I'm checking the size of the array correctly.
To simplified the example quite a bit. But here I'm adding different structs to an array. But I'm unable to determine the size using the traditional Marshal.SizeOf method. Would it be correct to simply iterate through the collection and count the sizeof for each value in the collection?
IComparable[] myCollection = new IComparable[1000];
myCollection[0] = null;
myCollection[1] = (int)1;
myCollection[2] = "helloo world";
myCollection[3] = long.MaxValue;
System.Runtime.InteropServices.Marshal.SizeOf(myCollection);
The last line will throw this exception:
Type 'System.IComparable[]' cannot be marshaled as an unmanaged structure; no meaningful size or offset can be computed.
Excuse the long post:
Is this an optimal and usable solution?
How can I determine the size
of my array?
I may be wrong but it looks to me like your IComparable[] array is a managed array? If so then you can use this code to get the length
int arrayLength = myCollection.Length;
If you are doing platform interop between C# and C++ then the answer to your question headline "Can I find the length of an unmanaged array" is no, its not possible. Function signatures with arrays in C++/C tend to follow the following pattern
void doSomeWorkOnArrayUnmanaged(int * myUnmanagedArray, int length)
{
// Do work ...
}
In .NET the array itself is a type which has some basic information, such as its size, its runtime type etc... Therefore we can use this
void DoSomeWorkOnManagedArray(int [] myManagedArray)
{
int length = myManagedArray.Length;
// Do work ...
}
Whenever using platform invoke to interop between C# and C++ you will need to pass the length of the array to the receiving function, as well as pin the array (but that's a different topic).
Does this answer your question? If not, then please can you clarify
Optimality always depends on your requirements. If you really need to store many elements of different classes/structs, your solution is completely viable.
However, I guess your expectations on the data structure might be misleading: Array elements are per definition all of the same size. This is even true in your case: Your array doesn't store the elements themselves but references (pointers) to them. The elements are allocated somewhere on the VM heap. So your data structure actually goes like this: It is an array of 1000 pointers, each pointer pointing to some data. The size of each particular element may of course vary.
This leads to the next question: The size of your array. What are you intending to do with the size? Do you need to know how many bytes to allocate when you serialize your data to some persistent storage? This depends on the serialization format... Or do you need just a rough estimate on how much memory your structure is consuming? In the latter case you need to consider the array itself and the size of each particular element. The array which you gave in your example consumes approximately 1000 times the size of a reference (should be 4 bytes on a 32 bit machine and 8 bytes on a 64 bit machine). To compute the sizes of each element, you can indeed iterate over the array and sum up the size of the particular elements. Please be aware that this is only an estimate: The virtual machine adds some memory management overhead which is hard to determine exactly...
I have a code that looks like (more or less) :
public void INeedHolidaysNow(double[,]){
//... Code to take a break from coding and fly to Hawaii
}
double[][] WageLossPerVacationDay = new double[10][5];
INeedHolidays(WageLossPerVacationDay); // >>throws the Exception in the Title
I found the solution on this post which consists in looping rather thant trying a savage cast
So my question is : WHY ? what happens behind the scenes in the memory allocation that prevents what may seem - at least at first glance, to be a feasible cast ? I mean structurally, both expression seem to be quite identic. What is it that I am missing here ?
EDIT:
I have to use "double[ ][ ]" as it is provided by an external library.
One is a jagged array, the other is one big block.
double[][] is an array that contains arrays of double. Its form is not necessarily rectangular. In c terms its similar to a double**
double[,] is just one big block of memory, and it is always rectangular. In c terms it's just a double* where you use myArray[x+y*width] to access an element.
One is called multidimensional array (double[*,*]) and other is called jagged array (double[][]).
Here is a good discussion over differences between them.
What is differences between Multidimensional array and Array of Arrays in C#?
Quite simply, [,] arrays (2D arrays) and [][] arrays (jagged arrays) aren't the same.
A [,] array can be visually represented by a rectangle (hence it's also known as a rectangular array), while a [][] array is a 1D array that contains other 1D arrays.
It wouldn't make much sense to be able to cast one to the other.
You should define your double array like this:
double[,] WageLossPerVacationDay = new double[3, 5];
Then it should work!
hth