2D array vs 1D array - c#

I have read the question for Performance of 2-dimensional array vs 1-dimensional array
But in conclusion it says could be the same (depending the map own map function, C does this automatically)?...
I have a matrix wich has 1,000 columns and 440,000,000 rows where each element is a double in C#...
If I am doing some computations in memory, which one could be better to use in performance aspect? (note that I have the memory needed to hold such a monstruos quantity of information)...

If what you're asking is which is better, a 2D array of size 1000x44000 or a 1D array of size 44000000, well what's the difference as far as memory goes? You still have the same number of elements! In the case of performance and understandability, the 2D is probably better. Imagine having to manually find each column or row in a 1D array, when you know exactly where they are in a 2D array.

It depends on how many operations you are performing. In the below example, I'm setting the values of the array 2500 times. Size of the array is (1000 * 1000 * 3). The 1D array took 40 seconds and the 3D array took 1:39 mins.
var startTime = DateTime.Now;
Test1D(new byte[1000 * 1000 * 3]);
Console.WriteLine("Total Time taken 1d = " + (DateTime.Now - startTime));
startTime = DateTime.Now;
Test3D(new byte[1000,1000,3], 1000, 1000);
Console.WriteLine("Total Time taken 3D = " + (DateTime.Now - startTime));
public static void Test1D(byte[] array)
{
for (int c = 0; c < 2500; c++)
{
for (int i = 0; i < array.Length; i++)
{
array[i] = 10;
}
}
}
public static void Test3D(byte[,,] array, int w, int h)
{
for (int c = 0; c < 2500; c++)
{
for (int i = 0; i < h; i++)
{
for (int j = 0; j < w; j++)
{
array[i, j, 0] = 10;
array[i, j, 1] = 10;
array[i, j, 2] = 10;
}
}
}
}

The difference between double[1000,44000] and double[44000000] will not be significant.
You're probably better of with the [,] version (letting the compiler(s) figure out the addressing. But the pattern of your calculations is likely to have more impact (locality and cache use).
Also consider the array-of-array variant, double[1000][]. It is a known 'feature' of the Jitter that it cannot eliminate range-checking in the [,] arrays.

Related

Why writing by column is slow in two dimensional array in C#

I have two-dimensional array when I am adding values by column it write very slowly (less than 300x):
class Program
{
static void Main(string[] args)
{
TwoDimArrayPerfomrance.GetByColumns();
TwoDimArrayPerfomrance.GetByRows();
}
}
class TwoDimArrayPerfomrance
{
public static void GetByRows()
{
int maxLength = 20000;
int[,] a = new int[maxLength, maxLength];
DateTime dt = DateTime.Now;
Console.WriteLine("The current time is: " + dt.ToString());
//fill value
for (int i = 0; i < maxLength; i++)
{
for (int j = 0; j < maxLength; j++)
{
a[i, j] = i + j;
}
}
DateTime end = DateTime.Now;
Console.WriteLine("Total: " + end.Subtract(dt).TotalSeconds);
}
public static void GetByColumns()
{
int maxLength = 20000;
int[,] a = new int[maxLength, maxLength];
DateTime dt = DateTime.Now;
Console.WriteLine("The current time is: " + dt.ToString());
for (int i = 0; i < maxLength; i++)
{
for (int j = 0; j < maxLength; j++)
{
a[j, i] = j + i;
}
}
DateTime end = DateTime.Now;
Console.WriteLine("Total: " + end.Subtract(dt).TotalSeconds);
}
}
The Column vice taking around 4.2 seconds
while Row wise taking 1.53
It is the "cache proximity" problem mentioned in the first comment. There are memory caches that any data must go through to be accessed by the CPU. Those caches store blocks of memory, so if you are first accessing memory N and then memory N+1 then cache is not changed. But if you first access memory N and then memory N+M (where M is big enough) then new memory block must be added to the cache. When you add new block to the cache some existing block must be removed. If you then have to access this removed block then you have inefficiency in the code.
I concur fully with what #Dialecticus wrote... I'll just add that there are bad ways to write a microbenchark, and there are worse ways. There are many things to do when microbenchmarking. Remembering to run in Release mode without the debugger attached, remembering that there is a GC, and that it is better if it runs when you want it to run, and not casually when you are benchmarking, remembering that sometimes the code is compiled only after it is executed at least once, so at least a round of full warmup is a good idea... and so on... There is even a full library about benchmarking (https://benchmarkdotnet.org/articles/overview.html) that is used by Microscot .NET Core teams to check that there are no speed regressions on the code they write.
class Program
{
static void Main(string[] args)
{
if (Debugger.IsAttached)
{
Console.WriteLine("Warning, debugger attached!");
}
#if DEBUG
Console.WriteLine("Warning, Debug version!");
#endif
Console.WriteLine($"Running at {(Environment.Is64BitProcess ? 64 : 32)}bits");
Console.WriteLine(RuntimeInformation.FrameworkDescription);
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.High;
Console.WriteLine();
const int MaxLength = 10000;
for (int i = 0; i < 10; i++)
{
Console.WriteLine($"Round {i + 1}:");
TwoDimArrayPerfomrance.GetByRows(MaxLength);
GC.Collect();
GC.WaitForPendingFinalizers();
TwoDimArrayPerfomrance.GetByColumns(MaxLength);
GC.Collect();
GC.WaitForPendingFinalizers();
Console.WriteLine();
}
}
}
class TwoDimArrayPerfomrance
{
public static void GetByRows(int maxLength)
{
int[,] a = new int[maxLength, maxLength];
Stopwatch sw = Stopwatch.StartNew();
//fill value
for (int i = 0; i < maxLength; i++)
{
for (int j = 0; j < maxLength; j++)
{
a[i, j] = i + j;
}
}
sw.Stop();
Console.WriteLine($"By Rows, size {maxLength} * {maxLength}, {sw.ElapsedMilliseconds / 1000.0:0.00} seconds");
// So that the assignment isn't optimized out, we do some fake operation on the array
for (int i = 0; i < maxLength; i++)
{
for (int j = 0; j < maxLength; j++)
{
if (a[i, j] == int.MaxValue)
{
throw new Exception();
}
}
}
}
public static void GetByColumns(int maxLength)
{
int[,] a = new int[maxLength, maxLength];
Stopwatch sw = Stopwatch.StartNew();
//fill value
for (int i = 0; i < maxLength; i++)
{
for (int j = 0; j < maxLength; j++)
{
a[j, i] = i + j;
}
}
sw.Stop();
Console.WriteLine($"By Columns, size {maxLength} * {maxLength}, {sw.ElapsedMilliseconds / 1000.0:0.00} seconds");
// So that the assignment isn't optimized out, we do some fake operation on the array
for (int i = 0; i < maxLength; i++)
{
for (int j = 0; j < maxLength; j++)
{
if (a[i, j] == int.MaxValue)
{
throw new Exception();
}
}
}
}
}
Ah... and multi-dimensional arrays of the type FooType[,] went the way of the dodo with .NET 3.5, when LINQ came out and it didn't support them. You should use jagged arrays FooType[][].
If you try to map your two dimensional array to a one dimensional one, it might be a bit easier to see what is going on.
The mapping gives
var a = int[maxLength * maxLength];
Now the lookup calculation is up to you.
for (int i = 0; i < maxLength; i++)
{
for (int j = 0; j < maxLength; j++)
{
//var rowBased = j + i * MaxLength;
var colBased = i + j * MaxLength;
//a[rowBased] = i + j;
a[colBased] = i + j;
}
}
So observe the following
On column based lookup the number of multiplications is 20.000 * 20.000 multiplications, because j changes for every loop
On row based lookup the i * MaxLength is compiler optimised and only happens 20.000 times.
Now that a is a one dimensional array it's also easier to see how the memory is being accessed. On row based index the memory is access sequentially, where as column based access is almost random and depending on the size of the array the overhead will vary as you have seen it.
Looking a bit on what BenchmarkDotNet produces
BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
.NET Core SDK=5.0.101
Method
MaxLength
Mean
Error
StdDev
GetByRows
100
23.60 us
0.081 us
0.076 us
GetByColumns
100
23.74 us
0.357 us
0.334 us
GetByRows
1000
2,333.20 us
13.150 us
12.301 us
GetByColumns
1000
2,784.43 us
10.027 us
8.889 us
GetByRows
10000
238,599.37 us
1,592.838 us
1,412.009 us
GetByColumns
10000
516,771.56 us
4,272.849 us
3,787.770 us
GetByRows
50000
5,903,087.26 us
13,822.525 us
12,253.308 us
GetByColumns
50000
19,623,369.45 us
92,325.407 us
86,361.243 us
You will see that while MaxLength is reasonable small, the differences are almost negligible (100x100) and (1000x1000), because I expect that the CPU can keep the allocated two dimensional array in the fast access memory cache and the differences are only related to the number of multiplications.
When the matrix becomes larger, then the CPU can no longer keep all allocated memory in its internal cache and we will start to see cache-misses and fetching memory from the external memory storage instead, which is always going to be a lot slower.
That overhead just increases as the size of the matrix grows.

Cannon's algorithm of matrix multiplication

I try to implement the Cannon's algorithm of matrix multiplication.
I read description on the wikipedia that provides next pseudocode:
row i of matrix a is circularly shifted by i elements to the left.
col j of matrix b is circularly shifted by j elements up.
Repeat n times:
p[i][j] multiplies its two entries and adds to running total.
circular shift each row of a 1 element left
circular shift each col of b 1 element up
and I implemented it on the C# next way:
public static void ShiftLeft(int[][] matrix, int i, int count)
{
int ind = 0;
while (ind < count)
{
int temp = matrix[i][0];
int indl = matrix[i].Length - 1;
for (int j = 0; j < indl; j++)
matrix[i][j] = matrix[i][j + 1];
matrix[i][indl] = temp;
ind++;
}
}
public static void ShiftUp(int[][] matrix, int j, int count)
{
int ind = 0;
while (ind < count)
{
int temp = matrix[0][j];
int indl = matrix.Length - 1;
for (int i = 0; i < indl; i++)
matrix[i][j] = matrix[i + 1][j];
matrix[indl][j] = temp;
ind++;
}
}
public static int[][] Cannon(int[][] A, int[][] B)
{
int[][] C = new int[A.Length][];
for (int i = 0; i < C.Length; i++)
C[i] = new int[A.Length];
for (int i = 0; i < A.Length; i++)
ShiftLeft(A, i, i);
for (int i = 0; i < B.Length; i++)
ShiftUp(B, i, i);
for (int k = 0; k < A.Length; k++)
{
for (int i = 0; i < A.Length; i++)
{
for (int j = 0; j < B.Length; j++)
{
var m = (i + j + k) % A.Length;
C[i][j] += A[i][m] * B[m][j];
ShiftLeft(A, i, 1);
ShiftUp(B, j, 1);
}
}
};
return C;
}
this code return correct result, but do it very slowly. Much slowly even than naive algorithm of matrix multiplication.
For matrix 200x200 I got that result:
00:00:00.0490432 //naive algorithm
00:00:07.1397479 //Cannon's algorithm
What I am doing wrong?
Edit
Thanks SergeySlepov, it was bad attempt to do it parallel. When I back to sequential implementation I got next result:
Count Naive Cannon's
200 00:00:00.0492098 00:00:08.0465076
250 00:00:00.0908136 00:00:22.3891375
300 00:00:00.1477764 00:00:58.0640621
350 00:00:00.2639114 00:01:51.5545524
400 00:00:00.4323984 00:04:50.7260942
okay, it's not a parallel implementation, but how can I do it correctly?
Cannon's algorithm was built for a 'Distributed Memory Machine' (a grid of processors, each with its own memory). This is very different to the hardware you're running it on (a few processors with shared memory) and that is why you're not seeing any increase in performance.
The 'circular shifts' in the pseudocode that you quoted actually mimic data transfers between processors. After the initial matrix 'skewing', each processor in the grid keeps track of three numbers (a, b and c) and executes pseudocode similar to this:
c += a * b;
pass 'a' to the processor to your left (wrapping around)
pass 'b' to the processor to 'above' you (wrapping around)
wait for the next iteration of k
We could mimic this behaviour on a PC using NxN threads but the overhead of context switching (or spawning Tasks) would kill all the joy. To make the most of a PC's 4 (or so) CPUs we could make the loop over i parallel. The loop over k needs to be sequential (unlike your solution), otherwise you might face racing conditions as each iteration of k modifies the matrices A, B and C. In a 'distributed memory machine' race conditions are not a problem as processors do not share any memory.

Performing efficient local average over a 2D array

I have a 2d-array custom Vector class of around 250, 250 in dimensions. The Vector class just stores x and y float components for the vector. My project requires that I perform a smoothing function on the array so that a new array is created by taking the local average of i indices around each vector in the array. My problem is that my current solution does not compute fast enough and was wondering if there was a better way of computing this.
Pseudo code for my current solution can be seen below. I am implementing this in C#, any help would be much appreciated. My actual solution use 1d arrays for the speed up, but I didn't include that here.
function smoothVectorArray(Vector[,] myVectorArray, int averagingDistance) {
newVectorArray = new Vector[250,250];
for (x = 0; x < 250; x++)
{
for (y = 0; y < 250; y++)
{
vectorCount = 0;
vectorXTotal = 0;
vectorYTotal = 0;
for (i = -averageDistance; i < averagingDistance+ 1; i++)
{
for (j = -averageDistance; j < averagingDistance+ 1; j++)
{
tempX = x + i;
tempY = y + j;
if (inArrayBounds(tempX, tempY)) {
vectorCount++;
vectorXTotal += myVectorArray[tempX, tempY].x;
vectorYTotal += myVectorArray[tempX, tempY].y;
}
}
}
newVectorArray[x, y] = new Vector(vectorXTotal / vectorCount, vectorYTotal / vectorCount);
}
}
return newVectorArray;
}
What your inner cycles do is calculating sum of rectangular ares:
for (i = -averageDistance; i < averagingDistance+ 1; i++)
for (j = -averageDistance; j < averagingDistance+ 1; j++)
You can pre-calculate those efficiently in O(n^2). Let's introduce array S[N][N] (where N = 250 in your case).
To make it simpler I will assume there is only one coordinate. You can easily adapt it to pair (x, y) by building 2 arrays.
S[i, j] - will be sum of sub-rectangle (0, 0)-(i, j)
we can build this array efficiently:
S[0, 0] = myVectorArray[0, 0]; //rectangle (0, 0)-(0,0) has only one cell (0, 0)
for (int i = 1; i < N; ++i){
S[0, i] = S[0, i-1] + myVectorArray[0, i]; //rectangle (0, 0)-(0, i) is calculated based on previous rectangle (0,0)-(0,i-1) and new cell (0, i)
S[i, 0] = S[i - 1, 0] + myVectorArray[i, 0]; //same for (0, 0)-(i, 0)
}
for (int i = 1; i < N; ++i){
var currentRowSum = myVectorArray[i, 0];
for (int j = 1; j < N; ++j){
currentRowSum += myVectorArray[i, j]; //keep track of sum in current row
S[i, j] = S[i - 1, j] + currentRowSum; //rectangle (0,0)-(i,j) sum constrcuted as //rectanle (0, 0) - (i-1, j) which is current rectagnle without current row which is already calculated + current row sum
}
}
Once we have have this partials sums array calculated we can get sub rectangle sum in O(1). Lets say we want to get sum in rectangle (a, b)-(c,d)
To get it we start with big rectangle (0, 0)-(c, d) from which we need to subtract (0, 0)-(a-1, d-1) and (0, 0)-(c-1, b-1) and add add back rectangle (0, 0)-(a-1, b-1) since it was subtracted twice.
This way your can get rid of your inner cycle.
https://en.wikipedia.org/wiki/Summed_area_table
You will definitely want to take advantage of CPU cache for the solution, it sounds like you have that in mind with your 1D array solution. Try to arrange the algorithm to work on chunks of contiguous memory at a time, rather than hopping around the array. To this point you should either use a Vector struct, rather than a class, or use two arrays of floats, one for the x values and one for the y values. By using a class, your array is storing pointers to various spots in the heap. So even if you iterate over the array in order, you are still missing the cache all the time as you hop to the location of the Vector object. Every cache miss is ~200 cpu cycles wasted. This would be the main thing to work out first.
After that, some micro-optimizations you can consider are
using an inlining hint on the inArrayBounds method: [MethodImpl(MethodImplOptions.AggressiveInlining)]
using unsafe mode and iterating with pointer arithmetic to avoid arrays bounds checking overhead
These last two ideas may or may not have any significant impact, you should test.

Linked 2D Matrix in C#

I need to implement this scenario in C#:
The matrix will be very large, maybe 10000x10000 or larger. I will use this for distance matrix in hierarchical clustering algorithm. In every iteration of the algorithm the matrix should be updated (joining 2 rows into 1 and 2 columns into 1). If I use simple double[,] or double[][] matrix this operations will be very "expensive".
Please, can anyone suggest C# implementation of this scenario?
Do you have a algorithm at the moment? And what do you mean by expensive? Memory or time expensive? If memory expensive: There is not much you can do in c#. But you can consider executing the calculation inside a database using temporary objects. If time expensive: You can use parallelism to join columns and rows.
But beside that I think a simple double[,] array is the fastest and memory sparing way you can get in c#, because accessing the array values is an o(1) operation and arrays have a least amount of memory and management overhead (compared to lists and dictionaries).
As mentioned above, a basic double[,] is going to be the most effective way of handling this in C#.
Remember that C# sits of top of managed memory, and as such you have less fine grain control over low level (in terms of memory) operations in contrast to something like basic C. Creating your own objects in C# to add functionality will only use more memory in this scenario, and likely slow the algorithm down as well.
If you have yet to pick an algorithm, CURE seems to be a good bet. The choice of algorithm may affect your data structure choice, but that's not likely.
You will find that the algorithm determines the theoretical limits of 'cost' at any rate. For example you will read that for CURE, you are bound by a O(n2 log n) running time, and O(n) memory use.
I hope this helps. If you can provide more detail, we might be able to assist further!
N.
It's not possible to 'merge' two rows or two columns, you'd have to copy the whole matrix into a new, smaller one, which is indeed unacceptably expensive.
You should probably just add the values in one row to the previous and then ignore the values, acting like they where removed.
the arrays of arrays: double[][] is actually faster than double[,]. But takes more memory.
The whole array merging thing might not be needed if you change the algoritm a bit, but this might help u:
public static void MergeMatrix()
{
int size = 100;
// Initialize the matrix
double[,] matrix = new double[size, size];
for (int i = 0; i < size; i++)
for (int j = 0; j < size; j++)
matrix[i, j] = ((double)i) + (j / 100.0);
int rowMergeCount = 0, colMergeCount = 0;
// Merge last row.
for (int i = 0; i < size; i++)
matrix[size - rowMergeCount - 2, i] += matrix[size - rowMergeCount - 1, i];
rowMergeCount++;
// Merge last column.
for (int i = 0; i < size; i++)
matrix[i, size - colMergeCount - 2] += matrix[i, size - colMergeCount - 1];
colMergeCount++;
// Read the newly merged values.
int newWidth = size - rowMergeCount, newHeight = size - colMergeCount;
double[,] smaller = new double[newWidth, newHeight];
for (int i = 0; i < newWidth; i++)
for (int j = 0; j < newHeight; j++)
smaller[i, j] = matrix[i, j];
List<int> rowsMerged = new List<int>(), colsMerged = new List<int>();
// Merging row at random position.
rowsMerged.Add(15);
int target = rowsMerged[rowMergeCount - 1];
int source = rowsMerged[rowMergeCount - 1] + 1;
// Still using the original matrix since it's values are still usefull.
for (int i = 0; i < size; i++)
matrix[target, i] += matrix[source, i];
rowMergeCount++;
// Merging col at random position.
colsMerged.Add(37);
target = colsMerged[colMergeCount - 1];
source = colsMerged[colMergeCount - 1] + 1;
for (int i = 0; i < size; i++)
matrix[i, target] += matrix[i, source];
colMergeCount++;
newWidth = size - rowMergeCount;
newHeight = size - colMergeCount;
smaller = new double[newWidth, newHeight];
for (int i = 0, j = 0; i < newWidth && j < size; i++, j++)
{
for (int k = 0, m = 0; k < newHeight && m < size; k++, m++)
{
smaller[i, k] = matrix[j, m];
Console.Write(matrix[j, m].ToString("00.00") + " ");
// So merging columns is more expensive because we have to check for it more often while reading.
if (colsMerged.Contains(m)) m++;
}
if (rowsMerged.Contains(j)) j++;
Console.WriteLine();
}
Console.Read();
}
In this code I use two 1D helper lists to calculate the index into a big array containing the data. Deleting rows/columns is really cheap since I only need to remove that index from the helper-lists. But of course the memory in the big array remains, i.e. depending on your usage you have a memory-leak.
public class Matrix
{
double[] data;
List<int> cols;
List<int> rows;
private int GetIndex(int x,int y)
{
return rows[y]+cols[x];
}
public double this[int x,int y]
{
get{return data[GetIndex(x,y)];}
set{data[GetIndex(x,y)]=value;}
}
public void DeleteColumn(int x)
{
cols.RemoveAt(x);
}
public void DeleteRow(int y)
{
rows.RemoveAt(y);
}
public Matrix(int width,int height)
{
cols=new List<int>(Enumerable.Range(0,width));
rows=new List<int>(Enumerable.Range(0,height).Select(i=>i*width));
data=new double[width*height];
}
}
Hm, to me this looks like a simple binary tree. The left node represents the next value in a row and the right node represents the column.
So it should be easy to iterate rows and columns and combine them.
Thank you for the answers.
At the moment I'm using this solution:
public class NodeMatrix
{
public NodeMatrix Right { get; set;}
public NodeMatrix Left { get; set; }
public NodeMatrix Up { get; set; }
public NodeMatrix Down { get; set; }
public int I { get; set; }
public int J { get; set; }
public double Data { get; set; }
public NodeMatrix(int I, int J, double Data)
{
this.I = I;
this.J = J;
this.Data = Data;
}
}
List<NodeMatrix> list = new List<NodeMatrix>(10000);
Then I'm building the connections between the nodes. After that the matrix is ready.
This will use more memory, but operations like adding rows and columns, joining rows and columns I think will be far more faster.

What's better in regards to performance? type[,] or type[][]?

Is it more performant to have a bidimensional array (type[,]) or an array of arrays (type[][]) in C#?
Particularly for initial allocation and item access
Of course, if all else fails... test it! Following gives (in "Release", at the console):
Size 1000, Repeat 1000
int[,] set: 3460
int[,] get: 4036 (chk=1304808064)
int[][] set: 2441
int[][] get: 1283 (chk=1304808064)
So a jagged array is quicker, at least in this test. Interesting! However, it is a relatively small factor, so I would still stick with whichever describes my requirement better. Except for some specific (high CPU/processing) scenarios, readability / maintainability should trump a small performance gain. Up to you, though.
Note that this test assumes you access the array much more often than you create it, so I have not included timings for creation, where I would expect rectangular to be slightly quicker unless memory is highly fragmented.
using System;
using System.Diagnostics;
static class Program
{
static void Main()
{
Console.WriteLine("First is just for JIT...");
Test(10,10);
Console.WriteLine("Real numbers...");
Test(1000,1000);
Console.ReadLine();
}
static void Test(int size, int repeat)
{
Console.WriteLine("Size {0}, Repeat {1}", size, repeat);
int[,] rect = new int[size, size];
int[][] jagged = new int[size][];
for (int i = 0; i < size; i++)
{ // don't count this in the metrics...
jagged[i] = new int[size];
}
Stopwatch watch = Stopwatch.StartNew();
for (int cycle = 0; cycle < repeat; cycle++)
{
for (int i = 0; i < size; i++)
{
for (int j = 0; j < size; j++)
{
rect[i, j] = i * j;
}
}
}
watch.Stop();
Console.WriteLine("\tint[,] set: " + watch.ElapsedMilliseconds);
int sum = 0;
watch = Stopwatch.StartNew();
for (int cycle = 0; cycle < repeat; cycle++)
{
for (int i = 0; i < size; i++)
{
for (int j = 0; j < size; j++)
{
sum += rect[i, j];
}
}
}
watch.Stop();
Console.WriteLine("\tint[,] get: {0} (chk={1})", watch.ElapsedMilliseconds, sum);
watch = Stopwatch.StartNew();
for (int cycle = 0; cycle < repeat; cycle++)
{
for (int i = 0; i < size; i++)
{
for (int j = 0; j < size; j++)
{
jagged[i][j] = i * j;
}
}
}
watch.Stop();
Console.WriteLine("\tint[][] set: " + watch.ElapsedMilliseconds);
sum = 0;
watch = Stopwatch.StartNew();
for (int cycle = 0; cycle < repeat; cycle++)
{
for (int i = 0; i < size; i++)
{
for (int j = 0; j < size; j++)
{
sum += jagged[i][j];
}
}
}
watch.Stop();
Console.WriteLine("\tint[][] get: {0} (chk={1})", watch.ElapsedMilliseconds, sum);
}
}
I believe that [,] can allocate one contiguous chunk of memory, while [][] is N+1 chunk allocations where N is the size of the first dimension. So I would guess that [,] is faster on initial allocation.
Access is probably about the same, except that [][] would involve one extra dereference. Unless you're in an exceptionally tight loop it's probably a wash. Now, if you're doing something like image processing where you are referencing between rows rather than traversing row by row, locality of reference will play a big factor and [,] will probably edge out [][] depending on your cache size.
As Marc Gravell mentioned, usage is key to evaluating the performance...
It really depends. The MSDN Magazine article, Harness the Features of C# to Power Your Scientific Computing Projects, says this:
Although rectangular arrays are generally superior to jagged arrays in terms of structure and performance, there might be some cases where jagged arrays provide an optimal solution. If your application does not require arrays to be sorted, rearranged, partitioned, sparse, or large, then you might find jagged arrays to perform quite well.
type[,] will work faster. Not only because of less offset calculations. Mainly because of less constraint checking, less memory allocation and greater localization in memory. type[][] is not a single object -- it's 1 + N objects that must be allocated and can be away from each other.

Categories