Multi-dimensional or jagged array when dealing with matrix in C#?

Multi-dimensional or jagged array when dealing with matrix in C#? - c#

I think the title is quite clear, so I'll just write some personal opinions here.
Consider a matrix of numbers, the equivalent representations in C# code are double[,] and double[][] respectively. When using multi-dimensional array (2D in this specific situation), It can be easily seen that one doesn't have to check either there is any null reference of double[] or the size of rows are the same, which allows a better understanding of the core problem. Also it descirbes the matrix more accurately from my point of view, since in most cases a matrix should be treated as a single entity rather than a list of arrays.
But using multi-dimensional array may result in more lines of code. If one wants to apply math operations on it, say, transposition, he would have to use nested loops like
var row = mat.GetLength(0);
var col = mat.GetLength(1);
var newmat = new double[col, row];
for (var i = 0; i < row; i++)
{
for (var j = 0; j < col; j++)
{
newmat[j, i] = mat[i, j];
}
}
With jagged array, he can simply write
var newmat = Enumerable.Range(0, mat[0].Length - 1).
Select(i => mat.Select(r => r[i]).ToArray()).ToArray();
I'm not sure which one is better. Usually I only create my own subroutine unless there is no solution provided by .Net, so I prefer the latter. But multi-dimensional array do have its advantages which I really like. Could anyone teach me about how to choose between them?

It's not about the lines of code that is the problem, but the efficiency of the code itself.
If you had a sparse matrix (matrix with almost all zeros), you want to use a jagged matrix because iterating through the two-dimensional matrix searching for non-zero elements would waste time.
However, if you had a matrix and you wanted to find its determinant, it would be simpler to use the method of co-factors on it. If you're not familiar with the method, it involves breaking up the matrix into smaller matrices, eventually to the 2x2 version where you can simply perform a*d-b*c. This isn't possible with jagged matrices.

Related

Extract a vector from a two dimensional array efficiently in C#

I have a very large two dimensional array and I need to compute vector operations on this array. NTerms and NDocs are both very large integers.
var myMat = new double[NTerms, NDocs];
I need to to extract vector columns from this matrix. Currently, I'm using for loops.
col = 100;
for (int i = 0; i < NTerms; i++)
{
myVec[i] = myMat[i, col];
}
This operation is very slow. In Matlab I can extract the vector without the need for iteration, like so:
myVec = myMat[:,col];
Is there any way to do this in C#?

There are no such constructs in C# that will allow you to work with arrays as in Matlab. With the code you already have you can speed up process of vector creation using Task Parallel Library that was introduced in .NET Framework 4.0.
Parallel.For(0, NTerms, i => myVec[i] = myMat[i, col]);
If your CPU has more than one core then you will get some improvement in performance otherwise there will be no effect.
For more examples of how Task Parallel Library could be used with matrixes and arrays you can reffer to the MSDN article Matrix Decomposition.
But I doubt that C# is a good choice when it comes to some serious math calculations.

Some possible problems:
Could it be the way that elements are accessed for multi-dimensional arrays in C#. See this earlier article.
Another problem may be that you are accessing non-contiguous memory - so not much help from cache, and maybe you're even having to fetch from virtual memory (disk) if the array is very large.
What happens to your speed when you access a whole row at a time, instead of a column? If that's significantly faster, you can be 90% sure it's a contiguous-memory issue...

Calculating numbers from an external file from my project (Project Euler #13)

I'm trying to find multiple ways to solve Project Euler's problem #13. I've already got it solved two different ways, but what I am trying to do this time is to have my solution read from a text file that contains all of the numbers, from there it converts it and adds the column numbers farthest to the right. I also want to solve this problem in a way such that if we were to add new numbers to our list, the list can contain any amount of rows or columns, so it's length is not predefined (non array? I'm not sure if a jagged array would apply properly here since it can't be predefined).
So far I've got:
static void Main(string[] args)
{
List<int> sum = new List<int>();
string bigIntFile = #"C:\Users\Justin\Desktop\BigNumbers.txt";
string result;
StreamReader streamReader = new StreamReader(bigIntFile);
while ((result = streamReader.ReadLine()) != null)
{
for (int i = 0; i < result.Length; i++)
{
int converted = Convert.ToInt32(result.Substring(i, 1));
sum.Add(converted);
}
}
}
which reads the file and converts each character from the string to a single int. I'm trying to think how I can store that int in a collection that is like 2D array, but the collection needs to be versatile and store any # of rows / columns. Any ideas on how to store these digits other than just a basic list? Is there maybe a way I can set up a list so it's like a 2D array that is not predefined? Thanks in advance!
UPDATE: Also I don't want to use "BigInteger". That'd be a little too easy to read the line, convert the string to a BigInt, store it in a BigInt list and then sum up all the integers from there.

There is no resizable 2D collection built into the .NET framework. I'd just go with the "jagged arrays" type of data structure, just with lists:
List<List<int>>
You can also vary this pattern by using an array for each row:
List<int[]>
If you want to read the file a little simpler, here is how:
List<int[]> numbers =
File.EnumerateLines(path)
.Select(lineStr => lineStr.Select(#char => #char - '0').ToArray())
.ToList();
Far less code. You can reuse a lot of built-in stuff to do basic data transformations. That gives you less code to write and to maintain. It is more extensible and it is less prone to bugs.
If you want to select a column from this structure, do it like this:
int colIndex = ...;
int[] column = numbers.Select(row => row[index]).ToArray();
You can encapsulate this line into a helper method to remove noise from your main addition algorithm.
Note, that the efficiency of all those patterns is far less than a 2D array, but in your case it is good enough.

In this case you can simply use an 2D array, since you actually do know in advance its dimensions: 100 x 50.
If for some reason you want to solve a more general problem, you may indeed use a List of Lists, List>.
having said that, I wonder: are you actually trying to sum up all the numbers? if so, I would suggest another approach: consider just which section part of the 50 digit numbers actually influences the first digits of their sum. Hint: you don't need the entire number.

C# shift two dimension array fast method

I have a 2D string array in C# and I need to shift that array to left in one dimension
how can I do that in efficient way
I dont want use nested for and i want an algurithm in O(n) not O(n2)
for (int i = 50; i < 300; i++)
{
for (int j = 0; j < 300; j++)
{
numbers[i-50, j] = numbers[i, j];
}
}

If you want to shift large amounts of data around quickly, use Array.Copy rather than a loop that copies individual characters.
If you swap to a byte array and use Array.Copy or Buffer.BlockCopy you will probably improve the performance a bit more (but if you have to convert to/from character arrays you may lose everything you've gained).
(edit: Now that you've posted example code): If you use references to the array rows then you may be able to shift the references rather than having to move the data itself. Any you can still shift the references using Array.Copy)
But if you change your approach so you don't need to shift the data, you'll gain considerably better performance - not doing the work at all if you can avoid it is always faster! Chances are you can wrap the data in an accessor layer that keeps track of how much the data has been shifted and modifies your indexes to return the data you are after. (This will slightly slow down access to the data, but saves you shifting the data, so may result in a net win - depending on how much you access relative to how much you shift)

The most efficient way would be to not shift it at all, but instead change how you access the array. For example, keep an offset that tells you where in the dimension the first column is.

Copying part of a Multidimentional Array into a smaller one

I have two multi-dimentional arrays declared like this:
bool?[,] biggie = new bool?[500, 500];
bool?[,] small = new bool?[100, 100];
I want to copy part of the biggie one into the small. Let’s say I want from the index 100 to 199 horizontally and 100 to 199 vertically.
I have written a simple for statement that goes like this:
for(int x = 0; x < 100; x++)
{
For(int y = 0; y < 100; y++)
{
Small[x,y] = biggie[x+100,y+100];
}
}
I do this A LOT in my code, and this has proven to be a major performance jammer.
Array.Copy only copies single-dimentional arrays, and with multi-dimentional arrays it just considers as if the whole matrix is a single array, putting each row at the end of the other, which won’t allow me to cut a square in the middle of my array.
Is there a more efficient way to do this?
Ps.: I do consider refactoring my code in order not to do this at all, and doing whatever I want to do with the bigger array. Copying matrixes just can’t be painless, the point is that I have already stumbled upon this before, looked for an answer, and got none.

In my experience, there are two ways to do this efficiently:
Use unsafe code and work directly with pointers.
Convert the 2D array to a 1D array and do the necessary arithmetic when you need to access it as a 2D array.
The first approach is ugly and it uses potentially invalid assumptions since 2D arrays are not guaranteed to be laid out contiguously in memory. The upshot to the first approach is that you don't have to change your code that is already using 2D arrays. The second approach is as efficient as the first approach, doesn't make invalid assumptions, but does require updating your code.

2d Data Structure in C#

I'm looking for resources that can help me determine which approach to use in creating a 2d data structure with C#.

Do you mean multidimensional array? It's simple:
<type>[,] <name> = new <type>[<first dimenison>,<second dimension>];
Here is MSDN reference:
Multidimensional Arrays (C#)

#Traumapony-- I'd actually state that the real performance gain is made in one giant flat array, but that may just be my C++ image processing roots showing.
It depends on what you need the 2D structure to do. If it's storing something where each set of items in the second dimension is the same size, then you want to use something like a large 1D array, because the seek times are faster and the data management is easier. Like:
for (y = 0; y < ysize; y++){
for (x = 0; x < xsize; x++){
theArray[y*xsize + x] = //some stuff!
}
}
And then you can do operations which ignore neighboring pixels with a single passthrough:
totalsize = xsize*ysize;
for (x = 0; x < totalsize; x++){
theArray[x] = //some stuff!
}
Except that in C# you probably want to actually call a C++ library to do this kind of processing; C++ tends to be faster for this, especially if you use the intel compiler.
If you have the second dimension having multiple different sizes, then nothing I said applies, and you should look at some of the other solutions. You really need to know what your functional requirements are in order to be able to answer the question.

Depending on the type of the data, you could look at using a straight 2 dimensional array:
int[][] intGrid;
If you need to get tricky, you could always go the generics approach:
Dictionary<KeyValuePair<int,int>,string>;
That allows you to put complex types in the value part of the dictionary, although makes indexing into the elements more difficult.
If you're looking to store spatial 2d point data, System.Drawing has a lot of support for points in 2d space.

For performance, it's best not to use multi-dimensional arrays ([,]); instead, use jagged arrays. e.g.:
<type>[][] <name> = new <type>[<first dimension>];
for (int i = 0; i < <first dimension>; i++)
{
<name>[i] = new <type>[<second dimension>];
}
To access:
<type> item = <name>[<first index>][<second index>];

Data Structures in C#
Seriously, I'm not trying to be critical of the question, but I got tons of useful results right at the top of my search when I Googled for:
data structures c#
If you have specific questions about specific data structures, we might have more specific answers...

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Multi-dimensional or jagged array when dealing with matrix in C#? - c#

Related

Extract a vector from a two dimensional array efficiently in C#

Calculating numbers from an external file from my project (Project Euler #13)

C# shift two dimension array fast method

Copying part of a Multidimentional Array into a smaller one

2d Data Structure in C#

Categories

Resources