I use a custom Matrix class in my application, and I frequently add multiple matrices:
Matrix result = a + b + c + d; // a, b, c and d are also Matrices
However, this creates an intermediate matrix for each addition operation. Since this is simple addition, it is possible to avoid the intermediate objects and create the result by adding the elements of all 4 matrices at once. How can I accomplish this?
NOTE: I know I can define multiple functions like Add3Matrices(a, b, c), Add4Matrices(a, b, c, d), etc. but I want to keep the elegancy of result = a + b + c + d.
You could limit yourself to a single small intermediate by using lazy evaluation. Something like
public class LazyMatrix
{
public static implicit operator Matrix(LazyMatrix l)
{
Matrix m = new Matrix();
foreach (Matrix x in l.Pending)
{
for (int i = 0; i < 2; ++i)
for (int j = 0; j < 2; ++j)
m.Contents[i, j] += x.Contents[i, j];
}
return m;
}
public List<Matrix> Pending = new List<Matrix>();
}
public class Matrix
{
public int[,] Contents = { { 0, 0 }, { 0, 0 } };
public static LazyMatrix operator+(Matrix a, Matrix b)
{
LazyMatrix l = new LazyMatrix();
l.Pending.Add(a);
l.Pending.Add(b);
return l;
}
public static LazyMatrix operator+(Matrix a, LazyMatrix b)
{
b.Pending.Add(a);
return b;
}
}
class Program
{
static void Main(string[] args)
{
Matrix a = new Matrix();
Matrix b = new Matrix();
Matrix c = new Matrix();
Matrix d = new Matrix();
a.Contents[0, 0] = 1;
b.Contents[1, 0] = 4;
c.Contents[0, 1] = 9;
d.Contents[1, 1] = 16;
Matrix m = a + b + c + d;
for (int i = 0; i < 2; ++i)
{
for (int j = 0; j < 2; ++j)
{
System.Console.Write(m.Contents[i, j]);
System.Console.Write(" ");
}
System.Console.WriteLine();
}
System.Console.ReadLine();
}
}
Something that would at least avoid the pain of
Matrix Add3Matrices(a,b,c) //and so on
would be
Matrix AddMatrices(Matrix[] matrices)
In C++ it is possible to use Template Metaprograms and also here, using templates to do exactly this. However, the template programing is non-trivial. I don't know if a similar technique is available in C#, quite possibly not.
This technique, in c++ does exactly what you want. The disadvantage is that if something is not quite right then the compiler error messages tend to run to several pages and are almost impossible to decipher.
Without such techniques I suspect you are limited to functions such as Add3Matrices.
But for C# this link might be exactly what you need: Efficient Matrix Programming in C# although it seems to work slightly differently to C++ template expressions.
You can't avoid creating intermediate objects.
However, you can use expression templates as described here to minimise them and do fancy lazy evaluation of the templates.
At the simplest level, the expression template could be an object that stores references to several matrices and calls an appropriate function like Add3Matrices() upon assignment. At the most advanced level, the expression templates will do things like calculate the minimum amount of information in a lazy fashion upon request.
This is not the cleanest solution, but if you know the evaluation order, you could do something like this:
result = MatrixAdditionCollector() << a + b + c + d
(or the same thing with different names). The MatrixCollector then implements + as +=, that is, starts with a 0-matrix of undefined size, takes a size once the first + is evaluated and adds everything together (or, copies the first matrix). This reduces the amount of intermediate objects to 1 (or even 0, if you implement assignment in a good way, because the MatrixCollector might be/contain the result immediately.)
I am not entirely sure if this is ugly as hell or one of the nicer hacks one might do. A certain advantage is that it is kind of obvious what's happening.
Might I suggest a MatrixAdder that behaves much like a StringBuilder. You add matrixes to the MatrixAdder and then call a ToMatrix() method that would do the additions for you in a lazy implementation. This would get you the result you want, could be expandable to any sort of LazyEvaluation, but also wouldn't introduce any clever implementations that could confuse other maintainers of the code.
I thought that you could just make the desired add-in-place behavior explicit:
Matrix result = a;
result += b;
result += c;
result += d;
But as pointed out by Doug in the Comments on this post, this code is treated by the compiler as if I had written:
Matrix result = a;
result = result + b;
result = result + c;
result = result + d;
so temporaries are still created.
I'd just delete this answer, but it seems others might have the same misconception, so consider this a counter example.
Bjarne Stroustrup has a short paper called Abstraction, libraries, and efficiency in C++ where he mentions techniques used to achieve what you're looking for. Specifically, he mentions the library Blitz++, a library for scientific calculations that also has efficient operations for matrices, along with some other interesting libraries. Also, I recommend reading a conversation with Bjarne Stroustrup on artima.com on that subject.
It is not possible, using operators.
My first solution would be something along this lines (to add in the Matrix class if possible) :
static Matrix AddMatrices(Matrix[] lMatrices) // or List<Matrix> lMatrices
{
// Check consistency of matrices
Matrix m = new Matrix(n, p);
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
foreach (Maxtrix mat in lMatrices)
m[i, j] += mat[i, j];
return m;
}
I'd had it in the Matrix class because you can rely on the private methods and properties that could be usefull for your function in case the implementation of the matrix change (linked list of non empty nodes instead of a big double array, for example).
Of course, you would loose the elegance of result = a + b + c + d. But you would have something along the lines of result = Matrix.AddMatrices(new Matrix[] { a, b, c, d });.
There are several ways to implement lazy evaluation to achieve that. But its important to remember that not always your compiler will be able to get the best code of all of them.
I already made implementations that worked great in GCC and even superceeded the performance of the traditional several For unreadable code because they lead the compiler to observe that there were no aliases between the data segments (somethign hard to grasp with arrays coming of nowhere). But some of those were a complete fail at MSVC and vice versa on other implementations. Unfortunately those are too long to post here (don't think several thousands lines of code fit here).
A very complex library with great embedded knowledge int he area is Blitz++ library for scientific computation.
Related
There is a function that multiplies two matrices as usual
public IMatrix Multiply(IMatrix m1, IMatrix m2)
{
var resultMatrix = new Matrix(m1.RowCount, m2.ColCount);
for (long i = 0; i < m1.RowCount; i++)
{
for (byte j = 0; j < m2.ColCount; j++)
{
long sum = 0;
for (byte k = 0; k < m1.ColCount; k++)
{
sum += m1.GetElement(i, k) * m2.GetElement(k, j);
}
resultMatrix.SetElement(i, j, sum);
}
}
return resultMatrix;
}
This function should be rewritten using Parallel.ForEach Threading, I tried this way
public IMatrix Multiply(IMatrix m1, IMatrix m2)
{
// todo: feel free to add your code here
var resultMatrix = new Matrix(m1.RowCount, m2.ColCount);
Parallel.ForEach(m1.RowCount, row =>
{
for (byte j = 0; j < m2.ColCount; j++)
{
long sum = 0;
for (byte k = 0; k < m1.ColCount; k++)
{
sum += m1.GetElement(row, k) * m2.GetElement(k, j);
}
resultMatrix.SetElement(row, j, sum);
}
});
return resultMatrix;
}
But there is an error with the type argument in the loop. How can I fix it?
Just use Parallel.For instead of Parallel.Foreach, that should let you keep the exact same body as the non-parallel version:
Parallel.For(0, m1.RowCount, i =>{
...
}
Note that only fairly large matrices will benefit from parallelization, so if you are working with 4x4 matrices for graphics, this is not the approach to take.
One problem with multiplying matrices is that you need to access one value for each row for one of the matrices in your innermost loop. This access pattern may be difficult to cache by your processor, causing lots of cache misses. So a fairly easy optimization is to copy an entire column to a temporary array and do all computations that need this column before reading the next. This lets all memory access be nice and linear and easy to cache. this will do more work overall, but better cache utilization easily makes it a win. There are even more cache efficient methods, but the complexity also tend to increase.
Another optimization would be to use SIMD, but this might require platform specific code for best performance, and will likely involve more work. But you might be able to find libraries that are already optimized.
But perhaps most importantly, Profile your code. It is quite easy to have simple things consume lot of time. You are for example using an interface, so if you may have a virtual method call for each memory access that cannot be inlined, potentially causing a severe performance penalty compared to a direct array access.
ForEach receives a collection, IEnumerable as the first argument and m1.RowCount is a number.
Probably Parallel.For() is what you wanted.
I'm trying to find a fit function that has the form:
f(x) = P / (1 + e^((x + m) / s)
Where P is a known constant. I'm fitting this function to a list of measured doubles (between 20-100 elements) and all these values has a corresponding x-value. I'm relatively new to C# and not very in to the maths either so I find it kind of hard to read the documentation available.
I have tried using AlgLib, but don't know where to start or what function to use.
Edit: So to precise what I#m looking for: I'd like to find a C# method where i can pass the functions form, aswell as some coordinates (x- and y-values) and have the method returning the two unknown variables (s and m above).
I use AlgLib daily for exactly this purpose. If you go to the link http://www.alglib.net/docs.php and scroll all the way down, you'll find the documentation with code examples in a number of languages (including C#) that I think will help you immensely: http://www.alglib.net/translator/man/manual.csharp.html
For your problem, you should consider all the constraints you need, but a simple example of obtaining a nonlinear least-squares fit given your input function and data would look something like this:
public SomeReturnObject Optimize(SortedDictionary<double, double> dataToFitTo, double p, double initialGuessM, double initialGuessS)
{
var x = new double[dataToFitTo.Count,1];
for(int i=0; i < dataToFitTo.Count; i++)
{
x[i, 0] = dataToFitTo.Keys.ElementAt(i);
}
var y = dataToFitTo.Values.ToArray();
var c = new[] {initialGuessM, initialGuessS};
int info;
alglib.lsfitstate state;
alglib.lsfitreport rep;
alglib.lsfitcreatef(x, y, c, 0.0001, out state);
alglib.lsfitsetcond(state, epsf, 0, 0);
alglib.lsfitfit(state, MyFunc, null, p);
alglib.lsfitresults(state, out info, out c, out rep);
/* When you get here, the c[] array should have the optimized values
for m and s, so you'll want to handle accordingly depending on your
needs. I'm not sure if you want out parameters for m and s or an
object that has m and s as properties. */
}
private void MyFunc(double[] c, double[] x, ref double func, object obj)
{
var xPt = x[0];
var m = c[0];
var s = c[1];
var P = (double)obj;
func = P / (1 + Math.Exp((xPt + m) / s));
}
Mind you, this is just a quick and dirty example. There is a lot of built-in functionality in Alglib so you'll need to adjust the problem code here to suit your needs with boundary constraints, weighting, step size, variable scaling....etc. It should be clear how to do all that from the examples and documentation in the second link.
Also note that Alglib is very particular about the method signature of MyFunc, so I would avoid moving around those inputs or adding any more.
Alternatively, you can write your own Levenberg-Marquardt algorithm if Alglib doesn't satisfy all your needs.
I have two arrays:
Vector3[] positions;
Matrix4x4[] transforms;
And a point in space:
Vector3 point;
For each position I get the distance from the point:
float distance = GetDistance(point, transforms[i] * positions[i]);
I'm comfortable enough with using delegates to sort a single array, but how can I sort the two arrays at the same time?
I need the operation to be as fast as possible, so i'd like to avoid packing into a temporary array and then unpacking the result.
I'm using .NET 2.0 so no Linq.
Instead of managing two parallel array you should use a better model that binds the data together.
Tuple<Vector3, Matrix4x4>[] posTransforms;
//add like this
posTransforms.Add(new Tuple<Vector3, Matrix4x4>(vec, matrix));
// order by Y cooridinate of the vectors for example
posTransforms.OrderBy(x => x.Item1.Y)
The simplest solution is to do something like what #evanmcdonnal suggests and just package the corresponding elements together (because they are obviously related and should be part of the same data structure). Then sort them using either a built in sort function or some such thing.
If you are really opposed to doing that for whatever reason, you will need to write your own sort method that will move the elements of both arrays at the same time.
Uncompiled and untested example (but should give you a decent idea of how to proceed):
bool isSorted;
do
{
isSorted = true;
for (int i = 0; i < positions.Length - 1; i++)
{
float distance = GetDistance(point, transforms[i] * positions[i]);
float distanceNext = GetDistance(point, transforms[i + 1] * positions[i + 1]);
if (distanceNext < distance)
{
var swapTransform = transforms[i];
transforms[i] = transforms[i + 1];
transforms[i + 1] = swapTransforms;
var swapPosition = positions[i];
positions[i] = positions[i + 1];
positions[i + 1] = swapPosition ;
isSorted = false;
}
}
} while(!isSorted);
Note that I used a bubble sort here (which is in no way efficient, just really easy to write). I suggest finding a much more efficient algorithm to use if you decide to go this route.
What i am trying here is multiplication of two 500x500 matrices of doubles.
And i am getting null reference exception !
Could u take a look into it
void Topt(double[][] A, double[][] B, double[][] C) {
var source = Enumerable.Range(0, N);
var pquery = from num in source.AsParallel()
select num;
pquery.ForAll((e) => Popt(A, B, C, e));
}
void Popt(double[][] A, double[][] B, double[][] C, int i) {
double[] iRowA = A[i];
double[] iRowC = C[i];
for (int k = 0; k < N; k++) {
double[] kRowB = B[k];
double ikA = iRowA[k];
for (int j = 0; j < N; j++) {
iRowC[j] += ikA * kRowB[j];
}
}
}
Thanks in advance
Since your nullpointer problem was already solved, why not a small performance tip ;) One thing you could try would be a cache oblivious algorithm. For 2k x 2k matrices I get 24.7sec for the recursive cache oblivious variant and 50.26s with the trivial iterative method on a e8400 #3ghz single threaded in C (both could be further optimized obviously, some better arguments for the compiler and making sure SSE is used etc.). Though 500x500 is quite small so you'd have to try if that gives you noticeable improvements.
The recursive one is easier to make multithreaded usually though.
Oh but the most important thing since you're writing in c#: Read this - it's for java, but the same should apply for any modern JIT..
I have a matrix and i want to create a new matrix which will be the old matrix, but without the first row and first column. is there a way to do this without using loops?
i want to create a new matrix
From this it sounds to me like you want a new T[,] object.
which will be the old matrix, but without the first row and first column
I interpret this to mean you want the new T[,] object to contain the same values as the original, excepting the first row/column.
is there a way to do this without using loops?
If I've interpreted your question correctly, then no, not really. You will need to copy elements from one array to another; this requires enumeration. But that doesn't mean you can't abstract the implementation of this method into a reusable method (in fact, this is what you should do).
public static T[,] SubMatrix(this T[,] matrix, int xstart, int ystart)
{
int width = matrix.GetLength(0);
int height = matrix.GetLength(1);
if (xstart < 0 || xstart >= width)
{
throw new ArgumentOutOfRangeException("xstart");
}
else if (ystart < 0 || ystart >= height)
{
throw new ArgumentOutOfRangeException("ystart");
}
T[,] submatrix = new T[width - xstart, height - ystart];
for (int i = xstart; i < width; ++i)
{
for (int j = ystart; j < height; ++j)
{
submatrix[i - xstart, j - ystart] = matrix[i, j];
}
}
return submatrix;
}
The above code isn't pretty, but once it's in place you'll be able to use it quite neatly:
T[,] withoutFirstRowAndColumn = originalMatrix.SubMatrix(1, 1);
Now, if I misinterpreted your question, and you are not dead-set on creating a new T[,] object, you can improve the efficiency of this approach by not allocating a new T[,] at all; you could take Abel's idea (along with its caveats) and use unsafe code to essentially simulate a T[,] with indices pointing to the elements of the original matrix. Come to think of it, you could even achieve this without resorting to unsafe code; you'd simply need to define an interface for the functionality you'd want to expose (a this[int, int] property comes to mind) and then implement that functionality (your return type wouldn't be a T[,] in this case, but what I'm getting at is that it could be something like it).
Simply put: no. But if you do not use jagged arrays but instead use multi-dim arrays, and if you take some time to study the memory layout of arrays in .NET, you could do it with unsafe pointers and erasing a part of the memory and moving the starting pointer of the multi-dim array. But it'd be still dependent on how you design your arrays and your matrixes whether this works or not.
However, I'd highly advice against it. There's a big chance you screw up the type and confuse the garbage collector if you do so.
Alternatively, if you like to do this exercise, use C++/CLI for this task. In C++, you have more control and it's easier to manipulate memory and move pointers directly. You also have more control over the destructor and finalizers, which may come in handy here. But, that said, then you still need marshaling. If you'd do all this for performance, I'd advice to go back to the simple loops, it'll perform faster in most cases.
Maybe you should have a look at using a maths library with good support for Matrix operations? Here's a thread which mentions a few:
Matrix Library for .NET
Using some methods from the Buffer class, you can do a row-wise copy if the matrix element type is a primitive type. This should be faster than an element-wise copy. Here is a generic extension method which demonstrates the use of Buffer:
static PrimitiveType[,] SubMatrix<PrimitiveType>(
this PrimitiveType[,] matrix, int fromRow, int fromCol) where PrimitiveType: struct
{
var (srcRowCount, srcColCount) = ( matrix.GetLength(0), matrix.GetLength(1) );
if (fromRow < 0 || fromRow > srcRowCount)
{
throw new IndexOutOfRangeException(nameof(fromRow));
}
if (fromCol < 0 || fromCol > srcColCount)
{
throw new IndexOutOfRangeException(nameof(fromCol));
}
var (dstRowCount, dstColCount) = ( srcRowCount - fromRow, srcColCount - fromCol );
var subMatrix = new PrimitiveType[dstRowCount, dstColCount];
var elementSize = Buffer.ByteLength(matrix) / matrix.Length;
for (var row = 0; row < dstRowCount; ++row)
{
var srcOffset = (srcColCount * (row + fromRow) + fromCol) * elementSize;
var dstOffset = dstColCount * row * elementSize;
Buffer.BlockCopy(matrix, srcOffset, subMatrix, dstOffset, dstColCount * elementSize);
}
return subMatrix;
}