Linq solution to get greatest numbers of arrays - c#

Having an object that has as a double[6] array property as its geometric bounds:
"xmin: " << bounds[0]
"xmax: " << bounds[1]
"ymin: " << bounds[2]
"ymax: " << bounds[3]
"zmin: " << bounds[4]
"zmax: " << bounds[5]
While iterating several objects and getting this property
I want to store the greatest value of xmax,ymax,zmax of all iterated objects
What would be the best way to accomplish this task, I have this idea, however I would like to use Linq
double[] max = new double[6];
double xmax = 0.0;
double ymax = 0.0;
double zmax = 0.0;
foreach (var o in myObject)
{
max = o.bounds;
if (xmax < max[1])
{
xmax = max[1];
}
if (ymax < max[3])
{
ymax = max[3];
}
if (zmax < max[5])
{
zmax = max[5];
}
}

You can do it using Enumerable.Max method:
double xmax = myObject.Select(x => x.bounds[1]).Max();
double ymax = myObject.Select(x => x.bounds[3]).Max();
double zmax = myObject.Select(x => x.bounds[5]).Max();
Note that this solution enumerates the collection three times unnecessarily.Your foreach loop is enumerates the collection only once, if I were you I would keep using the simple loop and use LINQ only when it's helpful.But ofcourse the decision is up to you.If your collection is not noticeably huge, you can prefer the more readable approach.

I would go with this approach:
double xmax = myObject.Max(mo => mo.bounds[1]);
double ymax = myObject.Max(mo => mo.bounds[3]);
double zmax = myObject.Max(mo => mo.bounds[5]);
This produces very fast results.
If you want to iterate the enumerable only once using linq, then do it this way:
var max = myObject
.Select(mo => mo.bounds)
.Aggregate(
new { x = double.MinValue, y = double.MinValue, z = double.MinValue },
(a, b) => new
{
x = Math.Max(a.x, b[1]),
y = Math.Max(a.y, b[3]),
z = Math.Max(a.z, b[5]),
});
While this is only one iteration my tests showed it was slower than the first method. The first method took 625ms and the second 705ms.

You can use LINQ and iterate only one time, with the following code, but at the core you are doing about the same thing. Depending on what else your code needs to do, this may be better than your current approach.
Personally, I would lean towards Selman's example, for readability sake.
double[] max = new double[6];
double xmax = 0.0;
double ymax = 0.0;
double zmax = 0.0;
myLinqObject.ForEach(x =>
{
xmax = (x.bounds[1] > xmax ? x.bounds[1] : xmax);
ymax = (x.bounds[3] > ymax ? x.bounds[3] : ymax);
zmax = (x.bounds[5] > zmax ? x.bounds[5] : zmax);
});
Important Note: Attempting this in code before C# 5.0 may result in wrong values because of a breaking change in the way linq closures worked. More info: http://davefancher.com/2012/11/03/c-5-0-breaking-changes/

You can use Enumerable.Aggregate() with Math.Max() to produce an array of max values:
myObject.Aggregate(new double[]{0,0,0}, (max, o) => new double[] {
Math.Max(max[0], o.Bounds[1]),
Math.Max(max[1], o.Bounds[3]),
Math.Max(max[2], o.Bounds[5])
});
http://rextester.com/KSBAG65974
Edit: I would think this would be the correct approach, as it takes advantage of Linq's ability to iterate the collection just once to get all 3 values. But, in practice, calling Enumerable.Max() 3 times is actually faster: http://rextester.com/QNPL66232
+1 to Enigmativity's answer for identifying that the cause of the slowness with Aggregate() is from garbage collection.

You can use this:
double max = myEnumerable.Max();
double min = myEnumerable.Min();

Related

#MathdotNet How to find the parameters of Herschel-Bulkley model through Nonlinear Regression in Math.NET?

First, I would like to thank everyone involved in this magnificent project, Math.NET saved my life!
I have few questions about the linear and nonlinear regression, I am a civil engineer and when I was working on my Master's degree, I needed to develop a C# application that calculates the Rheological parameters of concrete based on data acquired from a test.
One of the models that describes the rheological behavior of concrete is the "Herschel-Bulkley model" and it has this formula :
y = T + K*x^n
x (the shear-rate), y (shear-stress) are the values obtained from the test, while T,K and N are the parameters I need to determine.
I know that the value of "T" is between 0 and Ymin (Ymin is the smallest data point from the test), so here is what I did:
Since it is nonlinear equation, I had to make it linear, like this :
ln(y-T) = ln(K) + n*ln(x)
creat an array of possible values of T, from 0 to Ymin, and try each value in the equation,
then through linear regression I find the values of K and N,
then calculate the SSD, and store the results in an array,
after I finish all the possible values of T, I see which one had the smallest SSD, and use it to find the optimal K and N .
This method works, but I feel it is not as smart or elegant as it should be, there must be a better way to do it, and I was hoping to find it here, it is also very slow.
here is the code that I used:
public static double HerschelBulkley(double shearRate, double tau0, double k, double n)
{
var t = tau0 + k * Math.Pow(shearRate, n);
return t;
}
public static (double Tau0, double K, double N, double DeltaMin, double RSquared) HerschelBulkleyModel(double[] shear, double[] shearRate, double step = 1000.0)
{
// Calculate the number values from 0.0 to Shear.Min;
var sm = (int) Math.Floor(shear.Min() * step);
// Populate the Array of Tau0 with the values from 0 to sm
var tau0Array = Enumerable.Range(0, sm).Select(t => t / step).ToArray();
var kArray = new double[sm];
var nArray = new double[sm];
var deltaArray = new double[sm];
var rSquaredArray = new double[sm];
var shearRateLn = shearRate.Select(s => Math.Log(s)).ToArray();
for (var i = 0; i < sm; i++)
{
var shearLn = shear.Select(s => Math.Log(s - tau0Array[i])).ToArray();
var param = Fit.Line(shearRateLn, shearLn);
kArray[i] = Math.Exp(param.Item1);
nArray[i] = param.Item2;
var shearHerschel = shearRate.Select(sr => HerschelBulkley(sr, tau0Array[i], kArray[i], nArray[i])).ToArray();
deltaArray[i] = Distance.SSD(shearHerschel, shear);
rSquaredArray[i] = GoodnessOfFit.RSquared(shearHerschel, shear);
}
var deltaMin = deltaArray.Min();
var index = Array.IndexOf(deltaArray, deltaMin);
var tau0 = tau0Array[index];
var k = kArray[index];
var n = nArray[index];
var rSquared = rSquaredArray[index];
return (tau0, k, n, deltaMin, rSquared);
}

Combine Aggregate and Select Linq

I have a Listof points and I want to calculate the remaining distance to the end using Linq (given an index):
double remainingToEnd = Points.Skip(CurrentIndex).Aggregate((x, y) => x.DistanceTo(y));
This doesn't compile:
Cannot convert lambda expression to intended delegate type because
some of the return types in the block are not implicitly convertible
to the delegate return type
I normally solve this situation projecting by using the Select extension, but that would prevent me from calculating the distance afterwards.
This is easily achieved by using a loop but I want to know if it is possible with some simple Linq. I would like to avoid anonymous types too.
Point is defined like:
public class Point
{
public float X { get; set; }
public float Y { get; set; }
public float Z { get; set; }
public float DistanceTo(Point p2)
{
float x = this.X - p2.X;
float y = this.Y - p2.Y;
float z = this.Z - p2.Z;
return (float)Math.Sqrt((x * x) + (y * y) + (z * z));
}
}
Assume you want to calculate total distance between points in collection (starting from some index). You need previous point on each step. You can get it by zipping points collection with itself:
double remainingToEnd = Points.Skip(CurrentIndex)
.Zip(Points.Skip(CurrentIndex + 1), (x,y) => x.DistanceTo(y))
.Sum();
Zip will produce pairs of starting and ending points. Result selector function will select distance between points for each pair. And then you just calculate sum or distances.
You can solve this task with aggregation as will, but you need to store last point on each step. So you need accumulator which will keep both current distance and last point:
var remainingToEnd = Points.Skip(CurrentIndex).Aggregate(
new { total = 0.0, x = Points.Skip(CurrentIndex).FirstOrDefault() },
(a, y) => new { total = a.total + a.x.DistanceTo(y), x = y },
a => a.total);
And keep in mind, that Skip means just iterating your sequence item by item without doing anything. If you have a lot of points, skipping twice can hurt your performance. So if you have list of points, and performance matters, then simple for loop will do the job:
double remainingToEnd = 0.0;
for(int i = CurrentIndex; i < Points.Count - 1; i++)
remainingToEnd += Points[i].DistanceTo(Points[i+1]);
Try this:
double remainingToEnd = Points.Skip(CurrentIndex).Sum(point => point.DistanceTo(Points[Points.Findindex(p => p == point) - 1]));

Compute cosine and sine with Linq

I maded some code to compute the sine and cosine, but, the code is not so good, I want to know if is possible to make the code to compute the values with Linq.
that is my code to compute sine
var primes = PrimeNumbers(3, 15);
bool SumSub = false;
decimal seno = (decimal)(nGrau * nSeno);
foreach (var a in primes)
{
if (SumSub == false)
{
seno -= (decimal)Math.Pow(nGrau, (double)a) / Factorial(a);
SumSub = true;
}
else
{
seno += (decimal)Math.Pow(nGrau, (double)a) / Factorial(a);
SumSub = false;
}
}
Console.WriteLine(seno);
Is possible to make a code to compute the sine of degres using linq ?
Something like this, perhaps:
var sineResult = listDouble.Select((item, index) =>
new {i = (index%2)*2 - 1, o = item})
.Aggregate(seno, (result, b) =>
result - b.i * ((decimal)Math.Pow(nGrau, (double)b.o) / Factorial(b.o)));
The code
i = (index%2)*2 - 1
gives you alternating 1 and -1.
The Aggregate statement sums the values, mulitplying each value by either -1 or 1.
You could use Aggregate:
decimal seno = PrimeNumbers(3, 15)
.Aggregate(
new { sub = false, sum = (decimal)(nGrau * nSeno) },
(x, p) => new {
sub = !x.sub,
sum = x.sum + (x.sub ? 1 : -1) * (decimal)Math.Pow(nGrau, (double)p) / Factorial(p)
},
x => x.sum);
I didn't test that, but think it should work.
Btw. I don't think it's more readable or better then your solution. If I were you I would go with foreach loop, but improve it a little bit:
foreach (var a in primes)
{
seno += (SumSub ? 1 : -1) * (decimal)Math.Pow(nGrau, (double)a) / Factorial(a);
SumSub = !SumSub;
}
Here's a function that computes the adds up the first 10 terms of the Taylor series approximation of cosine:
var theta = 1.0m; // angle in radians
Enumerable.Range(1, 10).Aggregate(
new { term = 1.0m, accum = 0.0m },
(state, n) => new {
term = -state.term * theta * theta / (2 * n - 1) / (2 * n),
accum = state.accum + state.term},
state => state.accum)
See how it doesn't use an if, Power, or Factorial? The alternating signs are created simply by multiplying the last term by -1. Computing the ever-larger exponents and factorials on each term is not only expensive and results in loss of precision, it is also unnecessary.
To get x^2, x^4, x^6,... all you have to do is multiply each successive term by x^2. To get 1/1!, 1/3!, 1/5!,... all you have to do is divide each successive term by the next two numbers in the series. Start with 1; to get 1/3!, divide by 2 and then 3; to get 1/5! divide by 4 and then 5, and so on.
Note that I used the m prefix to denote decimal values because I'm assuming that you're trying to do your calculations in decimal for some reason (otherwise you would use Math.Cos).

C# Can LinearRegression code from Math.NET Numerics be made faster?

I need to do multiple linear regression efficiently. I am trying to use the Math.NET Numerics package but it seems slow - perhaps it is the way I have coded it? For this example I have only simple (1 x value) regression.
I have this snippet:
public class barData
{
public double[] Xs;
public double Mid;
public double Value;
}
public List<barData> B;
var xdata = B.Select(x=>x.Xs[0]).ToArray();
var ydata = B.Select(x => x.Mid).ToArray();
var X = DenseMatrix.CreateFromColumns(new[] { new DenseVector(xdata.Length, 1), new DenseVector(xdata) });
var y = new DenseVector(ydata);
var p = X.QR().Solve(y);
var b = p[0];
var a = p[1];
B[0].Value = (a * (B[0].Xs[0])) + b;
This runs about 20x SLOWER than this pure C#:
double xAvg = 0;
double yAvg = 0;
int n = -1;
for (int x = Length - 1; x >= 0; x--)
{
n++;
xAvg += B[x].Xs[0];
yAvg += B[x].Mid;
}
xAvg = xAvg / B.Count;
yAvg = yAvg / B.Count;
double v1 = 0;
double v2 = 0;
n = -1;
for (int x = Length - 1; x >= 0; x--)
{
n++;
v1 += (B[x].Xs[0] - xAvg) * (B[x].Mid - yAvg);
v2 += (B[x].Xs[0] - xAvg) * (B[x].Xs[0] - xAvg);
}
double a = v1 / v2;
double b = yAvg - a * xAvg;
B[0].Value = (a * B[Length - 1].Xs[0]) + b;
ALSO if Math.NET is the issue, then if anyone knows simple way to alter my pure code for multiple Xs I would be grateful of some help
Using a QR decomposition is a very generic approach that can deliver least squares regression solutions to any function with linear parameters, no matter how complicated it is. It is therefore not surprising that it cannot compete with a very specific straight implementation (on computation time), especially not in the simple case of y:x->a+b*x. Unfortunately Math.NET Numerics does not provide direct regression routines yet you could use instead.
However, there are still a couple things you can try for better speed:
Use thin instead of full QR decompositon, i.e. pass QRMethod.Thin to the QR method
Use our native MKL provider (much faster QR, but no longer purely managed code)
Tweak threading, e.g. try to disable multi-threading completely (Control.ConfigureSingleThread()) or tweak its parameters
If the data set is very large there are also more efficient ways to build the matrix, but that's likely not very relevant beside of the QR (-> perf analysis!).

How can I convert this divide and conquer code to compare one point to a list of points?

I found this code on the website http://rosettacode.org/wiki/Closest-pair_problem and I adopted the C# version of the divide and conquer method of finding the closest pair of points but what I am trying to do is adapt it for use to only find the closest point to one specific point. I have googled quite a bit and searched this website to find examples but none quite like this. I am not entirely sure what to change so that it only checks the list against one point rather than checking the list to find the two closest. I'd like to make my program operate as fast as possible because it could be searching a list of several thousand Points to find the closest to my current coordinate Point.
public class Segment
{
public Segment(PointF p1, PointF p2)
{
P1 = p1;
P2 = p2;
}
public readonly PointF P1;
public readonly PointF P2;
public float Length()
{
return (float)Math.Sqrt(LengthSquared());
}
public float LengthSquared()
{
return (P1.X - P2.X) * (P1.X - P2.X)
+ (P1.Y - P2.Y) * (P1.Y - P2.Y);
}
}
public static Segment Closest_BruteForce(List<PointF> points)
{
int n = points.Count;
var result = Enumerable.Range(0, n - 1)
.SelectMany(i => Enumerable.Range(i + 1, n - (i + 1))
.Select(j => new Segment(points[i], points[j])))
.OrderBy(seg => seg.LengthSquared())
.First();
return result;
}
public static Segment MyClosestDivide(List<PointF> points)
{
return MyClosestRec(points.OrderBy(p => p.X).ToList());
}
private static Segment MyClosestRec(List<PointF> pointsByX)
{
int count = pointsByX.Count;
if (count <= 4)
return Closest_BruteForce(pointsByX);
// left and right lists sorted by X, as order retained from full list
var leftByX = pointsByX.Take(count / 2).ToList();
var leftResult = MyClosestRec(leftByX);
var rightByX = pointsByX.Skip(count / 2).ToList();
var rightResult = MyClosestRec(rightByX);
var result = rightResult.Length() < leftResult.Length() ? rightResult : leftResult;
// There may be a shorter distance that crosses the divider
// Thus, extract all the points within result.Length either side
var midX = leftByX.Last().X;
var bandWidth = result.Length();
var inBandByX = pointsByX.Where(p => Math.Abs(midX - p.X) <= bandWidth);
// Sort by Y, so we can efficiently check for closer pairs
var inBandByY = inBandByX.OrderBy(p => p.Y).ToArray();
int iLast = inBandByY.Length - 1;
for (int i = 0; i < iLast; i++)
{
var pLower = inBandByY[i];
for (int j = i + 1; j <= iLast; j++)
{
var pUpper = inBandByY[j];
// Comparing each point to successivly increasing Y values
// Thus, can terminate as soon as deltaY is greater than best result
if ((pUpper.Y - pLower.Y) >= result.Length())
break;
Segment segment = new Segment(pLower, pUpper);
if (segment.Length() < result.Length())
result = segment;// new Segment(pLower, pUpper);
}
}
return result;
}
I used this code in my program to see the actual difference in speed and divide and conquer easily wins.
var randomizer = new Random(10);
var points = Enumerable.Range(0, 10000).Select(i => new PointF((float)randomizer.NextDouble(), (float)randomizer.NextDouble())).ToList();
Stopwatch sw = Stopwatch.StartNew();
var r1 = Closest_BruteForce(points);
sw.Stop();
//Debugger.Log(1, "", string.Format("Time used (Brute force) (float): {0} ms", sw.Elapsed.TotalMilliseconds));
richTextBox.AppendText(string.Format("Time used (Brute force) (float): {0} ms", sw.Elapsed.TotalMilliseconds));
Stopwatch sw2 = Stopwatch.StartNew();
var result2 = MyClosestDivide(points);
sw2.Stop();
//Debugger.Log(1, "", string.Format("Time used (Divide & Conquer): {0} ms", sw2.Elapsed.TotalMilliseconds));
richTextBox.AppendText(string.Format("Time used (Divide & Conquer): {0} ms", sw2.Elapsed.TotalMilliseconds));
//Assert.Equal(r1.Length(), result2.Length());
You can store the points in a better data structure that takes advantage of their position. Something like a quadtree.
The divide and conquer algorithm that you are trying to use doesn't really apply to this problem.
Don't use this algorithm at all, just go through the list one at a time comparing the distance to your reference point and at the end return the point that was the closest. This will be O(n).
You can probably add some extra speed ups but this should be good enough.
I can write some example code if you want.
You're mixing up two different problems. The only reason divide and conquer for the closest pair problem is faster than brute force is that it avoids comparing every point to every other point, so that it gets O(n log n) instead of O(n * n). But finding the closest point to just one point is just O(n). How can you find the closest point in a list of n points, while examining less than n points? What you're trying to do doesn't even make sense.
I can't say why your divide and conquer runs in less time than your brute force; maybe the linq implementation runs slower. But I think you'll find two things: 1) Even if, in absolute terms, your implementation of divide and conquer for 1 point runs in less time than your implementation of brute force for 1 point, they still have the same O(n). 2) If you just try a simple foreach loop and record the lowest distance squared, you'll get even better absolute time than your divide and conquer - and, it will still be O(n).
public static float LengthSquared(PointF P1, PointF P2)
{
return (P1.X - P2.X) * (P1.X - P2.X)
+ (P1.Y - P2.Y) * (P1.Y - P2.Y);
}
If, as your question states, you want to compare 1 (known) point to a list of points to find the closest then use this code.
public static Segment Closest_BruteForce(PointF P1, List<PointF> points)
{
PointF closest = null;
float minDist = float.MaxValue;
foreach(PointF P2 in points)
{
if(P1 != P2)
{
float temp = LengthSquared(P1, P2);
if(temp < minDist)
{
minDist = temp;
closest = P2;
}
}
}
return new Segment(P1, closest);
}
However, if as your example shows, you want to find the closest 2 points from a list of points try the below.
public static Segment Closest_BruteForce(List<PointF> points)
{
PointF closest1;
PointF closest2;
float minDist = float.MaxValue;
for(int x=0; x<points.Count; x++)
{
PointF P1 = points[x];
for(int y = x + 1; y<points.Count; y++)
{
PointF P2 = points[y];
float temp = LengthSquared(P1, P2);
if(temp < minDist)
{
minDist = temp;
closest1 = P1;
closest2 = P2;
}
}
}
return new Segment(closest1, closest2);
}
note the code above was written in the browser and may have some syntax errors.
EDIT Odd... is this an acceptable answer or not? Down-votes without explanation, oh well.

Categories