T-Test and PValue- Math.Net Numerics class - c#

Is there a method to get the ttest-value and P-Value from the StudentT class. I am trying to calculate those values from this library : https://numerics.mathdotnet.com/api/MathNet.Numerics.Distributions/StudentT.htm
The excel equivalent function is T-Test. But I do not have this method in math.net numerics excel function. Below link shows all the excel methods:
https://numerics.mathdotnet.com/api/MathNet.Numerics/ExcelFunctions.htm
Any pointers to get this done from the above library would be of great help.
Thank you...

I don't know how to do this in Math.Net, but I do know how to do it in Meta.Numerics:
Sample sample1 = new Sample(1.0, 2.0, 3.0, 4.0);
Sample sample2 = new Sample(3.0, 4.0, 5.0, 6.0);
TestResult tTest = Sample.StudentTTest(sample1, sample2);
Console.WriteLine("t = {0}", tTest.Statistic);
Console.WriteLine("P = {0} ({1})", tTest.Probability, tTest.Type);
I understand if you aren't interested in changing math libraries, but since your question had gone a while with no answer, I thought I'd chime in.

Here is how to perform a left-tailed t test using the MathNet.Numerics set of libraries.
Assume you have some data in an array and you have already calculated all the required preliminary statistics, as shown below.
int n = 5; //size of sample
double hypotheziedMean = 50.0;
double sampleMean = 50.03;
double sampleSD = 1.5; //sample standard deviation (n-1)
double stdErr = sampleSD / Math.Sqrt(n); //standard error of the mean
double t = (sampleMean - hypotheziedMean) / stdErr; //convert to a standard mean of 0 and SD of 1
StudentT st = new StudentT(0, 1, n-1); //create a standard StudentT object with n-1 DOF
double pVal = st.CumulativeDistribution(t); //left tail pval

Related

What parameters should I be using for the LogNormal and Normal Distribution in MATH.NET

I've tried several combinations of Mathdotnet's LogNormal and Normal classes: https://numerics.mathdotnet.com/api/MathNet.Numerics.Distributions/LogNormal.
I seem to get a lot closer to the result I'm looking for using the mean and standard deviation as parameters. However, I notice that when I use larger numbers, like numberOfMinutes my results do not deviate past the mean like they do with smaller numbers like numberOfDays do. I know I'm not thinking about this right and could use some help.
Also, I'd like to use the geometric mean vs the mean but I didn't know what parameter to use for the variance given I couldn't pinpoint how to even use it for the mean.
Finally, I hope the answer to this also answers the same issue I'm having with the Normal distribution.
List<double> numberOfDays = new List<double> { 10, 12, 18, 30 };
double mean = numberOfDays.Mean(); // 17.5
double geometricMean = numberOfDays.GeometricMean(); // 15.954
double variance = numberOfDays.Variance(); // 81
double standardDeviation = numberOfDays.StandardDeviation(); // 9
// Do I need a Geometric Standard Deviation or Variance
double numberOfDaysSampleMV = LogNormal.WithMeanVariance(mean, variance).Sample(); // One example sample yielded 40.23
double numberOfDaysSampleMSD = LogNormal.WithMeanVariance(mean, standardDeviation).Sample(); // One example sample yielded 17.33
I believe you are confused about the parameters required. Using conventional notation, you have set X which you believe is LogNormal:
X = { 10, 12, 18, 30 }
mean: m = 17.5
standard deviation: sd = 9
from this you derive set Y which is Normal:
Y = {2.30,2.48,2.89,3.4}
mean: mu = 2.77
standard deviation: sigma = 0.487
Note that mu and sigma are computed from Y, not X. To create sample of the LogNormal data, you use mu and sigma, not m and sd.
double[] sample = new double[100];
LogNormal.Samples(sample, mu, sigma);
This is consistent with the Wikipedia article on the LogNormal distribution. The Numerics documentation is not clear.
Here is my test program which might be useful:
List<double> X = new List<double> { 10, 12, 18, 30 }; // assume to be LogNormal
double m = X.Mean(); // mean of log normal values = 17.5
double sd = X.StandardDeviation(); // standard deviation of log normal values = 9
List<double> Y = new List<double> { };
for (int i = 0; i < 4; i++)
{
Y.Add(Math.Log(X[i]));
}
// Y = {2.30,2.48,2.89,3.4}
double mu = Y.Mean(); // mean of normal values = 2.77
double sigma = Y.StandardDeviation(); // standard deviation of normal values = 0.487
double[] sample = new double[100];
LogNormal.Samples(sample, mu, sigma); // get sample
double sample_m = sample.Mean(); // 17.93, approximates m
double sample_sd = sample.StandardDeviation(); // 8.98, approximates sd
sample = new double[100];
Normal.Samples(sample, mu, sigma); // get sample
double sample_mu = sample.Mean(); //2.77, approximates mu
double sample_sigma = sample.StandardDeviation(); //0.517 approximates sigma
Using your test program above my samples came out like this.
Using LogNormal(mu, sigma)
I'm ultimately concerned about the values greater than 30 and less than 10.
However, by trail and error [accidentally], when I use the following method to get the samples using the original m and sd variable in your test program I get the results I'm looking for. I do not want to go forward with something I accidentally did.
sample = new double[100];
for (int i = 0; i < 100; i++)
{
sample[i] = LogNormal.WithMeanVariance(m, sd).Sample();
}
Using LogNormal.WithMeanVariance(m, sd)
My values are consistently between the Min and Max and concentrated around the Mean.
My example shows pretty clearly how to get a LogNormal sample that has the mean and standard deviation of the original data.
The min/max of 10/30 is unrealistic if you are going create your samples based on the mean and standard deviation of the sample. Suppose you took of random sample of the weights of 4 people out of a population of 1000 people. Would you expect your sample to include both the lightest and heaviest of the population?
LogNormal.WithMeanVariance(m, sd) is wrong. The units are wrong. It's expecting a variance would have the units of ln(days)^2 while sd has units of days.
I suggest you a) use LogNormal(mu,sigma) and discard any values that are outside your min/max range or b) use LogNormal(mu,c*sigma) for some value of c less than one to reduce the variance enough that all the values are in your min/max range. The choice depends on the nature of your project.
The Wikipedia entry on the LogNormal distribution has formulas for computing mu and sigma from m and sd which might be better than calculating from the Y data.

Could someone describe a 2d interpolation method that is better than bilinear interpolation?

I have a grid of data points that I currently use Bilinear interpolation on to find the missing points in the grid. I was pointed in the directions of Kriging aka thee best linear unbiased estimator, but I was unable to find good source code or an algebraic explanation. Does anyone know of any other interpolation methods I could use?
--Update
#Sam Greenhalgh
I have considered Bicubic Interpolation but the results I received using the code example I found seemed off.
Here is the code example for Bicubic
Note I am coding in C# but I welcome examples from other languages as well.
//array 4
double cubicInterpolate(double[] p, double x)
{
return p[1] + 0.5 * x * (p[2] - p[0] + x * (2.0 * p[0] - 5.0 * p[1] + 4.0 * p[2] - p[3] + x * (3.0 * (p[1] - p[2]) + p[3] - p[0])));
}
//array 4 4
public double bicubicInterpolate(double[][] p, double x, double y)
{
double[] arr = new double[4];
arr[0] = cubicInterpolate(p[0], y);
arr[1] = cubicInterpolate(p[1], y);
arr[2] = cubicInterpolate(p[2], y);
arr[3] = cubicInterpolate(p[3], y);
return cubicInterpolate(arr, x);
}
double[][] p = {
new double[4]{2.728562594,2.30599759,1.907579158,1.739559264},
new double[4]{3.254756633,2.760758022,2.210417411,1.979012766},
new double[4]{4.075740069,3.366434527,2.816093916,2.481060234},
new double[4]{5.430966401,4.896723504,4.219613391,4.004306461}
};
Console.WriteLine(CI.bicubicInterpolate(p, 2, 2));
One widely-used interpolation method is kriging (or Gaussian process regression).
However, the use of kriging is not advised when your data points are on a regular grid. The euclidian distances between data points are used to adjust the parameters of the model. But in a grid, there are much fewer values of distance than in, say, a randomly simulated set of points.
Nevertheless, even if your data points are regularly placed, it could be interesting to give it a try. If you are interested, you can use the following softwares:
DiceKriging package in R language (there exist others like kriging, gstat...)
DACE toolbox in Matlab
STK in Matlab/Octave
And many others (in python for example)...
NOTE: It can be interesting to note (I do not exactly in what context you want to apply kriging) that the kriging interpolation property can very easily be relaxed in order to take into account, for example, possible measurement errors.
If your data points are on a regular grid, I would recommend using a piecewise linear spline in two dimensions. You could fill the data for the rows (x-values) first, then fill the data for the columns (y-values.)
Math.NET Numerics has the piecewise linear spline function that you would need:
MathNet.Numerics.Interpolation.LinearSpline.InterpolateSorted

Exponential based Curve-Fit using Math.Net

I'm very new to the Math.Net Library and I'm having problems trying to do curve-fitting based on an exponential function. More specifically I intend to use this function:
f(x) = a*exp(b*x) + c*exp(d*x)
Using MATLAB I get pretty good results, as shown in the following image:
MATLAB calculates the following parameters:
f(x) = a*exp(b*x) + c*exp(d*x)
Coefficients (with 95% confidence bounds):
a = 29.6 ( 29.49 , 29.71)
b = 0.000408 ( 0.0003838, 0.0004322)
c = -6.634 ( -6.747 , -6.521)
d = -0.03818 ( -0.03968 , -0.03667)
Is it possible to achieve these results using Math.Net?
Looking at Math.net, it seems that Math.net does various types of regression, whereas your function require some type of iterative method. For instance Gauss-Newton's method where you would use linear regression in each iteration to solve a (overdetermined) system of linear equations, but this would still require some "manual" work with writing the method.
No it appears there is not exponential support at this time. However there's a discussion on Math.NET forums where a maintainer proposes a workaround:
https://discuss.mathdotnet.com/t/exponential-fit/131
Contents duplicated in case link gets broken:
You can, by transforming it, similar to Linearizing non-linear models
by transformation. Something along the lines of the following should
work:
double[] Exponential(double[] x, double[] y,
DirectRegressionMethod method = DirectRegressionMethod.QR)
{
double[] y_hat = Generate.Map(y, Math.Log);
double[] p_hat = Fit.LinearCombination(x, y_hat, method, t => 1.0, t => t);
return new[] {Math.Exp(p_hat[0]), p_hat[1]};
}
Example usage:
double[] x = new[] { 1.0, 2.0, 3.0 };
double[] y = new[] { 2.0, 4.1, 7.9 };
double[] p = Exponential(x,y); // a=1.017, r=0.687
double[] yh = Generate.Map(x,k => p[0]*Math.Exp(p[1]*k)) // 2.02, 4.02, 7.98
Answer is: not yet, I believe. Basically, there is contribution of whole csmpfit package, but it yet to be integrated into Math.Net. You could use it as separate library and then after full integration move to Math.Net. Link http://csmpfit.codeplex.com

Hanning and Hamming window functions in C#

I'm trying to implement Hanning and Hamming window functions in C#. I can't find any .Net samples anywhere and I'm not sure if my attempts at converting from C++ samples does the job well.
My problem is mainly that looking at the formulas I imagine they need to have the original number somewhere on the right hand side of the equation - I just don't get it from looking at the formulas. (My math isn't that good yet obviously.)
What I have so far:
public Complex[] Hamming(Complex[] iwv)
{
Complex[] owv = new Complex[iwv.Length];
double omega = 2.0 * Math.PI / (iwv.Length);
// owv[i].Re = real number (raw wave data)
// owv[i].Im = imaginary number (0 since it hasn't gone through FFT yet)
for (int i = 1; i < owv.Length; i++)
// Translated from c++ sample I found somewhere
owv[i].Re = (0.54 - 0.46 * Math.Cos(omega * (i))) * iwv[i].Re;
return owv;
}
public Complex[] Hanning(Complex[] iwv)
{
Complex[] owv = new Complex[iwv.Length];
double omega = 2.0 * Math.PI / (iwv.Length);
for (int i = 1; i < owv.Length; i++)
owv[i].Re = (0.5 + (1 - Math.Cos((2d * Math.PI ) / (i -1)))); // Uhm... wrong
return owv;
}
Here's an example of a Hamming window in use in an open source C# application I wrote a while back. It's being used in a pitch detector for an autotune effect.
You can use the Math.NET library.
double[] hannDoubles = MathNet.Numerics.Window.Hamming(dataIn.Length);
for (int i = 0; i < dataIn.Length; i++)
{
dataOut[i] = hannDoubles[i] * dataIn[i];
}
See my answer to a similar question:
https://stackoverflow.com/a/42939606/246758
The operation of "windowing" means multiplying a signal by a window function. This code you found appears to generate the window function and scale the original signal. The equations are for just the window function itself, not the scaling.

Trying to calculate Pi to N number of decimals with C#

Note: I've already read this topic, but I don't understand it and it doesn't provide a solution I could use. I'm terrible with number problems.
What's a simple way to generate Pi to what number of decimals a user wants? This isn't for homework, just trying to complete some of the projects listed here:
Link
A classic algorithm for calculating digits of pi is the Gauss-Legendre algorithm. While it is not as fast as some of the more modern algorithms it does have the advantage of being understandable.
Let
a_0 = 1
b_0 = 1/Sqrt(2)
t_0 = 1/4
p_0 = 1
Then
a_(n+1) = (a_n + b_n) / 2
b_(n+1) = Sqrt(a_n * b_n)
t_(n+1) = t_n - p_n * (a_n - a_(n+1))^2
p_(n+1) = 2 * p_n
Then
pi =. (a_n + b_n)^2 / (4 * t_n)
Here (=. means "approximately equal to") This algorithm exhibits quadratic convergence (the number of correct decimal places doubles with each iteration).
I'll leave it to you to translate this to C# including discovering an arbitrary-precision arithmetic library.
The topic your talking about calculate the value of PI using the taylor series. Using the function "double F (int i)" wrote on that topic will give you the value of PI after "i" terms.
This way of calculating PI is kind of slow, i suggest you to look at the PI fast algorithm.
You can also find one implementation here that get the calculate PI to the n th digit.
Good luck!
If you take a close look into this really good guide:
Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4
You'll find at Page 70 this cute implementation (with minor changes from my side):
static decimal ParallelPartitionerPi(int steps)
{
decimal sum = 0.0;
decimal step = 1.0 / (decimal)steps;
object obj = new object();
Parallel.ForEach(Partitioner.Create(0, steps),
() => 0.0,
(range, state, partial) =>
{
for (int i = range.Item1; i < range.Item2; i++)
{
decimal x = (i + 0.5) * step;
partial += 4.0 / (1.0 + x * x);
}
return partial;
},
partial => { lock (obj) sum += partial; });
return step * sum;
}

Categories