Why does this LinqPad program produce different results on the second run?

Why does this LinqPad program produce different results on the second run? - c#

I am developing a simple PID controller in LinqPad:
The PID Class
class PidController
{
public float Proportional { get; set; }
public float Integral { get; set; }
public float Derivative { get; set; }
public float SetPoint { get; set; }
public float Kp { get; set; }
public float Ki { get; set; }
public float Kd { get; set; }
float _lastError;
DateTime _lastTime = DateTime.Now;
public PidController(float kp, float ki, float kd)
{
Kp = kp; Ki = ki; Kd = kd;
}
public float GetControlValue(float actual)
{
var currentTime = DateTime.Now;
var deltaTime = (float)(currentTime - _lastTime).TotalSeconds;
var error = SetPoint - actual;
Proportional = error;
Integral = Integral + error * deltaTime;
Derivative = (error - _lastError) / deltaTime;
_lastError = error;
return Kp * Proportional + Ki * Integral + Kd * Derivative;
}
}
For testing and tuning, the controller will control this simple process:
The Controlled Process
class SimpleProcess
{
private DateTime _lastTime = DateTime.Now;
private float _output;
public float Output { get { UpdateOutput(); return _output; }}
public float Input { get; set; }
private void UpdateOutput()
{
var deltaTime = (float)(DateTime.Now - _lastTime).TotalSeconds;
_output += Input * deltaTime;
}
}
...using this main loop:
The Main Program Loop
void Main()
{
var pid = new PidController(1f, 0f, 0f) { SetPoint = 100f };
var proc = new SimpleProcess();
// pid.Dump();
// proc.Dump();
var values = new List<ProcessValue>();
for (int i = 0; i < 50; i++)
{
var actual = proc.Output;
var controlValue = pid.GetControlValue(actual);
proc.Input = controlValue;
var value = new ProcessValue
{
index = i,
timestamp = DateTime.Now.ToString("ss.fff"),
p = pid.Proportional,
i = pid.Integral,
d = pid.Derivative,
input = controlValue,
output = actual
};
values.Add(value);
Thread.Sleep(100);
}
values.Dump();
}
public class ProcessValue
{
public int index;
public string timestamp;
public float p, i, d, input, output;
}
Everything works as expected on the first run:
index timestamp p i d input output
0 53.309 100 0.46 21490.59 100 0
1 53.411 89.69 10.06 -96.27 89.69 10.30
etc...
However, I started getting unexpected results on the second and subsequent runs after I commented out the line proc.Dump():
index timestamp p i d input output
0 10.199 100 0 ∞ NaN 0
1 10.299 NaN NaN NaN NaN NaN
2 10.399 NaN NaN NaN NaN NaN
etc...
Why is the second run (and subsequent runs) returning different results in my case?
Any of the following actions will cause the next run to succeed:
modify the code (even just adding/removing a single whitespace)
press [CTRL]+[SHIFT]+[F5]
The following makes the code run correctly every time:
uncomment the line proc.Dump()
This answer mentions that static variables will be cached between runs, but I have no static variables. I suspect the problem is related to the Application Domain Caching feature in LinqPad, but I'm trying to understand why I'm affected by this.
Update
StriplingWarrior's answer is correct, my first derivative calculation resulted in Infinity when they system was performing well (i.e. after LinqPad had cached the first run), causing all subsequent calculations to fail. Modifying my program in any way was invalidating this cache and caused the deltaTime to be large enough to avoid the error again on the next run.
Since a derivative term makes no sense on the first interval, I decided to handle this by simply ignoring it:
var p = Kp * Proportional;
var i = Ki * Integral;
var d = float.IsInfinity(Derivative) ? 0 : Kd * Derivative;
return p + i + d;

You can test what Andrew theorizes in your comments above, by changing the first part of your main method thusly:
var sw = new Stopwatch();
sw.Start();
var pid = new PidController(1f, 0f, 0f) { SetPoint = 100f };
var proc = new SimpleProcess();
// pid.Dump();
// proc.Dump();
var values = new List<ProcessValue>();
for (int i = 0; i < 50; i++)
{
var actual = proc.Output;
var controlValue = pid.GetControlValue(actual);
if(sw.IsRunning){
sw.Stop();
sw.ElapsedTicks.Dump();
}
Running on my machine, I can see that the first run takes 10,000+ ticks, whereas the second run takes only 20 ticks. I'm guessing this makes your calculations based on differences in DateTime.Now have very small delta values, and yield the differences you're seeing.

Related

Linear regression in a list with linq

I have a list of 'steps' that form a ramps series. Eeach step has a start value, an end value and a duration. Here is an example plot:
It is guaranteed, that the start value of a subsequent step is equal to the end value. Its a monotonous function.
Now I need to get the value at a given time. I have already a working implementation using good old foreach but I wonder if there is some clever way to do it with linq. Perhaps someome has an idea to substitute the GetValueAt function?
class Program
{
class Step
{
public double From { get; set; }
public double To { get; set; }
public int Duration { get; set; }
}
static void Main(string[] args)
{
var steps = new List<Step>
{
new Step { From = 0, To = 10, Duration = 20},
new Step { From = 10, To = 12, Duration = 10},
};
const double doubleTolerance = 0.001;
// test turning points
Debug.Assert(Math.Abs(GetValueAt(steps, 0) - 0) < doubleTolerance);
Debug.Assert(Math.Abs(GetValueAt(steps, 20) - 10) < doubleTolerance);
Debug.Assert(Math.Abs(GetValueAt(steps, 30) - 12) < doubleTolerance);
// test linear interpolation
Debug.Assert(Math.Abs(GetValueAt(steps, 10) - 5) < doubleTolerance);
Debug.Assert(Math.Abs(GetValueAt(steps, 25) - 11) < doubleTolerance);
}
static double GetValueAt(IList<Step> steps, int seconds)
{
// guard statements if seconds is within steps omitted here
var runningTime = steps.First().Duration;
var runningSeconds = seconds;
foreach (var step in steps)
{
if (seconds <= runningTime)
{
var x1 = 0; // stepStartTime
var x2 = step.Duration; // stepEndTime
var y1 = step.From; // stepStartValue
var y2 = step.To; // stepEndValue
var x = runningSeconds;
// linear interpolation
return y1 + (y2 - y1) / (x2 - x1) * (x - x1);
}
runningTime += step.Duration;
runningSeconds -= step.Duration;
}
return double.NaN;
}
}

You could try Aggregate:
static double GetValueAt(IList<Step> steps, int seconds)
{
var (value, remaining) = steps.Aggregate(
(Value: 0d, RemainingSeconds: seconds),
(secs, step) =>
{
if (secs.RemainingSeconds > step.Duration)
{
return (step.To, secs.RemainingSeconds - step.Duration);
}
else
{
return (secs.Value + ((step.To - step.From) / step.Duration) * secs.RemainingSeconds, 0);
}
});
return remaining > 0 ? double.NaN : value;
}

let's ignore linq for a moment...
for small amounts of steps, your foreach approach is quite effective ... also if you can manage the accessing side to favor ordered sequential access instead of random access, you could optimize the way of accessing the required step to calculate the value... think of an iterator that only goes forward if the requested point is not on the current step
if your amount of steps becomes larger and you need to access the values in a random order, you might want to introduce a balanced tree structure for searching the right step element

Linear Regression with NuML

I'm trying to do a really basic linear regression (Z = 2 * X + 1) prediction using NuML. Given the data is so linear I can't understand why the predicted value is so far off unless I am doing something wrong. I have the target class
public class Sample
{
public float V { get; set; }
public float X { get; set; }
public float Y { get; set; }
public float Z { get; set; }
public Func<float, float, float, float> OutputStrategy { get; set; }
public Sample(Func<float, float, float, float> outputStrategy)
{
OutputStrategy = outputStrategy;
}
public void Seed(int i)
{
V = (float) i;
X = (float) 2 * i;
Y = (float) 3 * i;
Z = OutputStrategy(V, X, Y);
}
}
and I have the NuML code to set up the source values and predict an answer for an arbitrary new data point:
NB: The output strategy is a simple 2 * A + 1. I've tried it with multivariate analysis and the prediction is further away
public static void Main(string[] args)
{
// Generate sample data
int sampleSize = 1000;
Sample[] samples = new Sample[sampleSize];
Func<float, float, float, float> outputStrategy = (A, B, C) => 2 * A + 1;
for (int i = 0; i < sampleSize; i++)
{
samples[i] = new Sample(outputStrategy);
samples[i].Seed(i);
}
// calculate model
var generator = new LinearRegressionGenerator();
var descriptor = Descriptor.New("Samples")
.With("V").As(typeof(float))
.With("X").As(typeof(float))
.With("Y").As(typeof(float))
.Learn("Z").As(typeof(float));
generator.Descriptor = descriptor;
var model = Learner.Learn(samples, 0.6, 50, generator);
// Use prediction
var targetSample = new Sample(outputStrategy);
targetSample.Seed(sampleSize + 1);
var predictedSample = model.Model.Predict(targetSample);
var predictedValue = predictedSample.Z;
var actualValue = outputStrategy(targetSample.V, targetSample.X, targetSample.Y);
Console.Write("Predicted Value = {0}, Actual Value = {1}, Difference = {2} {3:0.00}%", predictedValue, actualValue, actualValue - predictedValue, (decimal) (actualValue - predictedValue) / (decimal) predictedValue * 100M);
Console.ReadKey();
}
This gives a difference of about 0.5% which considering the line is completely straight was surprising. I have tried using different % of the dataset for training and number of iterations of the model but it makes no difference to the output.
If I use even a more slightly more complicated model I get much worse predictive capabilities. If I use logistic regression, the predicted output of Z is always 1?!

How to efficiently calculate a moving Standard Deviation

Below you can see my C# method to calculate Bollinger Bands for each point (moving average, up band, down band).
As you can see this method uses 2 for loops to calculate the moving standard deviation using the moving average. It used to contain an additional loop to calculate the moving average over the last n periods. This one I could remove by adding the new point value to total_average at the beginning of the loop and removing the i - n point value at the end of the loop.
My question now is basically: Can I remove the remaining inner loop in a similar way I managed with the moving average?
public static void AddBollingerBands(SortedList<DateTime, Dictionary<string, double>> data, int period, int factor)
{
double total_average = 0;
for (int i = 0; i < data.Count(); i++)
{
total_average += data.Values[i]["close"];
if (i >= period - 1)
{
double total_bollinger = 0;
double average = total_average / period;
for (int x = i; x > (i - period); x--)
{
total_bollinger += Math.Pow(data.Values[x]["close"] - average, 2);
}
double stdev = Math.Sqrt(total_bollinger / period);
data.Values[i]["bollinger_average"] = average;
data.Values[i]["bollinger_top"] = average + factor * stdev;
data.Values[i]["bollinger_bottom"] = average - factor * stdev;
total_average -= data.Values[i - period + 1]["close"];
}
}
}

The problem with approaches that calculate the sum of squares is that it and the square of sums can get quite large, and the calculation of their difference may introduce a very large error, so let's think of something better. For why this is needed, see the Wikipedia article on Algorithms for computing variance and John Cook on Theoretical explanation for numerical results)
First, instead of calculating the stddev let's focus on the variance. Once we have the variance, stddev is just the square root of the variance.
Suppose the data are in an array called x; rolling an n-sized window by one can be thought of as removing the value of x[0] and adding the value of x[n]. Let's denote the averages of x[0]..x[n-1] and x[1]..x[n] by µ and µ’ respectively. The difference between the variances of x[0]..x[n-1] and x[1]..x[n] is, after canceling out some terms and applying (a²-b²) = (a+b)(a-b):
Var[x[1],..,x[n]] - Var[x[0],..,x[n-1]]
= (\sum_1^n x[i]² - n µ’²)/(n-1) - (\sum_0^{n-1} x[i]² - n µ²)/(n-1)
= (x[n]² - x[0]² - n(µ’² - µ²))/(n-1)
= (x[n]-µ’ + x[0]-µ)(x[n]-x[0])/(n-1)
Therefore the variance is perturbed by something that doesn't require you to maintain the sum of squares, which is better for numerical accuracy.
You can calculate the mean and variance once in the beginning with a proper algorithm (Welford's method). After that, every time you have to replace a value in the window x[0] by another x[n] you update the average and variance like this:
new_Avg = Avg + (x[n]-x[0])/n
new_Var = Var + (x[n]-new_Avg + x[0]-Avg)(x[n] - x[0])/(n-1)
new_StdDev = sqrt(new_Var)

The answer is yes, you can. In the mid-80's I developed just such an algorithm (probably not original) in FORTRAN for a process monitoring and control application. Unfortunately, that was over 25 years ago and I do not remember the exact formulas, but the technique was an extension of the one for moving averages, with second order calculations instead of just linear ones.
After looking at your code some, I am think that I can suss out how I did it back then. Notice how your inner loop is making a Sum of Squares?:
for (int x = i; x > (i - period); x--)
{
total_bollinger += Math.Pow(data.Values[x]["close"] - average, 2);
}
in much the same way that your average must have originally had a Sum of Values? The only two differences are the order (its power 2 instead of 1) and that you are subtracting the average each value before you square it. Now that might look inseparable, but in fact they can be separated:
SUM(i=1; n){ (v[i] - k)^2 }
is
SUM(i=1..n){v[i]^2 -2*v[i]*k + k^2}
which becomes
SUM(i=1..n){v[i]^2 -2*v[i]*k} + k^2*n
which is
SUM(i=1..n){v[i]^2} + SUM(i=1..n){-2*v[i]*k} + k^2*n
which is also
SUM(i=1..n){v[i]^2} + SUM(i=1..n){-2*v[i]}*k + k^2*n
Now the first term is just a Sum of Squares, you handle that in the same way that you do the sum of Values for the average. The last term (k^2*n) is just the average squared times the period. Since you divide the result by the period anyway, you can just add the new average squared without the extra loop.
Finally, in the second term (SUM(-2*v[i]) * k), since SUM(v[i]) = total = k*n you can then change it into this:
-2 * k * k * n
or just -2*k^2*n, which is -2 times the average squared, once the period (n) is divided out again. So the final combined formula is:
SUM(i=1..n){v[i]^2} - n*k^2
or
SUM(i=1..n){values[i]^2} - period*(average^2)
(be sure to check the validity of this, since I am deriving it off the top of my head)
And incorporating into your code should look something like this:
public static void AddBollingerBands(ref SortedList<DateTime, Dictionary<string, double>> data, int period, int factor)
{
double total_average = 0;
double total_squares = 0;
for (int i = 0; i < data.Count(); i++)
{
total_average += data.Values[i]["close"];
total_squares += Math.Pow(data.Values[i]["close"], 2);
if (i >= period - 1)
{
double total_bollinger = 0;
double average = total_average / period;
double stdev = Math.Sqrt((total_squares - Math.Pow(total_average,2)/period) / period);
data.Values[i]["bollinger_average"] = average;
data.Values[i]["bollinger_top"] = average + factor * stdev;
data.Values[i]["bollinger_bottom"] = average - factor * stdev;
total_average -= data.Values[i - period + 1]["close"];
total_squares -= Math.Pow(data.Values[i - period + 1]["close"], 2);
}
}
}

I've used commons-math (and contributed to that library!) for something very similar to this. It's open-source, porting to C# should be easy as store-bought pie (have you tried making a pie from scratch!?). Check it out: http://commons.apache.org/math/api-3.1.1/index.html. They have a StandardDeviation class. Go to town!

Most important information has already been given above --- but maybe this is still of general interest.
A tiny Java library to calculate moving average and standard deviation is available here:
https://github.com/tools4j/meanvar
The implementation is based on a variant of Welford's method mentioned above. Methods to remove and replace values have been derived that can be used for moving value windows.
Disclaimer: I am the author of the said library.

I just did it with Data From Binance Future API
Hope this helps:
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.IO;
using System.Linq;
using System.Net;
using System.Text;
using System.Threading.Tasks;
using System.Windows.Forms;
using static System.Windows.Forms.VisualStyles.VisualStyleElement;
namespace Trading_Bot_1
{
public class BOLL
{
private BollingerBandData graphdata = new BollingerBandData();
private List<TickerData> data = new List<TickerData>();
public BOLL(string url)
{
string js = getJsonFromUrl(url);
//dynamic data = JObject.Parse(js);
object[][] arrays = JsonConvert.DeserializeObject<object[][]>(js);
data = new List<TickerData>();
for (int i = 1; i < 400; i++)
{
data.Add(new TickerData
{
Date = DateTime.Now,
Open = Convert.ToDouble(arrays[arrays.Length - i][1]),
High = Convert.ToDouble(arrays[arrays.Length - i][2]),
Low = Convert.ToDouble(arrays[arrays.Length - i][3]),
Close = Convert.ToDouble(arrays[arrays.Length - i][4]),
Volume = Math.Round(Convert.ToDouble(arrays[arrays.Length - i][4]), 0),
AdjClose = Convert.ToDouble(arrays[arrays.Length - i][6])
});
}
graphdata.LowerBand.Add(1);
graphdata.LowerBand.Add(2);
graphdata.LowerBand.Add(3);
graphdata.LowerBand.Add(1);
graphdata.UpperBand.Add(1);
graphdata.UpperBand.Add(2);
graphdata.UpperBand.Add(3);
graphdata.UpperBand.Add(4);
graphdata.MovingAverageWindow.Add(10);
graphdata.MovingAverageWindow.Add(20);
graphdata.MovingAverageWindow.Add(40);
graphdata.MovingAverageWindow.Add(50);
graphdata.Length.Add(10);
graphdata.Length.Add(30);
graphdata.Length.Add(50);
graphdata.Length.Add(100);
// DataContext = graphdata;
}
public static string getJsonFromUrl(string url1)
{
var uri = String.Format(url1);
WebClient client = new WebClient();
client.UseDefaultCredentials = true;
var data = client.DownloadString(uri);
return data;
}
List<double> UpperBands = new List<double>();
List<double> LowerBands = new List<double>();
public List<List<double>> GetBOLLDATA(int decPlaces)
{
int datalength = graphdata.SelectedMovingAverage + graphdata.SelectedLength;
string bands = "";
for (int i = graphdata.SelectedLength - 1; i >= 0; i--)
{
List<double> price = new List<double>();
for (int j = 0; j < graphdata.SelectedMovingAverage; j++)
{
price.Add(data[i + j].Close);
}
double sma = CalculateAverage(price.ToArray());
double sigma = CalculateSTDV(price.ToArray());
double lower = sma - (graphdata.SelectedLowerBand * sigma);
double upper = sma + (graphdata.SelectedUpperBand * sigma);
UpperBands.Add(Math.Round( upper,decPlaces));
LowerBands.Add(Math.Round(lower, decPlaces));
bands += (Math.Round(upper, decPlaces) + " / " + Math.Round(lower, decPlaces)) + Environment.NewLine;
// graphdata.ChartData.Add(new ChartData() { SMA = sma, LowerBandData = lower, UpperBandData = upper });
}
//MessageBox.Show(bands);
return new List<List<double>> { UpperBands, LowerBands };
}
public double[] GetBOLLDATA(int decPlaces, string a)
{
List<double> price = new List<double>();
for (int j = 0; j < graphdata.SelectedMovingAverage; j++)
{
price.Add(data[j].Close);
}
double sma = CalculateAverage(price.ToArray());
double sigma = CalculateSTDV(price.ToArray());
double lower = sma - (graphdata.SelectedLowerBand * sigma);
double upper = sma + (graphdata.SelectedUpperBand * sigma);
return new double[] { Math.Round(upper, decPlaces), Math.Round(lower, decPlaces) };
}
private double CalculateAverage(double[] data)
{
int count = data.Length;
double sum = 0;
for (int i = 0; i < count; i++)
{
sum += data[i];
}
return sum / count;
}
private double CalculateVariance(double[] data)
{
int count = data.Length;
double sum = 0;
double avg = CalculateAverage(data);
for (int i = 0; i < count; i++)
{
sum += (data[i] - avg) * (data[i] - avg);
}
return sum / (count - 1);
}
private double CalculateSTDV(double[] data)
{
double var = CalculateVariance(data);
return Math.Sqrt(var);
}
}
public class ChartData
{
public double UpperBandData
{ get; set; }
public double LowerBandData
{ get; set; }
public double SMA
{ get; set; }
}
public class BollingerBandData : INotifyPropertyChanged
{
private ObservableCollection<int> _lowerBand;
private ObservableCollection<int> _upperBand;
private ObservableCollection<int> _movingAvg;
private ObservableCollection<int> _length;
private ObservableCollection<ChartData> _chartData;
private int _selectedLowerBand;
private int _selectedUpperBand;
private int _selectedMovingAvg;
private int _selectedLength;
public BollingerBandData()
{
_lowerBand = new ObservableCollection<int>();
_upperBand = new ObservableCollection<int>();
_movingAvg = new ObservableCollection<int>();
_length = new ObservableCollection<int>();
_chartData = new ObservableCollection<ChartData>();
SelectedLowerBand = 2;
SelectedUpperBand = 2;
SelectedMovingAverage = 20;
SelectedLength = 5;
}
public ObservableCollection<ChartData> ChartData
{
get
{
return _chartData;
}
set
{
_chartData = value;
RaisePropertyChanged("ChartData");
}
}
public ObservableCollection<int> LowerBand
{
get
{
return _lowerBand;
}
set
{
_lowerBand = value;
RaisePropertyChanged("LowerBand");
}
}
public ObservableCollection<int> UpperBand
{
get
{
return _upperBand;
}
set
{
_upperBand = value;
RaisePropertyChanged("UpperBand");
}
}
public ObservableCollection<int> MovingAverageWindow
{
get
{
return _movingAvg;
}
set
{
_movingAvg = value;
RaisePropertyChanged("MovingAverageWindow");
}
}
public ObservableCollection<int> Length
{
get
{
return _length;
}
set
{
_length = value;
RaisePropertyChanged("Length");
}
}
public int SelectedLowerBand
{
get
{
return _selectedLowerBand;
}
set
{
_selectedLowerBand = value;
RaisePropertyChanged("SelectedLowerBand");
}
}
public int SelectedUpperBand
{
get
{
return _selectedUpperBand;
}
set
{
_selectedUpperBand = value;
RaisePropertyChanged("SelectedUpperBand");
}
}
public int SelectedMovingAverage
{
get
{
return _selectedMovingAvg;
}
set
{
_selectedMovingAvg = value;
RaisePropertyChanged("SelectedMovingAverage");
}
}
public int SelectedLength
{
get
{
return _selectedLength;
}
set
{
_selectedLength = value;
RaisePropertyChanged("SelectedLength");
}
}
public event PropertyChangedEventHandler PropertyChanged;
private void RaisePropertyChanged(string propertyName)
{
PropertyChangedEventHandler handler = this.PropertyChanged;
if (handler != null)
{
handler(this, new PropertyChangedEventArgs(propertyName));
}
}
}
public class TickerData
{
public DateTime Date
{ get; set; }
public double Open
{ get; set; }
public double High
{ get; set; }
public double Low
{ get; set; }
public double Close
{ get; set; }
public double Volume
{ get; set; }
public double AdjClose
{ get; set; }
}
}

c# for loop arithmetic

This program I am writing for a class uses "random" numbers to generate arrival(arrival[i]) and service times(service[i]) for jobs. My current problem is with the arrival time. To get the arrival time, I call a function named exponential and add the returning value to the previous arrival time (arrival[i-1]) in the array. For some reason I don't understand, the program is not using the previous value of the array for the addition, but rather a seemingly random value (1500,1600 ect). But I know the real values set in the array are all below 5. This should be simple array arithmetic in a for loop but I cannot figure out what is going wrong.
namespace ConsoleApplication4
{
class Program
{
static long state;
void putseed(int value)
{
state = value;
}
static void Main(string[] args)
{
Program pro = new Program();
double totals = 0;
double totald = 0;
pro.putseed(12345);
double[] arrival = new double[1000];
double[] service = new double[1000];
double[] wait = new double[1000];
double[] delay = new double[1000];
double[] departure = new double[1000];
for (int i = 1; i < 1000; i++)
{
arrival[i] = arrival[i - 1] + pro.Exponential(2.0);
if (arrival[i] < departure[i - 1])
departure[i] = departure[i] - arrival[i];
else
departure[i] = 0;
service[i] = pro.Uniform((long)1.0,(long)2.0);
totals += service[i];
totald += departure[i];
}
double averages = totals / 1000;
double averaged = totald / 1000;
Console.WriteLine("{0}\n",averages);
Console.WriteLine("{0}\n", averaged);
Console.WriteLine("press any key");
Console.ReadLine();
}
public double Random()
{
const long A = 48271;
const long M = 2147483647;
const long Q = M / A;
const long R = M % A;
long t = A * (state % Q) - R * (state / Q);
if (t > 0)
state = t;
else
state = t + M;
return ((double)state / M);
}
public double Exponential(double u)
{
return (-u * Math.Log(1.0 - Random()));
}
public double Uniform(long a, long b)
{
Program pro = new Program();
double c = ((double)a + ((double)b - (double)a) * pro.Random());
return c;
}
}
}

The values returned by your Exponential method can be very big. Very very big. In fact, they tend towards infinity if your Random values come close to 1...
I'm not surpirised your values in the arrival array tend to be big. I would in fact expect them to.
Also: try to name your methods accordingly to what they do. Your Exponential method has nothing to do with a mathematical exponential.
And try not to implement a random number generator yourself. Use the Random class included in the .Net Framework. If you want to always have the same sequence of pseudo-random numbers (as you seem to want), you can seed it with a constant.

Your output sounds perfectly correct to me, given your current logic. Maybe your logic is flawed?
I changed the first three lines of the for loop to:
var ex = Exponential(2.0);
arrival[i] = arrival[i - 1] + ex;
Console.WriteLine("i = " + arrival[i] + ", i-1 = " + arrival[i-1] + ", Exponential = " + ex);
And this is the start and end of the output:
i = 0.650048368820785, i-1 = 0, Exponential = 0.650048368820785
i = 3.04412645597466, i-1 = 0.650048368820785, Exponential = 2.39407808715387
i = 4.11006720700818, i-1 = 3.04412645597466, Exponential = 1.06594075103352
i = 5.05503853283036, i-1 = 4.11006720700818, Exponential = 0.944971325822186
i = 6.77397334440211, i-1 = 5.05503853283036, Exponential = 1.71893481157175
i = 8.03325406790781, i-1 = 6.77397334440211, Exponential = 1.2592807235057
i = 9.99797822010981, i-1 = 8.03325406790781, Exponential = 1.964724152202
i = 10.540051694898, i-1 = 9.99797822010981, Exponential = 0.542073474788196
i = 10.6332298644808, i-1 = 10.540051694898, Exponential = 0.0931781695828122
....
i = 1970.86834655692, i-1 = 1968.91989881306, Exponential = 1.94844774386271
i = 1971.49302600885, i-1 = 1970.86834655692, Exponential = 0.62467945192265
i = 1972.16711634654, i-1 = 1971.49302600885, Exponential = 0.674090337697884
i = 1974.5740025773, i-1 = 1972.16711634654, Exponential = 2.40688623075635
i = 1978.14531015105, i-1 = 1974.5740025773, Exponential = 3.5713075737529
i = 1979.15315663014, i-1 = 1978.14531015105, Exponential = 1.00784647908321
The math here looks perfectly right to me.
Side comment: You can declare all your extra methods (Exponential, Uniform, etc) as static, so you don't have to create a new Program just to use them.

You did not set the value of arrival[0], it is not initialized before for loop so the other values in array calculated wrongly.

Extra suggestion,
public double Uniform(long a, long b)
{
double c = ((double)a + ((double)b - (double)a) * Random());
return c;
}
change your Uniform function like that.

How do I calculate a trendline for a graph?

Google is not being my friend - it's been a long time since my stats class in college...I need to calculate the start and end points for a trendline on a graph - is there an easy way to do this? (working in C# but whatever language works for you)

Thanks to all for your help - I was off this issue for a couple of days and just came back to it - was able to cobble this together - not the most elegant code, but it works for my purposes - thought I'd share if anyone else encounters this issue:
public class Statistics
{
public Trendline CalculateLinearRegression(int[] values)
{
var yAxisValues = new List<int>();
var xAxisValues = new List<int>();
for (int i = 0; i < values.Length; i++)
{
yAxisValues.Add(values[i]);
xAxisValues.Add(i + 1);
}
return new Trendline(yAxisValues, xAxisValues);
}
}
public class Trendline
{
private readonly IList<int> xAxisValues;
private readonly IList<int> yAxisValues;
private int count;
private int xAxisValuesSum;
private int xxSum;
private int xySum;
private int yAxisValuesSum;
public Trendline(IList<int> yAxisValues, IList<int> xAxisValues)
{
this.yAxisValues = yAxisValues;
this.xAxisValues = xAxisValues;
this.Initialize();
}
public int Slope { get; private set; }
public int Intercept { get; private set; }
public int Start { get; private set; }
public int End { get; private set; }
private void Initialize()
{
this.count = this.yAxisValues.Count;
this.yAxisValuesSum = this.yAxisValues.Sum();
this.xAxisValuesSum = this.xAxisValues.Sum();
this.xxSum = 0;
this.xySum = 0;
for (int i = 0; i < this.count; i++)
{
this.xySum += (this.xAxisValues[i]*this.yAxisValues[i]);
this.xxSum += (this.xAxisValues[i]*this.xAxisValues[i]);
}
this.Slope = this.CalculateSlope();
this.Intercept = this.CalculateIntercept();
this.Start = this.CalculateStart();
this.End = this.CalculateEnd();
}
private int CalculateSlope()
{
try
{
return ((this.count*this.xySum) - (this.xAxisValuesSum*this.yAxisValuesSum))/((this.count*this.xxSum) - (this.xAxisValuesSum*this.xAxisValuesSum));
}
catch (DivideByZeroException)
{
return 0;
}
}
private int CalculateIntercept()
{
return (this.yAxisValuesSum - (this.Slope*this.xAxisValuesSum))/this.count;
}
private int CalculateStart()
{
return (this.Slope*this.xAxisValues.First()) + this.Intercept;
}
private int CalculateEnd()
{
return (this.Slope*this.xAxisValues.Last()) + this.Intercept;
}
}

OK, here's my best pseudo math:
The equation for your line is:
Y = a + bX
Where:
b = (sum(x*y) - sum(x)sum(y)/n) / (sum(x^2) - sum(x)^2/n)
a = sum(y)/n - b(sum(x)/n)
Where sum(xy) is the sum of all x*y etc. Not particularly clear I concede, but it's the best I can do without a sigma symbol :)
... and now with added Sigma
b = (Σ(xy) - (ΣxΣy)/n) / (Σ(x^2) - (Σx)^2/n)
a = (Σy)/n - b((Σx)/n)
Where Σ(xy) is the sum of all x*y etc. and n is the number of points

Given that the trendline is straight, find the slope by choosing any two points and calculating:
(A) slope = (y1-y2)/(x1-x2)
Then you need to find the offset for the line. The line is specified by the equation:
(B) y = offset + slope*x
So you need to solve for offset. Pick any point on the line, and solve for offset:
(C) offset = y - (slope*x)
Now you can plug slope and offset into the line equation (B) and have the equation that defines your line. If your line has noise you'll have to decide on an averaging algorithm, or use curve fitting of some sort.
If your line isn't straight then you'll need to look into Curve fitting, or Least Squares Fitting - non trivial, but do-able. You'll see the various types of curve fitting at the bottom of the least squares fitting webpage (exponential, polynomial, etc) if you know what kind of fit you'd like.
Also, if this is a one-off, use Excel.

Here is a very quick (and semi-dirty) implementation of Bedwyr Humphreys's answer. The interface should be compatible with #matt's answer as well, but uses decimal instead of int and uses more IEnumerable concepts to hopefully make it easier to use and read.
Slope is b, Intercept is a
public class Trendline
{
public Trendline(IList<decimal> yAxisValues, IList<decimal> xAxisValues)
: this(yAxisValues.Select((t, i) => new Tuple<decimal, decimal>(xAxisValues[i], t)))
{ }
public Trendline(IEnumerable<Tuple<Decimal, Decimal>> data)
{
var cachedData = data.ToList();
var n = cachedData.Count;
var sumX = cachedData.Sum(x => x.Item1);
var sumX2 = cachedData.Sum(x => x.Item1 * x.Item1);
var sumY = cachedData.Sum(x => x.Item2);
var sumXY = cachedData.Sum(x => x.Item1 * x.Item2);
//b = (sum(x*y) - sum(x)sum(y)/n)
// / (sum(x^2) - sum(x)^2/n)
Slope = (sumXY - ((sumX * sumY) / n))
/ (sumX2 - (sumX * sumX / n));
//a = sum(y)/n - b(sum(x)/n)
Intercept = (sumY / n) - (Slope * (sumX / n));
Start = GetYValue(cachedData.Min(a => a.Item1));
End = GetYValue(cachedData.Max(a => a.Item1));
}
public decimal Slope { get; private set; }
public decimal Intercept { get; private set; }
public decimal Start { get; private set; }
public decimal End { get; private set; }
public decimal GetYValue(decimal xValue)
{
return Intercept + Slope * xValue;
}
}

Regarding a previous answer
if (B) y = offset + slope*x
then (C) offset = y/(slope*x) is wrong
(C) should be:
offset = y-(slope*x)
See:
http://zedgraph.org/wiki/index.php?title=Trend

If you have access to Excel, look in the "Statistical Functions" section of the Function Reference within Help. For straight-line best-fit, you need SLOPE and INTERCEPT and the equations are right there.
Oh, hang on, they're also defined online here: http://office.microsoft.com/en-us/excel/HP052092641033.aspx for SLOPE, and there's a link to INTERCEPT. OF course, that assumes MS don't move the page, in which case try Googling for something like "SLOPE INTERCEPT EQUATION Excel site:microsoft.com" - the link given turned out third just now.

I converted Matt's code to Java so I could use it in Android with the MPAndroidChart library. Also used double values instead of integer values:
ArrayList<Entry> yValues2 = new ArrayList<>();
ArrayList<Double > xAxisValues = new ArrayList<Double>();
ArrayList<Double> yAxisValues = new ArrayList<Double>();
for (int i = 0; i < readings.size(); i++)
{
r = readings.get(i);
yAxisValues.add(r.value);
xAxisValues.add((double)i + 1);
}
TrendLine tl = new TrendLine(yAxisValues, xAxisValues);
//Create the y values for the trend line
double currY = tl.Start;
for (int i = 0; i < readings.size(); ++ i) {
yValues2.add(new Entry(i, (float) currY));
currY = currY + tl.Slope;
}
...
public class TrendLine
{
private ArrayList<Double> xAxisValues = new ArrayList<Double>();
private ArrayList<Double> yAxisValues = new ArrayList<Double>();
private int count;
private double xAxisValuesSum;
private double xxSum;
private double xySum;
private double yAxisValuesSum;
public TrendLine(ArrayList<Double> yAxisValues, ArrayList<Double> xAxisValues)
{
this.yAxisValues = yAxisValues;
this.xAxisValues = xAxisValues;
this.Initialize();
}
public double Slope;
public double Intercept;
public double Start;
public double End;
private double getArraySum(ArrayList<Double> arr) {
double sum = 0;
for (int i = 0; i < arr.size(); ++i) {
sum = sum + arr.get(i);
}
return sum;
}
private void Initialize()
{
this.count = this.yAxisValues.size();
this.yAxisValuesSum = getArraySum(this.yAxisValues);
this.xAxisValuesSum = getArraySum(this.xAxisValues);
this.xxSum = 0;
this.xySum = 0;
for (int i = 0; i < this.count; i++)
{
this.xySum += (this.xAxisValues.get(i)*this.yAxisValues.get(i));
this.xxSum += (this.xAxisValues.get(i)*this.xAxisValues.get(i));
}
this.Slope = this.CalculateSlope();
this.Intercept = this.CalculateIntercept();
this.Start = this.CalculateStart();
this.End = this.CalculateEnd();
}
private double CalculateSlope()
{
try
{
return ((this.count*this.xySum) - (this.xAxisValuesSum*this.yAxisValuesSum))/((this.count*this.xxSum) - (this.xAxisValuesSum*this.xAxisValuesSum));
}
catch (Exception e)
{
return 0;
}
}
private double CalculateIntercept()
{
return (this.yAxisValuesSum - (this.Slope*this.xAxisValuesSum))/this.count;
}
private double CalculateStart()
{
return (this.Slope*this.xAxisValues.get(0)) + this.Intercept;
}
private double CalculateEnd()
{
return (this.Slope*this.xAxisValues.get(this.xAxisValues.size()-1)) + this.Intercept;
}
}

This is the way i calculated the slope:
Source: http://classroom.synonym.com/calculate-trendline-2709.html
class Program
{
public double CalculateTrendlineSlope(List<Point> graph)
{
int n = graph.Count;
double a = 0;
double b = 0;
double bx = 0;
double by = 0;
double c = 0;
double d = 0;
double slope = 0;
foreach (Point point in graph)
{
a += point.x * point.y;
bx = point.x;
by = point.y;
c += Math.Pow(point.x, 2);
d += point.x;
}
a *= n;
b = bx * by;
c *= n;
d = Math.Pow(d, 2);
slope = (a - b) / (c - d);
return slope;
}
}
class Point
{
public double x;
public double y;
}

Here's what I ended up using.
public class DataPoint<T1,T2>
{
public DataPoint(T1 x, T2 y)
{
X = x;
Y = y;
}
[JsonProperty("x")]
public T1 X { get; }
[JsonProperty("y")]
public T2 Y { get; }
}
public class Trendline
{
public Trendline(IEnumerable<DataPoint<long, decimal>> dataPoints)
{
int count = 0;
long sumX = 0;
long sumX2 = 0;
decimal sumY = 0;
decimal sumXY = 0;
foreach (var dataPoint in dataPoints)
{
count++;
sumX += dataPoint.X;
sumX2 += dataPoint.X * dataPoint.X;
sumY += dataPoint.Y;
sumXY += dataPoint.X * dataPoint.Y;
}
Slope = (sumXY - ((sumX * sumY) / count)) / (sumX2 - ((sumX * sumX) / count));
Intercept = (sumY / count) - (Slope * (sumX / count));
}
public decimal Slope { get; private set; }
public decimal Intercept { get; private set; }
public decimal Start { get; private set; }
public decimal End { get; private set; }
public decimal GetYValue(decimal xValue)
{
return Slope * xValue + Intercept;
}
}
My data set is using a Unix timestamp for the x-axis and a decimal for the y. Change those datatypes to fit your need. I do all the sum calculations in one iteration for the best possible performance.

Thank You so much for the solution, I was scratching my head.
Here's how I applied the solution in Excel.
I successfully used the two functions given by MUHD in Excel:
a = (sum(x*y) - sum(x)sum(y)/n) / (sum(x^2) - sum(x)^2/n)
b = sum(y)/n - b(sum(x)/n)
(careful my a and b are the b and a in MUHD's solution).
- Made 4 columns, for example:
NB: my values y values are in B3:B17, so I have n=15;
my x values are 1,2,3,4...15.
1. Column B: Known x's
2. Column C: Known y's
3. Column D: The computed trend line
4. Column E: B values * C values (E3=B3*C3, E4=B4*C4, ..., E17=B17*C17)
5. Column F: x squared values
I then sum the columns B,C and E, the sums go in line 18 for me, so I have B18 as sum of Xs, C18 as sum of Ys, E18 as sum of X*Y, and F18 as sum of squares.
To compute a, enter the followin formula in any cell (F35 for me):
F35=(E18-(B18*C18)/15)/(F18-(B18*B18)/15)
To compute b (in F36 for me):
F36=C18/15-F35*(B18/15)
Column D values, computing the trend line according to the y = ax + b:
D3=$F$35*B3+$F$36, D4=$F$35*B4+$F$36 and so on (until D17 for me).
Select the column datas (C2:D17) to make the graph.
HTH.

If anyone needs the JS code for calculating the trendline of many points on a graph, here's what worked for us in the end:
/**#typedef {{
* x: Number;
* y:Number;
* }} Point
* #param {Point[]} data
* #returns {Function} */
function _getTrendlineEq(data) {
const xySum = data.reduce((acc, item) => {
const xy = item.x * item.y
acc += xy
return acc
}, 0)
const xSum = data.reduce((acc, item) => {
acc += item.x
return acc
}, 0)
const ySum = data.reduce((acc, item) => {
acc += item.y
return acc
}, 0)
const aTop = (data.length * xySum) - (xSum * ySum)
const xSquaredSum = data.reduce((acc, item) => {
const xSquared = item.x * item.x
acc += xSquared
return acc
}, 0)
const aBottom = (data.length * xSquaredSum) - (xSum * xSum)
const a = aTop / aBottom
const bTop = ySum - (a * xSum)
const b = bTop / data.length
return function trendline(x) {
return a * x + b
}
}
It takes an array of (x,y) points and returns the function of a y given a certain x
Have fun :)

Here's a working example in golang. I searched around and found this page and converted this over to what I needed. Hope someone else can find it useful.
// https://classroom.synonym.com/calculate-trendline-2709.html
package main
import (
"fmt"
"math"
)
func main() {
graph := [][]float64{
{1, 3},
{2, 5},
{3, 6.5},
}
n := len(graph)
// get the slope
var a float64
var b float64
var bx float64
var by float64
var c float64
var d float64
var slope float64
for _, point := range graph {
a += point[0] * point[1]
bx += point[0]
by += point[1]
c += math.Pow(point[0], 2)
d += point[0]
}
a *= float64(n) // 97.5
b = bx * by // 87
c *= float64(n) // 42
d = math.Pow(d, 2) // 36
slope = (a - b) / (c - d) // 1.75
// calculating the y-intercept (b) of the Trendline
var e float64
var f float64
e = by // 14.5
f = slope * bx // 10.5
intercept := (e - f) / float64(n) // (14.5 - 10.5) / 3 = 1.3
// output
fmt.Println(slope)
fmt.Println(intercept)
}

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why does this LinqPad program produce different results on the second run? - c#

Related

Linear regression in a list with linq

Linear Regression with NuML

How to efficiently calculate a moving Standard Deviation

c# for loop arithmetic

How do I calculate a trendline for a graph?

Categories

Resources