c# for loop arithmetic - c#

This program I am writing for a class uses "random" numbers to generate arrival(arrival[i]) and service times(service[i]) for jobs. My current problem is with the arrival time. To get the arrival time, I call a function named exponential and add the returning value to the previous arrival time (arrival[i-1]) in the array. For some reason I don't understand, the program is not using the previous value of the array for the addition, but rather a seemingly random value (1500,1600 ect). But I know the real values set in the array are all below 5. This should be simple array arithmetic in a for loop but I cannot figure out what is going wrong.
namespace ConsoleApplication4
{
class Program
{
static long state;
void putseed(int value)
{
state = value;
}
static void Main(string[] args)
{
Program pro = new Program();
double totals = 0;
double totald = 0;
pro.putseed(12345);
double[] arrival = new double[1000];
double[] service = new double[1000];
double[] wait = new double[1000];
double[] delay = new double[1000];
double[] departure = new double[1000];
for (int i = 1; i < 1000; i++)
{
arrival[i] = arrival[i - 1] + pro.Exponential(2.0);
if (arrival[i] < departure[i - 1])
departure[i] = departure[i] - arrival[i];
else
departure[i] = 0;
service[i] = pro.Uniform((long)1.0,(long)2.0);
totals += service[i];
totald += departure[i];
}
double averages = totals / 1000;
double averaged = totald / 1000;
Console.WriteLine("{0}\n",averages);
Console.WriteLine("{0}\n", averaged);
Console.WriteLine("press any key");
Console.ReadLine();
}
public double Random()
{
const long A = 48271;
const long M = 2147483647;
const long Q = M / A;
const long R = M % A;
long t = A * (state % Q) - R * (state / Q);
if (t > 0)
state = t;
else
state = t + M;
return ((double)state / M);
}
public double Exponential(double u)
{
return (-u * Math.Log(1.0 - Random()));
}
public double Uniform(long a, long b)
{
Program pro = new Program();
double c = ((double)a + ((double)b - (double)a) * pro.Random());
return c;
}
}
}

The values returned by your Exponential method can be very big. Very very big. In fact, they tend towards infinity if your Random values come close to 1...
I'm not surpirised your values in the arrival array tend to be big. I would in fact expect them to.
Also: try to name your methods accordingly to what they do. Your Exponential method has nothing to do with a mathematical exponential.
And try not to implement a random number generator yourself. Use the Random class included in the .Net Framework. If you want to always have the same sequence of pseudo-random numbers (as you seem to want), you can seed it with a constant.

Your output sounds perfectly correct to me, given your current logic. Maybe your logic is flawed?
I changed the first three lines of the for loop to:
var ex = Exponential(2.0);
arrival[i] = arrival[i - 1] + ex;
Console.WriteLine("i = " + arrival[i] + ", i-1 = " + arrival[i-1] + ", Exponential = " + ex);
And this is the start and end of the output:
i = 0.650048368820785, i-1 = 0, Exponential = 0.650048368820785
i = 3.04412645597466, i-1 = 0.650048368820785, Exponential = 2.39407808715387
i = 4.11006720700818, i-1 = 3.04412645597466, Exponential = 1.06594075103352
i = 5.05503853283036, i-1 = 4.11006720700818, Exponential = 0.944971325822186
i = 6.77397334440211, i-1 = 5.05503853283036, Exponential = 1.71893481157175
i = 8.03325406790781, i-1 = 6.77397334440211, Exponential = 1.2592807235057
i = 9.99797822010981, i-1 = 8.03325406790781, Exponential = 1.964724152202
i = 10.540051694898, i-1 = 9.99797822010981, Exponential = 0.542073474788196
i = 10.6332298644808, i-1 = 10.540051694898, Exponential = 0.0931781695828122
....
i = 1970.86834655692, i-1 = 1968.91989881306, Exponential = 1.94844774386271
i = 1971.49302600885, i-1 = 1970.86834655692, Exponential = 0.62467945192265
i = 1972.16711634654, i-1 = 1971.49302600885, Exponential = 0.674090337697884
i = 1974.5740025773, i-1 = 1972.16711634654, Exponential = 2.40688623075635
i = 1978.14531015105, i-1 = 1974.5740025773, Exponential = 3.5713075737529
i = 1979.15315663014, i-1 = 1978.14531015105, Exponential = 1.00784647908321
The math here looks perfectly right to me.
Side comment: You can declare all your extra methods (Exponential, Uniform, etc) as static, so you don't have to create a new Program just to use them.

You did not set the value of arrival[0], it is not initialized before for loop so the other values in array calculated wrongly.

Extra suggestion,
public double Uniform(long a, long b)
{
double c = ((double)a + ((double)b - (double)a) * Random());
return c;
}
change your Uniform function like that.

Related

Why are the random numbers not properly scattered?

My task is to implement a random number generator using LCG algorithm.
The task is to generate 1000 axes (x,y) between [-1, 1] and print them on a pane.
If the point is inside the circle of radius 1.0, it will be printed as Red.
Otherwise, Blue.
I used the parameters suggested by Numerical Recipes suggested in this YouTube video. I am following the coding style used in this link.
I am using ZedGraph to show my plots.
Why are the random numbers not properly scattered on the pane?
And, where are the blue points?
Random number generator class:
class MyRandom
{
long m = 4294967296;// modulus
long a = 1664525; // multiplier
long c = 1013904223; // increment
public long nextRandomInt(long seed)
{
return (((a * seed + c) % m));
}
private double nextRandomDouble(long seed)
{
return (2 * (nextRandomInt(seed) / m)) - 1;
}
public double nextRandomDouble(double seed)
{
double new_seed = seed + 1.0;
new_seed = new_seed / 2.0;
new_seed = new_seed * m;
long long_seed = Convert.ToInt64(new_seed);
double new_s = nextRandomInt(long_seed);
new_s = new_s / m;
new_s = new_s * 2;
new_s = new_s - 1;
return new_s;
}
}
Output
Additional Source Code:
Driver Program:
class Program
{
static void Main(string[] args)
{
int N = 1000;
double radius = 1.0;
List<double> rx = new List<double>(); rx.Add(0.0);
List<double> ry = new List<double>(); ry.Add(1.0);
MyRandom r = new MyRandom();
for (int i = 0; i < N; i++)
{
double x = r.nextRandomDouble(rx[rx.Count - 1]);
double y = r.nextRandomDouble(ry[ry.Count - 1]);
rx.Add(x);
ry.Add(y);
}
PlotForm form = new PlotForm();
ZedGraphControl zgControl = form.ZedGrapgControl;
//// get a reference to the GraphPane
GraphPane gPane = zgControl.GraphPane;
gPane.Title.Text = "Random Numbers";
gPane.XAxis.Type = AxisType.Linear;
PointPairList insideCircleList = new PointPairList();
PointPairList outsideCircleList = new PointPairList();
for (int i = 0; i < N; i++)
{
double x = rx[i];
double y = ry[i];
if ((x * x + y * y) < radius)
{
insideCircleList.Add(x, y);
}
else
{
outsideCircleList.Add(x, y);
}
}
LineItem redCurve = gPane.AddCurve("Inside", insideCircleList, Color.Red, SymbolType.Circle);
redCurve.Line.IsVisible = false;
redCurve.Symbol.Fill.Type = FillType.Solid;
LineItem blueCurve = gPane.AddCurve("Outside", outsideCircleList, Color.Blue, SymbolType.Circle);
blueCurve.Line.IsVisible = false;
zgControl.AxisChange();
form.ShowDialog();
Console.ReadLine();
}
}
WinForms Code:
public partial class PlotForm : Form
{
public ZedGraph.ZedGraphControl ZedGrapgControl { get; set; }
public PlotForm()
{
InitializeComponent();
ZedGrapgControl = this.zgc;
}
}
seed should not be a parameter of the random function, but a field within a random class. You set it once, and it changes with every random call.
But to answer your question, you don't save new_seed anywhere. It has to be saved so that it can be used in the next random call. So, your seed is just incremented by one in every new call, and this makes the graph a straight line.
Try using regular C# Random class (https://learn.microsoft.com/en-us/dotnet/api/system.random?view=netframework-4.8), it doesn't seem worth it to "roll your own" in this case.
I think this will be good enough for your purpose.
The current implementation of the Random class is based on a modified
version of Donald E. Knuth's subtractive random number generator
algorithm.
https://learn.microsoft.com/en-us/dotnet/api/system.random?view=netframework-4.8#remarks
If that doesn't meet your requirements you can look into:
https://learn.microsoft.com/en-us/dotnet/api/system.security.cryptography.rngcryptoserviceprovider?view=netframework-4.8

Fast comparison of two doubles ignoring everything that comes after 6 decimal digits

I try to optimize the performance of some calculation process.
Decent amount of time is wasted on calculations like the following:
var isBigger = Math.Abs((long) (a * 1e6) / 1e6D) > ((long) ((b + c) * 1e6)) / 1e6D;
where "a","b" and "c" are doubles, "b" and "c" are positive, "a" might be negative.
isBigger should be true only if absolute value of "a" is bigger than "b+c" disregarding anything after the 6th decimal digit.
So I look at this expression, I understand what it does, but it seems hugely inefficient to me, since it multiplies and divides compared numbers by million just to get rig of anything after 6 decimal places.
Below is the program I used to try and create a better solution. So far I failed.
Can someone help me?
class Program
{
static void Main(string[] args)
{
var arrLength = 1000000;
var arr1 = GetArrayOf_A(arrLength);
var arr2 = GetArrayOf_B(arrLength);
var arr3 = GetArrayOf_C(arrLength);
var result1 = new bool[arrLength];
var result2 = new bool[arrLength];
var sw = new Stopwatch();
sw.Start();
for (var i = 0; i < arrLength; i++)
{
result1[i] = Math.Abs((long) (arr1[i] * 1e6) / 1e6D)
>
(long) ((arr2[i] + arr3[i]) * 1e6) / 1e6D;
}
sw.Stop();
var t1 = sw.Elapsed.TotalMilliseconds;
sw.Restart();
for (var i = 0; i < arrLength; i++)
{
//result2[i] = Math.Round(Math.Abs(arr1[i]) - (arr2[i] + arr3[i]),6) > 0; // Incorrect, example by index = 0
//result2[i] = Math.Abs(arr1[i]) - (arr2[i] + arr3[i]) > 0.000001; // Incorrect, example by index = 1
//result2[i] = Math.Abs(arr1[i]) - (arr2[i] + arr3[i]) > 0.0000001; // Incorrect, example by index = 2
result2[i] = Math.Abs(arr1[i]) - (arr2[i] + arr3[i]) > 0.00000001; // Incorrect, example by index = 3
}
sw.Stop();
var t2 = sw.Elapsed.TotalMilliseconds;
var areEquivalent = true;
for (var i = 0; i < arrLength; i++)
{
if (result1[i] == result2[i]) continue;
areEquivalent = false;
break;
}
Console.WriteLine($"Functions are equivalent : {areEquivalent}");
if (areEquivalent)
{
Console.WriteLine($"Current function total time: {t1}ms");
Console.WriteLine($"Equivalent function total time: {t2}ms");
}
Console.WriteLine("Press ANY key to quit . . .");
Console.ReadKey();
}
private static readonly Random _rand = new Random(DateTime.Now.Millisecond);
private const int NumberOfRepresentativeExamples = 4;
private static double[] GetArrayOf_A(int arrLength)
{
if(arrLength<=NumberOfRepresentativeExamples)
throw new ArgumentException($"{nameof(arrLength)} should be bigger than {NumberOfRepresentativeExamples}");
var arr = new double[arrLength];
// Representative numbers
arr[0] = 2.4486382579120365;
arr[1] = -1.1716818990000011;
arr[2] = 5.996414627393257;
arr[3] = 6.0740085822069;
// the rest is to build time statistics
FillTheRestOfArray(arr);
return arr;
}
private static double[] GetArrayOf_B(int arrLength)
{
if(arrLength<=NumberOfRepresentativeExamples)
throw new ArgumentException($"{nameof(arrLength)} should be bigger than {NumberOfRepresentativeExamples}");
var arr = new double[arrLength];
// Representative numbers
arr[0] = 2.057823225;
arr[1] = 0;
arr[2] = 2.057823225;
arr[3] = 2.060649901;
// the rest is to build time statistics
FillTheRestOfArray(arr);
return arr;
}
private static double[] GetArrayOf_C(int arrLength)
{
if(arrLength<=NumberOfRepresentativeExamples)
throw new ArgumentException($"{nameof(arrLength)} should be bigger than {NumberOfRepresentativeExamples}");
var arr = new double[arrLength];
// Representative numbers
arr[0] = 0.3908145999796302;
arr[1] = 1.1716809269999997;
arr[2] = 3.9385910820740282;
arr[3] = 4.0133582670728858;
// the rest is to build time statistics
FillTheRestOfArray(arr);
return arr;
}
private static void FillTheRestOfArray(double[] arr)
{
for (var i = NumberOfRepresentativeExamples; i < arr.Length; i++)
{
arr[i] = _rand.Next(0, 10) + _rand.NextDouble();
}
}
}
You don't need the division since if (x/100) < (y/100) that means that x<y.
for(var i = 0; i < arrLength; i++)
{
result2[i] = Math.Abs((long)(arr1[i] * 1e6))
> (long)((arr2[i] + arr3[i]) * 1e6);
}
with the results for me:
Arrays have 1000000 elements.
Functions are equivalent : True
Current function total time: 40.10ms 24.94 kflop
Equivalent function total time: 22.42ms 44.60 kflop
A speedup of 78.83 %
PS. Make sure you compare RELEASE versions of the binary which includes math optimizations.
PS2. The display code is
Console.WriteLine($"Arrays have {arrLength} elements.");
Console.WriteLine($"Functions are equivalent : {areEquivalent}");
Console.WriteLine($" Current function total time: {t1:F2}ms {arrLength/t1/1e3:F2} kflop");
Console.WriteLine($"Equivalent function total time: {t2:F2}ms {arrLength/t2/1e3:F2} kflop");
Console.WriteLine($"An speedup of {t1/t2-1:P2}");
Overall your question goes into the Area of Realtime Programming. Not nessesarily realtime constraint, but it goes into teh same optimisation territory. The kind where every last nanosecond has be shaved off.
.NET is not the ideal scenario for this kind of operation. Usually that thing is done in dedicated lanagauges. The next best thing is doing it in Assembler, C or native C++. .NET has additional features like the Garbage Collector and Just In Time compiler that make even getting reliable benchmark results tricky. Much less reliale runtime performance.
For the datatypes, Float should be about the fastest operation there is. For historical reasons float opeations have been optimized.
One of your comment mentions physics and you do have an array. And I see stuff like array[i] = array2[i] + array3[i]. So maybe this should be a matrix operation you run on the GPU instead? This kind of "huge paralellized array opeartions" is exactly what the GPU is good at. Exactly what drawing on the screen is at it's core.
Unless you tell us what you are actually doing here as sa operation, that is about the best answer I can give.
Is this what you're looking for?
Math.Abs(a) - (b + c) > 0.000001
or if you want to know if the difference is bigger (difference either way):
Math.Abs(Math.Abs(a) - (b + c)) > 0.000001
(I'm assuming you're not limiting to this precision because of speed but because of inherent floating point limited precision.)
In addition to asking this question on this site, I also asked a good friend of mine, and so far he provided the best answer. Here it is:
result2[i] = Math.Abs(arr1[i]) - (arr2[i] + arr3[i]) > 0.000001 ||
Math.Abs((long)(arr1[i] * 1e6)) > (long)((arr2[i] + arr3[i])*1e6);
I am happy to have such friends :)

#MathdotNet How to find the parameters of Herschel-Bulkley model through Nonlinear Regression in Math.NET?

First, I would like to thank everyone involved in this magnificent project, Math.NET saved my life!
I have few questions about the linear and nonlinear regression, I am a civil engineer and when I was working on my Master's degree, I needed to develop a C# application that calculates the Rheological parameters of concrete based on data acquired from a test.
One of the models that describes the rheological behavior of concrete is the "Herschel-Bulkley model" and it has this formula :
y = T + K*x^n
x (the shear-rate), y (shear-stress) are the values obtained from the test, while T,K and N are the parameters I need to determine.
I know that the value of "T" is between 0 and Ymin (Ymin is the smallest data point from the test), so here is what I did:
Since it is nonlinear equation, I had to make it linear, like this :
ln(y-T) = ln(K) + n*ln(x)
creat an array of possible values of T, from 0 to Ymin, and try each value in the equation,
then through linear regression I find the values of K and N,
then calculate the SSD, and store the results in an array,
after I finish all the possible values of T, I see which one had the smallest SSD, and use it to find the optimal K and N .
This method works, but I feel it is not as smart or elegant as it should be, there must be a better way to do it, and I was hoping to find it here, it is also very slow.
here is the code that I used:
public static double HerschelBulkley(double shearRate, double tau0, double k, double n)
{
var t = tau0 + k * Math.Pow(shearRate, n);
return t;
}
public static (double Tau0, double K, double N, double DeltaMin, double RSquared) HerschelBulkleyModel(double[] shear, double[] shearRate, double step = 1000.0)
{
// Calculate the number values from 0.0 to Shear.Min;
var sm = (int) Math.Floor(shear.Min() * step);
// Populate the Array of Tau0 with the values from 0 to sm
var tau0Array = Enumerable.Range(0, sm).Select(t => t / step).ToArray();
var kArray = new double[sm];
var nArray = new double[sm];
var deltaArray = new double[sm];
var rSquaredArray = new double[sm];
var shearRateLn = shearRate.Select(s => Math.Log(s)).ToArray();
for (var i = 0; i < sm; i++)
{
var shearLn = shear.Select(s => Math.Log(s - tau0Array[i])).ToArray();
var param = Fit.Line(shearRateLn, shearLn);
kArray[i] = Math.Exp(param.Item1);
nArray[i] = param.Item2;
var shearHerschel = shearRate.Select(sr => HerschelBulkley(sr, tau0Array[i], kArray[i], nArray[i])).ToArray();
deltaArray[i] = Distance.SSD(shearHerschel, shear);
rSquaredArray[i] = GoodnessOfFit.RSquared(shearHerschel, shear);
}
var deltaMin = deltaArray.Min();
var index = Array.IndexOf(deltaArray, deltaMin);
var tau0 = tau0Array[index];
var k = kArray[index];
var n = nArray[index];
var rSquared = rSquaredArray[index];
return (tau0, k, n, deltaMin, rSquared);
}

C# OpenCL GPU implementation for double array math

How can I make the for loop of this function to use the GPU with OpenCL?
public static double[] Calculate(double[] num, int period)
{
var final = new double[num.Length];
double sum = num[0];
double coeff = 2.0 / (1.0 + period);
for (int i = 0; i < num.Length; i++)
{
sum += coeff * (num[i] - sum);
final[i] = sum;
}
return final;
}
Your problem as written does not fit well with something that would work on a GPU. You cannot parallelize (in a way that improves performance) the operation on a single array because the value of the nth element depends on elements 1 to n. However, you can utilize the GPU to process multiple arrays, where each GPU core operates on a separate array.
The full code for the solution is at the end of the answer, but the results of the test, to calculate on 10,000 arrays each of which has 10,000 elements, generates the following (on a GTX1080M and an i7 7700k with 32GB RAM):
Task Generating Data: 1096.4583ms
Task CPU Single Thread: 596.2624ms
Task CPU Parallel: 179.1717ms
GPU CPU->GPU: 89ms
GPU Execute: 86ms
GPU GPU->CPU: 29ms
Task Running GPU: 921.4781ms
Finished
In this test, we measure the speed at which we can generate results into a managed C# array using the CPU with one thread, the CPU with all threads, and finally the GPU using all cores. We validate that the results from each test are identical, using the function AreTheSame.
The fastest time is processing the arrays on the CPU using all threads (Task CPU Parallel: 179ms).
The GPU is actually the slowest (Task Running GPU: 922ms), but this is because of the time taken to reformat the C# arrays in a way that they can be transferred onto the GPU.
If this bottleneck were removed (which is quite possible, depending on your use case), the GPU could potentially be the fastest. If the data were already formatted in a manner that can be immediately be transferred onto the GPU, the total processing time for the GPU would be 204ms (CPU->GPU: 89ms + Execute: 86ms + GPU->CPU: 29 ms = 204ms). This is still slower than the parallel CPU option, but on a different sort of data set, it might be faster.
To get the data back from the GPU (the most important part of actually using the GPU), we use the function ComputeCommandQueue.Read. This transfers the altered array on the GPU back to the CPU.
To run the following code, reference the Cloo Nuget Package (I used 0.9.1). And make sure to compile on x64 (you will need the memory). You may need to update your graphics card driver too if it fails to find an OpenCL device.
class Program
{
static string CalculateKernel
{
get
{
return #"
kernel void Calc(global int* offsets, global int* lengths, global double* doubles, double periodFactor)
{
int id = get_global_id(0);
int start = offsets[id];
int length = lengths[id];
int end = start + length;
double sum = doubles[start];
for(int i = start; i < end; i++)
{
sum = sum + periodFactor * ( doubles[i] - sum );
doubles[i] = sum;
}
}";
}
}
public static double[] Calculate(double[] num, int period)
{
var final = new double[num.Length];
double sum = num[0];
double coeff = 2.0 / (1.0 + period);
for (int i = 0; i < num.Length; i++)
{
sum += coeff * (num[i] - sum);
final[i] = sum;
}
return final;
}
static void Main(string[] args)
{
int maxElements = 10000;
int numArrays = 10000;
int computeCores = 2048;
double[][] sets = new double[numArrays][];
using (Timer("Generating Data"))
{
Random elementRand = new Random(1);
for (int i = 0; i < numArrays; i++)
{
sets[i] = GetRandomDoubles(elementRand.Next((int)(maxElements * 0.9), maxElements), randomSeed: i);
}
}
int period = 14;
double[][] singleResults;
using (Timer("CPU Single Thread"))
{
singleResults = CalculateCPU(sets, period);
}
double[][] parallelResults;
using (Timer("CPU Parallel"))
{
parallelResults = CalculateCPUParallel(sets, period);
}
if (!AreTheSame(singleResults, parallelResults)) throw new Exception();
double[][] gpuResults;
using (Timer("Running GPU"))
{
gpuResults = CalculateGPU(computeCores, sets, period);
}
if (!AreTheSame(singleResults, gpuResults)) throw new Exception();
Console.WriteLine("Finished");
Console.ReadKey();
}
public static bool AreTheSame(double[][] a1, double[][] a2)
{
if (a1.Length != a2.Length) return false;
for (int i = 0; i < a1.Length; i++)
{
var ar1 = a1[i];
var ar2 = a2[i];
if (ar1.Length != ar2.Length) return false;
for (int j = 0; j < ar1.Length; j++)
if (Math.Abs(ar1[j] - ar2[j]) > 0.0000001) return false;
}
return true;
}
public static double[][] CalculateGPU(int partitionSize, double[][] sets, int period)
{
ComputeContextPropertyList cpl = new ComputeContextPropertyList(ComputePlatform.Platforms[0]);
ComputeContext context = new ComputeContext(ComputeDeviceTypes.Gpu, cpl, null, IntPtr.Zero);
ComputeProgram program = new ComputeProgram(context, new string[] { CalculateKernel });
program.Build(null, null, null, IntPtr.Zero);
ComputeCommandQueue commands = new ComputeCommandQueue(context, context.Devices[0], ComputeCommandQueueFlags.None);
ComputeEventList events = new ComputeEventList();
ComputeKernel kernel = program.CreateKernel("Calc");
double[][] results = new double[sets.Length][];
double periodFactor = 2d / (1d + period);
Stopwatch sendStopWatch = new Stopwatch();
Stopwatch executeStopWatch = new Stopwatch();
Stopwatch recieveStopWatch = new Stopwatch();
int offset = 0;
while (true)
{
int first = offset;
int last = Math.Min(offset + partitionSize, sets.Length);
int length = last - first;
var merged = Merge(sets, first, length);
sendStopWatch.Start();
ComputeBuffer<int> offsetBuffer = new ComputeBuffer<int>(
context,
ComputeMemoryFlags.ReadWrite | ComputeMemoryFlags.UseHostPointer,
merged.Offsets);
ComputeBuffer<int> lengthsBuffer = new ComputeBuffer<int>(
context,
ComputeMemoryFlags.ReadWrite | ComputeMemoryFlags.UseHostPointer,
merged.Lengths);
ComputeBuffer<double> doublesBuffer = new ComputeBuffer<double>(
context,
ComputeMemoryFlags.ReadWrite | ComputeMemoryFlags.UseHostPointer,
merged.Doubles);
kernel.SetMemoryArgument(0, offsetBuffer);
kernel.SetMemoryArgument(1, lengthsBuffer);
kernel.SetMemoryArgument(2, doublesBuffer);
kernel.SetValueArgument(3, periodFactor);
sendStopWatch.Stop();
executeStopWatch.Start();
commands.Execute(kernel, null, new long[] { merged.Lengths.Length }, null, events);
executeStopWatch.Stop();
using (var pin = Pinned(merged.Doubles))
{
recieveStopWatch.Start();
commands.Read(doublesBuffer, false, 0, merged.Doubles.Length, pin.Address, events);
commands.Finish();
recieveStopWatch.Stop();
}
for (int i = 0; i < merged.Lengths.Length; i++)
{
int len = merged.Lengths[i];
int off = merged.Offsets[i];
var res = new double[len];
Array.Copy(merged.Doubles,off,res,0,len);
results[first + i] = res;
}
offset += partitionSize;
if (offset >= sets.Length) break;
}
Console.WriteLine("GPU CPU->GPU: " + recieveStopWatch.ElapsedMilliseconds + "ms");
Console.WriteLine("GPU Execute: " + executeStopWatch.ElapsedMilliseconds + "ms");
Console.WriteLine("GPU GPU->CPU: " + sendStopWatch.ElapsedMilliseconds + "ms");
return results;
}
public static PinnedHandle Pinned(object obj) => new PinnedHandle(obj);
public class PinnedHandle : IDisposable
{
public IntPtr Address => handle.AddrOfPinnedObject();
private GCHandle handle;
public PinnedHandle(object val)
{
handle = GCHandle.Alloc(val, GCHandleType.Pinned);
}
public void Dispose()
{
handle.Free();
}
}
public class MergedResults
{
public double[] Doubles { get; set; }
public int[] Lengths { get; set; }
public int[] Offsets { get; set; }
}
public static MergedResults Merge(double[][] sets, int offset, int length)
{
List<int> lengths = new List<int>(length);
List<int> offsets = new List<int>(length);
for (int i = 0; i < length; i++)
{
var arr = sets[i + offset];
lengths.Add(arr.Length);
}
var totalLength = lengths.Sum();
double[] doubles = new double[totalLength];
int dataOffset = 0;
for (int i = 0; i < length; i++)
{
var arr = sets[i + offset];
Array.Copy(arr, 0, doubles, dataOffset, arr.Length);
offsets.Add(dataOffset);
dataOffset += arr.Length;
}
return new MergedResults()
{
Doubles = doubles,
Lengths = lengths.ToArray(),
Offsets = offsets.ToArray(),
};
}
public static IDisposable Timer(string name)
{
return new SWTimer(name);
}
public class SWTimer : IDisposable
{
private Stopwatch _sw;
private string _name;
public SWTimer(string name)
{
_name = name;
_sw = Stopwatch.StartNew();
}
public void Dispose()
{
_sw.Stop();
Console.WriteLine("Task " + _name + ": " + _sw.Elapsed.TotalMilliseconds + "ms");
}
}
public static double[][] CalculateCPU(double[][] arrays, int period)
{
double[][] results = new double[arrays.Length][];
for (var index = 0; index < arrays.Length; index++)
{
var arr = arrays[index];
results[index] = Calculate(arr, period);
}
return results;
}
public static double[][] CalculateCPUParallel(double[][] arrays, int period)
{
double[][] results = new double[arrays.Length][];
Parallel.For(0, arrays.Length, i =>
{
var arr = arrays[i];
results[i] = Calculate(arr, period);
});
return results;
}
static double[] GetRandomDoubles(int num, int randomSeed)
{
Random r = new Random(randomSeed);
var res = new double[num];
for (int i = 0; i < num; i++)
res[i] = r.NextDouble() * 0.9 + 0.05;
return res;
}
}
as commenter Cory stated refer to this link for setup.
How to use your GPU in .NET
Here is how you would use this project:
Add the Nuget Package Cloo
Add reference to OpenCLlib.dll
Download OpenCLLib.zip
Add using OpenCL
static void Main(string[] args)
{
int[] Primes = { 1,2,3,4,5,6,7 };
EasyCL cl = new EasyCL();
cl.Accelerator = AcceleratorDevice.GPU;
cl.LoadKernel(IsPrime);
cl.Invoke("GetIfPrime", 0, Primes.Length, Primes, 1.0);
}
static string IsPrime
{
get
{
return #"
kernel void GetIfPrime(global int* num, int period)
{
int index = get_global_id(0);
int sum = (2.0 / (1.0 + period)) * (num[index] - num[0]);
printf("" %d \n"",sum);
}";
}
}
for (int i = 0; i < num.Length; i++)
{
sum += coeff * (num[i] - sum);
final[i] = sum;
}
means first element is multiplied by coeff 1 time and subtracted from 2nd element. First element also multiplied by square of coeff and this time added to 3rd element. Then first element multiplied by cube of coeff and subtracted from 4th element.
This is going like this:
-e0*c*c*c + e1*c*c - e2*c = f3
e0*c*c*c*c - e1*c*c*c + e2*c*c - e3*c = f4
-e0*c*c*c*c*c + e1*c*c*c*c - e2*c*c*c + e3*c*c - e4*c =f5
For all elements, scan through for all smaller id elements and compute this:
if difference of id values(lets call it k) of elements is odd, take subtraction, if not then take addition. Before addition or subtraction, multiply that value by k-th power of coeff. Lastly, multiply the current num value by coefficient and add it to current cell. Current cell value is final(i).
This is O(N*N) and looks like an all-pairs compute kernel. An example using an open-source C# OpenCL project:
ClNumberCruncher cruncher = new ClNumberCruncher(ClPlatforms.all().gpus(), #"
__kernel void foo(__global double * num, __global double * final, __global int *parameters)
{
int threadId = get_global_id(0);
int period = parameters[0];
double coeff = 2.0 / (1.0 + period);
double sumOfElements = 0.0;
for(int i=0;i<threadId;i++)
{
// negativity of coeff is to select addition or subtraction for different powers of coeff
double powKofCoeff = pow(-coeff,threadId-i);
sumOfElements += powKofCoeff * num[i];
}
final[threadId] = sumOfElements + num[threadId] * coeff;
}
");
cruncher.performanceFeed = true; // getting benchmark feedback on console
double[] numArray = new double[10000];
double[] finalArray = new double[10000];
int[] parameters = new int[10];
int period = 15;
parameters[0] = period;
ClArray<double> numGpuArray = numArray;
numGpuArray.readOnly = true; // gpus read this from host
ClArray<double> finalGpuArray = finalArray; // finalArray will have results
finalGpuArray.writeOnly = true; // gpus write this to host
ClArray<int> parametersGpu = parameters;
parametersGpu.readOnly = true;
// calculate kernels with exact same ordering of parameters
// num(double),final(double),parameters(int)
// finalGpuArray points to __global double * final
numGpuArray.nextParam(finalGpuArray, parametersGpu).compute(cruncher, 1, "foo", 10000, 100);
// first compute always lags because of compiling the kernel so here are repeated computes to get actual performance
numGpuArray.nextParam(finalGpuArray, parametersGpu).compute(cruncher, 1, "foo", 10000, 100);
numGpuArray.nextParam(finalGpuArray, parametersGpu).compute(cruncher, 1, "foo", 10000, 100);
Results are on finalArray array for 10000 elements, using 100 workitems per workitem-group.
GPGPU part takes 82ms on a rx550 gpu which has very low ratio of 64bit-to-32bit compute performance(because consumer gaming cards are not good at double precision for new series). An Nvidia Tesla or an Amd Vega would easily compute this kernel without crippled performance. Fx8150(8 cores) completes in 683ms. If you need to specifically select only an integrated-GPU and its CPU, you can use
ClPlatforms.all().gpus().devicesWithHostMemorySharing() + ClPlatforms.all().cpus() when creating ClNumberCruncher instance.
binaries of api:
https://www.codeproject.com/Articles/1181213/Easy-OpenCL-Multiple-Device-Load-Balancing-and-Pip
or source code to compile on your pc:
https://github.com/tugrul512bit/Cekirdekler
if you have multiple gpus, it uses them without any extra code. Including a cpu to the computations would pull gpu effectiveness down in this sample for first iteration (repeatations complete in 76ms with cpu+gpu) so its better to use 2-3 GPU instead of CPU+GPU.
I didn't check numerical stability(you should use Kahan-Summation when adding millions or more values into same variable but I didn't use it for readability and don't have an idea about if 64-bit values need this too like 32-bit ones) or any value correctness, you should do it. Also foo kernel is not optimized. It makes %50 of core times idle so it should be better scheduled like this:
thread-0: compute element 0 and element N-1
thread-1: compute element 1 and element N-2
thread-m: compute element N/2-1 and element N/2
so all workitems get similar amount of work. On top of this, using 100 for workgroup size is not optimal. It should be something like 128,256,512 or 1024(for Nvidia) but this means array size should also be an integer multiple of this too. Then it would need extra control logic in the kernel to not go out of array borders. For even more performance, for loop could have multiple partial sums to do a "loop unrolling".

Implementing a neural network in C#

I'm following the tutorial at this link: http://www.c-sharpcorner.com/UploadFile/rmcochran/AI_OOP_NeuralNet06192006090112AM/AI_OOP_NeuralNet.aspx
I'm new to neural networking and I'm trying to edit the example in the above tutorial to match my problem. I'm using multiple regression to find coefficients for 3 different sets of data and I then calculate the rsquared value for each set of data. I'm trying to create a neural network that will change the coefficient value to get the rsquared value as close to 100 as possible.
This is how I establish the coefficient and find the rsquared value for that coefficient. All 3 coefficients use these same methods:
Calculations calc = new Calculations();
Vector<double> lowRiskCoefficient = MultipleRegression.QR( Matrix<double>.Build.DenseOfColumnArrays(lowRiskShortRatingList.ToArray(), lowRiskMediumRatingList.ToArray(), lowRiskLongRatingList.ToArray()), Vector<double>.Build.Dense(lowRiskWeekReturnList.ToArray()));
decimal lowRiskShortCoefficient = Convert.ToDecimal(lowRiskCoefficient[0]);
decimal lowRiskMediumCoefficient = Convert.ToDecimal(lowRiskCoefficient[1]);
decimal lowRiskLongCoefficient = Convert.ToDecimal(lowRiskCoefficient[2]);
List<decimal> lowRiskWeekReturnDecimalList = new List<decimal>(lowRiskWeekReturnList.Count);
lowRiskWeekReturnList.ForEach(i => lowRiskWeekReturnDecimalList.Add(Convert.ToDecimal(i)));
List<decimal> lowRiskPredictedReturnList = new List<decimal>(lowRiskWeekReturnList.Count);
List<decimal> lowRiskResidualValueList = new List<decimal>(lowRiskWeekReturnList.Count);
for (int i = 0; i < lowRiskWeekReturnList.Count; i++)
{
decimal lowRiskPredictedValue = (Convert.ToDecimal(lowRiskShortRatingList.ElementAtOrDefault(i)) * lowRiskShortCoefficient) + (Convert.ToDecimal(lowRiskMediumRatingList.ElementAtOrDefault(i)) * lowRiskMediumCoefficient) +
(Convert.ToDecimal(lowRiskLongRatingList.ElementAtOrDefault(i)) * lowRiskLongCoefficient);
lowRiskPredictedReturnList.Add(lowRiskPredictedValue);
lowRiskResidualValueList.Add(calc.calculateResidual(lowRiskWeekReturnDecimalList.ElementAtOrDefault(i), lowRiskPredictedValue));
}
decimal lowRiskTotalSumofSquares = calc.calculateTotalSumofSquares(lowRiskWeekReturnDecimalList, lowRiskWeekReturnDecimalList.Average());
decimal lowRiskTotalSumofRegression = calc.calculateTotalSumofRegression(lowRiskPredictedReturnList, lowRiskWeekReturnDecimalList.Average());
decimal lowRiskTotalSumofErrors = calc.calculateTotalSumofErrors(lowRiskResidualValueList);
decimal lowRiskRSquared = lowRiskTotalSumofRegression / lowRiskTotalSumofSquares;
This is the example that performs the training and I'm currently stuck on how to change this example to match what I'm trying to do.
private void button1_Click(object sender, EventArgs e)
{
net = new NeuralNet();
double high, mid, low;
high = .9;
low = .1;
mid = .5;
// initialize with
// 2 perception neurons
// 2 hidden layer neurons
// 1 output neuron
net.Initialize(1, 2, 2, 1);
double[][] input = new double[4][];
input[0] = new double[] {high, high};
input[1] = new double[] {low, high};
input[2] = new double[] {high, low};
input[3] = new double[] {low, low};
double[][] output = new double[4][];
output[0] = new double[] { low };
output[1] = new double[] { high };
output[2] = new double[] { high };
output[3] = new double[] { low };
double ll, lh, hl, hh;
int count;
count = 0;
do
{
count++;
for (int i = 0; i < 100; i++)
net.Train(input, output);
net.ApplyLearning();
net.PerceptionLayer[0].Output = low;
net.PerceptionLayer[1].Output = low;
net.Pulse();
ll = net.OutputLayer[0].Output;
net.PerceptionLayer[0].Output = high;
net.PerceptionLayer[1].Output = low;
net.Pulse();
hl = net.OutputLayer[0].Output;
net.PerceptionLayer[0].Output = low;
net.PerceptionLayer[1].Output = high;
net.Pulse();
lh = net.OutputLayer[0].Output;
net.PerceptionLayer[0].Output = high;
net.PerceptionLayer[1].Output = high;
net.Pulse();
hh = net.OutputLayer[0].Output;
}
while (hh > mid || lh < mid || hl < mid || ll > mid);
MessageBox.Show((count*100).ToString() + " iterations required for training");
}
How do I use this information to create a neural network to find the coefficient that will in turn have a rsquared value as close to 100 as possible?
Instead of building one, you can use Neuroph framework built in the .NET by using the Neuroph.NET from here https://github.com/starhash/Neuroph.NET/releases/tag/v1.0-beta
It is a light conversion of the original Neuroph they did for the JAVA platform.
Hope this helps you.

Categories