Character Recognizing using Aforge.net Nural network - c#

I am trying to recognize 0 to 9 digits using Aforge.net . I tried everything but I am still unable to get result please look at my program and why I am unable to recognize digits. Problem may be in number of hidden layers, learning rate or input data , I have tried it by changing number of hidden layers and learning rate. Please suggest ideas.
// opening file
OpenFileDialog open = new OpenFileDialog();
ActivationNetwork enactivation = new ActivationNetwork(new BipolarSigmoidFunction(1), 3886,10, 10);
double[][] input = new double[10][];
double[][] output = new double[10][];
//generating input data using Feature class -- which code is given below
Feature feature = new Feature();
//iterating for all 10 digits.
for (int i = 0; i < 10; i++)
{
open.ShowDialog();
Bitmap bitmap = new Bitmap(open.FileName);
double[] features = feature.features(bitmap);
input[i] = features;
features = feature.features(bitmap);
output[i] = feature.features(bitmap);
}
enactivation.Randomize();
BackPropagationLearning learn = new BackPropagationLearning(enactivation);
//learning
learn.LearningRate = 0.005f;
learn.Momentum = 0.005f;
double errora;
int iteration = 0;
while (true)
{
errora = learn.RunEpoch(input, output);
if (errora < 0.0006)
break;
else if (iteration > 23000)
break;
iteration++;
// Console.WriteLine("error {0} {1} ", errora, iteration);
}
double[] sample;
open.ShowDialog();
Bitmap temp = new Bitmap(open.FileName);
// providing input for computation using feature class
sample = feature.features(temp);
foreach (double daa in enactivation.Compute(sample))
{
Console.WriteLine(daa);
}
Class Feature for providing input for training nural network
class Feature
{
public double[] features(Bitmap bitmap)
{
//feature
double[] feature = new double[bitmap.Width * bitmap.Height];
int featurec = 0;
for (int vert = 0; vert < bitmap.Height; vert++)
{
for (int horizantal = 0; horizantal < bitmap.Width; horizantal++)
{
feature[featurec] = bitmap.GetPixel(horizantal, vert).ToArgb();
if (feature[featurec] < 1)
{
feature[featurec] = -0.5;
}
else
{
feature[featurec] = 0.5;
}
featurec++;
}
}
return feature;
}
}

I haven't used aforge, but re. using backprop neural nets for this problem:
You need something like a 10x10 input grid with each cell in the grid getting 1/100 of the image
You need at least one, possibly 2, hidden layers
The net will train faster with a bias input - meaning a source of a fixed value - for each cell (this lets the cells train faster: Role of Bias in Neural Networks)
I'd never start in bp mode but always run something a statistical annealing first. Bp is for descending inside a local minimum once one is found
Also:
Have you successfully used aforge for other problems?
What happens when you try to train the net?

Related

Peak generations for WaveSurfer.js using CSCore

I am trying to generate peaks using CSCore for WaveSurfer.js. I am essentially just trying to get peak value so it could be fed to the WaveSurfer.js element as prerendered peaks. Using CSCore as an alternative to AudioWaveForm.
Here is the code I am using:
var audioFile = CodecFactory.Instance.GetCodec("input.mp3");
var source = audioFile.ToSampleSource();
var peakMeter = new PeakMeter(source) { Interval = 40 };
var peakData = new float[source.Length / source.WaveFormat.BytesPerSample];
int read;
int i = 0;
while ((read = peakMeter.Read(peakData, i, peakData.Length - i)) > 0)
{
i += read;
}
// Convert the peak values from dB to linear scale
for (int j = 0; j < peakData.Length; j++)
{
decimal num = (decimal)peakData[j] * 100000;
var e = $"{num},";
File.AppendAllText("out.txt", e.ToString());
//peakData[j] = (float)Math.Pow(10, peakData[j] / 20);
}
I am trying to get a CSV or and array of values. Is this correct because I am getting wildly different results. I am new to CSCore and C# as a whole so any help would be helpful.
Thanks
When using the PeakMeter you have to use its PeakCalculated event which will provide the peaks. It gets fired while reading all samples as you are already doing using the Read method. So keep calling read till it returns zero and collect the peaks using the mentioned event.

Tensorflowsharp results getvalue() is very slow

I am using TensorflowSharp to run evaluations using a neural network on an Android phone. I am building the project with Unity.
I am using the tensorflowsharp unity plugin listed under the requirements here: https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Using-TensorFlow-Sharp-in-Unity.md.
Everything is working, however extracting the result is very slow.
The network I am running is an autoencoder and the output is an image with dimensions of 128x128x16 (yes there is a lot of output channels).
The evaluation is done in ~ 0.2 seconds which is acceptable. However when i need to extract the result data using results[0].GetValue() it is VERY slow.
This is my code where i run the neural network
var runner = session.GetRunner();
runner.AddInput(graph[INPUT_NAME][0], tensor).Fetch(graph[OUTPUT_NAME][0]);
var results = runner.Run();
float[,,,] heatmaps = results[0].GetValue() as float[,,,]; // <- this is SLOW
The problem:
The last line where i convert the result to floats is taking ~1.2 seconds.
Can it realy be true that reading the result data into a float array is taking more than 5 times as long as the actual evaluation of the network?
Is there another way to extract the result values?
So I have found a solution to this. I still do not know why the GetValue() call is so slow, but I found another way to retrieve the data.
I chose to manually read the raw tensor data available at results[0].Data
I created a small function to handle this as a drop in for GetValue, (Here just with the dimensions i am expecting hardcoded)
private float[,,,] TensorToFLoats(TFTensor tensor)
{
IntPtr resData = tensor.Data;
UIntPtr dataSize = tensor.TensorByteSize;
byte[] s_ImageBuffer = new byte[(int)dataSize];
System.Runtime.InteropServices.Marshal.Copy(resData, s_ImageBuffer, 0, (int)dataSize);
int floatsLength = s_ImageBuffer.Length / 4;
float[] floats = new float[floatsLength];
for (int n = 0; n < s_ImageBuffer.Length; n += 4)
{
floats[n / 4] = BitConverter.ToSingle(s_ImageBuffer, n);
}
float[,,,] result = new float[1, 128, 128, 16];
int i = 0;
for (int y = 0; y < 128; y++)
{
for (int x = 0; x < 128; x++)
{
for (int p = 0; p < 16; p++)
{
result[0, y, x, p] = floats[i++];
}
}
}
return result;
}
Given this i can replace the code in my question with the following
var runner = session.GetRunner();
runner.AddInput(graph[INPUT_NAME][0], tensor).Fetch(graph[OUTPUT_NAME][0]);
var results = runner.Run();
float[,,,] heatmaps = TensorToFLoats(results[0]);
This is insanely much faster. Where GetValue took ~1 second the TensorToFloats function i created got the same data in ~0.02 seconds

Getting a neural network to output anything inbetween -1.0 and 1.0

I am trying to make a back propagation neural network.
Based upon the the tutorials i found here : MSDN article by James McCaffrey. He gives many examples but all his networks are based upon the same problem to solve. So his networks look like 4:7:3 >> 4input - 7hidden - 3output.
His output is always binary 0 or 1, one output gets a 1, to classify an Irish flower, into one of the three categories.
I would like to solve another problem with a neural network and that would require me 2 neural networks where one needs an output inbetween 0..255 and another inbetween 0 and 2times Pi. (a full turn, circle). Well essentially i think i need an output that range from 0.0 to 1.0 or from -1 to 1 and anything in between, so that i can multiply it to becomme 0..255 or 0..2Pi
I think his network does behave, like it does because of his computeOutputs
Which I show below here :
private double[] ComputeOutputs(double[] xValues)
{
if (xValues.Length != numInput)
throw new Exception("Bad xValues array length");
double[] hSums = new double[numHidden]; // hidden nodes sums scratch array
double[] oSums = new double[numOutput]; // output nodes sums
for (int i = 0; i < xValues.Length; ++i) // copy x-values to inputs
this.inputs[i] = xValues[i];
for (int j = 0; j < numHidden; ++j) // compute i-h sum of weights * inputs
for (int i = 0; i < numInput; ++i)
hSums[j] += this.inputs[i] * this.ihWeights[i][j]; // note +=
for (int i = 0; i < numHidden; ++i) // add biases to input-to-hidden sums
hSums[i] += this.hBiases[i];
for (int i = 0; i < numHidden; ++i) // apply activation
this.hOutputs[i] = HyperTanFunction(hSums[i]); // hard-coded
for (int j = 0; j < numOutput; ++j) // compute h-o sum of weights * hOutputs
for (int i = 0; i < numHidden; ++i)
oSums[j] += hOutputs[i] * hoWeights[i][j];
for (int i = 0; i < numOutput; ++i) // add biases to input-to-hidden sums
oSums[i] += oBiases[i];
double[] softOut = Softmax(oSums); // softmax activation does all outputs at once for efficiency
Array.Copy(softOut, outputs, softOut.Length);
double[] retResult = new double[numOutput]; // could define a GetOutputs method instead
Array.Copy(this.outputs, retResult, retResult.Length);
return retResult;
The network uses the folowing hyperTan function
private static double HyperTanFunction(double x)
{
if (x < -20.0) return -1.0; // approximation is correct to 30 decimals
else if (x > 20.0) return 1.0;
else return Math.Tanh(x);
}
In above a function makes for the output layer use of Softmax() and it is i think critical to problem here. In that I think it makes his output all binary, and it looks like this :
private static double[] Softmax(double[] oSums)
{
// determine max output sum
// does all output nodes at once so scale doesn't have to be re-computed each time
double max = oSums[0];
for (int i = 0; i < oSums.Length; ++i)
if (oSums[i] > max) max = oSums[i];
// determine scaling factor -- sum of exp(each val - max)
double scale = 0.0;
for (int i = 0; i < oSums.Length; ++i)
scale += Math.Exp(oSums[i] - max);
double[] result = new double[oSums.Length];
for (int i = 0; i < oSums.Length; ++i)
result[i] = Math.Exp(oSums[i] - max) / scale;
return result; // now scaled so that xi sum to 1.0
}
How to rewrite softmax ?
So the network will be able to give non binary answers ?
Notice the full code of the network is here. if you would like to try it out.
Also as to test the network the following accuracy function is used, maybe the binary behaviour emerges from it
public double Accuracy(double[][] testData)
{
// percentage correct using winner-takes all
int numCorrect = 0;
int numWrong = 0;
double[] xValues = new double[numInput]; // inputs
double[] tValues = new double[numOutput]; // targets
double[] yValues; // computed Y
for (int i = 0; i < testData.Length; ++i)
{
Array.Copy(testData[i], xValues, numInput); // parse test data into x-values and t-values
Array.Copy(testData[i], numInput, tValues, 0, numOutput);
yValues = this.ComputeOutputs(xValues);
int maxIndex = MaxIndex(yValues); // which cell in yValues has largest value?
int tMaxIndex = MaxIndex(tValues);
if (maxIndex == tMaxIndex)
++numCorrect;
else
++numWrong;
}
return (numCorrect * 1.0) / (double)testData.Length;
}
Just in case that someone gets into the same situation.
If you need some example code of a neural network regression
(a NNR) That's how they are called.
Here is link to sample code in C#, and here is a good article about it. Notice the guy writes more articles there, you wont find everything but there's a lot there. Despite I was following this man for a while I missed this specific article as I didn't know how they where called, when I asked the question here on stack overflow.
I'm a bit rusty at Neural Netowrks but I think, if you want to have a range of values from your output then you need to make sure your activation functions on your output layer are linear (or something that has a similar effect).
Try adding this method:
private static double[] Linear(double[] oSums)
{
double sum = oSums.Sum(d => Math.Abs(d));
double[] result = new double[oSums.Length];
for (int i = 0; i < oSums.Length; ++i)
result[i] = Math.Abs(oSums[i]) / sum;
// scaled so that xi sum to 1.0
return result;
}
And then in the ComputeOutputs method you need to use this new activation function for the output (rather than Softmax):
...
//double[] softOut = Softmax(oSums); // all outputs at once for efficiency
double[] softOut = Linear(oSums); // all outputs at once for efficiency
Array.Copy(softOut, outputs, softOut.Length);
...
This should now output linear values.

Passing a List as a parameter to a Matlab function from C# project

I'm trying to run The Fast Compressive Tracking Algorithm from my C# project. What I want is that, (in C#)to pass the images after converting them into gray scale images to the Matlab function
Runtracker
instead of having the images in the Matlab code as I'm going to apply some operations to specify the index of the array that containing the images to start tracking from.
But when I pass the list of the grayscale images, I got an error says
" The best overload method match for 'trackerNative.FCT.Runtracker(int)' has some invalid arguments."
Can you help me solve this? to pass List of images in C# to a matlab function.
C# Code
//to convert to graScale
public double[,] to_gray(Bitmap img)
{
int w = img.Width;
int h = img.Height;
double[,] grayImage = new double[w, h];
for (int i = 0; i < w; i++)
{
for (int x = 0; x < h; x++)
{
Color oc = img.GetPixel(i, x);
grayImage[i, x] = (double)(oc.R + oc.G + oc.B) / 3;
}
}
return grayImage;
}
//to get the files from specified folder
public static String[] GetFilesFrom(String searchFolder, String[] filters, bool isRecursive)
{
List<String> filesFound = new List<String>();
var searchOption = isRecursive ? SearchOption.AllDirectories : SearchOption.TopDirectoryOnly;
foreach (var filter in filters)
{
filesFound.AddRange(Directory.GetFiles(searchFolder, String.Format("*.{0}", filter), searchOption));
}
return filesFound.ToArray();
}
//Button to run the matlab function 'Runtracker'
private void button1_Click(object sender, EventArgs e)
{
FCT obj = new FCT(); //FCT is a matlab class
OpenFileDialog openFileDialog1 = new OpenFileDialog();
if (openFileDialog1.ShowDialog() == DialogResult.OK)
{
string file_path = openFileDialog1.FileName;
var filters = new String[] { "png" };
var files = GetFilesFrom(file_path, filters, false);
List< double[,] > imgArr = new List< double[,] >();
for (int i = 0; i < files.Length; i++)
{
Bitmap image = new Bitmap(files[i]);
double[,] grayScale_image = to_gray(image);
imgArr[i] = grayScale_image;
}
object m = obj.Runtracker(imgArr); //the error occured
//Bitmap output = m as Bitmap;
}
}
Matlab Code
function Runtracker(input_imgArr)
clc;clear all;close all;
rand('state',0);
%%
[initstate] = initCt(1);% position of the detected face in the 1st frame
num = length(input_imgArr);% number of frames
%%
x = initstate(1);% x axis at the Top left corner
y = initstate(2);
w = initstate(3);% width of the rectangle
h = initstate(4);% height of the rectangle
%---------------------------
img = imread(input_imgArr(1));
img = double(img);
%%
trparams.init_negnumtrain = 50;%number of trained negative samples
trparams.init_postrainrad = 4;%radical scope of positive samples
trparams.initstate = initstate;% object position [x y width height]
trparams.srchwinsz = 25;% size of search window
%-------------------------
%% Classifier parameters
clfparams.width = trparams.initstate(3);
clfparams.height= trparams.initstate(4);
% feature parameters
% number of rectangle from 2 to 4.
ftrparams.minNumRect =2;
ftrparams.maxNumRect =4;
M = 100;% number of all weaker classifiers, i.e,feature pool
%-------------------------
posx.mu = zeros(M,1);% mean of positive features
negx.mu = zeros(M,1);
posx.sig= ones(M,1);% variance of positive features
negx.sig= ones(M,1);
lRate = 0.85;% Learning rate parameter
%% Compute feature template
[ftr.px,ftr.py,ftr.pw,ftr.ph,ftr.pwt] = HaarFtr(clfparams,ftrparams,M);
%% Compute sample templates
posx.sampleImage = sampleImgDet(img,initstate,trparams.init_postrainrad,1);
negx.sampleImage = sampleImg(img,initstate,1.5*trparams.srchwinsz,4+trparams.init_postrainrad,trpar ams.init_negnumtrain);
%% Feature extraction
iH = integral(img);%Compute integral image
posx.feature = getFtrVal(iH,posx.sampleImage,ftr);
negx.feature = getFtrVal(iH,negx.sampleImage,ftr);
[posx.mu,posx.sig,negx.mu,negx.sig] =
classiferUpdate(posx,negx,posx.mu,posx.sig,negx.mu,negx.sig,lRate);%
update distribution parameters
%% Begin tracking
for i = 2:num
img = imread(input_imgArr(i));
imgSr = img;% imgSr is used for showing tracking results.
img = double(img);
iH = integral(img);%Compute integral image
%% Coarse detection
step = 4; % coarse search step
detectx.sampleImage =
sampleImgDet(img,initstate,trparams.srchwinsz,step);
detectx.feature = getFtrVal(iH,detectx.sampleImage,ftr);
r = ratioClassifier(posx,negx,detectx.feature);% compute the classifier for all samples
clf = sum(r);% linearly combine the ratio classifiers in r to the final classifier
[c,index] = max(clf);
x = detectx.sampleImage.sx(index);
y = detectx.sampleImage.sy(index);
w = detectx.sampleImage.sw(index);
h = detectx.sampleImage.sh(index);
initstate = [x y w h];
%% Fine detection
step = 1;
detectx.sampleImage = sampleImgDet(img,initstate,10,step);
detectx.feature = getFtrVal(iH,detectx.sampleImage,ftr);
r = ratioClassifier(posx,negx,detectx.feature);% compute the classifier for all samples
clf = sum(r);% linearly combine the ratio classifiers in r to the final classifier
[c,index] = max(clf);
x = detectx.sampleImage.sx(index);
y = detectx.sampleImage.sy(index);
w = detectx.sampleImage.sw(index);
h = detectx.sampleImage.sh(index);
initstate = [x y w h];
%% Show the tracking results
imshow(uint8(imgSr));
rectangle('Position',initstate,'LineWidth',4,'EdgeColor','r');
hold on;
text(5, 18, strcat('#',num2str(i)), 'Color','y', 'FontWeight','bold', 'FontSize',20);
set(gca,'position',[0 0 1 1]);
pause(0.00001);
hold off;
%% Extract samples
posx.sampleImage = sampleImgDet(img,initstate,trparams.init_postrainrad,1);
negx.sampleImage =
sampleImg(img,initstate,1.5*trparams.srchwinsz,4+trparams.init_postrainrad,trparams.init_negnumtrain);
%% Update all the features
posx.feature = getFtrVal(iH,posx.sampleImage,ftr);
negx.feature = getFtrVal(iH,negx.sampleImage,ftr);
[posx.mu,posx.sig,negx.mu,negx.sig] = classiferUpdate(posx,negx,posx.mu,posx.sig,negx.mu,negx.sig,lRate);% update distribution parameters
end
end
This is because the Runtracker expects an argument of type int as per the error and you are passing an argument of type int[] i.e an integer array.
The passed parameter and the expected parameter should match in order to execute the method successfully.
Hope this helps.

How to calculate the reverberation time of a wave signal in C#

I am trying to develop a console application in C# which uses a WAV-file for input. The application should do a couple of things all in order, as shown below. First of all, the complete code:
class Program
{
static List<double> points = new List<double>();
static double maxValue = 0;
static double minValue = 1;
static int num = 0;
static int num2 = 0;
static List<double> values = new List<double>();
private static object akima;
static void Main(string[] args)
{
string[] fileLines = File.ReadAllLines(args[0]);
int count = 0;
foreach (string fileLine in fileLines)
{
if (!fileLine.Contains(";"))
{
string processLine = fileLine.Trim();
processLine = Regex.Replace(processLine, #"\s+", " ");
if (Environment.OSVersion.Platform == PlatformID.Win32NT)
{
processLine = processLine.Replace(".", ",");
}
string[] dataParts = processLine.Split(Char.Parse(" "));
points.Add(double.Parse(dataParts[0]));
double value = Math.Pow(double.Parse(dataParts[1]), 2);
if (value > maxValue)
{
maxValue = value;
num = count;
}
values.Add(value);
}
count++;
}
for (int i = num; i < values.Count; i++)
{
if (values[i] < minValue)
{
minValue = values[i];
num2 = i;
}
}
Console.WriteLine(num + " " + num2);
int between = num2 - num;
points = points.GetRange(num, between);
values = values.GetRange(num, between);
List<double> defVal = new List<double>();
List<double> defValPoints = new List<double>();
alglib.spline1dinterpolant c;
alglib.spline1dbuildakima(points.ToArray(), values.ToArray(), out c);
double baseInt = alglib.spline1dintegrate(c, points[points.Count - 1]);
List<double> defETC = new List<double>();
for (int i = 0; i < points.Count; i += 10)
{
double toVal = points[i];
defETC.Add(10 * Math.Log10(values[i]));
defVal.Add(10 * Math.Log10((baseInt - alglib.spline1dintegrate(c, toVal)) / baseInt));
defValPoints.Add(points[i]);
}
WriteDoubleArrayToFile(defValPoints.ToArray(), defVal.ToArray(), "test.dat");
WriteDoubleArrayToFile(defValPoints.ToArray(), defETC.ToArray(), "etc.dat");
int end = 0;
for (int i = 0; i < points.Count; i++)
{
if (defVal[i] < -10)
{
end = i;
break;
}
}
//Console.WriteLine(num + " " + end);
int beginEDT = num;
int endEDT = num + end;
double timeBetween = (defValPoints[endEDT] - defValPoints[beginEDT]) * 6;
Console.WriteLine(timeBetween);
for (int i = 0; i < points.Count; i++)
{
}
Console.ReadLine();
}
static void WriteDoubleArrayToFile(double[] points, double[] values, string filename)
{
string[] defStr = new string[values.Length];
for (int i = 0; i < values.Length; i++)
{
defStr[i] = String.Format("{0,10}{1,25}", points[i], values[i]);
}
File.WriteAllLines(filename, defStr);
}
}
Extract the decimal/float/double value from the WAV-file
Create an array from extracted data
Create an Energy Time Curve that displays the decay of the noise/sound in a decibel-like way
Create an Decay Curve from the ETC created in step 3
Calculate things as Early Decay Time (EDT), T15/T20 and RT60 from this Decay Curve.
Display these Reverb Times in stdout.
At the moment I am sort of like half way through the process. I´ll explain what I did:
I used Sox to convert the audio file into a .dat file with numbers
I create an array using C# by simply splitting each line in the file above and putting the times in a TimesArray and the values at those points in a ValuesArray.
I am displaying a graph via GNUPlot, using the data processed with this function: 10 * Math.Log10(values[i]); (where i is an iterative integer in a for-loop iterating over all the items in the ValuesArray)
This is where I'm starting to get stuck. I mean, in this step I am using an Akima Spline function from Alglib to be able to integrate a line. I am doing that with a Schroeder integration (reversed), via this mathematical calculation: 10 * Math.Log10((baseInt - alglib.spline1dintegrate(c, toVal)) / baseInt); (where baseInt is a value calculated as a base integral for the complete curve, so I have a calculated bottom part of the reversed Schroeder integration. The c is a spline1dinterpolant made available when using the function alglib.spline1dbuildakima, which takes the timeArray as x values, valueArray as the y values, and c as an outwards spline1dinterpolant. toval is an x-value from the points array. The specific value is selected using a for-loop.) From these newly saved values I want to create an interpolated line and calculate the RT60 from that line, but I do not know how to do that.
Tried, did not really work out.
Same as above, I have no real values to show.
I'm quite stuck now, as I'm not sure if this is the right way to do it. If anyone can tell me how I can calculate the reverberation times in a fast and responsive way in C#, I'd be pleased to hear. The way of doing it might be completely different from what I have now, that's OK, just let me know!
Maybe you need to approach this differently:
start with the underlying maths. find out the mathematical formulas for these functions.
Use a simple mathematical function and calculate by hand (in excel or matlab) what the values should be (of all these things ETC, DC, EDC, T15, T20, RT60)
(A function such as a sine wave of just the minimum number of points you need )
then write a separate procedure to evaluate each of these in C# and verify your results for consistency with excel/matlab.
in C#, maybe store your data in a class that you pass around to each of the methods for its calculation.
your main function should end up something like this:
main(){
data = new Data();
//1, 2:
extract_data(data, filename);
//3:
energy_time_curve(data)
//...4, 5
//6:
display_results(data);
}

Categories