Why does this Hidden Markov Model make this prediction?

Why does this Hidden Markov Model make this prediction? - c#

I am trying to understand how HMM works, but I think I am missing some crucial part of information that I cannot identify. I want it to predict the next "feature"/"symbol" based on a given sequence.
int[][] sequences =
{
new[] { 10001, 15, 1, 0, 0, 10002 },
new[] { 10002, 0, 1, 0, 15, 10001 },
new[] { 101, 15, 0, 0, 0, 101},
new[] { 101, 0, 0, 0, 15, 101 },
new[] { 114, 15, 0, 1, 0, 114 },
new[] { 114, 0, 0, 1, 15, 114 },
new[] { 10001, 15, 1, 0, 0, 10002 },
new[] { 10002, 0, 1, 0, 15, 10001 },
};
var teacher = new BaumWelchLearning()
{
Topology = new Forward(6),
Tolerance = 0.0001,
};
HiddenMarkovModel hmm = teacher.Learn(sequences);
// Gives 15 instead of 114
int[] prediction = hmm.Predict(observations: new[] { 114, 15, 0, 1, 0 }, next: 1);
The next character for "114, 15, 0, 1, 0" should be 114, yet the prediction is 15. Am I doing something wrong with the topology? Do I need to define something differently?
Thanks in advance!

u can't use statistic/probabilistic to predict one single realization. The theory make sense when u use it in many occurrences. In your case
call :
int[] prediction = hmm.Predict(observations: new[] { 114, 15, 0, 1, 0 }, next: 1);
many times and see what's next observations probs. really is ...

Related

Accord and Mulit-label Support Vector Machines

I'm working through the example in the docs for a multi-class support vector machine - http://accord-framework.net/docs/html/T_Accord_MachineLearning_VectorMachines_MultilabelSupportVectorMachine.htm
Though, I'm not getting a 0 error rate, and when I try to compute values, they do not give the output values they should. Is there something wrong with the example?
static void Main(string[] args)
{
// Sample input data
double[][] inputs =
{
new double[] { 0 },
new double[] { 1 },
new double[] { 2 },
new double[] { 3 },
};
// Outputs for each of the inputs
int[][] outputs =
{
new[] {1,-1,-1,-1},
new[] {-1,1,-1,-1},
new[] {-1,-1,1,-1},
new[] {-1,-1,-1,1},
};
// Create a new Linear kernel
IKernel kernel = new Linear();
// Create a new Multi-class Support Vector Machine with one input,
// using the linear kernel and for four disjoint classes.
var machine = new MultilabelSupportVectorMachine(1, kernel, 4);
// Create the Multi-label learning algorithm for the machine
var teacher = new MultilabelSupportVectorLearning(machine, inputs, outputs);
// Configure the learning algorithm to use SMO to train the
// underlying SVMs in each of the binary class subproblems.
teacher.Algorithm = (svm, classInputs, classOutputs, i, j) =>
new SequentialMinimalOptimization(svm, classInputs, classOutputs);
// Run the learning algorithm
double error = teacher.Run();
error = teacher.Run(); // 0.1875 error rate
var answer = machine.Compute(new double[] {2}); // gives -1,-1,-1,-1, instead of -1,-1,1,-1
Should the error rate be zero, and why does it seem that only an input of 0 gives the right output?

To answer the question, it is very likely that there was something wrong with that particular example. Most examples have been updated to reflect the new .Learn() API that was put in place last year.
Now you may see that the documentation page for Multi-label Support Vector Machine has also changed addresses due the new API and is now at
http://accord-framework.net/docs/html/T_Accord_MachineLearning_VectorMachines_MultilabelSupportVectorMachine_1.htm
And now it includes this example, among others:
// Let's say we have the following data to be classified
// into three possible classes. Those are the samples:
//
double[][] inputs =
{
// input output
new double[] { 0, 1, 1, 0 }, // 0
new double[] { 0, 1, 0, 0 }, // 0
new double[] { 0, 0, 1, 0 }, // 0
new double[] { 0, 1, 1, 0 }, // 0
new double[] { 0, 1, 0, 0 }, // 0
new double[] { 1, 0, 0, 0 }, // 1
new double[] { 1, 0, 0, 0 }, // 1
new double[] { 1, 0, 0, 1 }, // 1
new double[] { 0, 0, 0, 1 }, // 1
new double[] { 0, 0, 0, 1 }, // 1
new double[] { 1, 1, 1, 1 }, // 2
new double[] { 1, 0, 1, 1 }, // 2
new double[] { 1, 1, 0, 1 }, // 2
new double[] { 0, 1, 1, 1 }, // 2
new double[] { 1, 1, 1, 1 }, // 2
};
int[] outputs = // those are the class labels
{
0, 0, 0, 0, 0,
1, 1, 1, 1, 1,
2, 2, 2, 2, 2,
};
// Create the multi-class learning algorithm for the machine
var teacher = new MulticlassSupportVectorLearning<Gaussian>()
{
// Configure the learning algorithm to use SMO to train the
// underlying SVMs in each of the binary class subproblems.
Learner = (param) => new SequentialMinimalOptimization<Gaussian>()
{
// Estimate a suitable guess for the Gaussian kernel's parameters.
// This estimate can serve as a starting point for a grid search.
UseKernelEstimation = true
}
};
// Configure parallel execution options
teacher.ParallelOptions.MaxDegreeOfParallelism = 1;
// Learn a machine
var machine = teacher.Learn(inputs, outputs);
// Obtain class predictions for each sample
int[] predicted = machine.Decide(inputs);
// Get class scores for each sample
double[] scores = machine.Score(inputs);
// Compute classification error
double error = new ZeroOneLoss(outputs).Loss(predicted);

LINQ Solution for Multiple Resolves

I have an array of MyClass, which can be simplified as follow:
public class MyClass {
public int Id;
public string Origin;
public int Points;
public DateTime RequestTime;
public MyClass(int id, string origin, int points, DateTime requestTime) {
Id = id;
Origin = origin;
Points = points;
RequestTime = requestTime;
}
}
Now, in the Array, without any errors from the user side or throughout the input process, there cannot be MyClass instance with identical Id and Origin.
However, if there be any, I should resolve it. And here are the resolving rules:
Firstly by Points - that is, to take one among the duplicates which has the highest Points
But if the Points are the same, I have to further resolve it by using RequestTime - the latest will be taken.
And if, there is no difference in RequestTime, then I can take one of the duplicates arbitrarily.
Here is the sample data input I have:
MyClass[] myarr = new MyClass[] {
new MyClass(1, "Ware House 1", 5, new DateTime(2016, 1, 26, 14, 0, 0)), //[0]
new MyClass(1, "Ware House 1", 7, new DateTime(2016, 1, 26, 14, 0, 0)), //[1] //higher points
new MyClass(1, "Ware House 2", 7, new DateTime(2016, 1, 26, 14, 0, 0)), //[2]
new MyClass(1, "Ware House 2", 7, new DateTime(2016, 1, 26, 14, 1, 0)), //[3] //later time
new MyClass(1, "Ware House 2", 7, new DateTime(2016, 1, 26, 14, 0, 0)), //[4]
new MyClass(2, "Ware House 2", 7, new DateTime(2016, 1, 26, 14, 0, 0)), //[5] //higher points
new MyClass(2, "Ware House 2", 5, new DateTime(2016, 1, 26, 14, 1, 0)), //[6] //later time but less points
new MyClass(3, "Ware House 1", 6, new DateTime(2016, 1, 26, 14, 0, 0)), //[7] //triplet, pick any
new MyClass(3, "Ware House 1", 6, new DateTime(2016, 1, 26, 14, 0, 0)), //[8] //triplet, pick any
new MyClass(3, "Ware House 1", 6, new DateTime(2016, 1, 26, 14, 0, 0)) //[9] //triplet, pick any
};
The final result should be [1], [3], [5], + any of [7]/[8]/[9]
I want to implement LINQ solution for it, but stuck. I do not know how make query it at once.
Any idea?

Group by {Id, Origin} and take the first one from each group when ordered by Points and RequestTime:
var query = from x in myarr
group x by new {x.Id, x.Origin}
into g
select (
from z in g
orderby z.Points descending, z.RequestTime descending
select z).First();
In method syntax, this is:
var query =
myarr.GroupBy(x => new {x.Id, x.Origin})
.Select(g => g.OrderByDescending(z => z.Points)
.ThenByDescending(z => z.RequestTime)
.First());

Try following:
myarr.OrderBy(m=> m.Points).ToList();
or
myarr.OrderBy(m=> m.Points).Orderby(m=> m.RequestTime);

Split a string into substrings at variable widths in C#

I have a string of fixed length that has to be split at variable positions along the string to yield the substrings.
30849162 AUF3063100-2022031Doe Deanne 2610194031482100720081007200820000000000G43Z4206372 10 8 98282000000000911140000 00000000K6358Z8643K638 D126 Z099 320930090308009251519 132093 100720080071 0000000000000000000000000000000000000000000000000000000000000000000000002022031 000000000000000000000000000000000000000000000 00000000
The column break points are:
15, 18, 33, 61, 81, 89, 93, 94, 102, 110, 111, 114, 118,
Does anyone have an idea how I might do this? I have literally thousands of lines to parse

Put the break points in an array and use .substring() in a loop through those numbers. This is roughly how you want to do it, though you will have to adjust it to compensate for exactly where you want your column breaks.
int[] nums = {0, 15, 18, 33, 61, 81, 89, 93, 94, 102, 110, 111, 114, 118 };
string input = "Long string here";
for (int i = 0; i < nums.Length - 1; i++)
{
Console.WriteLine(input.Substring(nums[i], nums[i + 1] - nums[i]));
}

Or you could use some nasty LINQ like so..
public string[] ReturnMyStrings(string str)
{
int[] br = { 15, 18, 33, 61, 81, 89, 93, 94, 102, 110, 111, 114, 118 };
return br.Select((x, i) =>
str.Substring(br.ElementAtOrDefault(i - 1), x - br.ElementAtOrDefault(i - 1)))
.ToArray();
}

If you wanted to make your code scaleable you could implement some classes to do this work.
static void Main(string[] args)
{
string inputString = "30849162 AUF3063100-2022031Doe Deanne " +
"2610194031482100720081007200820000000000G43Z4" +
"206372 10 8 98282000000000911140000 00000000K" +
"6358Z8643K638 D126 Z099 320930090308009251519" +
"132093 100720080071 0000000000000000000000000" +
"000000000000000000000000000000000000000000000" +
"002022031 00000000000000000000000000000000000" +
"0000000000 00000000";
//myRecord will hold the entire input in its split form
var myRecord = new StringSplitterRecord()
{
fields = new List<StringSplitterField>()
{
//define all the different fields
new StringSplitterField(inputString, 0, 15, "Name of field 1"),
new StringSplitterField(inputString, 15, 3, "Name of field 2"),
new StringSplitterField(inputString, 18, 15, "Name of field 3"),
new StringSplitterField(inputString, 33, 28, "Name of field 4"),
new StringSplitterField(inputString, 61, 20, "Name of field 5"),
new StringSplitterField(inputString, 81, 8, "Name of field 6"),
new StringSplitterField(inputString, 93, 1, "Name of field 7"),
new StringSplitterField(inputString, 94, 8, "Name of field 8"),
new StringSplitterField(inputString, 102, 8, "Name of field 9"),
new StringSplitterField(inputString, 110, 1, "Name of field 10"),
new StringSplitterField(inputString, 111, 3, "Name of field 11"),
new StringSplitterField(inputString, 114, 4, "Name of field 12"),
}
};
}
class StringSplitterRecord
{
public List<StringSplitterField> fields;
}
class StringSplitterField
{
private string _contents;
private string _fieldType;
public StringSplitterField(string originalString, int startLocation, int length, string fieldType)
{
_contents = originalString.Substring(startLocation, length);
_fieldType = fieldType;
}
}
This will not only split your input string into the require pieces but it will put them all in a list with a name for each sub section. Then you can use LINQ etc to retrieve the data that you need.

Get a specific value from a specific array when having two dropdown lists

I have 5 arrays which represents 1 city each. Each position in the array represents the distance to another city (all arrays shares the same position for each specific city). And I have two dropdown lists from where the user is supposed to select two cities to calculate the distance between them.
It's set up like this:
// City0, City1, City2, City3, City4
int[] distanceFromCity0 = { 0, 16, 39, 9, 24 };
int[] distanceFromCity1 = { 16, 0, 36, 32, 54 };
int[] distanceFromCity2 = { 39, 36, 0, 37, 55 };
int[] distanceFromCity3 = { 9, 32, 37, 0, 21 };
int[] distanceFromCity4 = { 24, 54, 55, 21, 0 };
int cityOne = Convert.ToInt16(DropDownList1.SelectedValue);
int cityTwo = Convert.ToInt16(DropDownList2.SelectedValue);
And within the dropdown lists each city has the corresponding ID (city0 = 0, city1 = 1 etc)
I have tried a few different ways, but none of them really works.
So basically, how do I "connect" DropDownList1 to one of the arrays depending on the choice, and then connecting DropDownList2 to one of the positions in the selected array (from DropDownList1 selection) and print it out to Label1?
Is it easier with a 2D array?
This probably looks easy for you, but I'm a noob in C#.

One way would be to combine distanceFromCity0 ... distanceFromCity4 into a single 2D array and use the two cities as indexes to the distance value:
int[][] distanceBetweenCities = {
new[]{ 0, 16, 39, 9, 24 },
new[]{ 16, 0, 36, 32, 54 },
new[]{ 39, 36, 0, 37, 55 },
new[]{ 9, 32, 37, 0, 21 },
new[]{ 24, 54, 55, 21, 0 }
};
int cityOne = Convert.ToInt32(DropDownList1.SelectedValue);
int cityTwo = Convert.ToInt32(DropDownList2.SelectedValue);
var distance = distanceBetweenCities[cityOne][cityTwo];

Yes, using two-dimensional array is very easy. You can regard it like a matrix. Some code like below:
int[,] distanceMatrix = new int[5, 5] { { 0, 16, 39, 9, 24 },
{ 16, 0, 36, 32, 54 },
{ 39, 36, 0, 37, 55 },
{ 9, 32, 37, 0, 21 },
{ 24, 54, 55, 21, 0 }
};
int cityOne = Convert.ToInt32(DropDownList1.SelectedValue);
int cityTwo = Convert.ToInt32(DropDownList2.SelectedValue);
var distance = distanceMatrix[cityOne, cityTwo]; //the distance between cityOne and cityTwo;

multiple classification using Liblinear in Accord.net Framework

I need to implement multiple classification classifier using Liblinear. Accord.net machine learning framework provides all of Liblinear properties except the Crammer and Singer’s formulation for multi-class classification. This is the process.

The usual way of learning a multi-class machine is by using the MulticlassSupportVectorLearning class. This class can teach one-vs-one machines that can then be queried using either voting or elimination strategies.
As such, here is an example on how linear training can be done for multiple classes:
// Let's say we have the following data to be classified
// into three possible classes. Those are the samples:
//
double[][] inputs =
{
// input output
new double[] { 0, 1, 1, 0 }, // 0
new double[] { 0, 1, 0, 0 }, // 0
new double[] { 0, 0, 1, 0 }, // 0
new double[] { 0, 1, 1, 0 }, // 0
new double[] { 0, 1, 0, 0 }, // 0
new double[] { 1, 0, 0, 0 }, // 1
new double[] { 1, 0, 0, 0 }, // 1
new double[] { 1, 0, 0, 1 }, // 1
new double[] { 0, 0, 0, 1 }, // 1
new double[] { 0, 0, 0, 1 }, // 1
new double[] { 1, 1, 1, 1 }, // 2
new double[] { 1, 0, 1, 1 }, // 2
new double[] { 1, 1, 0, 1 }, // 2
new double[] { 0, 1, 1, 1 }, // 2
new double[] { 1, 1, 1, 1 }, // 2
};
int[] outputs = // those are the class labels
{
0, 0, 0, 0, 0,
1, 1, 1, 1, 1,
2, 2, 2, 2, 2,
};
// Create a one-vs-one multi-class SVM learning algorithm
var teacher = new MulticlassSupportVectorLearning<Linear>()
{
// using LIBLINEAR's L2-loss SVC dual for each SVM
Learner = (p) => new LinearDualCoordinateDescent()
{
Loss = Loss.L2
}
};
// Learn a machine
var machine = teacher.Learn(inputs, outputs);
// Obtain class predictions for each sample
int[] predicted = machine.Decide(inputs);
// Compute classification accuracy
double acc = new GeneralConfusionMatrix(expected: outputs, predicted: predicted).Accuracy;
You can also try to solve a multiclass decision problem using the one-vs-rest strategy. In this case, you can use the MultilabelSupportVectorLearning teaching algorithm instead of the multi-class one shown above.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why does this Hidden Markov Model make this prediction? - c#

u can't use statistic/probabilistic to predict one single realization. The theory make sense when u use it in many occurrences. In your case call : int[] prediction = hmm.Predict(observations: new[] { 114, 15, 0, 1, 0 }, next: 1); many times and see what's next observations probs. really is ...

Related

Accord and Mulit-label Support Vector Machines

LINQ Solution for Multiple Resolves

Split a string into substrings at variable widths in C#

Get a specific value from a specific array when having two dropdown lists

multiple classification using Liblinear in Accord.net Framework

Categories

Resources