deedle and linq FillMissing values

deedle and linq FillMissing values - c#

Im trying to figure out how to use deedle to fill in some missing values in a list and its proving to be quite a headache, there are no real examples in c# i can find (it all centres around F#)
I have a collection of objects and I want to calculate the average of the value before and the value after in the list and assign it to the missing value. Ive created a frame below
var dfObjects = Frame.FromRecords(prices);
so I now want to perform the calculations, but I just cant work out how. There is an F# example which supposedly does just what im after, but there is no C# version anywhere.
Heres the F# version
// Fill missing values using interpolation function
ozone |> Series.fillMissingUsing (fun k ->
// Get previous and next values
let prev = ozone.TryGet(k, Lookup.ExactOrSmaller)
let next = ozone.TryGet(k, Lookup.ExactOrGreater)
// Pattern match to check which values were available
match prev, next with
| OptionalValue.Present(p), OptionalValue.Present(n) ->
(p + n) / 2.0
| OptionalValue.Present(v), _
| _, OptionalValue.Present(v) -> v
| _ -> 0.0)
has anyone else done something similar ?
Ive managed to get all the series within my list like this
var frameDate = dfObjects.IndexRows<object>("SettlementDate").SortRowsByKey();
inspecting the object at runtime the data property gives me 4 series, two of which are the ones with missing values, how on earth do I interpolate values that are mising in these series ? The F# example below uses FillMissingUsing, this isnt available in C#. Can anyone help ?

The one notable difference is that the F# example shows how to fill missing data in a series, but you are interested in filling data in a whole data frame. The best way to do this is to process individual columns (those with missing data) independently.
Given a data frame df that contains a column named Whatever, you can fill the missing values using the above logic and replace the column in the data frame using:
var series = df.Columns["Whatever"].As<double>();
var filledSeries = series.FillMissing(dt => {
var before = series.TryGet(dt, Lookup.ExactOrSmaller);
var after = series.TryGet(dt, Lookup.ExactOrGreater);
if (before.HasValue && after.HasValue) return (before.Value + after.Value) / 2.0
else if (before.HasValue) return before.Value;
else if (after.HasValue) return after.Value;
else return 0.0;
});
df.ReplaceColumn("Whatever", filledSeries);

Related

Microsoft ml.net Concatenate 2 columns as a label

I've been wondering if it was possible to concatenate 2 columns of the datatype string into the label column.
What I tried was:
pipeline.Add(new ColumnConcatenator("Label", "string1", "string2"));
But that just spits out a V2(text, 2). And a Label must be of type R4-R8.
The reason I need this is because I only have 2 input variables and I want to use regression to determine which is the best.
Thanks !

ColumnConcatenator is currently taking your two columns and producing a new vector typed column of width two. It's taking a and b and up-converting to a vector [a, b].
I think you're asking how to produce a new Label equal to a + b where a & b are strings. Eg: var a = "Hello"; var b = "World"; var c = a + c; // c is HelloWorld.
There is no way currently in ML.NET to accomplish the second method (normal string concat). You may want to combine your strings before the ML.NET code. It's something we'll look into for future versions of ML.NET, and you're invited to submit an issue requesting it: https://github.com/dotnet/machinelearning/issues/new.
Update:
We have added the expression transform which can be used to concatenate strings (among many other things).
Usage:
pipeline.Append(ML.Transforms.Expression("Label", "(x, y) : concat(x, \"-\", y)", "LabelColOne", "LabelColTwo"))
For input of LabelColOne="Cat" and LabelColTwo="Dog", it join them together with a "-" to produce Label="Cat-Dog".

Refer to a specific index of an array depending on user input

Trying to make a program where it takes in an equation from user and outputs the answer. Having to develop all code for it to work, and wondering how I could refer to either array index value either side of a specific index?
Example:
User input: X = 5 + 5 * 6
Wanting to be able to locate the ***** and retrieve the value of the 5 & 6 either side of it. Tried multiple things and tried searching here too, cannot find a answer.
Thankyou in advance to anyone who takes the time to help!

One way to do it is to use regular expressions. I would remove all whitespaces first to make it easier. For example:
string input = "X = 5 + 5 * 6";
Regex r = new Regex(#"(\d+)\*(\d+)");
Match m = r.Match(input.Replace(" ", ""));
var a = m.Groups[1].Value; // a = 5
var b = m.Groups[2].Value; // b = 6
Explanation of this regex can be found here: https://regex101.com/r/k7rthI/1
You would have to improve the regex to handle decimals and many other cases when the equation gets more complex. It is a long dark rabbit hole to go down if you get more complex than what you have. So you might be better off finding a math library you can use. No reason to reinvent the wheel.

Filling and manipulating matrices using MathNet.Numerics

I'm working on a code where I need to represent a small number of matrices (around 10) and do some operations with them (like get the inverse, transposed, etc). One of my co-workers recommended using the Math.Net Iridium library. The referred page said the project was Discontinued and merged with MathNeh.Numerics, found here.
I manage to install the package successfully. But now, I'm struggling to use the operations properly.
To sum up, what I am asking is: how to put data into matrices and manipulate them using MathNet.Numerics? For instance, how can I add values to a specific row x column y in a given matrix m1. Does it allow us to access a specific index?
One more thing to note, the matrices will always have the same number of columns and rows, but this number is known in run-time only.
I've tried to google for tutorials, found this one, but I didn't get what I needed to know. Any help is appreciated.
--
PS: the method I was using so far was creating Nested Lists to represent each matrix, and using for loops to populate it. I believe I would have a hard time when the time to transpose/invert/multiply would come.

The answer is in the documentation linked in the question itself. http://numerics.mathdotnet.com/Matrix.html#Manipulating-Matrices-and-Vectors
The given example is:
var m = Matrix<double>.Build.Dense(3,4,(i,j) => 10*i + j);
m[0,0]; // 0 (row 0, column 0)
m[2,0]; // 20 (row 2, column 0)
m[0,2]; // 2 (row 0, column 2)
m[0,2] = -1.0;
m[0,2]; // -1

C# Find the Next X and Previous Numbers in a sequence

I have a list of numbers, {1,2,3,4,...,End} where End is specified. I want to display the X closest numbers around a given number Find within the list. If x is odd I want the extra digit to go on the greater than side.
Example (Base Case)
End: 6
X: 2
Find: 3
The result should be: {2,3,4}
Another Example (Bound Case):
End: 6
X: 4
Find: 5
The result should be: {2,3,4,5,6}
Yet Another Example (Odd Case):
End: 6
X: 3
Find: 3
The result should be: {2,3,4,5}
I'm assuming it would be easier to simply find a start and stop value, rather than actually generating the list, but I don't really care one way or another.
I'm using C# 4.0 if that matters.
Edit: I can think of a way to do it, but it involves way too many if, else if cases.
if (Find == 1)
{
Start = Find;
Stop = (Find + X < End ? Find + X : End);
}
else if (Find == 2)
{
if (X == 1)
{
Start = Find;
End = (Find + 1 < End ? Find + 1 : End);
}
...
}
You can hopefully see where this is going. I assuming I'm going to have to use a (X % 2 == 0) for odd/even checking. Then some bound thats like less = Find - X/2 and more = Find + X/2. I just can't figure out the path of least if cases.
Edit II: I should also clarify that I don't actually create a list of {1,2,3,4...End}, but maybe I need to just start at Find-X/2.

I realise that you are learning, and out of respect from this I will not provide you with the full solution. I will however do my best to nudge you in the right direction.
From looking at your attempted solution, I think you need to figure out the algorithm you need before trying to code up something that may or may not solve your problem. As you say yourself, writing one if statement for every possible permutation on the input is not a manageble solution. You need to find an algorithm that is general enough that you can use it for any input you get, and still get the right results out.
Basically, there are two questions you need to answer before you'll be able to code up a working solution.
How do I find the lower bound of the list I want to return?
How do I find the upper bound of the list I want to return?
Considering the example base case, you know that the given parameter X contains a number that tells you how many numbers around Find you should display. Therefore you need to divide X equally on both sides of Find.
Thus:
If I get an input X = 4 and Find = 3, the lower bound will be 3 - 4/2 or Find - X/2.
The higher bound will be 3 + 4/2 or Find + X/2.
Start by writing a program that runs and works for the base case. Once that is done, sit down and figure out how you would find the higher and lower bounds for a more complicated case.
Good luck!

You can look at Extension methods. skip and take.
x.Skip(3).Take(4);
this will help u in what u r trying to do

No more "If / Then"; Dictionary Use?

I am writing a piece of code that to work would require an extensive amount of if/then statements. In order to eliminate the need for writing line upon line of if/then statements can I use a dictionary or a list? If so can someone direct me to a good web resource or show an instance where they have done it before?
Edit
Clarification: I have six inputs, each are to be combo boxes with a group of selections. Below is a detail of the inputs and the selections.
(Amps) 1:1 - 1:12 (12 different selections)
(Cable Size) 2:1 - 2:13 (13 different selections) Certain items in this list will be excluded by the selection of the first input.
(Cable Type) 3:1 - 3:2 (2 different selections)
(Temp Rating) 4:1 - 4:3 (3 different selections)
(System Type) 5:1 - 5:2 (2 different selections)
(Conduit Type) 6:1 - 6:2 (2 different selections)
From the above input will come two outputs which will appear in two text boxes.
(Cable Qty) 7:1 - 7:16 (16 different outputs)
(Conduit Size) 8:1 - 8:8 (8 different outputs)
I hope this serves to help and not hinder.

It looks like you're trying to map each combination of the 6 inputs (12 * 13 * 2 * 3 * 2 * 2 possibilities) to one of the (16 * 8) outputs. If that's the case, you'll still have a lot of typing to do - but moving to a collection will allow you to easily externalize the mapping. I would guess that this would probably be best suited for a database table:
Amps | CableSize | CableType | TempRating | SystemType | ConduitType | CableQty | ConduitSize
You'd put a primary key on the 6 input columns, and then just do a simple SELECT:
SELECT CableQty, ConduitSize
FROM Table
WHERE Amps = #amps AND CableSize = #cableSize...etc
To do this in quick and dirty code, arrays would work:
const int AMPS = 0; const int CABLE_SIZE = 1; const int TEMP_RATING = 2; // etc.
var mappings = new Dictionary<int[], int[]>(12 * 13 * 2 * 3 * 2 * 2);
mappings.Add(
new int[] { 1, 1, 1, 1, 1, 1 }, // inputs
new int[] { 1, 2 } //outputs
);
// repeat...a lot
var outputs = mappings.First(inputs => {
inputs[AMPS] == myAmps
&& inputs[CABLE_SIZE] == myCableSize
&& inputs[TEMP_RATING] == myTempRating
&& // etc
});
It doesn't save you much typing - though you could use for loops and the like to populate the mappings if there's some sort of logic to it - but it's a hell of a lot more readable than 6 pages of if statements (I'd probably region off or partial class loading the mappings).

might want to give some idea of what you're doing with the if/the statements. If you're just obtaining a value from a key then, yes, a dictionary probably would work.
Dictionary<string,string> map = new Dictionary<string,string>();
... populate the map with keys...
Then use it...
string value = "default value";
if (map.ContainsKey(key))
{
value = map[key];
}

I actually suggest building a object model to store your settings. This will give you an opportunity to encapsulate your logic regarding what options are available at what times. Another benefit is that your Amp[1] control can bind to your SettingsContainer.Amp[1].Value, or however it ends up.

If you are always just obtaining a simple key-value lookup, then yes a dictionary lookup can replace a long string of if-then statements. However, if sometimes your logic is more complex than a simple key-value lookup, you may have to create a hybrid of if-then statements plus dictionary lookups. Sometimes these will be combined to make one logical statement.
The only correct answer in your case is to follow exactly what your business domain dictates. If you can simplify to dictionary lookups most of the time, then use them. Don't be too rigid to choose one over the other, though. Usually business logic is too messy to fall neatly into place like that.

A dictionary look up is possible but I don't believe it is feasible with the problem you have described.
David Thomas Garcia has given a good solution to your problem. I like that solution because it makes a nice encapsulation in a business object that you could possibly reuse and I would expect it to simplify maintenance/debugging for you as well.
Have the object model expose default lists for each list of choices, then as each choice is selected, have the lower levels of choices automatically filter.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

deedle and linq FillMissing values - c#

Related

Microsoft ml.net Concatenate 2 columns as a label

Refer to a specific index of an array depending on user input

Filling and manipulating matrices using MathNet.Numerics

C# Find the Next X and Previous Numbers in a sequence

No more "If / Then"; Dictionary Use?

Categories

Resources