Microsoft ml.net Concatenate 2 columns as a label

Microsoft ml.net Concatenate 2 columns as a label - c#

I've been wondering if it was possible to concatenate 2 columns of the datatype string into the label column.
What I tried was:
pipeline.Add(new ColumnConcatenator("Label", "string1", "string2"));
But that just spits out a V2(text, 2). And a Label must be of type R4-R8.
The reason I need this is because I only have 2 input variables and I want to use regression to determine which is the best.
Thanks !

ColumnConcatenator is currently taking your two columns and producing a new vector typed column of width two. It's taking a and b and up-converting to a vector [a, b].
I think you're asking how to produce a new Label equal to a + b where a & b are strings. Eg: var a = "Hello"; var b = "World"; var c = a + c; // c is HelloWorld.
There is no way currently in ML.NET to accomplish the second method (normal string concat). You may want to combine your strings before the ML.NET code. It's something we'll look into for future versions of ML.NET, and you're invited to submit an issue requesting it: https://github.com/dotnet/machinelearning/issues/new.
Update:
We have added the expression transform which can be used to concatenate strings (among many other things).
Usage:
pipeline.Append(ML.Transforms.Expression("Label", "(x, y) : concat(x, \"-\", y)", "LabelColOne", "LabelColTwo"))
For input of LabelColOne="Cat" and LabelColTwo="Dog", it join them together with a "-" to produce Label="Cat-Dog".

Related

deedle and linq FillMissing values

Im trying to figure out how to use deedle to fill in some missing values in a list and its proving to be quite a headache, there are no real examples in c# i can find (it all centres around F#)
I have a collection of objects and I want to calculate the average of the value before and the value after in the list and assign it to the missing value. Ive created a frame below
var dfObjects = Frame.FromRecords(prices);
so I now want to perform the calculations, but I just cant work out how. There is an F# example which supposedly does just what im after, but there is no C# version anywhere.
Heres the F# version
// Fill missing values using interpolation function
ozone |> Series.fillMissingUsing (fun k ->
// Get previous and next values
let prev = ozone.TryGet(k, Lookup.ExactOrSmaller)
let next = ozone.TryGet(k, Lookup.ExactOrGreater)
// Pattern match to check which values were available
match prev, next with
| OptionalValue.Present(p), OptionalValue.Present(n) ->
(p + n) / 2.0
| OptionalValue.Present(v), _
| _, OptionalValue.Present(v) -> v
| _ -> 0.0)
has anyone else done something similar ?
Ive managed to get all the series within my list like this
var frameDate = dfObjects.IndexRows<object>("SettlementDate").SortRowsByKey();
inspecting the object at runtime the data property gives me 4 series, two of which are the ones with missing values, how on earth do I interpolate values that are mising in these series ? The F# example below uses FillMissingUsing, this isnt available in C#. Can anyone help ?

The one notable difference is that the F# example shows how to fill missing data in a series, but you are interested in filling data in a whole data frame. The best way to do this is to process individual columns (those with missing data) independently.
Given a data frame df that contains a column named Whatever, you can fill the missing values using the above logic and replace the column in the data frame using:
var series = df.Columns["Whatever"].As<double>();
var filledSeries = series.FillMissing(dt => {
var before = series.TryGet(dt, Lookup.ExactOrSmaller);
var after = series.TryGet(dt, Lookup.ExactOrGreater);
if (before.HasValue && after.HasValue) return (before.Value + after.Value) / 2.0
else if (before.HasValue) return before.Value;
else if (after.HasValue) return after.Value;
else return 0.0;
});
df.ReplaceColumn("Whatever", filledSeries);

Refer to a specific index of an array depending on user input

Trying to make a program where it takes in an equation from user and outputs the answer. Having to develop all code for it to work, and wondering how I could refer to either array index value either side of a specific index?
Example:
User input: X = 5 + 5 * 6
Wanting to be able to locate the ***** and retrieve the value of the 5 & 6 either side of it. Tried multiple things and tried searching here too, cannot find a answer.
Thankyou in advance to anyone who takes the time to help!

One way to do it is to use regular expressions. I would remove all whitespaces first to make it easier. For example:
string input = "X = 5 + 5 * 6";
Regex r = new Regex(#"(\d+)\*(\d+)");
Match m = r.Match(input.Replace(" ", ""));
var a = m.Groups[1].Value; // a = 5
var b = m.Groups[2].Value; // b = 6
Explanation of this regex can be found here: https://regex101.com/r/k7rthI/1
You would have to improve the regex to handle decimals and many other cases when the equation gets more complex. It is a long dark rabbit hole to go down if you get more complex than what you have. So you might be better off finding a math library you can use. No reason to reinvent the wheel.

.NET query getting inappropriate answer

I was at career fair and I was asked a question "what does this query do and in which language it is".
I told that it is in .NET LINQ, but was unable to predict what it does.
Can any one help me
I wrote in .NET and tried .
var youShould = from c
in "3%.$#9/52#2%35-%#4/#./3,!#+%23 !2#526%N#/-"
select (char)(c ^ 3 << 5);
Label1.Text = youShould.ToString();
And got this output :
System.Linq.Enumerable+WhereSelectEnumerableIterator`2[System.Char,System.Char]

First of all, don't feel bad that you didn't get the answer. I know exactly what's going on here and I'd have probably just laughed and walked away if someone asked me what this did.
There's a number of things going on here, but start with the output:
var youShould = from c in "3%.$#9/52#2%35-%#4/#./3,!#+%23 !2#526%N#/-"
select (char)(c ^ 3 << 5);
Label1.Text = youShould.ToString();
>>> System.Linq.Enumerable+WhereSelectEnumerableIterator`2[System.Char,System.Char]
When you run a LINQ query, or use any of the equivalent methods like Select() that return sets, what you get back is a special, internal type of object called an "iterator", specifically, an object that implements the IEnumerable interface. .NET uses these objects all over the place; for example, the foreach loop's purpose is to iterate over iterators.
One of the most important things to know about these kind of objects is that just creating one doesn't actually "do" anything. The iterator doesn't actually contain a set of things; rather, it contains the instructions needed to produce a set of things. If you try to do something like, e.g. ToString on it, the result you get won't be very useful.
However, it does tell us one thing: it tells us that this particular iterator takes a source list of type char and returns a new set, also of type char. (I know that because I know that's what the two generic parameters of a "select iterator" do.) To get the actual results out of this iterator you just need to loop over it somehow, e.g.:
foreach (var c in youShould)
{
myLabel.Text += c;
}
or, slightly easier,
myLabel.Text = new string(youShould.ToArray());
To actually figure out what it does, you have to also recognize the second fact: LINQ treats a string as a "set of characters". It is going to process each character in that string, one at a time, and perform the bit-wise operations on the value.
The long-form equivalent of that query is something like this:
var input= "3%.$#9/52#2%35-%#4/#./3,!#+%23 !2#526%N#/-";
var output = string.Empty;
foreach (var c in input)
{
var i = (int)c;
var i2 = i ^ (3 << 5);
var c2= (char)i2;
output += c2;
}
If you did the math by hand you'd get the correct output message. To save you the brain-numbing exercise, I'll just tell you: it toggles bits 5 and 6 of the ASCII value, changing each character to one further up the ASCII table. The resulting string is:
SEND YOUR RESUME TO [an email address]
Demostrative .NET Fiddle: https://dotnetfiddle.net/x7UvYA

For each character in the string, project the character by xor-ing it with (3 left shifted by 5), then cast the numeric value back to a char.
You could generate your own code strings by running the query again over an uncoded string, because if you XOR a number twice by the same value, you'll be left with the same number you started with. (e.g. x ^ y ^ y = x)
I'll leave it as an exercise to the reader to figure out what the following is:
4()3#)3#!#$5-"#4%34
I suppose it tests:
Linq to objects
Understanding of IEnumerable<T> interface and how it relates to strings
Casting
Bitwise operations
Bitwise operator precedence
Personally, I think this is a useless test that doesn't really reflect real world problems.

linq orderby issue (sorting strings by custom rules)

I have a linq query which is not ordered the way I would like.
The Query:
return (from obj in context.table_orders
orderby obj.order_no
select obj.order_no.ToString() + '-' + obj.order_description).ToList<string>();
What happens is that my records are ordered alphabeticaly, is there a Linq keyword I can use so my records are ordered correctly (so order 30 comes before order 100)?
I want the result to be a list of string since this is used to populate a ComboBox.
Also some of the 'order_no' in the DB are like '2.10' and '9.1.1'.

What happens is that my records are ordered alphabeticaly, is there a Linq keyword I can use so my records are
ordered correctly (so order #30 comes before order #100)?
If I would get a cente everytime someone asks this I would be rich.
Yes, there is - simple answer: ORDER A NUMBER NOT A STRING.
so order #30 comes before order #100)
But #30 comes AFTER #100 for the simple reason that they ARE sorted alphabetically becase THEY ARE STRINGS.
Parse the string, convert the number to - well - a number, and order by it.
WHOEVER had the idea that order_no should be a string WITHOUT A FIXED LENGH (like 00030) should - well - ;) get a basic education on database modelling. I really like things like invoice numbers etc. to be strings (they are NOT numbers) but keeping them in (a) a defiable pattern and (b) checksummed (so that data entry errors are easily catched) should be basics ;)
This is the kind of issue you get with junior people defining databases and data models and not thinking about the consequences.
You are in for some pain - parse the string, order by parsing result.

If obj.order_no is a string, then convert it to a number for sorting
orderby Int32.Parse(obj.order_no)
or
orderby Decimal.Parse(obj.order_no)
of cause this works only if the string represents a valid number.
If order_no is not a valid number (e.g. "17.7-8A") then write a function that formats it to contain right aligned numbers, like "00017.007-0008A" and sort using this function
orderby FormatOrderNo(obj.order_no)
UPDATE
Since you are working with EF you cannot call this function in the EF part of your query. Convert the EF-result to an IEnumerable<T> and perform the sorting using LINQ-To-Objects
return (from obj in context.table_orders select ...)
.AsEnumerable()
.OrderBy(obj => FormatOrderNo(obj.order_no))
.ToList();

Based on what you said - that the numbers are not really numbers but rather custom sequence identifiers (i.e. you don't even know what level of depth you get) I would suggest implementing a custom comparer.
If you do this you can define what you exaclty want - and that is I believe something along these lines:
split the string on .
compare the sequences up to and including the array.length of the shorter sequence
if there was a shorter sequence and by now there is a tie, pick the shorter before the longer (i.e. 2.1 before 2.1.1)
if both sequences had the same length, by the end of the comparison you should know which one is 'bigger'
solved.
If you need inspiration on IComparer implementation an example below:
http://zootfroot.blogspot.co.uk/2009/09/natural-sort-compare-with-linq-orderby.html

I would either change the datatype if possible or add it as another firld if applicable.
If that is a no-go you can look at the solutions mentioned by others in this question but beware - The .ToList() option is nice for small tables if you are pulling everything from them but getting used to it will eventually give you a world of pain. You don't wanna get everything in the long run, either use a Where or top critera.
The other solutions are nice but to complex for my taste to accomplish the task.
You could go with shooting sql directly through LinqToSql. http://msdn.microsoft.com/en-us/library/bb399403.aspx
In the sql you are free to convert and sort however you like. Some think this is a great idea and some will tell you it it bad. You loose strong typing and gain performance. You will have to know WHY you take this kinds of decision, that is the most important thing imho.

Since nobody came up with a custom orderby function translatable into SQL, I went for the IComparer function like so:
public class OrderComparer<T> : IComparer<string>
{
#region IComparer<string> Members
public int Compare(string x, string y)
{
return GetOrderableValue(x.Split('-').First()).CompareTo(GetOrderableValue(y.Split('-').First()));
}
#endregion
private int GetOrderableValue(string value)
{
string[] splitValue = value.Split('.');
int orderableValue = 0;
if (splitValue.Length.Equals(1))
orderableValue = int.Parse(splitValue[0]) * 1000;
else if (splitValue.Length.Equals(2))
orderableValue = int.Parse(splitValue[0]) * 1000 + int.Parse(splitValue[1]) * 100;
else if (splitValue.Length.Equals(3))
orderableValue = int.Parse(splitValue[0]) * 1000 + int.Parse(splitValue[1]) * 100 + int.Parse(splitValue[2]) * 10;
else
orderableValue = int.Parse(splitValue[0]) * 1000 + int.Parse(splitValue[1]) * 100 + int.Parse(splitValue[2]) * 10 + int.Parse(splitValue[3]);
return orderableValue;
}
}
The values have a maximum of 4 levels.
Anyone has a recommandation?

entering a string to be used as a formula , in c#

I've found the shunting yard algorithm, but it feels like case is a little bit different.
What if the formula your entering has variables that you want to use in another formula
you enter
cout<<"enter string"<<endl;
cin>>f;
and you enter f as
3 * (X*X) + 2*X + 7
and you use that string and parse it to find the root, for example.

While it is hard to understand what you are looking for, it hints at the ability to evaluate strings as formulas. If your looking for an Eval like solution, take a look at nCalc. Stealing from their webpage:
Expression e = new Expression("2 + 3 * 5");
Debug.Assert(17 == e.Evaluate());
It also allows you to use parameters, like
e.Parameters["X"] = 10;
and then use X as part of your string to evaluate.
It let's you convert a string representing a formula into it's final value. If you are not asking about interpreting strings at formulas then I am at a loss for what you are asking. I recommend revising your question to better articulate the inputs and outputs of your proposed functionality.

One solution that comes to my mind is, may be you can create a valid C# code from that string at runtime and make it compile, and make it execute. For example your code cam become in memory like
using namespace FormulaCalculator
{
public class Calculator
{
public static object Calculate()
{
return 3 * (X*X) + 2*X + 7 ; //where X is replaced value at runtime.
}
}
}
Compile this with CodeDome in in-memory assembly and run. Here an example Example
object resultOfFormula = FormulaCalculator.Calculator.Calculate()

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Microsoft ml.net Concatenate 2 columns as a label - c#

Related

deedle and linq FillMissing values

Refer to a specific index of an array depending on user input

.NET query getting inappropriate answer

linq orderby issue (sorting strings by custom rules)

entering a string to be used as a formula , in c#

Categories

Resources