ML.NET MakePredictionFunction dynamic type? - c#

I am able to dynamically train and create my regression model just fine from a string[] of column names. However, when I try to pass in a dynamic object with the same Parameter names as Dictionary Key Pair properties it throw the error:
System.ArgumentOutOfRangeException: 'Could not find input column '<MyColumn>'' Where <MyColumn> is the first parameter that the model is looking for.
private static void TestSinglePrediction(MLContext mlContext, dynamic ratingDataSample, int actual)
{
ITransformer loadedModel;
using (var stream = new FileStream(_modelPath, FileMode.Open, FileAccess.Read, FileShare.Read))
{
loadedModel = mlContext.Model.Load(stream);
}
var predictionFunction = loadedModel.MakePredictionFunction<dynamic, RatingPrediction>(mlContext);
var prediction = predictionFunction.Predict(ratingDataSample);
Console.WriteLine($"**********************************************************************");
Console.WriteLine($"Predicted rating: {prediction.Rating:0.####}, actual rating: {actual}");
Console.WriteLine($"**********************************************************************");
}
I suspect this is because the dynamic object doesn't contain the [Column] attributes that the standard class object I normally would pass in has.
However, I will eventually have hundreds of columns that are auto generated from transposing SQL queries so manually typing each column isn't a feasible approach for the future.
Is there any way I could perhaps apply the attribute at run time? Or any other way I can generically approach this situation? Thanks!

This is a great question. The dynamic objects don't work at runtime because ML.NET needs something called a SchemaDefinition for the objects that you pass in so that it knowns where to get the columns it expects.
The simplest way to solve your problem would be to define an object holding only the columns you need at scoring-time, annotated with Column attributes, and manually cast your dynamic object at runtime. This has the main advantage that since you do the casting to the scoring object yourself, you can handle missing data cases yourself without the ML.NET runtime throwing. While your SQL query may give you a large assortment of columns, you won't need the majority of these columns for scoring your model, and therefor don't need to account for them in the scoring object; you only have to account for the columns the model expects.
See this sample from the ML.NET Cookbook for an example of how to score a single row. Behind the scenes, ML.NET is taking the class you defined, and using attributes like Column to construct the SchemaDefinition.

Related

ProtoBuf.NET - Choose fields you want to load at runtime

We are using ProtoBuf.NET to serialize our report to file (using DataContract/DataMember attributes to mark up the fields we are interested in). Is there any way (at runtime) to mark which fields we want to Deserialize back ?
We need this feature because we are dealing with large data (1Mln rows with 250+ rows of data). and depending on linq query we run against it, want only to load/populate fields which are required (to save memory footprint mainly).
Yes, we are using IEnum way of retrieving data, but if you are doing any GroupBy in your linq, it tries to load everything which is causing OutOfMem (because of too many fields in it).
Well, there is, but...
var model = RuntimeTypeModel.Create();
var metaType = model.Add(typeof(Foo), false);
if(includeA) metaType.Add(1, "A");
//...
if(includeXYZ) metaType.Add(42, "XYZ");
var foo = (Foo)model.Deserialize(source, typeof(Foo));
but note that this will cause it to do all the assembly generation etc per RuntimeTypeModel instance - you would probably want to cache a model per field subset. This could be quite easy if your choice of fields is via a [Flags] enum, as you could just use a Dictionary<YourFields, RuntimeTypeModel>

Linq select new with dynamic variable names

I want to make a generalized Data Table to Linq Collection
I am a beggginer so if it's not possible please let me know
public void Something(DataTable dt)
{
var data = from row in dt.AsEnumerable()
select new {
Order = row["Order"].ToString(),
Something = row["Something"].ToString(),
Customer = row["Customer"].ToString(),
Address = row["Address"].ToString()
};
}
That is the code for one table
i want something like this:
public static void convertDatatable(DataTable dt)
{
var results = from myRow in dt.AsEnumerable()
select new
{
foreach(DataColumn column in dt.Columns)
column.ColumnName // linq Variable name
= myRow[column.ColumnName];// linq Variable Value
};
}
I know it doesn't work how i wrote it but is there another way ?
Note: the reason i am doing this is because i can't convert Datatable directly to JSON it serializes it to XMl then sends it as a string containing that xml.
If you want to stay with datatables then there is this, mentioned in another SO: What should I use to serialize a DataTable to JSON in ASP.NET 2.0?, which links to What should I use to serialize a DataTable to JSON in ASP.NET 2.0?.
I highly recommend, however, that you consider moving away from DataTables and DataRows, replacing it instead with an ORM such as Entity Framework (EF Quick Start here) or Linq to Sql - there are others, but since you are a beginner these offer the easiest learning curve; not least because of the full designer support in Visual Studio.
For the standard forms of JSON serialization offered by .Net (e.g. WCFs DataContractSerializer or the Asp.Net JSON serializer) then you need concrete types. The ORM solution will create all your table wrapper types at design-time, giving you a concrete type, potentially, for every table in your database.
As for the idea you've specifically outlined above, it is exceptionally difficult to achieve - because the compiler, in the first example, dynamically generates a type whose members match the names and types of the expressions you use. If you open your compiled code in ILSpy and switch to IL instead of C# you'll see what I mean.
Therefore, to reproduce it dynamically you would need to dynamically emit a class, probably using ILGenerator, doing the same thing; and then dynamically emit the expression tree (using the Expression class' static factory methods) to fill it out; and finally compile and execute it.
I would only look at doing something like that if I literally couldn't do it any other way - I'd be more likely to just write a routine to iterate through each column and write the JSON to a StringBuilder and return that! But if I could use an ORM, then I'd do that instead.

Convert data from object of class A to an object from class B

I have three different classes:
Task
Order
Transmission
Each class have properties with different types. Also, there is a possibility to attach data that represented by custom fields (implemented by an array of IField, where IField can be text field or list field). Each custom field have a name that represent the name of the attached data property.
I need to convert between each class to another:
Task -> Order
Order -> Transmission
Transmission -> Task
Transmission -> Order
Order -> Task
Task -> Transmission
for that I created:
Static class of static keys where each key represents the name of
the property.
"DataObject" that holds a dictionary of a property name and an object as its value.
Each class (Task, Order, Transmission) implements IDataShare interface:
public interface IDataShare
{
DataObject ToDataObject();
void FromDataObject(DataObject data);
}
For example, task objects with the following properties:
WorkerId:5
CustomerId:7
VehicleId:null
StartDate:null
And with the following custom fields:
Subcontractor: {listId:5, Value:4} (this is list field)
delivery Note: "abc" (this is text field)
will be convert to the following dictionary:
{"WorkerId", 5}
{"CustomerId", 7}
{"VehicleId", null}
{"StartDate", null}
{"Subcontractor", {listId:5, Value:4}}
{"delivery Note", "abc"}
the string keys "WorkerId", "CustomerId", "VehicleId", "StartDate" were taken from static class that contains string consts where "Subcontractor" and "deliveryNote" are the names of the custom fields the user added (I don't know which fields the user might add so I just use the field name).
When I fill an object using DataObject I have to verify the name of the property is the same as the name of the key and also verify the value is correct (string cannot inserted into long).
In addition, custom list field (subcontractor) can't have only itemId as a value because when I have to verify that the listId of the custom field in the object is the same with the listId of the customField in the DataObject.
I have many problems about knowing the type of the value. I always have to use "X is Y" if statements of "X as Y" statements. In addition, I have to remember how to store the type of the value when implementing IDataShare interface which makes the work harder.
Can anyone help me think about constraint I can add to the conversion proccess from an object to DataObject? Can anyone help me improve this method of converting objects?
Thanks!
UPDATE
I want to explain a point. My biggest problem is that there are several ways to translate each property/custom field so I need to remember the type of the value in DataObject. For example, in Transmission class I have VehicleId property. I want to convert Task object with custom field with the name "VehicleId" to Transmission. All I want is that Task's custom field VehicleId's value will be converted into the VehicleId property of Transmission. But, because it is custom field - as I wrote before - there is a way I store custom field that based on a list: {listId:5, Value:4}. Now, in the conversion proccess (FromDataObject in Transmission) in case the DataObject has "VehicleId" key, I have to check whether the value is long (vehicle id as property) or IListField (vehicle id as custom list field).
those type checking really makes mess.
Well, if the number of classes you're converting between is really as limited as you've said, may I suggest just writing casting operators for your classes?
http://msdn.microsoft.com/en-us/library/xhbhezf4%28v=VS.100%29.aspx
It seems like the amount of logic that you're putting into the conversion is enough to warrant something like this.
On the other hand, it seems like there is a base set of fields being used across the different objects and you're just stuffing them into an untyped dictionary. If the fields are common across all types, could you use a conversion to a strongly typed common object?
Which also begs the question: could you use a common base class object?
If you have options of modifying the Task, Order, and Transmission definitions, I'd take a look at them again. This sort of scenario seems like a "code smell".
If I understand this correctly ToDataObjectis basically a serializer and FromDataObject is a deserializer. If the data contained by these object is type compatible, then it seems that the very act of serializing it into untyped data is the source of your problem. Why do this, instead of just keeping the data in its native format?
If you need to use an adapter because there are incompatibilities between the objects that can't be resolved for some reason, I would think that you can make one that at least keep the data in its native structures instead of serializing everything to a string. A dictionary in C# can contain anything, at a minimum you could be using a Dictionary<string,object>.
It's also unclear what all this verification is about - why would data be incompatible, if you are mapping properties of the same data types? E.g. assuming that this is an internal process, under what circumstance could (e.g.) a string from one object be trying to be assigned to a long in another object? Seems that would only be necessary if the data were strongly typed in one object, but not in another.
Have you considered using generics?
If Task, Order and Transmission all inherit from a base class like Property, then you could expose a common method for getting the values you need.
GetMyValue() where T : Property
It's not very clear what you are trying to achieve.

How to create objects dynamically with C#?

I'm trying to create objects dynamically but I don't know how to. What I need is, I have a class for that object, and objects properties are stored in the database. Then I'll need to compare the properties of each object to get the desired result.
So I need to dynamically create objects on the fly with the properties loaded from database.
I don't think you need to create objects dynamically, just create one statically that matches your db schema with the property details, then you can compare the values of the properties across rows, or within an instance of your object.
I have been working on something similar to this. There are several things:
Include the System.Reflection namespace
Create an object dynamically using Activator
Get the object properties using the myObjectType.GetProperties() method
Here is an example of a generic object creation function using the above methods:
using System.Reflection;
public static Item CreateItem<Item>(object[] constructorArgs, object[] propertyVals)
{
//Get the object type
Type t = typeof(Item);
//Create object instance
Item myItem = (Item)Activator.CreateInstance(t, constructorArgs);
//Get and fill the properties
PropertyInfo[] pInfoArr = t.GetProperties();
for (int i = 0; i < pInfoArr.Length; ++i)
pInfo.SetValue(myItem, propertyVals[i], null); //The last argument is for indexed properties
return myItem;
}
Of course the above example assumes that the values in the property value array are arranged correctly, which is not necessarily the case, but you get the idea.
With the PropertyInfo class you can get properties, get property names, get attributes associated with the properties, etc. Powerful technology. You should be able to do what you need with the above info, but if not let me know and I will add more info.
If you have a number of objects you want to instantiate from database values it can be done something like this.
//database code goes here, results go in results
List<ClassName> l = new List<ClassName>()
foreach(Row r in results){
l.Add(new ClassName(){ClassProperty1 = r.Property1,ClassProperty2 = r.Property2});
}
Are you talking about Dictionary?
var dict=new Dictionary<string, string>();
dict.Add("property1", "val1");
dict.Add("property2", "val2");
var prop2val=dict["property2"];
Maybe Activator is what your looking for?
http://msdn.microsoft.com/en-us/library/system.activator.aspx
Check this class, compile in the realtime. But it's performance is not quite good.
http://msdn.microsoft.com/zh-cn/library/microsoft.csharp.csharpcodeprovider(VS.80).aspx
You could use reflection to dynamically build your objects:
Reflection msdn reference
I think that you want to retrieve rows from the DB and directly assign them to object given that the properties of the object are equivalent to the columns of DB table. If that what you mean then I believe you can't :)
Rob Conery did a small project called Massive that pretty much does what you're trying to accomplish. It's essentially a small ORM, in 400 lines of Dynamic C# 4.0 code.
Rob has been doing this kind of thing for quite some time with SubSonic, so you might find his approach with Massive quite interesting.
http://blog.wekeroad.com/helpy-stuff/and-i-shall-call-it-massive
Some of the code is explained here, with examples:
http://blog.wekeroad.com/microsoft/the-super-dynamic-massive-freakshow

Using Dynamic LINQ (or Generics) to query/filter Azure tables

So here's my dilemma. I'm trying to utilize Dynamic LINQ to parse a search filter for retrieving a set of records from an Azure table. Currently, I'm able to get all records by using a GenericEntity object defined as below:
public class GenericEntity
{
public string PartitionKey { get; set; }
public string RowKey { get; set; }
Dictionary<string, object> properties = new Dictionary<string, object>();
/* "Property" property and indexer property omitted here */
}
I'm able to get this completely populated by utilizing the ReadingEntity event of the TableServiceContext object (called OnReadingGenericEvent). The following code is what actually pulls all the records and hopefully filter (once I get it working).
public IEnumerable<T> GetTableRecords(string tableName, int numRecords, string filter)
{
ServiceContext.IgnoreMissingProperties = true;
ServiceContext.ReadingEntity -= LogType.GenericEntity.OnReadingGenericEntity;
ServiceContext.ReadingEntity += LogType.GenericEntity.OnReadingGenericEntity;
var result = ServiceContext.CreateQuery<GenericEntity>(tableName).Select(c => c);
if (!string.IsNullOrEmpty(filter))
{
result = result.Where(filter);
}
var query = result.Take(numRecords).AsTableServiceQuery<GenericEntity>();
IEnumerable<GenericEntity> res = query.Execute().ToList();
return res;
}
I have TableServiceEntity derived types for all the tables that I have defined, so I can get all properties/types using Reflection. The problem with using the GenericEntity class in the Dynamic LINQ Query for filtering is that the GenericEntity object does NOT have any of the properties that I'm trying to filter by, as they're really just dictionary entries (dynamic query errors out). I can parse out the filter for all the property names of that particular type and wrap
"Property[" + propName + "]"
around each property (found by using a type resolver function and reflection). However, that seems a little... overkill. I'm trying to find a more elegant solution, but since I actually have to provide a type in ServiceContext.CreateQuery<>, it makes it somewhat difficult.
So I guess my ultimate question is this: How can I use dynamic classes or generic types with this construct to be able to utilize dynamic queries for filtering? That way I can just take in the filter from a textbox (such as "item_ID > 1023000") and just have the TableServiceEntity types dynamically generated.
There ARE other ways around this that I can utilize, but I figured since I started using Dynamic LINQ, might as well try Dynamic Classes as well.
Edit: So I've got the dynamic class being generated by the initial select using some reflection, but I'm hitting a roadblock in mapping the types of GenericEntity.Properties into the various associated table record classes (TableServiceEntity derived classes) and their property types. The primary issue is still that I have to initially use a specific datatype to even create the query, so I'm using the GenericEntity type which only contains KV pairs. This is ultimately preventing me from filtering, as I'm not able to do comparison operators (>, <, =, etc.) with object types.
Here's the code I have now to do the mapping into the dynamic class:
var properties = newType./* omitted */.GetProperties(
System.Reflection.BindingFlags.Instance |
System.Reflection.BindingFlags.Public);
string newSelect = "new(" + properties.Aggregate("", (seed, reflected) => seed += string.Format(", Properties[\"{0}\"] as {0}", reflected.Name)).Substring(2) + ")";
var result = ServiceContext.CreateQuery<GenericEntity>(tableName).Select(newSelect);
Maybe I should just modify the properties.Aggregate method to prefix the "Properties[...]" section with the reflected.PropertyType? So the new select string will be made like:
string newSelect = "new(" + properties.Aggregate("", (seed, reflected) => seed += string.Format(", ({1})Properties[\"{0}\"] as {0}", reflected.Name, reflected.PropertyType)).Substring(2) + ")";
Edit 2: So now I've hit quite the roadblock. I can generate the anonymous types for all tables to pull all values I need, but LINQ craps out on my no matter what I do for the filter. I've stated the reason above (no comparison operators on objects), but the issue I've been battling with now is trying to specify a type parameter to the Dynamic LINQ extension method to accept the schema of the new object type. Not much luck there, either... I'll keep you all posted.
I've created a simple System.Refection.Emit based solution to create the class you need at runtime.
http://blog.kloud.com.au/2012/09/30/a-better-dynamic-tableserviceentity/
I have run into exactly the same problem (with almost the same code :-)). I have a suspicion that the ADO.NET classes underneath somehow do not cooperate with dynamic types but haven't found exactly where yet.
So I've found a way to do this, but it's not very pretty...
Since I can't really do what I want within the framework itself, I utilized a concept used within the AzureTableQuery project. I pretty much just have a large C# code string that gets compiled on the fly with the exact object I need. If you look at the code of the AzureTableQuery project, you'll see that a separate library is compiled on the fly for whatever table we have, that goes through and builds all the properties and stuff we need as we query the table. Not the most elegant or lightweight solution, but it works, nevertheless.
Seriously wish there was a better way to do this, but unfortunately it's not as easy as I had hoped. Hopefully someone will be able to learn from this experience and possibly find a better solution, but I have what I need already so I'm done working on it (for now).

Categories