ProtoBuf.NET - Choose fields you want to load at runtime - c#

We are using ProtoBuf.NET to serialize our report to file (using DataContract/DataMember attributes to mark up the fields we are interested in). Is there any way (at runtime) to mark which fields we want to Deserialize back ?
We need this feature because we are dealing with large data (1Mln rows with 250+ rows of data). and depending on linq query we run against it, want only to load/populate fields which are required (to save memory footprint mainly).
Yes, we are using IEnum way of retrieving data, but if you are doing any GroupBy in your linq, it tries to load everything which is causing OutOfMem (because of too many fields in it).

Well, there is, but...
var model = RuntimeTypeModel.Create();
var metaType = model.Add(typeof(Foo), false);
if(includeA) metaType.Add(1, "A");
//...
if(includeXYZ) metaType.Add(42, "XYZ");
var foo = (Foo)model.Deserialize(source, typeof(Foo));
but note that this will cause it to do all the assembly generation etc per RuntimeTypeModel instance - you would probably want to cache a model per field subset. This could be quite easy if your choice of fields is via a [Flags] enum, as you could just use a Dictionary<YourFields, RuntimeTypeModel>

Related

Automapper transform source just temporarily

I write a generic possibility to convert from database object to business object.
My business object contains custom attributes and depending on them, I like to make specific operations on them.
On reading from db its quite easy because I can use aftermap (not perfect solution, cause I have to do it by reflection and set the value depending on it)
But on writing back to the database I have to do it beforeMap but this would change the source permanent, but I just like it in a transient way. So do the operation with Source on the fly but do not change source object.
It's a generic option so I can't work with properties.
protected static T MapFromDatabaseWithConversion<T, TSource>(TSource source) where T : MappingModel, new()
{
var config = new MapperConfiguration(cfg => cfg.CreateMap<TSource, T>().AfterMap((src, dest) => dest.ConvertFromDatabase()));
return config.CreateMapper().Map<T>(source);
}
Do you have any solution for the check on the fly the attribute of a property and change the value depending on it - or you have any idea to change source only on the fly, so not write the result of source operation to src obj?
Thank you very much.
I think you have to include value tracking in your objects. For each class member you would need a boolean to reflect if the value changed, and a method that checks them all at once such as isObjectChanged(). You can hard code this or wrap your object in a Proxy object at runtime, which is more complicated, but does not clutter you class with value tracking data/methods. On the other hand, Java Data Objects (https://db.apache.org/jdo/) can do this for you by re-compiling your class files to include value tracking within the class about changing values. It takes a bit to set up and may be overkill for your specific question, but I have used it many times when targeting multiple data sources in the same project such as a database or spreadsheet. JDO allows me to use the same code with a different data type manager that can be swapped at runtime. You can also target a No-SQL database and other data stores as well.

StackExchange.Redis Send Structs

I am new to redis and I wonder how to send a class or a struct with StackExchange.Redis.
So lets assume I want to write
var redis = ConnectionMultiplexer.Connect("localhost");
var db = redis.GetDatabase();
db.StringSet(key, value);
This is actually only possible if my value is primitive. So is there any other way to achieve sending complex types without serializing them as json?
Since Redis is not aware of your class or struct, you'll need to define how to store it. A recommended way is to store the object as a hash, where the key is the property name, and the value is the property value. Note that this does not support object graphs, e.g. nested collections or complex types.
As per the documentation on data types:
A hash with a few fields (where few means up to one hundred or so) is stored in a way that takes very little space, so you can store millions of objects in a small Redis instance.
Alternatively, you could serialize the object yourself and store it as a string/byte[]. Json is one format, which includes the property names in the data, which is great for versioning. E.g. if a new property is added, you wouldn't need to go and change all existing data. The downside is that it takes up a lot of space. You could use any other form of serialization as well, e.g. binary serialization.

Generic serializer with ProtoBuf.net

I'm attempting to write a generic serializer using protobuf.net v2. However I'm running into some issues which make me wonder if perhaps what I'm doing is impossible. The objects to be serialized are of an indeterminate type to which I don't have access so I'm attempting to walk the object and add its properties to the type model.
var model = TypeModel.Create();
List<string> propertiesToSerialize = new List<string>();
foreach (var property in typeToSerialize.GetProperties())
{
propertiesToSerialize.Add(property.Name);
}
model.AutoAddMissingTypes = true;
model.Add(typeToSerialize, true).Add(propertiesToSerialize.ToArray());
For simple objects which contain only primitives this seems to work just fine. However when working with an object which contains, say, a Dictionary<string,object> I encounter an error telling me that no serializer is registered for Object.
I did look at serializing a Dictionary<string,object> in ProtoBuf-net fails but it seems the suggested solution requires some knowledge and access to the object being serialized.
Any suggestions on how I might proceed?
protobuf-net does not set out to be able to serialize every scenario (especially those dominated by object), in exactly the same way that XmlSerializer and DataContractSerializer have scenarios which they can't model. In particular, the total lack of metadata in the protobuf format (part of why it is very efficient) means that it is only intended to be consumed by code that knows the structure of the data in advance - which is not possible if too much is object.
That said, there is some support via DynamicType=true, but that would not currently be enabled for the dictionary scenario you mention.
In most cases, though, it isn't really the case that the data can be anything; more typically there are a finite number of expected data types. When that is the case, the object problem can be addressed in a cleaner way using a slightly different model (specifically, a non-generic base-type, a generic sub-type, and a few "include" options). As with most serialization, there are scenarios were it may be desirable to have a separate "DTO" model, that looks closer to the serialization output than to your domain model.
A final note: the GetProperties()/Add() approach is not robust, as GetProperties() does not guarantee any particular order to the members; with protobuf-net in the way you show, order is important, as this helps determine the keys to use. Even if the order was fixed (sorting alphabetically, for example), note that adding a member could be a breaking change.

C# - Get property in member class using Reflection

SHORT VERSION
What's the best way to use reflection to turn something like string prop = "part1.first_name"; into a System.Reflection.PropertyInfo, so that I can use the GetValue and SetValue functions?
LONG VERSION
I'm using ASP .NET MVC to build a questionnaire for my organization. It's very long, so it's divided into several different pages. Since it's not uncommon for us to get requests like, "Can you move this question to that page, and this other question to another page," I need to build this to be pretty flexible for a junior programmer to change.
My model is a complex class (it's got five member classes that have mostly primitive-typed properties on them).
So, I access it by doing things like Model.part1.first_name or Model.part2.birth_date.
Since the same model is used on all of the pages, but not all of the questions are on every page, I have ActionAttributes that essentially clear out all of the properties that were submitted on the form except for the ones that were displayed on that page (so someone can't inject a hidden field into the form and have the value persist to the database).
I want to make sure that I only save valid field values and don't let the user proceed to the next page until the current one is entirely OK, but I also want to save the values that are valid, even if the user isn't allowed to proceed.
To do this, I have a function that takes two instances of my model class, a reference to the ModelStateDictionary, and a string[] of field names like "part1.first_name" and "part2.birth_date". That function needs to copy all of the values listed in the string array that do not have validation errors from the first (ie, form-submitted) object into the second (ie, loaded from the db) object.
As stated above, what's the best way to use reflection to turn something like "part1.first_name" into a System.Reflection.PropertyInfo, OR, is there a better way to accomplish this?
var infoParts = prop.Split('.');
var myType = Type.GetType(infoParts[0]);
var myPropertyInfo = myType.GetProperty(infoParts[1]);
Assuming "part1" is your type. Although this is very limited and very dependent on the string being in the correct format and the type being in the current scope.
I would probably handle this differently, using data. I would keep, in the database, which step each question belongs to. To render that step, I would select the questions that match that step and have a model that contains a list of question id/question pairs. Each input would be identified by the question id when posted back. To validate, simply compare the set of question ids with the expected ids for that step. This way, to change which question goes in which step is to only change the data in the database.
If you do end up going down that road, you'll need to split the string into parts and recursively or iteratively find the property on the object at each step.
PropertyInfo property = null;
Type type = questionModel.GetType();
object value = questionModel;
object previousObj = null;
foreach (var part in questionId.Split('.'))
{
property = type.GetProperty(part);
previousObj = value;
value = property.GetValue(value,null);
type = value.GetType();
}
// here, if all goes well, property should contain the correct PropertyInfo and
// value should contain that property's value...and previousObj should contain
// the object that the property references, without which it won't do you much good.

ML.NET MakePredictionFunction dynamic type?

I am able to dynamically train and create my regression model just fine from a string[] of column names. However, when I try to pass in a dynamic object with the same Parameter names as Dictionary Key Pair properties it throw the error:
System.ArgumentOutOfRangeException: 'Could not find input column '<MyColumn>'' Where <MyColumn> is the first parameter that the model is looking for.
private static void TestSinglePrediction(MLContext mlContext, dynamic ratingDataSample, int actual)
{
ITransformer loadedModel;
using (var stream = new FileStream(_modelPath, FileMode.Open, FileAccess.Read, FileShare.Read))
{
loadedModel = mlContext.Model.Load(stream);
}
var predictionFunction = loadedModel.MakePredictionFunction<dynamic, RatingPrediction>(mlContext);
var prediction = predictionFunction.Predict(ratingDataSample);
Console.WriteLine($"**********************************************************************");
Console.WriteLine($"Predicted rating: {prediction.Rating:0.####}, actual rating: {actual}");
Console.WriteLine($"**********************************************************************");
}
I suspect this is because the dynamic object doesn't contain the [Column] attributes that the standard class object I normally would pass in has.
However, I will eventually have hundreds of columns that are auto generated from transposing SQL queries so manually typing each column isn't a feasible approach for the future.
Is there any way I could perhaps apply the attribute at run time? Or any other way I can generically approach this situation? Thanks!
This is a great question. The dynamic objects don't work at runtime because ML.NET needs something called a SchemaDefinition for the objects that you pass in so that it knowns where to get the columns it expects.
The simplest way to solve your problem would be to define an object holding only the columns you need at scoring-time, annotated with Column attributes, and manually cast your dynamic object at runtime. This has the main advantage that since you do the casting to the scoring object yourself, you can handle missing data cases yourself without the ML.NET runtime throwing. While your SQL query may give you a large assortment of columns, you won't need the majority of these columns for scoring your model, and therefor don't need to account for them in the scoring object; you only have to account for the columns the model expects.
See this sample from the ML.NET Cookbook for an example of how to score a single row. Behind the scenes, ML.NET is taking the class you defined, and using attributes like Column to construct the SchemaDefinition.

Categories