Direct memory access to underlying field data - c#

I'm looking for a way to avoid FieldInfo.Get/SetValue overhead, and access memory directly for a few select, known ahead of time, primitive types. (Most specifically, I'm looking to avoid any memory allocations in our custom serializer)
Basically, here's what the official way allows me to do:
System.Object o = someobject;
int inOut = 0;
var type = o.GetType();
var fieldInfos = type.GetFields(BindingFlags.Public | BindingFlags.Instance);
foreach (var fi in fieldInfos) {
fi.SetValue(o, inOut);
inOut = (int)fi.GetValue(o);
}
And here's roughly what I'd like to do:
foreach (var fi in fieldInfos) {
fixed(int* ip = o.basePointer + fi.fieldOffset) {
*p = inOut;
inOut = *p;
}
}
I would use this only for Int32, Single, and possibly bools. I'm primarily interested in getting this working on Mono, so if there's anything Mono specific available, that'd be fine.
Note: I'm well aware of the "you shouldn't be doing this", and "have you profiled it" etc. I know, and I have, which is why I'm looking into this. We have a very specific case, where we control all variables (and all code), but we would like it to work on any 'normal' class without requiring additional markup or explicit struct layout.
EDIT: I should point out that I'm not able to emit dynamic code to solve this. I'm ok with a solution requiring me to write and assemble IL up-front though.

I'm well aware of the "you shouldn't be doing this"
That is good - I'll skip this part of the explanation then, and go straight to a way of accessing fields that avoids memory allocation, while staying within the limits of managed code.
You can use LINQ expressions to construct a Func<ObjType,int> for a getter and Action<ObjType,int> for a setter. Calling these functors would let you get or set int fields as if you were accessing their methods directly.
Here is how you can make a wrapper-free getter:
public class Test
{
public int myfield;
public static void Main()
{
// Make a parameter expression to represent the object
var argExpr = Expression.Parameter(typeof(Test), "a");
// Get the field of your object (the same way as in your first example)
var field = typeof(Test).GetField("myfield", BindingFlags.Public | BindingFlags.Instance);
// Make an expression accessing the field from the parameter
var fieldExpr = Expression.Field(argExpr, field);
// Compile the expression into a functor
var getter = (Func<Test,int>)Expression.Lambda(fieldExpr, argExpr).Compile();
// Construct a test object
var tmp = new Test {myfield = 123};
// Use a wrapper to avoid "boxing"/"unboxing" of "GetValue"
int res = getter(tmp);
Console.WriteLine("Res={0}", res);
}
}
Demo on ideone.
Construct the setter in a similar way, using one more parameter of type int, and Expression.Assign. The resultant lambda will compile into an Action<Test,int> rather than Func<Test,int>, because setters do not return value.

You say, that you can't use dynamic code generation. Here are some other ideas:
If you can work with properties instead of fields, create a delegate to the property getter (https://stackoverflow.com/a/724427).
Generate IL code for your serializer at build time. Compile that into an assembly that you can load at runtime. Just generate accessor code for each and every field. I think you can access private members in IL when FullTrust and SkipVerification permissions are present.

Related

Is it possible to instantiate an object and pass it object initializer variables with Activator.CreateInstance? [duplicate]

I'm not sure of the terminology for this kind of code, but I want to know if it's possible to instantiate variables after the parentheses, but whilst using reflection.
I have a map which gets loaded from an XML file. This is a collection of (int X, int Y, string S) where the X,Y is the position of some terrain, and S is a string representing the type of the terrain. I have a dictionary to pass between the strings and the relevant types; for example one key-value pair might be "Tree", typeof(Tree).
When using reflection, although I know it's possible to instantiate with parameters, the only way I'm comfortable is just by using Activator.CreateInstance(Type t), i.e. with an empty constructor.
When I had the maps hard coded, I would originally instantiate like this (within some i,j for loop):
case: "Tree"
world.Add( new Tree(i,j) );
Whilst starting to think about reflection and my save file, I changed this to:
world.Add( new Tree() { X = i, Y = j }
However, I realised that this won't work with reflection, so I am having to do the following (Tree inherits from Terrain, and the dictionary just converts the XML save data string to a type):
Type type = dictionary[dataItem.typeAsString];
Terrain t = (Terrain)Activator.CreateInstance(type);
t.X = i;
t.Y = j;
world.Add(t);
I would prefer to do this using something like
Type type = dictionary[dataItem.typeAsString];
world.Add((Terrain)Activator.CreateInstance(type) { X = i, Y = j }
Is there any shortcut like this? I guess if not I could edit world.Add to take an X and Y and cast to Terrain in there to access those variables, but I am still curious as to a) what this {var1 = X, var2 = Y} programming is called, and b) whether something similar exists when using reflection.
This syntax is called Object Initializer syntax and is just syntactic sugar for setting the properties.
The code var result = new MyType { X = x } will be compiled to this:
MyType __tmp = new MyType();
__tmp.X = x;
MyType result = __tmp;
You will have to do that yourself using PropertyInfo.SetValue if you know the instantiated type only at runtime or use the normal property setters if the type is known at compile time.
The answer is no, because the object initialization syntax you mention (introduced with LINQ in 3.0) is an illusion of the compiler. As in, when you type this
var foo = new Foo { Bar = "baz" };
the compiler actually converts it into CLS-compliant IL which equates to
var foo = new Foo();
foo.Bar = "baz";
Phil Haack has a great blog post which not only covers the details of this rewriting done by the compiler, but also some side effects it can cause when dealing with types that implement IDisposable
As all of this is nothing but a feint by the compiler, there is no equivalent using reflection (i.e., Activator.CreateInstance(Type t)). Others will give you workarounds, but in the end there really is no direct equivalent.
Probably the closest generic hack you could manage would be to create a method that accepted an object, then used reflection in order to identify the properties of that object and their respective values in order to perform object initialization for you. It might be used something like this
var foo = Supercollider.Initialize<Foo>(new { Bar = "baz" });
and the code would be something like (this is off the top of my head)
public sealed class Supercollider
{
public static T Initialize<T>(object propertySource)
{
// you can provide overloads for types that don't have a default ctor
var result = Activator.CreateInstance(typeof(T));
foreach(var prop in ReflectionHelper.GetProperties(typeof(T)))
ReflectionHelper.SetPropertyValue(
result, // the target
prop, // the PropertyInfo
propertySource); // where we get the value
}
}
You'd have to get each property from the anonymous object, find a property in your target type with the same exact name and type, then get the value from that property in the anonymous object and set the value of your target's property to this value. Its not incredibly hard, but its absolutely prone to runtime exceptions and issues where the compiler chooses a different type for the anonymous type's property, requiring you be more specific (e.g., new { Bar = (string)null }), which screws with the elegance of the thing.
(T)Activator.CreateInstance(typeof(T), param1, param2, ...);
As described HERE.
public sealed class ReflectionUtils
{
public static T ObjectInitializer<T>(Action<T> initialize)
{
var result = Activator.CreateInstance<T>();
initialize(result);
return result;
}
}
public class MyModel
{
public string Name{get;set;}
}
And after that just make the call :
var myModel = ReflectionUtils.ObjectInitializer<MyModel>(m =>
{
m.Name = "Asdf"
});
The advantage is that in this way you will have type safety and use reflection as minimum required, because we all know that reflection is an expensive operation that should be avoided as much as possible.
You could create a constructor which takes those arguments, then use
Activator.CreateInstance(type, i, j)
But you won't be able to use the object initialization syntax. Which is just sugar candy for setting the properties.

Using Reflection to log the Properties of a type [duplicate]

I am trying to improve performance in the code below and kinda know how but not sure which is the best approach.
The first hit will take longer but subsequent hits should be quicker. Now, I could cache T (where T is a class) and then check the cache to see if "T" exists, if so - go ahead and get its related information (NamedArguments) and go through each of the NamedArguments and finally if the criteria matches, go ahead and set the value of the current property.
I just want to make it more efficient and performant. Any ideas?
var myProps = typeof(T).GetProperties(BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Static).Where(prop => Attribute.IsDefined(prop, typeof(MyCustomAttribute)) && prop.CanWrite && prop.GetSetMethod() != null);
foreach (var currentProperty in myProps)
{
foreach (var currentAttributeForProperty in currentProperty.GetCustomAttributesData())
{
foreach (var currentNamedArgument in currentAttributeForProperty.NamedArguments)
{
if (string.Equals(currentNamedArgument.MemberInfo.Name, "PropName", StringComparison.OrdinalIgnoreCase))
{
currentAttribParamValue = currentNamedArgument.TypedValue.Value == null ? null : currentNamedArgument.TypedValue.Value.ToString();
// read the reader for the currentAttribute value
if (reader.DoesFieldExist(currentAttribParamValue))
{
var dbRecordValue = reader[currentAttribParamValue] == DBNull.Value ? null : reader[currentAttribParamValue];
// set it in the property
currentProperty.SetValue(val, dbRecordValue, null);
}
break;
}
}
}
}
DynamicMethods or ExpressionTrees will be much* faster than reflection. You could build a cache of all property getter/setters for a type, and then cache that information in a Dictionary (or ConcurrentDictionary) with type as the key.
Expression Tree Basics
Dynamic Methods
Flow
Discover type information (e.g. on app startup).
Compile dynamic methods for each property (do all properties at once).
Store those methods in a metadata class (example follows).
Cache the metadata somewhere (even a static field is fine, as long as access is synchronized). Use the type as the key.
Get the metadata for the type when needed.
Find the appropriate getter/setter.
Invoke, passing the instance on which you wish to act.
// Metadata for a type
public sealed class TypeMetadata<T> {
// The compiled getters for the type; the property name is the key
public Dictionary<string, Func<T, object>> Getters {
get;
set;
}
// The compiled setters for the type; the property name is the key
public Dictionary<string, Action<T, object>> Setters {
get;
set;
}
}
// rough invocation flow
var type = typeof( T);
var metadata = _cache[type];
var propertyName = "MyProperty";
var setter = metadata[propertyName];
var instance = new T();
var value = 12345;
setter( instance, value );
Example Setter
Excerpted from Dynamic Method Implementation (good article on the subject).
I can't vouch that this exact code works, but I've written very similar code myself. If you aren't comfortable with IL, definitely consider an expression tree instead.
public static LateBoundPropertySet CreateSet(PropertyInfo property)
{
var method = new DynamicMethod("Set" + property.Name, null, new[] { typeof(object), typeof(object) }, true);
var gen = method.GetILGenerator();
var sourceType = property.DeclaringType;
var setter = property.GetSetMethod(true);
gen.Emit(OpCodes.Ldarg_0); // Load input to stack
gen.Emit(OpCodes.Castclass, sourceType); // Cast to source type
gen.Emit(OpCodes.Ldarg_1); // Load value to stack
gen.Emit(OpCodes.Unbox_Any, property.PropertyType); // Unbox the value to its proper value type
gen.Emit(OpCodes.Callvirt, setter); // Call the setter method
gen.Emit(OpCodes.Ret);
var result = (LateBoundPropertySet)method.CreateDelegate(typeof(LateBoundPropertySet));
return result;
}
*25-100x faster in my experience
Reflection is notoriously slow in loops, so some kind of caching would probably help. But to decide what to cache, you should measure. As the famous saying goes: "premature optimization is the root of all evil"; you should make sure that you really need to optimize and what exactly to optimize.
For a more concrete advice, attributes are attached at compile time, so you could cache a type and a list of its propertyInfos for example.
I also had similar problem once - reflection is extreemely slow.
I used caching, like you are planning and performance grow more than 10 times. It was never again be a bottleneck in performance.
I've created similar logic before where I've cached an 'execution plan' for each type encountered. It was definitely faster for subsequent runs but you would have to profile your scenario to see whether it's worth the extra code complexity and memory usage.

Static(expression) alternative for type.GetFields()

I'm trying to implement static reflection for all fields in the class.
In other words, I have to create get and set using name for all these fields.
I use the answers on Set field value with Expression tree, and came to the following solution
public class ExpressionSetterGetter
{
public class SetterGetter<T> where T : class
{
public Delegate getter;
public Action<T, object> setter;
}
public static Dictionary<string, SetterGetter<T>> GetFieldSetterGetterExpressions<T>() where T : class
{
var dic = new Dictionary<string, SetterGetter<T>>();
SetterGetter<T> setterGetter;
var type = typeof(T);
var fields = type.GetFields(BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public);
foreach (var fieldInfo in fields)
{
var targetExp = Expression.Parameter(type, "target");
var valueExp = Expression.Parameter(typeof(object), "value");
var fieldExp = Expression.Field(targetExp, fieldInfo);
var assignExp = Expression.Assign(fieldExp, Expression.Convert(valueExp, fieldExp.Type));
var fieldSetter = Expression.Lambda<Action<T, object>>(assignExp, targetExp, valueExp).Compile();
ParameterExpression objParm = Expression.Parameter(type, "obj");
MemberExpression fieldExpr = Expression.Field(objParm, fieldInfo.Name);
var fieldExprConverted = Expression.Convert(fieldExpr, typeof(object));
var fieldGetter = Expression.Lambda(fieldExprConverted, objParm).Compile();
setterGetter = new SetterGetter<T>() { setter = fieldSetter, getter = fieldGetter };
dic.Add(fieldInfo.Name, setterGetter);
}
return dic;
}
}
Now, I have these two issues.
I use type.GetFields(), but MSDN says it is a reflection method.
What means that the compiler doesn't know the type before runtime.
Am I right? If it is correct, what is the reason to use the
expression trees. As far as I know expression trees translate to the
code at compile-time time, what means almost no additional costs.
The same logic. What if I put as a parameter a list with name of
fields what have to be wrapped. In other words, instead of
type.GetFields() I simply put the names of fields as parameters.
public static Dictionary<string, SetterGetter<T>> GetFieldSetterGetterExpressions<T>(IEnumerable<string> fieldNames)
Obviously, the list can be not known at compile-time. Again, how the CLR will react?
Expression trees' output is not known in compile time, they are compiled at runtime, which is something different. Advantage of using (cached) expression trees compilation is that reflection heavy lifting is performed only once, during the expression construction and compilation. From there, invocation of the resulting delegate is very fast.
In case you are sure the list specifies only the fields that actually exist in your class and you provide only a subset of all the fields, you could, in theory, somewhat reduce the performance impact of the reflection. But you should keep in mind that you will still need the FieldInfo instances for each of them which could only be retrieved using reflection. So the overall price of invoking GetField for every known field will probably be even higher than just calling GetFields for all of them.
The easiest way to set fields value in compile time is converting your private fields to public properties. Then, you can set them as you wish.
And this is not a joke on you, it's an explanation of the difference - C# doesn't allow setting fields from outside, except using reflection API's. And those API's designed for runtime, not for compile time.
If you'd like to bend the rules and produce such a code at compilation time (precisely, at post-compilation time), you should use some third party library such as PostSharp or Fody.
And that's the end of the story, so long as you are using C# in it's current or one of the previous versions.

Improving performance by caching or delegate call?

I am trying to improve performance in the code below and kinda know how but not sure which is the best approach.
The first hit will take longer but subsequent hits should be quicker. Now, I could cache T (where T is a class) and then check the cache to see if "T" exists, if so - go ahead and get its related information (NamedArguments) and go through each of the NamedArguments and finally if the criteria matches, go ahead and set the value of the current property.
I just want to make it more efficient and performant. Any ideas?
var myProps = typeof(T).GetProperties(BindingFlags.Instance | BindingFlags.NonPublic | BindingFlags.Public | BindingFlags.Static).Where(prop => Attribute.IsDefined(prop, typeof(MyCustomAttribute)) && prop.CanWrite && prop.GetSetMethod() != null);
foreach (var currentProperty in myProps)
{
foreach (var currentAttributeForProperty in currentProperty.GetCustomAttributesData())
{
foreach (var currentNamedArgument in currentAttributeForProperty.NamedArguments)
{
if (string.Equals(currentNamedArgument.MemberInfo.Name, "PropName", StringComparison.OrdinalIgnoreCase))
{
currentAttribParamValue = currentNamedArgument.TypedValue.Value == null ? null : currentNamedArgument.TypedValue.Value.ToString();
// read the reader for the currentAttribute value
if (reader.DoesFieldExist(currentAttribParamValue))
{
var dbRecordValue = reader[currentAttribParamValue] == DBNull.Value ? null : reader[currentAttribParamValue];
// set it in the property
currentProperty.SetValue(val, dbRecordValue, null);
}
break;
}
}
}
}
DynamicMethods or ExpressionTrees will be much* faster than reflection. You could build a cache of all property getter/setters for a type, and then cache that information in a Dictionary (or ConcurrentDictionary) with type as the key.
Expression Tree Basics
Dynamic Methods
Flow
Discover type information (e.g. on app startup).
Compile dynamic methods for each property (do all properties at once).
Store those methods in a metadata class (example follows).
Cache the metadata somewhere (even a static field is fine, as long as access is synchronized). Use the type as the key.
Get the metadata for the type when needed.
Find the appropriate getter/setter.
Invoke, passing the instance on which you wish to act.
// Metadata for a type
public sealed class TypeMetadata<T> {
// The compiled getters for the type; the property name is the key
public Dictionary<string, Func<T, object>> Getters {
get;
set;
}
// The compiled setters for the type; the property name is the key
public Dictionary<string, Action<T, object>> Setters {
get;
set;
}
}
// rough invocation flow
var type = typeof( T);
var metadata = _cache[type];
var propertyName = "MyProperty";
var setter = metadata[propertyName];
var instance = new T();
var value = 12345;
setter( instance, value );
Example Setter
Excerpted from Dynamic Method Implementation (good article on the subject).
I can't vouch that this exact code works, but I've written very similar code myself. If you aren't comfortable with IL, definitely consider an expression tree instead.
public static LateBoundPropertySet CreateSet(PropertyInfo property)
{
var method = new DynamicMethod("Set" + property.Name, null, new[] { typeof(object), typeof(object) }, true);
var gen = method.GetILGenerator();
var sourceType = property.DeclaringType;
var setter = property.GetSetMethod(true);
gen.Emit(OpCodes.Ldarg_0); // Load input to stack
gen.Emit(OpCodes.Castclass, sourceType); // Cast to source type
gen.Emit(OpCodes.Ldarg_1); // Load value to stack
gen.Emit(OpCodes.Unbox_Any, property.PropertyType); // Unbox the value to its proper value type
gen.Emit(OpCodes.Callvirt, setter); // Call the setter method
gen.Emit(OpCodes.Ret);
var result = (LateBoundPropertySet)method.CreateDelegate(typeof(LateBoundPropertySet));
return result;
}
*25-100x faster in my experience
Reflection is notoriously slow in loops, so some kind of caching would probably help. But to decide what to cache, you should measure. As the famous saying goes: "premature optimization is the root of all evil"; you should make sure that you really need to optimize and what exactly to optimize.
For a more concrete advice, attributes are attached at compile time, so you could cache a type and a list of its propertyInfos for example.
I also had similar problem once - reflection is extreemely slow.
I used caching, like you are planning and performance grow more than 10 times. It was never again be a bottleneck in performance.
I've created similar logic before where I've cached an 'execution plan' for each type encountered. It was definitely faster for subsequent runs but you would have to profile your scenario to see whether it's worth the extra code complexity and memory usage.

Property / field initializers in code generation

I'm generating code in a visual studio extension using CodeDom and plain code strings. My extension reads a current classes declared fields and properties using reflection and generates contructors, initializers, implements certain interfaces, etc.
The generator class is simple:
public class CodeGenerator < T >
{
public string GetCode ()
{
string code = "";
T type = typeof(T);
List < PropertyInfo > properties = t.GetProperties();
foreach (PropertyInfo property in properties)
code += "this." + property.Name + " = default(" + property.PropertyType.Name + ")";
}
}
I'm stuck at field and property initializers in two ways.
Firstly, although default(AnyNonGenericValueOrReferenceType) seems to work in most cases, I'm uncomfortable with using it in generated code.
Secondly, it does not work for generic types since I can't find a way to get the underlying type of the generic type. So if a property is List < int >, property.PropertyType.Name returns List`1. There are two problems here. First, I need to get the proper name for the generic type without using string manipulation. Second, I need to access the underlying type. The full property type name returns something like:
System.Collections.Generic.List`1[[System.Int32, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]
Before I try to answer, I feel compelled to point out that what you're doing seems redundant. Assuming that you are putting this code into a constructor, generating something like:
public class Foo
{
private int a;
private bool b;
private SomeType c;
public Foo()
{
this.a = default(int);
this.b = default(bool);
this.c = default(SomeType);
}
}
is unnecessary. That already happens automatically when a class is constructed. (In fact, some quick testing shows that these assignments aren't even optimized away if they're done explicitly in the constructor, though I suppose the JITter could take care of that.)
Second, the default keyword was designed in large part to do exactly what you're doing: to provide a way to assign the "default" value to a variable whose type is unknown at compile time. It was introduced for use by generic code, I assume, but auto-generated code is certainly correct in using it as well.
Keep in mind that the default value of a reference type is null, so
this.list = default(List<int>);
does not construct a new List<int>, it just sets this.list to null. What I suspect you want to do, instead, is to use the Type.IsValueType property to leave value types at their default values, and initialize reference types using new.
Lastly, I think what you're looking for here is the IsGenericType property of the Type class and the corresponding GetGenericArguments() method:
foreach (PropertyInfo property in properties)
{
if (property.Type.IsGenericType)
{
var subtypes = property.Type.GetGenericArguments();
// construct full type name from type and subtypes.
}
else
{
code += "this." + property.Name + " = default(" + property.PropertyType.Name + ")";
}
}
EDIT:
As far as constructing something useful for a reference type, a common technique I've seen used by generated code is to require a parameterless constructor for any class that you expect to use. It's easy enough to see if a class has a parameterless constructor, by calling Type.GetConstructor(), passing in an empty Type[] (e.g. Type.EmptyTypes), and see if it returns a ConstructorInfo or null. Once that has been established, simply replacing default(typename) with new typename() should achieve what you need.
More generally you can supply any array of types to that method to see if there's a matching constructor, or call GetConstructors() to get them all. Things to look out for here are the IsPublic, IsStatic, and IsGenericMethod fields of the ConstructorInfo, to find one you can actually call from wherever this code is being generated.
The problem you are trying to solve, though, is going to become arbitrarily complex unless you can place some constraints on it. One option would be to find an arbitrary constructor and build a call that looks like this:
var line = "this." + fieldName + " = new(";
foreach ( var param in constructor.GetParameters() )
{
line += "default(" + param.ParameterType.Name + "),";
}
line = line.TrimEnd(',') + ");"
(Note this is for illustrative purposes only, I'd probably use CodeDOM here, or at least a StringBuilder :)
But of course, now you have the problem of determining the appropriate type name for each parameter, which themselves could be generics. And the reference type parameters would all be initialized to null. And there's no way of knowing which of the arbitrarily many constructors you can pick from actually produces a usable object (some of them may do bad things, like assume you're going to set properties or call methods immediately after you construct an instance.)
How you go about solving those issues is not a technical one: you can recursively apply this same logic to each parameter as far down as you're willing to go. It's a matter of deciding, for your use case, how complex you need to be and what kind of limits you're willing to place on the users.
If you are sure you want to use strings, you will have to write your own method to format those type names. Something like:
static string FormatType(Type t)
{
string result = t.Name;
if (t.IsGenericType)
{
result = string.Format("{0}<{1}>",
result.Split('`')[0],
string.Join(",", t.GetGenericArguments().Select(FormatType)));
}
return result;
}
This code assumes you have all necessary usings in your file.
But I think it's much better to actually use CodeDOM's object model. This way, you don't have to worry about usings, formatting types or typos:
var statement =
new CodeAssignStatement(
new CodePropertyReferenceExpression(new CodeThisReferenceExpression(), property.Name),
new CodeDefaultValueExpression(new CodeTypeReference(property.PropertyType)));
And if you really don't want to use default(T), you can find out whether the type is a reference or value type. If it's a reference type, use null. If it's value type, the default constructor has to exist, and so you can call that.

Categories