Getting the Local Variables, ExceptionHandlingClauses and IL array from a dynamic method - c#

From a methodbase, I need to get an array of the IL instructions, the location and kind of that methods exception handlers and the local variables in that methods.
To do this currently (for non dynamic methods I just do)
MethodBody pBody = pMethod.GetMethodBody();
Method = pMethod;
MethodName = pMethod.Name;
IsPublic = pMethod.IsPublic;
IsStatic = pMethod.IsStatic;
Instructions = pBody.GetILAsByteArray();
Variables = pBody.LocalVariables.ToArray();
Module = pMethod.Module;
MethodInfo pMethodInfo = pMethod as MethodInfo;
ReturnType = pMethodInfo == null ? typeof(void) : pMethodInfo.ReturnType;
ExceptionHandlers = pBody.ExceptionHandlingClauses.ToArray();
Position = 0;
mhashExceptionParameters = new HashSet<ParameterExpression>();
mdictLocalExpressions = new Dictionary<LocalVariableInfo, ParameterExpression>();
However, as has been noted before, GetMethodBody() doesn't work for dynamic methods (e.g. a method handle got from compiling a linq expression). I know certain answers here show how to get the IL instructions. But how do I get the other things in a methodbody, such as Variables and Exceptionhandlers?
For reference, see this question where solutions to getting the IL byte array are given.
How do I get an IL bytearray from a DynamicMethod?

Related

Generating enumerator.Current instead of (int)((List<T>.Enumerator*)(byte*)enumerator)->Current

I am currently working on an API.
Whenever the user marks a type with a certain attribute, I want to create a new List<int> field and loop through it, performing some operations.
Here is some relevant code:
TypeReference intTypeReference = moduleDefinition.ImportReference(typeof(int));
TypeReference listType = moduleDefinition.ImportReference(typeof(List<>));
GenericInstanceType intListType = listType.MakeGenericInstanceType(intTypeReference);
var numberList =
new FieldDefinition(
name: "Numbers",
attributes: field.Attributes,
fieldType: moduleDefinition.ImportReference(intListType));
generatedType.Fields.Add(numberList);
Type enumeratorType = typeof(List<>.Enumerator);
var enumeratorTypeReference = moduleDefinition.ImportReference(enumeratorType);
GenericInstanceType intEnumeratorType = enumeratorTypeReference.MakeGenericInstanceType(intTypeReference);
var enumeratorVariable = new VariableDefinition(intEnumeratorType);
convertMethod.Body.Variables.Add(enumeratorVariable);
ilProcessor.Emit(OpCodes.Ldarg_0); // this
ilProcessor.Emit(OpCodes.Ldfld, numberList);
MethodReference getEnumeratorMethodReference =
new MethodReference(
name: "GetEnumerator",
returnType: intEnumeratorType,
declaringType: intListType)
{
HasThis = true
};
ilProcessor.Emit(OpCodes.Callvirt, getEnumeratorMethodReference);
ilProcessor.Emit(OpCodes.Stloc, enumeratorVariable);
TypeDefinition enumeratorTypeDefinition = enumeratorTypeReference.Resolve();
MethodDefinition getCurrentMethod =
enumeratorTypeDefinition.Properties.Single(p => p.Name == "Current").GetMethod;
MethodDefinition moveNextMethod =
enumeratorTypeDefinition.Methods.Single(m => m.Name == "MoveNext");
MethodReference getCurrentMethodReference = moduleDefinition.ImportReference(getCurrentMethod);
MethodReference moveNextMethodReference = moduleDefinition.ImportReference(moveNextMethod);
// Call enumerator.Current
ilProcessor.Emit(OpCodes.Ldloc, enumeratorVariable);
ilProcessor.Emit(OpCodes.Callvirt, getCurrentMethodReference);
// Store it inside currentVariable
ilProcessor.Emit(OpCodes.Stloc, currentVariable);
ilProcessor.Emit(OpCodes.Nop);
Here is the relevant output:
List<int>.Enumerator enumerator = Numbers.GetEnumerator();
int value = (int)((List<T>.Enumerator*)(byte*)enumerator)->Current;
List<int>.Enumerator enumerator = Numbers.GetEnumerator(); is my desired result. However, int value = (int)((List<T>.Enumerator*)(byte*)enumerator)->Current; is obviously not what I want.
What should I do differently so that my output becomes int value = enumerator.Current, instead of the unreadable mess that it currently is?
List<T>.Enumerator is a value type. As such, you need to be calling methods on the address of the enumerator variable, rather than on its value.
You also can't use callvirt on value types (although you can do constrained virtual calls, which is useful for calling some methods from object). You need to use call here. This isn't a problem, because value types can't be subclassed, so you know the exact method you're calling.
Therefore you need to:
ilProcessor.Emit(OpCodes.Ldloca_S, enumeratorVariable);
ilProcessor.Emit(OpCodes.Call, getCurrentMethodReference);
This explains why you're getting the strange decompiled output: the decompiler knows that Current can only be called on the address of enumerator, but it also sees that you're actually calling it on the value, so it concocts the cast to turn enumerator into a pointer to a List<T>.Enumerator.
You can see that on SharpLab.

When compiling C# expression trees into methods, is it possible to access "this"?

I am trying to dynamically generate a class that implements a given interface. Because of this, I need to implement some methods. I would like to avoid directly emitting IL instructions, so I am trying to use Expression trees and CompileToMethod. Unfortunately, some of these methods need to access a field of the generated class (as if I wrote this.field into the method I am implementing). Is it possible to access "this" using expression trees? (By "this" I mean the object the method will be called on.)
If yes, what would a method like this look like with expression trees?
int SomeMethod() {
return this.field.SomeOtherMethod();
}
Expression.Constant or ParameterExpression are your friends; examples:
var obj = Expression.Constant(this);
var field = Expression.PropertyOrField(obj, "field");
var call = Expression.Call(field, field.Type.GetMethod("SomeOtherMethod"));
var lambda = Expression.Lambda<Func<int>>(call);
or:
var obj = Expression.Parameter(typeof(SomeType));
var field = Expression.PropertyOrField(obj, "field");
var call = Expression.Call(field, field.Type.GetMethod("SomeOtherMethod"));
var lambda = Expression.Lambda<Func<SomeType, int>>(call, obj);
(in the latter case, you'd pass this in as a parameter, but it means you can store the lambda and re-use it for different target instance objects)
Another option here might be dynamic if your names are fixed:
dynamic obj = someDuckTypedObject;
int i = obj.field.SomeOtherMethod();

c# DynamicMethod exception

I have the following code
var dynamicAdd2 = new DynamicMethod("add",
typeof(string),
new[] { typeof(TestType) },
typeof(Program).Module, false);
var add2Body = typeof(Program).GetMethod("add2").GetMethodBody().GetILAsByteArray();
var dynamicIlInfo = dynamicAdd2.GetDynamicILInfo();
var ilGenerator = dynamicAdd2.GetILGenerator();
dynamicIlInfo.SetLocalSignature(SignatureHelper.GetLocalVarSigHelper().GetSignature());
dynamicIlInfo.SetCode(add2Body, 1000);
var test2 = (Func<TestType, string>)dynamicAdd2.CreateDelegate(typeof(Func<TestType, string>));
var ret2 = test2(new TestType()); // <-- Exception
the add2:
public string add2(TestType digit)
{
return digit.Name;
}
the testType:
public class TestType
{
public string Name = "test";
}
I get a InvalidProgrammException, no more information
So I expect that the creation of the dynamic method fails. I think the dynamic Method can not find the references to the TestClass. Or what can be wrong in this case? Or what can I do to get a hint where the problem lies? the Exception brings not the needed infos...
You cannot directly copy IL stream from existing method to dynamic method, because IL uses so called tokens (32-bit numbers) to represent types, methods or fields. For the same field, value of token can be different in different modules, so byte-copying method IL stream without replacing tokens results in invalid program.
Second problem is that because add2 is instance method (not static), you must add instance of type that this method belongs to as first argument of method. In C# this first argument of instance methods is hidden, but IL requires it. Or you can declare method as static to avoid this.
Third problem is that add2 method contains (compiler generated) local variable. You have to add this variable to local signature (using SetLocalSignature() method), otherwise your method would use undeclared variable. (See code bellow to see how to do that).
Solution 1:
First solution is to use GetILGenerator() instead of GetDynamicILInfo(), and rewrite IL stream from scratch. You can use IL disassembler (e.g. ILDASM, .NET Reflector) to get list of instructions for any existing method. Writing these instructions to IlGenerator using IlGenerator.Emit(...) should not be difficult.
static void Main(string[] args)
{
var dynamicAdd2 = new DynamicMethod("add",
typeof(string),
new[] { typeof(Program), typeof(TestType) },
typeof(Program).Module,
false);
var ilGenerator = dynamicAdd2.GetILGenerator();
ilGenerator.DeclareLocal(typeof(string));
ilGenerator.Emit(OpCodes.Ldarg_1);
var fld = typeof(TestType).GetField("Name");
ilGenerator.Emit(OpCodes.Ldfld, fld);
ilGenerator.Emit(OpCodes.Ret);
var test2 = (Func<TestType, string>)dynamicAdd2.CreateDelegate(typeof(Func<TestType, string>), new Program());
var ret2 = test2(new TestType());
}
Solution 2:
If you cannot use IlGenerator and you require direct IL stream manipulation using GetDynamicILInfo, you have to replace tokens in IL stream with values that are valid for generated dynamic method. Replacing tokens generally requires you to know offsets of these tokens in IL stream. Problem is that exact offset depends on compiler (and is even different for Release/Debug build). So you either have to use some IL dissassembler to get these offsets, or write IL parser able to do that (which is not trivial, maybe you can find some library for that). So following code uses kind of "dirty hack" to make it work in this particular case, but does not work generally.
public static void Main()
{
var dynamicAdd2 = new DynamicMethod("add",
typeof(string),
new[] { typeof(Program), typeof(TestType) },
typeof(Program).Module,
false);
var add2Body = typeof(Program).GetMethod("add2").GetMethodBody();
var add2ILStream = add2Body.GetILAsByteArray();
var dynamicIlInfo = dynamicAdd2.GetDynamicILInfo();
var token = dynamicIlInfo.GetTokenFor(typeof(TestType).GetField("Name").FieldHandle);
var tokenBytes = BitConverter.GetBytes(token);
//This tries to find index of token used by ldfld by searching for it's opcode (0x7B) in IL stream.
//Token follows this instructions so I add +1. This works well for this simple method, but
//will not work in general case, because IL stream could contain 0x7B on other unrelated places.
var tokenIndex = add2ILStream.ToList().IndexOf(0x7b) + 1;
Array.Copy(tokenBytes, 0, add2ILStream, tokenIndex, 4);//
//Copy signature of local variables from original add2 method
var localSignature = SignatureHelper.GetLocalVarSigHelper();
var localVarTypes = add2Body.LocalVariables.Select(_ => _.LocalType).ToArray();
localSignature.AddArguments(localVarTypes, null, null);
dynamicIlInfo.SetLocalSignature(localSignature.GetSignature());
dynamicIlInfo.SetCode(add2ILStream, 1);
var test2 = (Func<TestType, string>)dynamicAdd2.CreateDelegate(typeof(Func<TestType, string>));
var ret2 = test2(new TestType());
}

Direct memory access to underlying field data

I'm looking for a way to avoid FieldInfo.Get/SetValue overhead, and access memory directly for a few select, known ahead of time, primitive types. (Most specifically, I'm looking to avoid any memory allocations in our custom serializer)
Basically, here's what the official way allows me to do:
System.Object o = someobject;
int inOut = 0;
var type = o.GetType();
var fieldInfos = type.GetFields(BindingFlags.Public | BindingFlags.Instance);
foreach (var fi in fieldInfos) {
fi.SetValue(o, inOut);
inOut = (int)fi.GetValue(o);
}
And here's roughly what I'd like to do:
foreach (var fi in fieldInfos) {
fixed(int* ip = o.basePointer + fi.fieldOffset) {
*p = inOut;
inOut = *p;
}
}
I would use this only for Int32, Single, and possibly bools. I'm primarily interested in getting this working on Mono, so if there's anything Mono specific available, that'd be fine.
Note: I'm well aware of the "you shouldn't be doing this", and "have you profiled it" etc. I know, and I have, which is why I'm looking into this. We have a very specific case, where we control all variables (and all code), but we would like it to work on any 'normal' class without requiring additional markup or explicit struct layout.
EDIT: I should point out that I'm not able to emit dynamic code to solve this. I'm ok with a solution requiring me to write and assemble IL up-front though.
I'm well aware of the "you shouldn't be doing this"
That is good - I'll skip this part of the explanation then, and go straight to a way of accessing fields that avoids memory allocation, while staying within the limits of managed code.
You can use LINQ expressions to construct a Func<ObjType,int> for a getter and Action<ObjType,int> for a setter. Calling these functors would let you get or set int fields as if you were accessing their methods directly.
Here is how you can make a wrapper-free getter:
public class Test
{
public int myfield;
public static void Main()
{
// Make a parameter expression to represent the object
var argExpr = Expression.Parameter(typeof(Test), "a");
// Get the field of your object (the same way as in your first example)
var field = typeof(Test).GetField("myfield", BindingFlags.Public | BindingFlags.Instance);
// Make an expression accessing the field from the parameter
var fieldExpr = Expression.Field(argExpr, field);
// Compile the expression into a functor
var getter = (Func<Test,int>)Expression.Lambda(fieldExpr, argExpr).Compile();
// Construct a test object
var tmp = new Test {myfield = 123};
// Use a wrapper to avoid "boxing"/"unboxing" of "GetValue"
int res = getter(tmp);
Console.WriteLine("Res={0}", res);
}
}
Demo on ideone.
Construct the setter in a similar way, using one more parameter of type int, and Expression.Assign. The resultant lambda will compile into an Action<Test,int> rather than Func<Test,int>, because setters do not return value.
You say, that you can't use dynamic code generation. Here are some other ideas:
If you can work with properties instead of fields, create a delegate to the property getter (https://stackoverflow.com/a/724427).
Generate IL code for your serializer at build time. Compile that into an assembly that you can load at runtime. Just generate accessor code for each and every field. I think you can access private members in IL when FullTrust and SkipVerification permissions are present.

Conversion problem with Expression Trees

I have an expression tree function from a previous SO question. It basically allows the conversion of a data row into a specific class.
This code works fine, unless you're dealing with data types that can be bigger or smaller (eg. Int32/Int64).
The code throws an invalid cast exception when going from an Int64 to an Int32 when the value would fit in an Int32 (eg. numbers in the 3000).
Should I?
Attempt to fix this in the code? (If so, any pointers?)
Leave the code as it is.
private Func<SqlDataReader, T> getExpressionDelegate<T>()
{
// hang on to row[string] property
var indexerProperty = typeof(SqlDataReader).GetProperty("Item", new[] { typeof(string) });
// list of statements in our dynamic method
var statements = new List<Expression>();
// store instance for setting of properties
ParameterExpression instanceParameter = Expression.Variable(typeof(T));
ParameterExpression sqlDataReaderParameter = Expression.Parameter(typeof(SqlDataReader));
// create and assign new T to variable: var instance = new T();
BinaryExpression createInstance = Expression.Assign(instanceParameter, Expression.New(typeof(T)));
statements.Add(createInstance);
foreach (var property in typeof(T).GetProperties())
{
// instance.MyProperty
MemberExpression getProperty = Expression.Property(instanceParameter, property);
// row[property] -- NOTE: this assumes column names are the same as PropertyInfo names on T
IndexExpression readValue = Expression.MakeIndex(sqlDataReaderParameter, indexerProperty, new[] { Expression.Constant(property.Name) });
// instance.MyProperty = row[property]
BinaryExpression assignProperty = Expression.Assign(getProperty, Expression.Convert(readValue, property.PropertyType));
statements.Add(assignProperty);
}
var returnStatement = instanceParameter;
statements.Add(returnStatement);
var body = Expression.Block(instanceParameter.Type, new[] { instanceParameter }, statements.ToArray());
var lambda = Expression.Lambda<Func<SqlDataReader, T>>(body, sqlDataReaderParameter);
// cache me!
return lambda.Compile();
}
Update:
I have now given up and decided it is not worth it. From the comments below, I got as far as:
if (readValue.Type != property.PropertyType)
{
BinaryExpression assignProperty = Expression.Assign(getProperty, Expression.Convert(Expression.Call(property.PropertyType, "Parse", null, new Expression[] { Expression.ConvertChecked(readValue, typeof(string)) }), property.PropertyType));
statements.Add(assignProperty);
}
else
{
// instance.MyProperty = row[property]
BinaryExpression assignProperty = Expression.Assign(getProperty, Expression.Convert(readValue, property.PropertyType));
statements.Add(assignProperty);
}
I don't think I was too far off, feel free to finish it and post the answer if you figure it out :)
You could try to fix it by "convert checked" before assigning i.e. using Expression.ConvertChecked on the value instead of Expression.Convert .
Couldn't try it right now but this should take care of the case you describe...
EDIT - as per comment this could be a boxing issue:
In this case you could try using Expression.TypeAs or Expression.Unbox for the conversion or use Expression.Call for calling a method to do the conversion... an example for using Call can be found at http://msdn.microsoft.com/en-us/library/bb349020.aspx
What you're trying to build is actually much more complicated if you want to support 100% of the primitives in .NET and SQL.
If you don't care about some of the edge cases (nullable types, enums, byte arrays, etc), two tips to get you 90% there:
Don't use the indexer on IDataRecord, it returns an object and the boxing/unboxing will kill performance. Instead, notice that IDataRecord has Get[typeName] methods on it. These exist for all .NET primitive types (note: it's GetFloat, not GetSingle, huge annoyance).
You can use IDataRecord.GetFieldType to figure out which Get method you need to call for a given column. Once you have that, you can use Expression.Convert to coerce the DB column type to the target property's type (if they're different). This will fail for some of the edge cases I listed above, for those you need custom logic.

Categories