Mono.Cecil: Create Instruction from string

Mono.Cecil: Create Instruction from string - c#

How can i turn strings like these:
"call System.Console.WriteLine"
"ldstr \"hello\""
into Intructions with Operands?

If you now how to use Mono.Cecil (or Reflection.Emit), the question is more generally about parsing text to code actions.
You have some ways to do that and I can just show you an hint and you can choose your way.
First of all you need some prerequisites (If you IL text is a valid IL code these prerequisites already exist for you).
For example, you can't guess what Console.WriteLine is. Console is an assembly, type, method? same questions on WriteLine. And even if we know that WriteLine is a method, which overload we need to choose? and what about generics? So you need set a contract that define for example that the dot is the delimiter and the first section is assembly, the second is namespace, and so on.
For example:
"mscorlib.System.Console.WriteLine(string)" will be translated to System.Console.WriteLine(string)
After you have a strict contract, you need a few steps (for the WriteLine example):
Resolve the Assembly and get is ModuleDefinition
Resolve and import the Type
Resolve and import the MethodReference describe the requested method
Emit the call
One way to do that, is to keep structure of opcodes and their required action.
For example, we know that call instruction is need to emit call to static method (in most of the cases) so you need to send the operand to ParseStaticCall, there you need to parse the string and emit the call instruction
Pseudo code:
new Dictionary<string, Tuple<OpCode, Action<string>>>
{
{
"call",
Tuple.Create<OpCode, Action<string>>
(OpCodes.Call,ParseStaticCall)
}
};
static void ParseStaticCall(Opcpde opcode,
string call,
ILProcessor processor)
{
string assembly, namespaceName, type, method;
int numOfParameters;
var moduleDefenition = AssemblyResolver.Resolve(assembly).MainModule;
var methodReference =
new ReferenceFinder(moduleDefenition).
GetMethodReference(typeof (Console),
md => md.Name == methodName &&
md.Parameters.Count == numOfParameters);
processor.Emit(opcode, methodReference);
}
AssemblyResolver is an helper class to find assembly by name and path (can be a constant path). ReferenceFinder is an helper class that find type\method in a specific module.
So you need to create method and ILProccesor for the method body, then for every string you have, you need to separate the instruction opcode form the operand, then look in dictionary for the required action and pass the opcode, the operand as string and the ILProccesor.

Related

Read the value of an attribute from a method with parameters

I was looking for a way to get the value of an attribute and send it to a report I have to make. The short of it is I found an answer when a method has no parameters but any methods with paramaters throws an error.
My initial question of how to Read the value of an attribute from a method was answered by this question (Read the value of an attribute of a method)
Here is the code that has been working
public static void WriteStepNamesOfMethodToReport(Type classType, string methodName)
{
MethodInfo methodInfo = classType.GetRuntimeMethod(methodName, new Type[] { });
Attribute[] attributeList = (System.Attribute[])methodInfo.GetCustomAttributes(typeof(Step), true);
GaugeMessages.WriteMessage("---------------------");
foreach (Attribute attr in attributeList)
{
Step a = (Step)attr;
GaugeMessages.WriteMessage("Executed Step - {0}", a.Names.ElementAt(0));
}
GaugeMessages.WriteMessage("---------------------");
}
This is how I set up the variables to send (and yes I could make that one line, but I define it in one place and use it in many so that is the way it needs to be)
Type classType = typeof(AClassInTheProject);
GenericHelpers.WriteStepNamesOfMethodToReport(classType, nameof(AMethodNameFrom_AClassInTheProject));
The line of Code that starts with Attribute[] attribute.... is throwing an error when I try to provide a method (methodName) that has parameters in it. When I enter the "methodName" it is always just like that (no parenthesis as it will not accept those). The error produced says:
Object reference not set to an instance of an object.
I tried removing the parameter temporarily from the specific method that was throwing an error and it saw the Step attribute that I was looking for and output it to the report.
Here is the basic layout of the class I am using (same setup as all the non-parameter methods that work).
class AClassInTheProject
{
[Step("Perform the Step For AMethodNameOne"]
AMethodNameOne() // This one works
{
// Code
}
[Step("Perform the Step For AMethodNameTwo"]
AMethodNameTwo(string parameterA) // This one doesn't work
{
// Code
}
}
Background:
This is for a Gauge UIAutomation project. I need to run some steps in the UI Automation under logical conditions (If A Perform Step ...) which Gauge does not provide support for. All steps performed need to be output to the final report (GaugeMessages.....). This is a C# project. My need is not common among people int the Gauge community so it was not deemed priority enough to include a fix in the source code (which is why I'm doing this workaround). Hopefully that's detailed enough.

At the root, this is a NullReferenceException problem.
That call to GetRuntimeMethod is saying "give me a method by this name with no parameters". It's returning null because the method you want has parameters. It works when you remove the parameter because then it matches the "no parameters" condition.
If you want specific parameter types, specify them, e.g. new Type[] { typeof(string) }.
If you want any number and type of parameters, use the GetMethod overload that doesn't take a Type[] (assuming there's only one method by that name, otherwise you'll get a different exception) or use GetMethods and find the method you want from the array it returns.

Datatype of Methods

Question
How does a delegate store a reference to a function? The source code appears to refer to it as an Object, and the manner in which it invokes the method seems redacted from the source code. Can anyone explain how C# is handling this?
Original Post
It seems I'm constantly fighting the abstractions C# imposes on its programmers. One that's been irking me is the obfuscation of Functions/Methods. As I understand it, all methods are in fact anonymous methods assigned to properties of a class. This is the reason why no function is prefixed by a datatype. For example...
void foo() { ... }
... would be written in Javascript as...
Function foo = function():void { ... };
In my experience, Anonymous functions are typically bad form, but here it's replete throughout the language standard. Because you cannot define a function with its datatype (and apparently the implication/handling is assumed by the compiler), how does one store a reference to a method if the type is never declared?
I'm trying very hard to avoid Delegates and its variants (Action & Func), both because...
it is another abstraction from what's actually happening
the unnecessary overhead required to instantiate these classes (which in turn carry their own pointers to the methods being called).
Looking at the source code for the Delegate.cs, it appears to refer to the reference of a function as simply Object (see lines 23-25).
If these really are objects, how are we calling them? According to the delegate.cs trail, it dead-ends on the following path:
Delegate.cs:DynamicInvoke() > DynamicInvokeImpl() > methodinfo.cs:UnsafeInvoke() > UnsafeInvokeInternal() > RuntimeMethodHandle.InvokeMethod() > runtimehandles.cs:InvokeMethod()
internal extern static object InvokeMethod(object target, object[] arguments, Signature sig, bool constructor);
This really doesn't explain how its invoked if indeed the method is an object. It feels as though this is not code at all, and the actual code called has been redacted from source repository.
Your help is appreciated.
Response to Previous Comments
#Amy: I gave an example immediately after that statement to explain what I meant. If a function were prefixed by a datatype, you could write a true anonymous function, and store it as a property to an Object such as:
private Dictionary<string, Function> ops = new Dictionary<string, Function> {
{"foo", int (int a, int b) { return a + b } }
};
As it stands, C# doesn't allow you to write true anonymous functions, and walls that functionality off behind Delegates and Lambda expressions.
#500 Internal server error: I already explained what I was trying to do. I even bolded it. You assume there's any ulterior motive here; I'm simply trying to understand how C# stores a reference to a method. I even provided links to the source code so that others could read the code for themselves and help answer the question.
#Dialecticus: Obviously if I already found the typical answer on Google, the only other place to find the answer I'm looking for would be here. I realize this is outside the knowledge of most C# developers, and that's why I've provided the source code links. You don't have to reply if you don't know the answer.

While I'm not fully understanding your insights about "true anonymous functions", "not prefixed by a data type" etc, I can explain you how applications written in C# call methods.
First of all, there is no such a thing "function" in C#. Each and every executable entity in C# is in fact a method, that means, it belongs to a class. Even if you define lambdas or anonymous functions like this:
collection.Where(item => item > 0);
the C# compiler creates a compiler-generated class behind the scenes and puts the lambda body return item > 0 into a compiler-generated method.
So assuming you have this code:
class Example
{
public static void StaticMethod() { }
public void InstanceMethod() { }
public Action Property { get; } = () => { };
}
static class Program
{
static void Main()
{
Example.StaticMethod();
var ex = new Example();
ex.InstanceMethod();
ex.Property();
}
}
The C# compiler will create an IL code out of that. The IL code is not executable right away, it needs to be run in a virtual machine.
The IL code will contain a class Example with two methods (actually, four - a default constructor and the property getter method will be automatically generated) and a compiler-generated class containing a method whose body is the body of the lambda expression.
The IL code of Main will look like this (simplified):
call void Example::StaticMethod()
newobj instance void Example::.ctor()
callvirt instance void Example::InstanceMethod()
callvirt instance class [mscorlib]System.Action Example::get_Prop()
callvirt instance void [mscorlib]System.Action::Invoke()
Notice those call and callvirt instructions: these are method calls.
To actually execute the called methods, their IL code needs to be compiled into machine code (CPU instructions). This occurs in the virtual machine called .NET Runtime. There are several of them like .NET Framework, .NET Core, Mono etc.
A .NET Runtime contains a JIT (just-in-time) compiler. It converts the IL code to the actually executable code during the execution of your program.
When the .NET Runtime first encounters the IL code "call method StaticMethod from class Example", it first looks in the internal cache of already compiled methods. When there are no matches (which means this is the first call of that method), the Runtime asks the JIT compiler to create such a compiled-and-ready-to-run method using the IL code. The IL code is converted into a sequence of CPU operations and stored in the process' memory. A pointer to that compiled code is stored in the cache for future reuse.
This all will happen behind the call or callvirt IL instructions (again, simplified).
Once this happened, the Runtime is ready to execute the method. The CPU gets the compiled code's first operation address as the next operation to execute and goes on until the code returns. Then, the Runtime takes over again and proceeds with next IL instructions.
The DynamicInvoke method of the delegates does the same thing: it instructs the Runtime to call a method (after some additional arguments checks etc). The "dead end" you mention RuntimeMethodHandle.InvokeMethod is an intrinsic call to the Runtime directly. The parameters of this method are:
object target - the object on which the delegate invokes the instance method (this parameter).
object[] arguments - the arguments to pass to the method.
Signature sig - the actual method to call, Signature is an internal class that provides the connection between the managed IL code and native executable code.
bool constructor - true if this is a constructor call.
So in summary, methods are not represented as objects in C# (while you of course can have a delegate instance that is an object, but it doesn't represent the executable method, it rather provides an invokable reference to it).
Methods are called by the Runtime, the JIT compiler makes the methods executable.
You cannot define a global "function" outside of classes in C#. You could get a direct native pointer to the compiled (jitted) method code and probably even call it manually by directly manipulating own process' memory. But why?

You clearly misunderstand main differences between script languages, C/C++ and C#.
I guess the main difficulty is that there is no such thing as a function in C#. At all.
C#7 introduced the new feature "a local function", but that is not what a function in JS is.
All pieces of code are methods.
That name is intentionally different from function or a procedure to emphasize the fact that all executable code in C# belongs to a class.
Anonymous methods and lambdas are just a syntax sugar.
A compiler will generate a real method in the same (or a nested) class, where the method with anonymous method declaration belongs to.
This simple article explains it. You can take the examples, compile them and check the generated IL code yourself.
So all the methods (anonymous or not) do belong to a class. It's impossible to answer your updated question, besides saying It does not store a reference to a function, as there is no such thing in C#.
How does one store a reference to a method?
Depending on what you mean by reference, it can be either
An instance of MethodInfo class, used to reference reflection information for a method,
RuntimeMethodHandle (obtainable via RuntimeMethodInfo.MethodHandle) stores a real memory pointer to a JITed method code
A delegate, that is very different from just a memory pointer, but logically could be used to "pass a method reference to another method" .
I believe you are looking for the MethodInfo option, it has a MethodInfo.Invoke method which is very much alike Function..apply function in JS. You have already seen in the Delegate source code how that class is used.
If by "reference" you mean the C-style function pointer, it is in RuntimeMethodHandle struct. You should never use it without solid understanding how a particular .Net platform implementation and a C# compiler work.
Hopefully it clarifies things a bit.

A delegate is simply a pointer(memory location to jump to) to a method with the specified parameters and return type. Any Method that matches the signature(Parameters and return type) is eligible to fulfill the role, irrespective of the defined object. Anonymous simply means the delegate is not named.
Most times the type is implied(if it is not you will get a compiler error):
C# is a strongly typed language. That means every expression (including delegates) MUST have a return type(including void) as well as strongly typed parameters(if any). Generics were created to permit explicit types to be used within general contexts, such as Lists.
To put it another way, delegates are the type-safe managed version of C++ callbacks.
Delegates are helpful in eliminating switch statements by allowing the code to jump to the proper handler without testing any conditions.
A delegate is similar to a Closure in Javascript terminology.
In your response to Amy, you are attempting to equate a loosely typed language like JS, and a strongly typed language C#. In C# it is not possible to pass an arbitrary(loosely-typed) function anywhere. Lambdas and delegates are the only way to guarantee type safety.
I would recommend trying F#, if you are looking to pass functions around.
EDIT:
If you are trying to mimic the behavior of Javascipt, I would try looking at using inheritance through Interfaces. I can mimic multiple inheritance, and be type safe at the same time. But, be aware that it cannot fully supplant Javascript's dependency injection model.

As you probably found out C# doesn't have the concept of a function as in your JavaScript example.
C# is a statically typed language and the only way you can use function pointers is by using the built in types (Func,Action) or custom delegates.(I'm talking about safe,strongly typed pointers)
Javascript is a dynamic language that's why you can do what you describe
If you are willing to lose type safety, you can use the "dynamic" features of C# or refection to achieve what you want like in the following examples (Don't do this,use Func/Action)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reflection;
namespace ConsoleApp1
{
class Program
{
private static Dictionary<string, Func<int, int, int>> FuncOps = new Dictionary<string, Func<int, int, int>>
{
{"add", (a, b) => a + b},
{"subtract", (a, b) => a - b}
};
//There are no anonymous delegates
//private static Dictionary<string, delegate> DelecateOps = new Dictionary<string, delegate>
//{
// {"add", delegate {} }
//};
private static Dictionary<string, dynamic> DynamicOps = new Dictionary<string, dynamic>
{
{"add", new Func<int, int, int>((a, b) => a + b)},
{"subtract", new Func<int, int, int>((a, b) => a - b)},
{"inverse", new Func<int, int>((a) => -a )} //Can't do this with Func
};
private static Dictionary<string, MethodInfo> ReflectionOps = new Dictionary<string, MethodInfo>
{
{"abs", typeof(Math).GetMethods().Single(m => m.Name == "Abs" && m.ReturnParameter.ParameterType == typeof(int))}
};
static void Main(string[] args)
{
Console.WriteLine(FuncOps["add"](3, 2));//5
Console.WriteLine(FuncOps["subtract"](3, 2));//1
Console.WriteLine(DynamicOps["add"](3, 2));//5
Console.WriteLine(DynamicOps["subtract"](3, 2));//1
Console.WriteLine(DynamicOps["inverse"](3));//-3
Console.WriteLine(ReflectionOps["abs"].Invoke(null, new object[] { -1 }));//1
Console.ReadLine();
}
}
}
one more example that you shouldn't use
delegate object CustomFunc(params object[] paramaters);
private static Dictionary<string, CustomFunc> CustomParamsOps = new Dictionary<string, CustomFunc>
{
{"add", parameters => (int) parameters[0] + (int) parameters[1]},
{"subtract", parameters => (int) parameters[0] - (int) parameters[1]},
{"inverse", parameters => -((int) parameters[0])}
};
Console.WriteLine(CustomParamsOps["add"](3, 2)); //5
Console.WriteLine(CustomParamsOps["subtract"](3, 2)); //1
Console.WriteLine(CustomParamsOps["inverse"](3)); //-3

I will provide a really short and simplified answer compared to the others. Everything in C# (classes, variables, properties, structs, etc) has a backed with tons of things your programs can hook into. This network of backend stuff slightly lowers the speed of C# when compared to "deeper" languages like C++, but also gives programmers a lot more tools to work with and makes the language easier to use. In this backend is included things like "garbage collection," which is a feature that automatically deletes objects from memory when there are no variables left that reference them. Speaking of reference, the whole system of passing objects by reference, which is default in C#, is also managed in the backend. In C#, Delegates are possible because of features in this backend that allow for something called "reflection."
From Wikipedia:
Reflection is the ability of a computer program to examine,
introspect, and modify its own structure and behavior at runtime.
So when C# compiles and it finds a Delegate, it is just going to make a function, and then store a reflective reference to that function in the variable, allowing you to pass it around and do all sorts of cool stuff with it. You aren't actually storing the function itself in the variable though, you are storing a reference, which is kinda like an address that points you to where the function is stored in RAM.

Force my code to use my extension method

I'm using BitFactory logging, which exposes a bunch of methods like this:
public void LogWarning(object aCategory, object anObject)
I've got an extension method that makes this a bit nicer for our logging needs:
public static void LogWarning(this CompositeLogger logger,
string message = "", params object[] parameters)
Which just wraps up some common logging operations, and means I can log like:
Logging.LogWarning("Something bad happened to the {0}. Id was {1}",foo,bar);
But when I only have one string in my params object[], then my extension method won't be called, instead the original method will be chosen.
Apart from naming my method something else, is there a way I can stop this from happening?

The rules about how overloaded methods are resolved to one (or an error) are complex (the C# specification is included with Visual Studio for all the gory details).
But there is one simple rule: extension methods are only considered if there is no possible member that can be called.
Because the signature of two objects will accept any two parameters, any call with two parameters will match that member. Thus no extension methods will considered as possibilities.
You could pass a third parameter (eg. String.Empty) and not use it in the format.
Or, and I suspect this is better, to avoid possible interactions with additions to the library (variable length argument list methods are prone to this) rename to LogWarningFormat (akin to the naming of StringBuffer.AppendFormat).
PS. there is no point having a default for the message parameter: it will never used unless you pass no arguments: but that would log nothing.

Declared methods are always preceding extension methods.
If you want to call the extension regardless of the declared method, you have to call it as a regular static method, of the class that declared it.
eg:
LoggerExtensions.LogWarning(Logging, "Something bad happened to the {0}. Id was {1}",foo,bar);
I assume that the extension is declared in a class named LoggerExtensions

Provided that I think a method with a different name is the way to go (easier to read and maintain), as a workaround you could specify parameters as a named parameter:
logger.LogWarning("Something bad happened to the {0}.", parameters: "foo");

What is 'this' used for in C# language?

I've read open source c# code and there is a lot of strange grammar (to me).
They declare method arguments with the this keyword like this:
this object #object
What does it mean?
If I remove 'this' keyword where is before the data type, then will it work differently?

Sounds like an Extension Method.
The # symbol allows the variable name to be the same as a C# keyword - I tend to avoid them like the plague personally.
If you remove the this keyword, it will no longer be an extension method, just a static method. Depending on the calling code syntax, it may no longer compile, for example:
public static class IntegerMethods
{
public static int Add(this int i, int value)
{
return i + value;
}
}
int i = 0;
// This is an "extension method" call, and will only compile against extension methods.
i = i.Add(2);
// This is a standard static method call.
i = IntegerMethods.Add(i, 2);
The compiler will simply translate all "extension method calls" into standard static method calls at any rate, but extension method calls will still only work against valid extension methods as per the this type name syntax.
Some guidelines
These are my own, but I find they are useful.
Discoverability of extension methods can be a problem, so be mindful of the namespace you choose to contain them in. We have very useful stuff under .NET namespaces such as System.Collections or whatever. Less useful but otherwise "common" stuff tends to go under Extensions.<namespace of extended type> such that discoverability is at least consistent via convention.
Try not to extend often used types in broad scope, you don't want MyFabulousExtensionMethod appearing on object throughout your app. If you need to, either constrain the scope (namespace) to be very specific, or bypass extension methods and use a static class directly - these won't pollute the type metadata in IntelliSense.
In extension methods, "this" can be null (due to how they compile into static method calls) so be careful and don't assume that "this" is not null (from the calling side this looks like a successful method call on a null target).
These are optional and not exhaustive, but I find they usually fall under the banner of "good" advice. YMMV.

The 'this type name' syntax is used for extension methods.
For example if I wanted to add a UnCamelCase method to a string (so I could do "HelloWorld".UnCamelCase() to produce "Hello World` - I'd write this:
public static string UnCamelCase(this string text)
{
/*match any instances of a lower case character followed by an upper case
* one, and replace them with the same characters with a space between them*/
return Regex.Replace(text, "([a-z])([A-Z])", "$1 $2");
}
this string text means the specific instance of the string that you're working with, and text is the identifier for it.
The # syntax allows for variable names that are ordinarily reserved.

Is there a way to list all invocations and hardcoded string parameters of a method from a .Net assembly?

For example, assume that in my assembly, in Namespace A, Class B, there is an instance method with the following signature:
void Test(string someString, int someOtherParm, string someOtherString );
This method is called multiple times, from multiple places in the assembly.
I would like to be able build a list of all invocations of this method and the value of the someString/someOtherString (assuming they are hardcoded).
In other words, I like to extract a list of calls like the example one below, if they happen in the assembly somewhere:
Test("some text", 8, "some other text");
Thanks in advance,
R.

You could use the Cecil library, which is a very powerful IL inspection and modification API. You will want to create a "method visitor" that will scan for call instructions and attempt to locate constant strings loaded onto the stack.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Mono.Cecil: Create Instruction from string - c#

How can i turn strings like these: "call System.Console.WriteLine" "ldstr \"hello\"" into Intructions with Operands?

Related

Read the value of an attribute from a method with parameters

Datatype of Methods

Force my code to use my extension method

What is 'this' used for in C# language?

Is there a way to list all invocations and hardcoded string parameters of a method from a .Net assembly?

Categories

Resources