I'm modifying an assembly using Mono.Cecil, and I want to check it for validity (whether the result will run at all). I'm trying to use PEVerify, but I'm having a problem.
It was designed for ensuring code is verifiable, so it just says ERROR whether the error means the IL is completely invalid and will not execute, or whether it's a verifiability issue that would be ignored in full trust. Here are some examples:
Using pointers and the like.
Not setting .locals init when the method has locals.
Calling .ctor from a non-constructor method.
Issues that make the IL fail to run include:
Member isn't accessible from the location it is used in.
Member doesn't exist.
Is there a way to make it give me some indication of the severity of the issue? If not, is there another tool that can do this?
#HansPassant already tried to explain it, but just so that we all understand each other, here's what's going on.
PEVerify checks your assembly for constructs that are not okay. That said, PEVerify is not the JIT compiler. The JIT compiler itself doesn't check the IL assembly - it just grabs the method it's going to call, changes it into an SSA form, optimizes it, compiles it and then calls the resulting binary assembly.
Now, the compiler will evolve over time. Optimizations are changed and added, and the role of the compiler is not necessarily to check for error (if it finds one as a by-product, it'll probably report it, but no guarantees). Remember, the JIT compiler is relentlessly optimized for just one thing, and that is to produce pretty good assembler byte code (because it's a JIT'ted language, the time it takes to compile something is really important). So, two different tools.
This basically results in the following:
The compiler will compile and execute what it was given.
PEVerify will tell you if the result of the method / assembly is defined.
If you ignore an error of PEVerify, this basically means that the result will be undefined behavior - which can be anything from a working executable to a hard crash. There is no such thing as a 'warning'.
Related
The problem is that I often get a TypeLoadReflectionException and my ability to find the cause for them is very limited, essentially nil actually.
Today I again had a problem like this and couldn't solve it. For some reason my assembly required another assembly which never happened before and even reverting the code to a previously working state did not fix the issue. I'm stumped, there must be deeper issues here.
Is there any way to debug this properly? Ideally a tool like Reflector or IlSpy would be ably to tell me "dependent assemblies" and "type dependencies" per line.
Seeing as how the C# 4.5 Roslyn compiler is now open source, this should theoretically be doable, after all the compiler at some point during the compilation process decides "hey, i need this type for this stuff to compile". Correct? However these TypeLoadExceptions occur at runtime, so I'm not sure.
Could you please tell me which are the differences between rules of StyleCop and Code Analysis ? Should it be used together or not ?
Thanks.
Style cop essentially parses the file looking for formatting issues and other things that you could think of as "cosmetic". Code analysis actually builds your code and inspects the compiled runtime IL for characteristics about how it behaves when it runs and flag potential runtime problems.
So, they are complimentary, and you are perfectly fine to use them together.
Short answer:
stylecop: takes your source code as input and checks for potential code style issues. For instance: using directives are not alphabetically ordered...etc.
fxcop (now code analysis): takes a compiled assembly as input and checks for potential issues related to the executable/dll itself when it'll be executed. For instance: in your class you have a member of type IDisposable that is not disposed properly.
However, there are some rules that are common to both tools, for instance rules related to naming convention for public exposed types.
Anyway, using both is a good idea.
FxCop checks what is written. It works over the compiled assembly.
StyleCop checks how it is written. It works over the parsed source file, even without trying to compile it.
This leads to all the differences. For example, FxCop cannot check indentations, cause they are absent in a compiled assembly. And StyleCop cannot perform code-flow checks cause it doesn't know how your code is really being executed.
I've just disassembled a project to debug it using Reflector, but it seems to balk at decoding the 'compile results' of automatic properties, e.g. the next line gives me a syntax error. I've tried fixing these manually, but every time I fix one, more appear.
private string <GLDescription>k__BackingField;
Is there anything I can do about this?
Ha! Stupid me: all I had to do was set the disassembler optimization in Reflector's options to .NET 3.5. Mine was on 2.0.
The compiler generates fields with "unspeakable names" - i.e. ones which are illegal in C# itself, but are valid IL.
There's no exactly accurate translation of the IL into "normal" C# (without automatic properties). You can replace < and > with _ which will give legal code, but then of course it won't be exactly the same code any more. If you're only after the ability to debug, however, that won't be a problem.
If you decompile iterators (i.e. methods using yield statements) you'll find more of the same, including the use of fault blocks, which are like finally blocks but they only run when an exception has occurred (but without catching the exception). Various other constructs generate unspeakable names too, including anonymous methods, lambda expressions and anonymous types.
On a broader note, do you have permission to decompile this code? If the author doesn't mind you doing so, they're likely to be willing to give you the source code to start with which would make your life easier. If they don't want you debugging their source code to start with, you should consider the ethical (and potentially legal) ramifications of decompiling the code. This may vary by location: consult a real lawyer for more definitive guidance.
EDIT: Having seen your own answer, that makes a lot of sense. I'll leave this here for background material.
When Visual Studio greys out some code and tells you it is redundant, does this mean the compiler will ignore this code or will it still compile this code? In other words, would this redundant code never be interpreted or will it be? Or does it simply act as a reminder that the code is simply not required?
If I leave redundant code in my classes/structs etc, will it have an impact on performance?
Thanks
If the code is redundant it's not necessary for compilation, but leaving it in won't have any impact on performance.
As the compiler has identified the code as redundant in Visual Studio it won't get compiled into the IL or machine code.
It's not good practice to leave redundant code in your project. If you need the code in the future you should get it from the older versions of the file in your source code repository.
C# is not an interpreted language, it's a JITted (Just-In-Time compiled) language, which means it's compiled from MSIL at runtime. Thus, the JITter can do analysis to determine whether code is redundant, and then remove it.
There will be two opportunities to remove redundant code
Compiling C# to MSIL in Visual Studio.
JITting MSIL to assembly at run (or install) time.
Because the C# compiler itself has flagged this issue, that means the code will likely be removed during (1).
So yeah, it's just being nice and reminding you. Most compilers remove redundant code in many different and subtle ways without telling the programmer, but in certain obvious cases it's a good idea to tell the programmer.
No, it's not compiled.
Can drive me nuts sometimes when testing and I want to use the debuggers "set next statement" command to some statement and it wasn't compiled.
What is the difference between the JIT compiler and CLR? If you compile your code to il and CLR runs that code then what is the JIT doing? How has JIT compilation changed with the addition of generics to the CLR?
You compile your code to IL which gets executed and compiled to machine code during runtime, this is what's called JIT.
Edit, to flesh out the answer some more (still overly simplified):
When you compile your C# code in visual studio it gets turned into IL that the CLR understands, the IL is the same for all languages running on top of the CLR (which is what enables the .NET runtime to use several languages and inter-op between them easily).
During runtime the IL is interpreted into machine code (which is specific to the architecture you're on) and then it's executed. This process is called Just In Time compilation or JIT for short. Only the IL that is needed is transformed into machine code (and only once, it's "cached" once it's compiled into machinecode), just in time before it's executed, hence the name JIT.
This is what it would look like for C#
C# Code > C# Compiler > IL > .NET Runtime > JIT Compiler > Machinecode > Execution
And this is what it would look like for VB
VB Code > VB Compiler > IL > .NET Runtime > JIT Compiler > Machinecode > Execution
And as you can see only the two first steps are unique to each language, and everything after it's been turned into IL is the same which is, as I said before, the reason you can run several different languages on top of .NET
The JIT is one aspect of the CLR.
Specifically it is the part responsible for changing CIL (hereafter called IL) produced by the original language's compiler (csc.exe for Microsoft c# for example) into machine code native to the current processor (and architecture that it exposes in the current process, for example 32/64bit). If the assembly in question was ngen'd then the the JIT process is completely unnecessary and the CLR will run this code just fine without it.
Before a method is used which has not yet been converted from the intermediate representation it is the JIT's responsibility to convert it.
Exactly when the JIT will kick in is implementation specific, and subject to change. However the CLR design mandates that the JIT happens before the relevant code executes, JVMs in contrast would be free to interpret the code for a while while a separate thread creates a machine code representation.
The 'normal' CLR uses a pre-JIT stub approach where by methods are JIT compiled only as they are used. This involves having the initial native method stub be an indirection to instruct the JIT to compile the method then modify the original call to skip past the initial stub. The current compact edition instead compiles all methods on a type when it is loaded.
To address the addition of Generics.
This was the last major change to the IL specification and JIT in terms of its semantics as opposed to its internal implementation details.
Several new IL instructions were added, and more meta data options were provided for instrumenting types and members.
Constraints were added at the IL level as well.
When the JIT compiles a method which has generic arguments (either explicitly or implicitly through the containing class) it may set up different code paths (machine code instructions) for each type used. In practice the JIT uses a shared implementation for all reference types since variables for these will exhibit the same semantics and occupy the same space (IntPtr.Size).
Each value type will get specific code generated for it, dealing with the reduced / increased size of the variables on the stack/heap is a major reason for this. Also by emitting the constrained opcode before method calls many invocations on non reference types need not box the value to call the method (this optimization is used in non generic cases as well). This also allows the default<T> behaviour to be correctly handled and for comparisons to null to be stripped out as no ops (always false) when a non Nullable value type is used.
If an attempt is made at runtime to create an instance of a generic type via reflection then the type parameters will be validated by the runtime to ensure they pass any constraints. This does not directly affect the JIT unless this is used within the type system (unlikely though possible).
As Jon Skeet says, JIT is part of the CLR. Basically this is what is happening under the hood:
Your source code is compiled into a byte code know as the common intermediate language (CIL).
Metadata from every class and every methods (and every other thing :O) is included in the PE header of the resulting executable (be it a dll or an exe).
If you're producing an executable the PE Header also includes a conventional bootstrapper which is in charge of loading the CLR (Common language runtime) when you execute you executable.
Now, when you execute:
The bootstraper initializes the CLR (mainly by loading the mscorlib assembly) and instructs it to execute your assembly.
The CLR executes your main entry.
Now, classes have a vector table which hold the addresses of the method functions, so that when you call MyMethod, this table is searched and then a corresponding call to the address is made. Upon start ALL entries for all tables have the address of the JIT compiler.
When a call to one of such method is made, the JIT is invoked instead of the actual method and takes control. The JIT then compiles the CIL code into actual assembly code for the appropiate architecture.
Once the code is compiled the JIT goes into the method vector table and replaces the address with the one of the compiled code, so that every subsequent call no longer invokes the JIT.
Finally, the JIT handles the execution to the compiled code.
If you call another method which haven't yet being compiled then go back to 4... and so on...
The JIT is basically part of the CLR. The garbage collector is another. Quite where you put interop responsibilities etc is another matter, and one where I'm hugely underqualified to comment :)
I know the thread is pretty old, but I thought I might put in the picture that made me understand JIT. It's from the excellent book CLR via C# by Jeffrey Ritcher. In the picture, the metadata he is talking about is the metadata emitted in the assembly header where all information about types in the assembly is stored:
1)while compiling the .net program,.net program code is converted into Intermediate Language(IL) code
2)upon executing the program the Intermediate language code is converted into operating system Native code as and when a method is called; this is called JIT (Just in Time) compilation.
Common Language Runtime(CLR) is interpreter while Just In Time(JIT) is compiler in .Net Framework.
2.JIT is the internal compiler of .NET which takes MicroSoft Intermediate Code Language (MSICL) code from CLR and executes it to machine specific instructions whereas CLR works as an engine its main task is to provide MSICL code to JIT to ensure that code is fully compiled as per machine specification.