IL code vs IL assembly: is there a difference? - c#

If I run a .NET compiler it produces a file containing intermediate language code (IL) and put it into, an .exe file (for instance).
After if I use a tool like ildasm it shows me the IL code again.
However if I write directly into a file IL code then I can use ilasm to produce an .exe file.
What does it contain? IL code again? Is IL code different to IL assembly code?
Is there a difference between IL code and IL assembly?

Yes, there is a big difference between them, since :
(IL) which is also known as Microsoft Intermediate Language or Common Intermediate Language can be considered very similar to the Byte Code generated by the Java Language, and is what I think you are referring as IL Code in your question .
(ILAsm) has the instruction set same as that the native assembly language has. You can write code for ILAsm in any text editor like notepad and then can use the command line compiler (ILAsm.exe) provided by the .NET framework to compile that.
I think that IL Assembly can be considered a fully fledged .NET language(maybe an intermediate language), so when you compile ILAsm with ILAsm.exe you are producing IL in pretty much the same way(with less steps) that your C# compiler does with C# Code ...
As someone stated in the comment IL Assembly is basically a human readable version of the .NET Byte Code.

A .NET assembly does not contain MSIL, it contains metadata and bytes that represent IL opcodes. Pure binary data, not text. Any .NET decompiler, like ildasm.exe, knows how to convert the bytes back to text. It is pretty straight-forward.
The C# compiler directly generates the binary data, there is no intermediate text format. When you write your own IL code with a text editor then you need ilasm.exe to convert it to binary. It is pretty straight-forward.
The most difficult job of generating the binary data is the metadata btw. It is excessively micro-optimized to make it as small as possible, its structure is quite convoluted. No compiler generates the bytes directly, they'll use a pre-built component to get that job done. Notable is that Roslyn had to rewrite this from scratch, big job.

Related

Decompile C# vs C++

I have read many posts about decompiling (though no experience) but did not understand why all of them generally mentioned that it is easier to decompile C# than C++ executable. Could anyone explain the difference?
C# compiles into CIL, not directly into a native code like a C++ compiler would normally do.
It produces a .NET assembly, which contains much more meta data than a C++ executable does (via the embedded manifest) - this is metadata about the types contained in the assembly, what it references and more, making it much easier to decompile than a "normal" executable.
As noted in the comments, CIL in and of itself is a higher level language than assembly and is an object oriented language, making it easier to understand and decompile correctly.
It's simply.
The C#-code has necessary information for restore source code, but C/C++ hasn't it.

Is C# code compiled to native binaries?

I know that Java code is compiled into byte-code, that is executed by the JVM.
What is the case with C# ? I have noticed that applications written in C# have the .exe extension what would suggest they are native machine instructions. but is it really so ?
No.
Like Java, C# is compiled to an intermediary language (called MSIL or CIL).
Unlike Java, the IL is stored in EXE files which have enough actual EXE code to show a dialog box asking users to install .Net.
C# compilation is done in these two steps :
1. Conversion from C# to CIL by the C# compiler
2. Conversion from CIL to instructions that the processor can execute.
A component (just in time) performs this compilation at run time from CIL to machine code
What that .exe is supposed to tell you is that the file is executable. C# is compiled into bytecode, just as java is, but .NET wraps this in a CLR executable.
Look here for a more in depth look at CLR executable http://etutorials.org/Programming/.NET+Framework+Essentials/Chapter+2.+The+Common+Language+Runtime/2.2+CLR+Executables/
c# code is compiled to MSIL. it likes java bytecode. msil will be convert to machine isntrctions at runtime.
C# code is compiled to MSIL, MSIL is taken care by .NET CLR
There is also a project that allows compilation of C# to standalone binary executables: CoreRT

Viewing MSIL as expression tree

I'm currently building a compiler for my language into MSIL, and use Reflector to inspect the IL.
Is there a way to visualise the IL as an Expression Tree that could be used to generate the IL instead?
You could use FxCop for this, with a custom rule that writes to a text file or something.
Note: FxCop works on compiled managed code (DLLS/exes), not sure about starting from IL. I suggested this answer as you say you're using Reflector to get IL, implying you're starting from compiled managed code.

Does compiling to native code in .Net remove the MSIL completely?

I'm wondering if, in the context of disassembling .Net code (Redgate .Net reflector, etc), is it more secure to compile your code to native, using Ngen? That is, does that mean someone would now need IDA and ASM skills to disassemble (and make sense) of your code vs the relatively trivial de-compiling of MSIL?
Yes, I'm aware that MS provides a obfuscater for exactly this purpose, but I'm curious if compiling to native is a better solution, with some tradeoffs(no JIT).
Thanks.
ngen doesn't remove the MSIL (or rather, the native binary produced by ngen is unusable without also having the MSIL file). MSIL is still used by the verifier to determine whether to load assemblies in partial-trust scenarios, and for reflection.
There's a lot of good information here.

Whats the relation(if any) of MASM assembly language and ILASM?

whats the relation(if any) of MASM assembly language and ILASM. Is there a one to one conversion? Im trying to incorporate Quantum GIS into a program Im kinda writing as I go along! I have GIS on my computer, I have RedGate Reflector and it nor the Object Browser of Visual Studio 2008 couldnt open one(of several which I dont have a strong clue to how they behave) of the .dlls in Quantum. I used the MASM assembly editor and "opened" the same dll and it spewed something I didnt expect to necessarily understand in the first place. How can I/can I make a conversion of that same "code" to something I can interact with in ILASM and Im assuming consequently in Csharp? Thanks a ton for reading and all the responses to earlier questions...please bear in mind Im relatively new to programming in Csharp, and even fresher to MASM and ILASM.
MASM deals with the x86 instructions and is platform/processor dependent, while ILASM reffers to the .Net CIL (common intermediary language) instructions which are platform/processor independent. Switching from something specific to something more general is hard to achieve, that's why, AFAIK, there is no converter from MASM to ILASM (inverse, there is!)
IL is a platform independent layer of abstraction over native code. Code written on the .NET platform in C#, VB.NET, or other .NET language all compile down to an assembly .EXE/.DLL containing IL. Typically, the first time the IL code is executed the .NET runtime will run it through NGen, which compiles it once again down to native code and stores the output in a temporary location where it is actually executed. This allows .NET platform code to be deployed to any platform supporting that .NET framework, regardless of the processor or architecture of the system.
As you've seen, Reflector is great for viewing the code in an assembly because IL can easily be previewed in C# or VB.NET form. This is because IL is generally a little higher level instructions and also contain a lot of metadata that native code wouldn't normally have, such as class, method, and variable names.
It's also possible to compile a .NET project directly to native code by setting the Visual Studio project platform or by calling Ngen.exe directly on the assembly. Once done, it's really difficult to make sense of the native code.
Ther is no relationship between MASM assembly language and ILASM. I don't see you have any way to convert native code to IL code. IL can be understood by CLR only while the MASM assembly language is about native machine code. CLR turns the IL into native code in runtime

Categories