Does F# really allow specifying which functions to be inlined in code? - c#

When I am reading F# stuff, they are talking about inlining methods, but I thought .NET didn't expose this functionality to programmers. If it's exposed then it has to be in the IL? And so can C# make use of it as well?
Just wondering if this thing is the same as C++ inline functionality.

It is actually more complicated when compared to C++ inlining, because F# works on top of .NET, which has IL as an intermediate language, so there are actually two layers where some inlining can be done:
At the F# -> IL level - The inline keyword allows you to specify that an F# function should be inlined when generating .NET IL code. In this case, the IL instructions of the function will be placed in place of a IL instruction representing a method call.
At the IL -> assembly level - This is fully controlled by JITter (.NET just-in-time compiler), which compiles the IL (intermediate language) to actual executable assembly code. This is done fully automatically, so you cannot specify that something should be inlined at this level. However, JITter also inlines some simple calls (such as calls to property getters and setters).

To answer some of your specific questions, inline is an F#-specific construct that interacts with both the type system (e.g. static member constraints) and code generation (inlining code for optimization purposes). The F# compiler deals with these things, and the information regarding inlining is stored in F#-specific metadata in the assembly, which enables functions to be inlined across F# assembly boundaries.

Guess I'll post as an answer... didn't really want to do such because I don't know anything about F# beyond the basics. :p
http://msdn.microsoft.com/en-us/library/dd548047%28VS.100%29.aspx

Related

What is Deconstructed Compiler ? How C# gains a dynamic language's ability to generate and invoke code at runtime via Roslyn?

After reading this article about Roslyn. I came across two things that i did not understand.
deconstructed compiler
C# gains a dynamic language's ability to generate and invoke code at runtime via Roslyn
I have searched a lots of posts on stack overflow and googled for it. but could not get the whole picture.
Can anyone please explain to me or direct me with some links and resources about these topics.
Taken from the linked article:
Hejlsberg demonstrated a C# program that passed a few code snippets to
the C# compiler as strings; the compiler returned the resulting IL
assembly code as an object, which was then passed to the Common
Language Runtime (CLR) for execution. VoilĂ ! With Roslyn, C# gains a
dynamic language's ability to generate and invoke code at runtime.
The part of:
[...] C# gains a dynamic language's ability to generate and invoke code at runtime.
...is just a very wrong assumption made by the blog post author...
Compiling code from an application doesn't turn C# into a dynamic language or it doesn't turn new C# compiler in a substitute of an interpreter...
C# was able to generate code at run-time since its inception using Reflection Emit. It seems like the new compiler didn't add that feature, but anyway it's easier to generate code from regular C# code with the new compiler than using Reflection Emit. In addition, as #hvd has noted in some comment, it was also possible since C# inception using CSharpCodeProvider.
C#, since .NET 4.0, can interoperate with dynamic languages using the Dynamic Language Runtime, which was created to open the door to interpreted language implementations on top of .NET (and also to make COM interop easier...).
About the other question (the thing about deconstructed compiler), it means that the new C# compiler allows you to hook other code to perform actions based on C# compilation steps.
I would take a look at Roslyn Overview on GitHub where there're a lot of details that should give more depth on the topic.

Is JIT compiler a Compiler or Interpreter?

My question is whether JIT compiler which converts the IL to Machine language is exactly a compiler or an interpreter.
One more question :
Is HTML, JavaScript a compiled language or interpreted language?
Thanks in Advance
JIT (just in time) compiler is a compiler. It does optimizations as well as compiling to machine code. (and even called a compiler)
HTML, Javascript are interpreted, they are read as-is by the web browser, and run with minimal bug fixes and optimizations.
Technically, a compiler translates from one language to another language. Since a JIT compiler receives an IL as its input and outputs native machine binary, it easily fits this criteria and should be called a compiler.
Regarding Javascript, making a distinction here is more difficult. If you want to be pedantic, there's no such thing as a "compiled language" or "interpreted language". I mean, it's true that in practice most languages have one common way of running them and if that is an interpreter they are usually called interpreted languages, but interpretation or compilation are (usually) not traits of the language itself. Python is almost universally considered interpreted, but it's possible to write a compiler which compiles it to native binary code; does it still deserve the "interpreted" adjective?
Now to get to the actual answer: Javascript is typically ran by an interpreter which, among other things, uses a JIT compiler itself. Is that interpreted or compiled, then? Your call.
From Wiki's , just-in-time compiler(JIT), also known as dynamic translator, is used to improve the runtime performance of computer programs.
Just-in-time compilation is the conversion of non-native code, for example bytecode, into native code just before it is executed.JIT compiler is the one who compiles the IL code and output the native code which is cached, where as an interpreter will execute line by line code,
i.e in the case of java the class files are the input to the interpreter.
More on JIT here :
Difference between JIT Compiler and Interpreter (a)
Difference between JIT Compiler and Interpreter (b)
JIT-Compiler in detail
What a JIT compiler do ?
Yes, HTML, JavaScript are interpreted languages since they aren't compiled to any code. It means that scripts execute without preliminary compilation.
Also a good read here on JavaScript/HTML not being the compiled languages.
JIT processors like IL are compilers, mostly. JavaScript processors are interpreters, mostly. I understand your curiosity for this question, but personally I've come to think that there really isn't any 'right' anwser.
There are JavaScript interpreters that compiler parts or all of the code for efficiency reasons. Are those really interpreters?
JIT acts at runtime, so it can be understood as a clever, highly optimized interpreter. Which is it?
It's like "it's a plant" or "it's an animal" questions. There are live things that don't quite fit either mold very well: nature is what nature is, and 'classification' of things is a purely human intellectual effort that has its limitations. Even man-made things like 'code' are subject to the same considerations.
Ok; so maybe there is one right answer:
The way JavaScript is processed (say, as of 5 years ago) is called an 'Interpreter'. The way C++ is processed is considered a 'compiler'.
The way IL is processed is simply... a 'JIT'.
CLI (.Net bytecode) has features not found in native CPU's, so JIT is most definitively a compiler. Contrary to what some write here most of the optimizations has already been done however.
HTML is not programing language, so it is hard to say if it is compiled or interpreted... In sence of "if result of compilation is reused" HTML is not compiled by any browsers (it is parsed any time page is renderd).
JavaScript in older browsers is interpreted (preprocessed into intermediate representation, but not to machine code). Latest versions of browsers have JavaScript JIT compilers - so it is much harder to define if it is interpreted or compiled language now.
JIT (Just In Time) Compiler is a compiler only and not an interpreter,because JIT compiler compiles or converts certain pieces of bytecodes to native machine code at run-time for high performance,but it does'nt execute the instructions.
Whereas,an Interpreter reads and executes the instructions at runtime.
HTML and Javascript are interpreted,it is directly executed by browser without compilation.

C# reflection and auditing types

I'm trying to figure out if it's possible via reflection (or otherwise) to "audit" some code to enforce validation requirements -- such as checking whether or not code creates any threads (System.Threading.Thread) or uses other BCLs. The assumption is that the code is already compiled into a dll. Thanks!
Look at FxCop. It can load a compiled binary (dll or exe) and perform validation and compliance checking against that compiled IL, regardless of the .NET language used to write it.
You can write your own rules - which you would do in this case to catch cases of "= new Thread()" and the like.
You can do this with reflection if you are very well-versed in IL.
MethodBody mb = this.GetType().GetMethod( "Method", BindingFlags.Default ).GetMethodBody();
byte[] bytes = mb.GetILAsByteArray();
Probably way more trouble than it is worth; the resulting IL will need to be parsed.
An IL parser (but somewhat dated): http://www.codeproject.com/KB/cs/sdilreader.aspx which will generate a list of OpCodes for you (look for OpCodes.Newobj for instantiation of a Thread).
As others have said reflection won't help you as it only describes the metadata of tpyes.
However, the Mono.Cecil project is a runtime way of actually looking at the IL (Intermediate Language) of types within an assembly. Although a product of the Mono framework it is compatible with the Microsoft CLR.
Reflection does not allow inspection of the body of members, only their signatures. In other words, it won't tell you anything about what a particular method or property does, just what it looks like.
To do what you're after, you'll have to use something like ildasm.exe to turn the compiled .dll or .exe into IL, then go over the IL and see if it's doing anything to which you object.
Reflection will allow you to inspect the body of methods through MethodBase.GetMethodBody, which gives you a MethodBody to inspect.
However, at this level you are dealing with raw IL in a byte array, which you have to analyze start to end to find out calls to external methods and what they do etc.
So it won't be pretty or easy, but certainly it's possible.

How do languages like C# and Java avoid C/C++-like independent compilation?

For my programming languages class, I'm writing a research paper on some papers by some important people in the history of language design. One by CAR Hoare struck me as odd because it speaks against independent compilation techniques used in C and later C++ before C even became popular.
Since this is primarily an optimization to speed up compilation times, what is it about Java and C# that make them able to avoid reliance on independent compilation? Is it a compiler technique or are there elements of the language that facilitate this? And are there any other compiled languages that used these techniques before them?
Short answer: Java and C# don't avoid separate compilation; they make full use of it.
Where they differ is that they don't require the programmer to write a pair of separate header/implementation files when writing a reusable library. The user writes the definition of a class once, and the compiler extracts the information equivalent to the "header" from that single definition and includes it in the output file as "type metadata". So the output file (a .jar full of .class files in Java, or an .dll assembly in .NET-based languages) is a combination of binaries AND headers in a single package.
Then when another class is compiled and it depends on the first class, it can look at the metadata instead of having to find a separate include file.
It happens that they target a virtual machine rather than a specific chip architecture, but that's a separate issue; they could put x86 machine code in as the binary and still have the header-like metadata in the same file as well (this is in fact an option in .NET, albeit rarely used).
In C++ compilers it is common to try to speed up compilation by using "pre-compiled headers". The metadata in .NET .dll and .class files is much like a pre-compiled header - already parsed and indexed, ready for rapid look-ups.
The upshot is that in these modern languages, there is one way of doing modularization, and it has the characteristics of a perfectly organised and hand-optimised C++ modular build system - pretty nifty, speaking ASFAC++B.
IMO, one of the biggest factors here is that both java and .NET use intermediate languages; that means that the compiled unit (jar/assembly) contains, as a pre-requisite, a lot of expressive metadata about the types, methods, etc; meaning that it is already laid out conveniently for reference checking. The runtime still checks anyway, in case you are pulling a fast one ;-p
This isn't very far removed from the MIDL that underpins COM, although there the TLB is often a separate entity.
If I've misunderstood your meaning, please let me know...
You could consider a java .class file to be similar to a precompiled header file in C/C++. Essentially the .class file is the intermediate form that a C/C++ linker would need as well as all of the information contained in the header (Java just doesn't have a separate header).
Form your comment in another post:
"I'm basically meaning the idea in
C/C++ that each source file is its own
individual compilation unit. This
doesn't as much seem to be the case in
C# or Java."
In Java (I cannot speak for C#, but I assume it is the same) each source file is its own individual compilation unit. I am not sure why you would think it is not... perhaps we have different definitions of compilation unit?
It requires some language support (otherwise, C/C++ compilers would do it too)
In particular, it requires that the compiler generates self-contained modules, which expose metadata that other modules can reference to call into them.
.NET assemblies are a straightforward example. All the files in a project are compiled together, generating one dll. This dll can be queried by .NET to determine which types it contains, so that other assemblies can call functions defined in it.
And to make use of this, it must be legal in the language to reference other modules.
In C++, what defines the boundary of a module? The language specifies that the compiler only considers data in its current compilation unit (.cpp file + included headers). There is no mechanism for specifying "I'd like to call function Foo in module Bar, even though I don't have the prototype or anything for it at compile-time". The only mechanism you have for sharing type information between files is with #includes.
There is a proposal to add a module system to C++, but it won't be in C++0x. Last I saw, the plan was to consider it for a TR1 after 0x is out.
(It's worth mentioning that the #include system in C/C++ was originally used because it'd speed up compilation. Back in the 70's, it allowed the compiler to process the code in a simple linear scan. It didn't have to build syntax trees or other such "advanced" features. Today, the tables have turned and it's become a huge bottleneck, both in terms of usability and compilation speed.)
The object files generated by a C/C++ are ment to be read only by the linker, not by the compiler.
As to other languages: IIRC Turbo Pascal had "units" which you could use without having any source code. I think the point is to create metadata along with compiled code which can then be used by the compiler to figure out the interface to the module (i.e. signatures of functions, class layout etc.)
One problem with C/C++ which prevents just replacing #include with some kind of #import is also the preprocessor, which can completely change the meaning/syntax etc of included/imported modules. This would be very difficult (if not impossible) with a Java-like module system.

CLR vs JIT

What is the difference between the JIT compiler and CLR? If you compile your code to il and CLR runs that code then what is the JIT doing? How has JIT compilation changed with the addition of generics to the CLR?
You compile your code to IL which gets executed and compiled to machine code during runtime, this is what's called JIT.
Edit, to flesh out the answer some more (still overly simplified):
When you compile your C# code in visual studio it gets turned into IL that the CLR understands, the IL is the same for all languages running on top of the CLR (which is what enables the .NET runtime to use several languages and inter-op between them easily).
During runtime the IL is interpreted into machine code (which is specific to the architecture you're on) and then it's executed. This process is called Just In Time compilation or JIT for short. Only the IL that is needed is transformed into machine code (and only once, it's "cached" once it's compiled into machinecode), just in time before it's executed, hence the name JIT.
This is what it would look like for C#
C# Code > C# Compiler > IL > .NET Runtime > JIT Compiler > Machinecode > Execution
And this is what it would look like for VB
VB Code > VB Compiler > IL > .NET Runtime > JIT Compiler > Machinecode > Execution
And as you can see only the two first steps are unique to each language, and everything after it's been turned into IL is the same which is, as I said before, the reason you can run several different languages on top of .NET
The JIT is one aspect of the CLR.
Specifically it is the part responsible for changing CIL (hereafter called IL) produced by the original language's compiler (csc.exe for Microsoft c# for example) into machine code native to the current processor (and architecture that it exposes in the current process, for example 32/64bit). If the assembly in question was ngen'd then the the JIT process is completely unnecessary and the CLR will run this code just fine without it.
Before a method is used which has not yet been converted from the intermediate representation it is the JIT's responsibility to convert it.
Exactly when the JIT will kick in is implementation specific, and subject to change. However the CLR design mandates that the JIT happens before the relevant code executes, JVMs in contrast would be free to interpret the code for a while while a separate thread creates a machine code representation.
The 'normal' CLR uses a pre-JIT stub approach where by methods are JIT compiled only as they are used. This involves having the initial native method stub be an indirection to instruct the JIT to compile the method then modify the original call to skip past the initial stub. The current compact edition instead compiles all methods on a type when it is loaded.
To address the addition of Generics.
This was the last major change to the IL specification and JIT in terms of its semantics as opposed to its internal implementation details.
Several new IL instructions were added, and more meta data options were provided for instrumenting types and members.
Constraints were added at the IL level as well.
When the JIT compiles a method which has generic arguments (either explicitly or implicitly through the containing class) it may set up different code paths (machine code instructions) for each type used. In practice the JIT uses a shared implementation for all reference types since variables for these will exhibit the same semantics and occupy the same space (IntPtr.Size).
Each value type will get specific code generated for it, dealing with the reduced / increased size of the variables on the stack/heap is a major reason for this. Also by emitting the constrained opcode before method calls many invocations on non reference types need not box the value to call the method (this optimization is used in non generic cases as well). This also allows the default<T> behaviour to be correctly handled and for comparisons to null to be stripped out as no ops (always false) when a non Nullable value type is used.
If an attempt is made at runtime to create an instance of a generic type via reflection then the type parameters will be validated by the runtime to ensure they pass any constraints. This does not directly affect the JIT unless this is used within the type system (unlikely though possible).
As Jon Skeet says, JIT is part of the CLR. Basically this is what is happening under the hood:
Your source code is compiled into a byte code know as the common intermediate language (CIL).
Metadata from every class and every methods (and every other thing :O) is included in the PE header of the resulting executable (be it a dll or an exe).
If you're producing an executable the PE Header also includes a conventional bootstrapper which is in charge of loading the CLR (Common language runtime) when you execute you executable.
Now, when you execute:
The bootstraper initializes the CLR (mainly by loading the mscorlib assembly) and instructs it to execute your assembly.
The CLR executes your main entry.
Now, classes have a vector table which hold the addresses of the method functions, so that when you call MyMethod, this table is searched and then a corresponding call to the address is made. Upon start ALL entries for all tables have the address of the JIT compiler.
When a call to one of such method is made, the JIT is invoked instead of the actual method and takes control. The JIT then compiles the CIL code into actual assembly code for the appropiate architecture.
Once the code is compiled the JIT goes into the method vector table and replaces the address with the one of the compiled code, so that every subsequent call no longer invokes the JIT.
Finally, the JIT handles the execution to the compiled code.
If you call another method which haven't yet being compiled then go back to 4... and so on...
The JIT is basically part of the CLR. The garbage collector is another. Quite where you put interop responsibilities etc is another matter, and one where I'm hugely underqualified to comment :)
I know the thread is pretty old, but I thought I might put in the picture that made me understand JIT. It's from the excellent book CLR via C# by Jeffrey Ritcher. In the picture, the metadata he is talking about is the metadata emitted in the assembly header where all information about types in the assembly is stored:
1)while compiling the .net program,.net program code is converted into Intermediate Language(IL) code
2)upon executing the program the Intermediate language code is converted into operating system Native code as and when a method is called; this is called JIT (Just in Time) compilation.
Common Language Runtime(CLR) is interpreter while Just In Time(JIT) is compiler in .Net Framework.
2.JIT is the internal compiler of .NET which takes MicroSoft Intermediate Code Language (MSICL) code from CLR and executes it to machine specific instructions whereas CLR works as an engine its main task is to provide MSICL code to JIT to ensure that code is fully compiled as per machine specification.

Categories