Throughout this site I commonly see people answer questions with such answers as "it works like that because the compiler replaces [thing] with [other thing]", so my question is, how do people know/learn this? Where can I learn these things?
The most definitive source for how the C# compiler interprets code is the C# language spec.
http://www.microsoft.com/download/en/details.aspx?id=7029
Also the following blogs provide a lot of more insight into the C# language. Mandatory reading for anyone who wants to become an expert in the language
http://blogs.msdn.com/b/ericlippert/
http://msmvps.com/blogs/jon_skeet/
One technique is to compile your code, and then decompile it using tools such as ILSpy. Using such a tool, you can view the raw IL and see for yourself what the compiler produces.
In addition to the other answers, I'd like to mention that LINQPad is my favorite tool for inspecting IL for quick snippets.
You can type a snippet of code, and immediately see the IL.
It's by far the easiest tool to use, and you can make changes and see the results instantly.
In addition to checking the Intermediate Language and reading the language specification, please allow me to add "CLR via C#" by Jeffrey Richter. Microsoft Press Library of Congress Control Number: 2009943026. This reference is amazing, and goes into complete detail on what's happening under the covers.
Niklaus Wirth's book Compiler Construction (PDF) is an introduction to the theory and the techniques of compiler construction. It gives you a general idea of what a compiler is and what it does.
Related
I've been using reflector to decompile a couple simple c# apps but I notice that though code is being decompiled, I still can't see things as they were written on VS. I think this is the way it is as the compiler replaces human instructions by machine code. However I thought I would give it a try and ask it on here. Maybe there is a decompiler that can decompile and show the coding almost identically to the original code.
That is impossible, since there are lots of ways to get the same IL from different code. For example, there is no way to know if an extension method was called fluent-style vs explicit on the declaring type. There is no way to know if LINQ vs regular code was used. All manner of implicit operations may or may not be there. Removed code may or may not have been there. Many primitives (including enums) up-to-and-including 4 bytes are indistinguishable once they are IL.
If you want the actual code, legally obtain the original code.
Existing .Net decompilers generally decompile to the best of their ability.
You appear to be asking for variable names and line formatting, which for obvious reasons are not compiled to IL.
There are several. I currently use JustDecompile found here http://www.telerik.com/products/decompiler.aspx?utm_source=twitter&utm_medium=sm&utm_campaign=ad
[Edit]
An alternative is .NET Reflector found here: http://www.reflector.net/
I believe there is a free version of it, but didn't take time to look.
Basically, no. There are often many ways to arrive at the same IL code, and there's no way at all for a decompiler to know which was used.
No, nor should there ever be. Things like comments and unreachable code would just add bloat with absolutely zero benefit. The very best you can ever do is approximate the compiled code.
I am researching ways, tools and techniques to parse code files in order to support syntax highlighting and intellisence in an editor written in c#.
Does anyone have any ideas/patterns & practices/tools/techiques for that.
EDIT: A nice source of info for anyone interested:
Parsing beyond Context-free grammars
ISBN 978-3-642-14845-3
My favourite parser for C# is Irony: http://irony.codeplex.com/ - i have used it a couple of times with great success
Here is a wikipedia page listing many more: http://en.wikipedia.org/wiki/Compiler-compiler
There are two basic aproaches:
1) Parse the entire solution and everything it references so you understand all the types involved in the code
2) Parse locally and do your best to guess what types etc are.
The trouble with (2) is that you have to guess, and in some circumstances you just can't tell from a code snippet exactly what everything is. But if you're happy with the sort oif syntax highlighting shown on (e.g.) Stack Overflow, then this approach is easy and quite effective.
To do (1) then you need to do one of (in decreasing order of difficulty):
Parse all the source code. Not possible if you reference 3rd party assemblies.
Use reflection on the compiled code to garner type information you can use when parsing the source.
Use the host IDE's (if avaiable - so not applicable in your case!) code element interfaces to provide the information you need
You could take a look at how http://www.icsharpcode.net/ did it. They wrote a book doing just that, Dissecting a C# Application: Inside SharpDevelop, it even has a chapter called
Implement a parser to provide syntax
highlighting and auto-completion as
users type
I am looking to write an interpreted language in C#, where should I start? I know how I would do it using fun string parsing, but what is the correct way?
It can be a pretty difficult endeavour to do right.
If you don't have much knowledge in compiler theory you should probably first start reading about it.
Just using "fun string parsing", if I understand that term correctly, isn't going to get you very far at all.
The first basic step is to write your language grammar that defines the valid syntax for the language.
A tool like ANTLR will help you get the pieces together, but I would suggest reading the Dragon book as it is the canonical starting point to get up to speed on the subject.
If you want to build an interpreted language on .NET, the DLR is the way to go - check out Martin Maly's LOLCODE sample at http://www.iunknown.com/2007/11/lolcode-on-dlr.html
Edit: Here's another link with more information from Scott Hanselman: http://www.hanselman.com/blog/TheWeeklySourceCode11LOLCodeDLREdition.aspx
Checkout the Phoenix compiler from Microsoft. This will provide many of the tools you will need to build a compiler targeting native or managed environments. Among these tools us a optimizing back end.
I second Cycnus' suggestion on reading Aho Sethi and Ullman's "Dragon Book" (Wikipedia, Amazon).
RGR
Certainly there's the difference in general syntax, but what other critical distinctions exist? There are some differences, right?
The linked comparisons are very thorough, but as far as the main differences I would note the following:
C# has anonymous methodsVB has these now, too
C# has the yield keyword (iterator blocks)VB11 added this
VB supports implicit late binding (C# has explicit late binding now via the dynamic keyword)
VB supports XML literals
VB is case insensitive
More out-of-the-box code snippets for VB
More out-of-the-box refactoring tools for C#Visual Studio 2015 now provides the same refactoring tools for both VB and C#.
In general the things MS focuses on for each vary, because the two languages are targeted at very different audiences. This blog post has a good summary of the target audiences. It is probably a good idea to determine which audience you are in, because it will determine what kind of tools you'll get from Microsoft.
This topic has had a lot of face time since .Net 2.0 was released. See this Wikipedia article for a readable summary.
This may be considered syntax, but VB.NET is case insensitive while C# is case sensitive.
This is a very comprehensive reference.
Since I assume you can google, I don't think a link to more sites is what you are looking for.
My answer: Choose base on the history of your developers. C# is more JAVA like, and probably C++ like.
VB.NET was easier for VB programmers, but I guess that is no really an issue anymore sine there are no new .NET programmers coming from old VB.
My opinion is that VB is more productive then C#, it seems it is always ahead in terms of productivity tools (such as intelisense), and I would recommend vb over c# to someone that asks. Of course, someone that knows he prefers c# won't ask, and c# is probably the right choice for him.
Although the syntax sugar on C#3 has really pushed the bar forward, I must say some of the Linq to XML stuff in VB.Net seems quite nice and makes handling complex, deeply nested XML a little bit more tolerable. Just a little bit.
One glaring difference is in how they handle extension methods (Vb.Net actually allows something that C# doesn't - passing the type on which the extension method is being defined as ref): http://blog.gadodia.net/extension-methods-in-vbnet-and-c/
Apart from syntax not that much any more. They both compile to exactly the same IL, so you can compile something as VB and reflect it into C#.
Most of the apparent differences are syntactic sugar. For instance VB appears to support dynamic types, but really they're just as static as C#'s - the VB compiler figures them out.
Visual Studio behaves differently with VB than with C# - it hides lots of functionality but adds background compiling (great for small projects, resource hogging for large ones) and better snippet support.
With more and more compiler 'magic' in C#3 VB.Net has really fallen behind. The only thing VB now has that C# doesn't is the handles keyword - and that's of debatable benefit.
#Tom - that really useful, but a little out of date - VB.Net now supports XML docs too with '''
#Luke - VB.Net still doesn't have anon-methods, but does now support lambdas.
The biggest difference in my opinion is the ability to write unsafe code in C#.
Although VB.NET supports try...catch type exception handling, it still has something similar to VB6's ON ERROR. ON ERROR can be seriously abused, and in the vast majority of cases, try...catch is far better; but ON ERROR can be useful when handling COM time-out operations where the error can be trapped, decoded, and the final "try again" is a simple one line.
You can do the same with try...catch but the code is a lot messier.
This topic is briefly described at wikipedia and harding.
http://en.wikipedia.org/wiki/Comparison_of_C_Sharp_and_Visual_Basic_.NET
http://www.harding.edu/fmccown/vbnet_csharp_comparison.html
Just go through and make your notes on that.
When it gets to IL its all just bits. That case insensitivity is just a precompiler pass.
But the general consensus is, vb is more verbose.
If you can write c# why not save your eyes and hands and write the smaller amount of code to do the same thing.
One glaring difference is in how they handle extension methods (Vb.Net actually allows something that C# doesn't - passing the type on which the extension method is being defined as ref): http://blog.gadodia.net/extension-methods-in-vbnet-and-c/
Yes VB.NET fixed most of the VB6 problems and made it a proper OOP language - ie. Similar in abilities to C#. AlthougnI tend to prefer C#, I do find the old VB ON ERROR construct useful for handling COM interop timeouts. Something to use wisely though - ON ERROR is easily abused!!
Certainly there's the difference in general syntax, but what other critical distinctions exist? There are some differences, right?
The linked comparisons are very thorough, but as far as the main differences I would note the following:
C# has anonymous methodsVB has these now, too
C# has the yield keyword (iterator blocks)VB11 added this
VB supports implicit late binding (C# has explicit late binding now via the dynamic keyword)
VB supports XML literals
VB is case insensitive
More out-of-the-box code snippets for VB
More out-of-the-box refactoring tools for C#Visual Studio 2015 now provides the same refactoring tools for both VB and C#.
In general the things MS focuses on for each vary, because the two languages are targeted at very different audiences. This blog post has a good summary of the target audiences. It is probably a good idea to determine which audience you are in, because it will determine what kind of tools you'll get from Microsoft.
This topic has had a lot of face time since .Net 2.0 was released. See this Wikipedia article for a readable summary.
This may be considered syntax, but VB.NET is case insensitive while C# is case sensitive.
This is a very comprehensive reference.
Since I assume you can google, I don't think a link to more sites is what you are looking for.
My answer: Choose base on the history of your developers. C# is more JAVA like, and probably C++ like.
VB.NET was easier for VB programmers, but I guess that is no really an issue anymore sine there are no new .NET programmers coming from old VB.
My opinion is that VB is more productive then C#, it seems it is always ahead in terms of productivity tools (such as intelisense), and I would recommend vb over c# to someone that asks. Of course, someone that knows he prefers c# won't ask, and c# is probably the right choice for him.
Although the syntax sugar on C#3 has really pushed the bar forward, I must say some of the Linq to XML stuff in VB.Net seems quite nice and makes handling complex, deeply nested XML a little bit more tolerable. Just a little bit.
One glaring difference is in how they handle extension methods (Vb.Net actually allows something that C# doesn't - passing the type on which the extension method is being defined as ref): http://blog.gadodia.net/extension-methods-in-vbnet-and-c/
Apart from syntax not that much any more. They both compile to exactly the same IL, so you can compile something as VB and reflect it into C#.
Most of the apparent differences are syntactic sugar. For instance VB appears to support dynamic types, but really they're just as static as C#'s - the VB compiler figures them out.
Visual Studio behaves differently with VB than with C# - it hides lots of functionality but adds background compiling (great for small projects, resource hogging for large ones) and better snippet support.
With more and more compiler 'magic' in C#3 VB.Net has really fallen behind. The only thing VB now has that C# doesn't is the handles keyword - and that's of debatable benefit.
#Tom - that really useful, but a little out of date - VB.Net now supports XML docs too with '''
#Luke - VB.Net still doesn't have anon-methods, but does now support lambdas.
The biggest difference in my opinion is the ability to write unsafe code in C#.
Although VB.NET supports try...catch type exception handling, it still has something similar to VB6's ON ERROR. ON ERROR can be seriously abused, and in the vast majority of cases, try...catch is far better; but ON ERROR can be useful when handling COM time-out operations where the error can be trapped, decoded, and the final "try again" is a simple one line.
You can do the same with try...catch but the code is a lot messier.
This topic is briefly described at wikipedia and harding.
http://en.wikipedia.org/wiki/Comparison_of_C_Sharp_and_Visual_Basic_.NET
http://www.harding.edu/fmccown/vbnet_csharp_comparison.html
Just go through and make your notes on that.
When it gets to IL its all just bits. That case insensitivity is just a precompiler pass.
But the general consensus is, vb is more verbose.
If you can write c# why not save your eyes and hands and write the smaller amount of code to do the same thing.
One glaring difference is in how they handle extension methods (Vb.Net actually allows something that C# doesn't - passing the type on which the extension method is being defined as ref): http://blog.gadodia.net/extension-methods-in-vbnet-and-c/
Yes VB.NET fixed most of the VB6 problems and made it a proper OOP language - ie. Similar in abilities to C#. AlthougnI tend to prefer C#, I do find the old VB ON ERROR construct useful for handling COM interop timeouts. Something to use wisely though - ON ERROR is easily abused!!