I am working on a Reverse Engineering school project, which requires to translate manipulate AST of compiled C# project. I have seen the post on "Translate C# code into AST?" in this website, but it doesn't look like the one I am looking for.
According to what I know, currently C# doesn't provide a library class that does something like that for Java: http://help.eclipse.org/help33/index.jsp?topic=/org.eclipse.cdt.doc.isv/reference/api/org/eclipse/cdt/core/dom/ast/ASTVisitor.html. If there is such library class in C#, everything here is solved.
I have consulted with someone, and here are the possible solutions. But I have problems with working out on the solutions as well:
Find another compiler that provides a library which allows its AST to be expose for manipulation. But I can't find a compiler like that.
Use ANTLR Parser Generator to come out with my own compiler that does that (it will be a much more difficult and longer process). The download there provides sample grammars for different languages but not C# (it has grammars written in various languages including C# but not to produce C# grammar). Hence the problem is I can't find C# grammar.
What is shortest and fastest way to approach this issue? If I really have to take one of the alternative above, how should I go about solving those problems I faced.
I know the answer for this one was accepted long ago. But I had a similar question and wasn't sure of the options out there. I did a little investigation of the NRefactory library that ships as part of SharpDevelop. It does generate an AST from C# code.
Here's an image of the NRefactory demo application that is part of the SD source code. Type in some C# code and it generates and displays the AST in a treeview.
Why don't you try NRefectory. I've seen it discussed for AST thing on some SharepDevelop forums.
Here is an article on CodeProject regarding this topic.
ANTLR is not a good choice. I am now trying out using Mono Cecil instead. Mono Cecil is good for analyzing any souce codes that can be compiled into Common Intermediate Language (CIL). The disadvantage is that it doesn't have properly documentation.
I've just answered on another thread here at StackOverflow a solution where I implemented an API to create and manipulate AST from C# Source Code
A full C# 3.0 parser is available with our DMS Software Reengineering Toolkit (DMS for short). It has been used to process tens of thousands of C# files accurately. It provides automated AST building, tree traversals,
surface-syntax pattern matching and transformation and lots more.
As a commercial product it might not work out for a student project.
ANTLR arguably offers a C# parser, but I don't know complete or robust it is,
or whether it actually builds ASTs.
[EDIT Jan 25 2010: C# 4.0 parser now available for DMS with all the above properties]
[EDIT May 2016: C# 6.0 parser available for DMS.]
Related
For a large porting from VB6 to C# job I wrote a tool which uses a murder of regular expressions to analyse a VB6 code base and extract the dependencies of all the functions in all the forms, bas files and classes.
It allowed us to chop out blocks of code for the developers, generate graphs and extract all the SQL.
I could really use something that does the same thing for C# and although it would be a lot easier for C#, I don't have the time or budget to write it.
We are limited to VS2008
Does anything like this already exist?
I'm not sure but I think NDepend has this. If not, writing it yourself should be pretty straight forward using Roslyn or NRefactory
If you take a look on project Roslyn page, you'll find example which will show you about 70% of what you seem to try to achive.
Walkthrough: Getting Started with Semantic Analysis – C#
It seems you are doing some kind of "refactoring". I found ReSharper is most useful in this case http://www.jetbrains.com/resharper/
Not sure if the title explains it correctly.
Anyways, I'm building a .NET WPF application which should go through the JavaScript and identify issues such as
If the variables defined are being nullified at the end
If try/catch/finally blocks are being used.
Function calls
I went through the questions over here which were all revolving around c/c++. Now I regret bunking my compilers classes.
I wanted to know how to verify points 1-3 in C#. Any library out there which does this?
What you're looking for is an abstract syntax tree parser for Javascript written in C#.
There are a few choices I know of:
Microsoft's Ajax Minifier library comes with its own AST parser (used to minify / optimize Javascript files). You can find the source code for that on GitHub.
Esprima.net is another option. It's a port of the popular Javascript library Esprima.
The good thing about Esprima is it outputs the AST in a common format (defined by Mozilla here) that's used across a few parsers, making it really easy to port utilities for walking the tree, etc. since they all use the same underlying data structure.
Check out IronJS I know they have a pretty good JavaScript library for .Net
IronJS
I need to generate Python code to be more specific IronPyton. I also need to be able to parse the code and to load it into AST. I just started looking at some tools. I played with "Oslo" and made a decision that it's not the right tool for me. I just looked very briefly at Coco/R and it looks promising.
Does anyone use Coco/R?
If you did what's your experience with the tool
Can you recommend some other tool?
The IronPython implementation itself includes a parser and an AST representation of Python programs which can be walked with a PythonWalker.
Not really my area of expertise but you might want to try ANTLR 4. It has support for generating Python 2 and Python 3.
I think you should look at the Dynamic Language Runtime. This will be a standard part of some later version of .Net and C# (.Net 4 from memory).
I've used it to compile and execute Python code generated at runtime, but I haven't played with all the AST stuff yet.
I'm looking for turn-key ANTLR grammar for C# that generates a usable Abstract Syntax Tree (AST) and is either back-end language agnostic or targets C#, C, C++ or D.
It doesn't need to support error reporting.
P.S. I'm not willing to do hardly any fix-up as the alternative is not very hard.
This may be waaaay too late, but you can get a C# 4 grammar.
Here's a C# grammar link, as well as an overview of C# and ANTLR. There are others for the other languages you mentioned here.
The DMS Software Reengineering Toolkit provides a full, validated grammar for C# 1.2, 2.0 and 3.0 with generics and LINQ expressions.
It automatically builds ASTs, allows you programmatic access to the ASTs for analysis or tranformation, or you can apply source-to-source transformations that also directly manipulate the tree. The resulting AST can be prettyprinted back to source code, even retaining indentation and comments.
DMS also has mature front ends for other languages such as Java, PHP5, JavaScript, COBOL, C and C++.
EDIT: 1/31/2010: The DMS C# parser has been extended to handle full C# 4.0.
You can find C# 6 ANTLR grammar at official grammars repository.
I'm porting a Java library to C#. I'm using Visual Studio 2008, so I don't have the discontinued Microsoft Java Language Conversion Assistant program (JLCA).
My approach is to create a new solution with a similar project structure to the Java library, and to then copy the java code into a c# file and convert it to valid c# line-by-line. Considering that I find Java easy to read, the subtle differences in the two languages have surprised me.
Some things are easy to port (namespaces, inheritance etc.) but some things have been unexpectedly different, such as visibility of private members in nested classes, overriding virtual methods and the behaviour of built-in types. I don't fully understand these things and I'm sure there are lots of other differences I haven't seen yet.
I've got a long way to go on this project. What rules-of-thumb I can apply during this conversion to manage the language differences correctly?
Your doing it in the only sane way you can...the biggest help will be this document from Dare Obasanjo that lists the differences between the two languages:
http://www.25hoursaday.com/CsharpVsJava.html
BTW, change all getter and setter methods into properties...No need to have the C# library function just the same as the java library unless you are going for perfect interface compatibility.
Couple other options worth noting:
J# is Microsoft's Java language
implementation on .NET. You can
access Java libraries (up to version
1.4*, anyways).
*actually Java 1.1.4 for java.io/lang,
and 1.2 for java.util + keep in mind that J# end of
life is ~ 2015-2017 for J# 2.0 redist
Mono's IKVM also runs Java on
the CLR, with access to other .NET
programs.
Microsoft Visual Studio 2005 comes
with a "Java language conversion
assistant" that converts Java
programs to C# programs
automatically for you.
One more quick-and-dirty idea: you could use IKVM to convert the Java jar to a .NET assembly, then use Reflector--combined with the FileDisassembler Add-in--to disassemble it into a Visual C# project.
(By the way, I haven't actually used IKVM--anyone care to vouch that this process would work?)
If you have a small amount of code then a line by line conversion is probably the most efficient.
If you have a large amount of code I would consider:
Looking for a product that does the conversation for you.
Writing a script (Ruby or Perl might be a good candidate) to do the conversion for you - at least the monotonous stuff! It could be a simple search/replace for keyword differences and renaming of files. Gives you more time/fingers to concentrate on the harder stuff.
I'm not sure if it is really the best way to convert the code line by line especially if the obstacles become overwhelming. Of course the Java code gives you a guideline and the basic structure but I think at the end the most important thing is that the library does provide the same functionality like it does in Java.