I'm looking for turn-key ANTLR grammar for C# that generates a usable Abstract Syntax Tree (AST) and is either back-end language agnostic or targets C#, C, C++ or D.
It doesn't need to support error reporting.
P.S. I'm not willing to do hardly any fix-up as the alternative is not very hard.
This may be waaaay too late, but you can get a C# 4 grammar.
Here's a C# grammar link, as well as an overview of C# and ANTLR. There are others for the other languages you mentioned here.
The DMS Software Reengineering Toolkit provides a full, validated grammar for C# 1.2, 2.0 and 3.0 with generics and LINQ expressions.
It automatically builds ASTs, allows you programmatic access to the ASTs for analysis or tranformation, or you can apply source-to-source transformations that also directly manipulate the tree. The resulting AST can be prettyprinted back to source code, even retaining indentation and comments.
DMS also has mature front ends for other languages such as Java, PHP5, JavaScript, COBOL, C and C++.
EDIT: 1/31/2010: The DMS C# parser has been extended to handle full C# 4.0.
You can find C# 6 ANTLR grammar at official grammars repository.
Related
Is there any freeware c# parser that can be used as parser in ActiproSyntaxEditor control?
Your first choice should be Mono if you are looking for a C# parser, as it is always updated to keep sync with Microsoft's C# compiler,
http://tirania.org/blog/archive/2010/Apr-27.html
However, you seems to ask for one that can work with SyntaxEditor. This is strange, as this control already has C# support,
Syntax Languages
Syntax languages are a core piece of the text/parsing
framework. They basically encapsulate all functionality for a
particular code language that is being used within a SyntaxEditor
control. This is everything from various types of parsing all the way
to simpler features like determining word breaks or performing line
commenting.
Over 20 full source sample language definitions are included with
SyntaxEditor for common languages like Assembly, Batch files, C, C++,
C#, CSS, HTML, INI files, Java, JScript, Lua, MSIL, Pascal, Perl, PHP,
PowerShell, Python, RTF, SQL, VB.NET, VBScript, and XML. Custom
language definition can easily be created, thereby making it possible
to build a code editor for any proprietary language.
http://www.actiprosoftware.com/products/controls/windowsforms/syntaxeditor/editing
I've already tried to develop languages using C and C++, but how can I create a interpreted language using C#? Thanks.
PS: I want to build it to run in Windows Mobile devices.
Well... what have you tried in C and C++?
Developing a language isn't exactly childs play. Do you understand lexical analysis? Do you understand different types of parsers? Where an LR parser might be more appropriate than an LALR parser? Or vice-versa? Do you understand context free grammars? Regular expressions?
That doesn't even begin to cover code generation, optimization, etc... (which may not all apply for an interpreted language, but you're still going to want to know a thing or two about them before you dive in).
You seem to be familiar with compiler construction, so I'll just point to the tools:
MPLEX and MPPG are respectively scanner and parser generators that generate a lot of what you need to build a compiler or interpreter using the C# language.
It seems that more documentation can be found in the .NET SDK, but I don't have it at hand so I'll just leave a pointer to MSDN.
If you want an interpreteded language developing it for C# is no different than developing it in C or C++.
If you want to compile to a (.Net) compiled language then .Net offers lots of possibilities through System.Reflection.Emit.
Is there any functionality built into the .NET framework somewhere to tokenize C# code? I'm not looking to build a tokenizer in C#, I'm looking for something that can tokenize C# source code.
The only thing that comes to mind is a parser generator like ANTLR, which has C# Sample Grammar available. Bison/Flex also looks like it has pretty decent C# grammar as well. Parsing any language and then actually making sense of it is fairly difficult, so I wish you the best of luck.
No, not built into the framework.
However, you may want to look at Irony, and C# Parser on CodePlex, as they both provide a parser/lexer for at least simple C#
The GOLD Parser too has a C# grammar (to parse C#), and run-time engines written in C# (so that you can execute that grammar using C# code).
I want to write a simple DSL in C#. Nothing too complicated. I'm looking for the .NET equivalent of Lex & Yacc. The easiest one I've found so far is GOLD Parser builder. The other choice is to use the lex & yacc available with F#, but I'm not keen to program in F# right now.
If you have any suggestions for the .NET version of Lex or Yacc, I'd love to hear them!
Thanks!
If you really want to stay in C#, I would recommend using the Irony toolkit - it allows you to specify grammars in C# code.
ANTLR 3 has a C# target.
How much F# programming do you need to do to take advantage of the Lex & Yacc? Can you throw what you need into an F# dll, and reference it from a C# project?
I don't know if it's what you're looking for, but Oslo has the ability to create a textual DSL.
It's got more features than that, but you can ignore all the repository and Grand Vision stuff and just produce a grammar you can use to parse your DSL into an AST. Alternatively, you can take advantage of the built-in support for parsing into a set of rows in a database.
There is also a C# Lex and C# CUP that I found on a Vienna University website
C# Lex Manual
I am working on a Reverse Engineering school project, which requires to translate manipulate AST of compiled C# project. I have seen the post on "Translate C# code into AST?" in this website, but it doesn't look like the one I am looking for.
According to what I know, currently C# doesn't provide a library class that does something like that for Java: http://help.eclipse.org/help33/index.jsp?topic=/org.eclipse.cdt.doc.isv/reference/api/org/eclipse/cdt/core/dom/ast/ASTVisitor.html. If there is such library class in C#, everything here is solved.
I have consulted with someone, and here are the possible solutions. But I have problems with working out on the solutions as well:
Find another compiler that provides a library which allows its AST to be expose for manipulation. But I can't find a compiler like that.
Use ANTLR Parser Generator to come out with my own compiler that does that (it will be a much more difficult and longer process). The download there provides sample grammars for different languages but not C# (it has grammars written in various languages including C# but not to produce C# grammar). Hence the problem is I can't find C# grammar.
What is shortest and fastest way to approach this issue? If I really have to take one of the alternative above, how should I go about solving those problems I faced.
I know the answer for this one was accepted long ago. But I had a similar question and wasn't sure of the options out there. I did a little investigation of the NRefactory library that ships as part of SharpDevelop. It does generate an AST from C# code.
Here's an image of the NRefactory demo application that is part of the SD source code. Type in some C# code and it generates and displays the AST in a treeview.
Why don't you try NRefectory. I've seen it discussed for AST thing on some SharepDevelop forums.
Here is an article on CodeProject regarding this topic.
ANTLR is not a good choice. I am now trying out using Mono Cecil instead. Mono Cecil is good for analyzing any souce codes that can be compiled into Common Intermediate Language (CIL). The disadvantage is that it doesn't have properly documentation.
I've just answered on another thread here at StackOverflow a solution where I implemented an API to create and manipulate AST from C# Source Code
A full C# 3.0 parser is available with our DMS Software Reengineering Toolkit (DMS for short). It has been used to process tens of thousands of C# files accurately. It provides automated AST building, tree traversals,
surface-syntax pattern matching and transformation and lots more.
As a commercial product it might not work out for a student project.
ANTLR arguably offers a C# parser, but I don't know complete or robust it is,
or whether it actually builds ASTs.
[EDIT Jan 25 2010: C# 4.0 parser now available for DMS with all the above properties]
[EDIT May 2016: C# 6.0 parser available for DMS.]