Parsing C# and VB.Net code to automate the code review process - c#

The customer has million lines of code (VB.Net and C#), and wants us to develop a tool to estimate the quality of the code.
What the information the customer wants to know include:
1)how many lines of comments in one code file
2) how many functions implemented in one class
3) whether all possible exception has been wrapped by a try/catch block
4) how many attributes attached to one function
5) ... (the customer said that the tool we provide should be configured and extensible so that they can implement more ideas later)
We plan to write a VS.Net add-on, which can parse the code of the opening project in time. seems the interesting thing in here is that we need to parsing the code of C# and VB.Net.
Please kindly provide some tips about how to start this interesting task.
Thanks in advanced!

You ask a very broad question, but you should begin by studying existing parser's APIs.
Once you do that you're golden.
For example look at this SO question which provides some parsers for C#. Of course you could write your own but I don't find any reason to since the task isn't very easy.
So you get your AST and once you do that you have all the information you want.
Keep in mind that if you reference a type that isn't in the file you must have to get it from another one, and it could also be a type from .NET. So there is definitely more work to be done.
To go through your list:
1)how many lines of comments in one code file: You could find it through your C# parser of choice. They recognize comment aswell
2) how many functions implemented in one class: Likewise, should be very easy
3) whether all possible exception has been wrapped by a try/catch block: Likewise, just find exception throws (the parser is likely to have a special type for language keywords, so looking for throw should be easy).
4) how many attributes attached to one function: and... Likewise
5) ... (the customer said that the tool we provide should be configured and extensible so that they can implement more ideas later): Shouldn't differ from any other project. Just make sure you're using good design principles, keeping everything abstract, using interfaces wisely, make your work in layers, etc. etc...

You can use Roslyn. For C# you can also use NRefactory.

Have a look at Stylecop, you may be able to add rules get the information you want?
http://stylecop.codeplex.com/

Related

How to analyse usage of my C# code (and some SQL code also) - in terms of number of methods/classes and number of times they are called

On high level my problem is -
We have couple of applications which have millions of lines of legacy code (C# and SQL). I need to figure out code areas which are being used most?
It may not be possible to find exact figures (especially in apps when code is being called based on user's action in GUI).
However, to get some rough figures few thoughts I have are to find out:
1) Find out List of Classes and Methods
2) Find out number of time they are called from within the code. (by means of direct method calls/delegates etc)
3) Find out all the stored procs/db functions (this would be bit staright forward)
4) Find out all the calls to stored procs
Could you please let me know - if you are aware of any tools to achive this?
Or any other idea to fetch above 4 details? Also, apart from these any other way to to do this analysis?
Thanks in advance!
I have used Red Gate's ANTS Profiler before:
http://www.red-gate.com/products/dotnet-development/ants-performance-profiler/
It's powerful and very easy to use (comes with a visual studio plugin). 14 days free!
One way you could achieve this is using Aspect Oriented Programming (AOP). I have used this previously in Java with the Spring Framework, but haven't used it before on .NET projects.
You could check out something like;
http://blogs.msdn.com/b/morgan/archive/2008/12/18/method-entry-exit-logging.aspx
This will give you an idea of how frequently methods are being called. Your will need simply need to collate the data in the logs into some form giving you an overall idea of usage patterns of the codebase.
Edit:
Further information on using this method can be found on other SO posts;
Logging entry and exit of methods along with parameters automagically?
https://stackoverflow.com/a/25825/685760

Automating complex refactoring tasks

I have the situation that the same repeating refactoring tasks have to be done for a huge number of methods in my code.
For example imagine a interface with 100 methods, each of them has one or more parameters as well as a return value. For each of these methods I need to jump to the implementation change the return type and add a line of code which converts the old return value to its new type for callers of the interface method.
Is there any way to quickly automate such refactorings?
I even thought to write a custom script to do it, but writing a intelligent script would approximately take longer than doing it maually.
A tool supporting such task can save a lot of time.
It's a good question, but in the time it took since you posted it (not to mention the time you spent searching for an answer before posting), you could have completed the changes manually.
I know, I know, it's utterly unsatisfying, but if you think of it as a form of mediation, and only do this once a year, it's not that bad.
If your problem is one interface with 100 methods, then I agree with another poster: just doing it may seem painful but it is limited in effort and you can be done really soon.
If you have this problem repeatedly, or you have very large code base (many, many interfaces for which you want to perform this task), then what you need is a tool for implementing automated change: a program transformation engine. Such a tool provides the ability to parse source code, build a program representation (an abstract syntax tree), and enables one to apply "scripted" operations on the tree either through procedural interfaces and/or through source-to-source transformation patterns.
OUr DMS Software Reengineering Toolkit is such a program transformation system. It has a C# Front End to enable its application to C# code. Configuring such a tool for a complex task is not a matter of hours, so it is not useful for "small scale" changes. For large scale changes, such tools can make it possible to do things simply not practical by hand.
Resharper and CodeRush both have features which can help with this kind of task.
Resharper's change signature functionality is probably the closest match.
Can't you generate a new interface from the class you have and then remove the ones you don't need! if it's that simple!!
change the return type : by changing... the return type, provided it is not a standard type (...), and the converter can be implemented by a TypeConverter.
When i have such boring task to do, i often switch VS2010 and use a tool that allow regex search and replace. In your example, maybe change 'return xxx;' by 'var yyy=convert(xxx); return yyy;'
(for example editor Notepad++ (free) allready offers quite some possiblities to change everything in a project (use with caution))

Code parsing C#

I am researching ways, tools and techniques to parse code files in order to support syntax highlighting and intellisence in an editor written in c#.
Does anyone have any ideas/patterns & practices/tools/techiques for that.
EDIT: A nice source of info for anyone interested:
Parsing beyond Context-free grammars
ISBN 978-3-642-14845-3
My favourite parser for C# is Irony: http://irony.codeplex.com/ - i have used it a couple of times with great success
Here is a wikipedia page listing many more: http://en.wikipedia.org/wiki/Compiler-compiler
There are two basic aproaches:
1) Parse the entire solution and everything it references so you understand all the types involved in the code
2) Parse locally and do your best to guess what types etc are.
The trouble with (2) is that you have to guess, and in some circumstances you just can't tell from a code snippet exactly what everything is. But if you're happy with the sort oif syntax highlighting shown on (e.g.) Stack Overflow, then this approach is easy and quite effective.
To do (1) then you need to do one of (in decreasing order of difficulty):
Parse all the source code. Not possible if you reference 3rd party assemblies.
Use reflection on the compiled code to garner type information you can use when parsing the source.
Use the host IDE's (if avaiable - so not applicable in your case!) code element interfaces to provide the information you need
You could take a look at how http://www.icsharpcode.net/ did it. They wrote a book doing just that, Dissecting a C# Application: Inside SharpDevelop, it even has a chapter called
Implement a parser to provide syntax
highlighting and auto-completion as
users type

Preprocessing C# - Detecting Methods

I require the ability to preprocess a number of C# files as a prebuild step for a project, detect the start of methods, and insert generated code at the start of the method, before any existing code. I am, however, having a problem detecting the opening of a method. I initially tried a regular expression to match, but ended up with far too many false positives.
I would use reflection, but the MethodInfo class does not reference the point in the original source.
EDIT: What I am really trying to do here is to support pre-conditions on methods, that pre-condition code being determined by attributes on the method. My initial thought being that I could look for the beginning of the method, and then insert generated code for handling the pre-conditions.
Is there a better way to do this? I am open to creating a Visual Studio Addin if need be.
This is a .NET 2.0 project.
Cheers
PostSharp or Mono.Cecil will let you do this cleanly by altering the generated code without getting into writing a C# parser which is unlikely to be core business for you...
Havent done anything of consequence with PostSharp but would be guessing its more appropriate than Mono for implementing something like preconditions or AOP. Alternately you might be able to do something AOPy with a DI container like Ninject
But of course the applicability of this idea Depends - you didnt say much other than that you wanted to insert code at the start of methods...
EDIT: In light of your desire to do preconditions... Code Contracts in .net 4 is definitely in that direction.
What sort of a tool do you have? Whats wrong with having a single Mono.Cecil.dll DLL shipped? Either way something other than a parser is the tool for the job.
I am sure there is an easier way but this might be a good excuse to take MGrammer for a spin.

How to make a "Call stack diagram"

Creating a call stack diagram
We have just recently been thrown into a big project that requires us to get into the code (duh).
We are using different methods to get acquainted with it, breakpoints etc. However we found that one method is to make a call tree of the application, what is the easiest /fastest way to do this?
By code? Plugins? Manually?
The project is a C# Windows application.
With the static analyzer NDepend, you can obtain a static method call graph, like the one below. Disclaimer: I am one of the developers of the tool
For that you just need to export to the graph the result of a CQLinq code query:
Such a code query, can be generated actually for any method, thanks to the right-click menu illustrated below.
Whenever I start a new job (which is frequently as I am a contractor) I spend two to three days reading through every single source file in the repository, and keep notes against each class in a simple text file. It is quite laborious but it means that you get a really good idea how the project fits together and you have a trusty map when you need to find the class that does somethnig.
Altought I love UML/diagramming when starting a project I, personally, do not find them at all useful when examining existing code.
Not a direct answer to your question, but NDepend is a good tool to get a 100ft view of a codebase, and it enables you to drill down into the relationships between classes (and many other features)
Edit: I believe the Microsoft's CLR Profiler is capable of displaying a call tree for a running application. If that is not sufficient I have left the link I posted below in case you would like to start on a custom solution.
Here is a CodeProject article that might point you in the right direction:
The download offered here is a Visual
Studio 2008 C# project for a simple
utility to list user function call
trees in C# code.
This call tree lister seems to work OK
for my style of coding, but will
likely be unreliable for some other
styles of coding. It is offered here
with two thoughts: first, some
programmers may find it useful as is;
second, I would be appreciative if
someone who is up-to-speed on C#
parsing would upgrade it by
incorporating an accurate C# parser
and turn out an improved utility that
is reliable regardless of coding style
The source code is available for download - perhaps you can use this as a starting point for a custom solution.
You mean something like this: http://erik.doernenburg.com/2008/09/call-graph-visualisation-with-aspectj-and-dot/
Not to be a stuck record, but if I get it running and pause it a few times, and each time capture the call stack, that gives me a real good picture of the call structure that accounts for the most time. It doesn't give me the call structure for things that happen real fast, however.

Categories