Static Code Analysis - Which ones to turn on first?

Static Code Analysis - Which ones to turn on first? - c#

We're using VS2008 with the built in static code analysis rule set.
We've got a relatively large C# solution (150+ projects) and while some of the projects (< 20) are using static code analysis religiously, most are not. We want to start enforcing static code analysis on all projects, but enabling all rules would create a massive distraction to our current projects. Which of the many static code analysis rules that are available should we turn on first? Which rules have the biggest bang for the buck? If you could give me your prioritized top 20, I'd greatly appreciate it.
Thanks in advance,
--Ed.S.

The very first rules you should activate for a project are those for which you don't yet have any violations in that project. This will allow you to avoid introducing new problems without costing you any additional clean-up effort.
As for the rest, given that you're already using code analysis on other projects, your best input for which rules are most likely to be broken with serious consequences is probably the developers who work on those projects. If you don't have enough overlap between projects to get meaningful feedback from developers, you might want to consider starting with the rules that are included in the Microsoft Minimum Recommended Rules rule set in Visual Studio 2010.
If you are planning on actually cleaning up existing violations in any given project, you may want to consider using FxCop instead of VS Code Analysis until the clean-up is complete. This would allow you to activate rules immediately while keeping "for clean-up" exclusions of existing violations outside your source code.

Given that the Studio ones are similar to FxCop's rules, I can tell you which ones I'd turn on last.
If internationalization is not on the horizon, turn off Globalization Rules.
Turn off Performance Rules initially. Optimize when you need to.
Fit the others to your team and your projects. Turn off individual rules that aren't applicable. In particular, Naming Rules may need to be adjusted.
EDIT: The most important thing is to reduce noise. If every project has 200 warnings and stays that way for months, everyone will ignore them. Turn on the rules that matter to your team, clean up the code to get 100% passing (or suppress the exceptions - and there will be exceptions; these are guidelines), then enforce keeping the code clean.

If you going to localize your project/ it is going to be used in different countries, then definitely enable localization rules. It will find all call to all sort of Format/Parse functions that do not specify CultureInfo. Bugs involving not specified CultureInfo are hard to find in testing, but they will really bite you in the ass, when your French client will ask: why your program does not work/crash on numbers with "," as decimal separator.

In my experience code analysis warnings of all types show 'hidden' bugs or flaws in your code. Fixing these can solve some real problems. I have not found a list of warnings that I would like to disable.
Instead, I would turn them on one project at a time and fix all the warnings in that project before moving to the next.
If you want to turn things off I would consider not checking the Naming rules (unless you are shipping a library, APIs or other externally exposed methods) and Globalization rules. (unless your applications make active use of Globalization). It depends a bit on your situation which make sense.

I somewhat agree with Jeroen Huinink's answer.
I would turn on all the rules that you think a project should follow and fix them as soon as possible. You don't have to fix them all now, but as you go through and fix a defect or refactor a method in a module, you can always clean up the issues found by static analysis in that method or module. New code should adhere to your rules and existing code should be transformed into adherence as quickly as possible, but you don't need to drop everything to make that happen.
Your development team can also look at the issues for a project and prioritize them, perhaps filing defects in your issue tracking system for the most critical problems so that they are addressed quickly and by the appropriate developer.

Related

Cross-reference code and business requirements

I was curious if a tool exists to collect Metadata references in code to somehow link business requirements with sections of code?
I'm currently working on a legacy system that does not do anything like this, so I'm envisioning a Visual Studio extension that allows me to define Metadata tags. Once they are defined, I can add them to sections of code such that they are searchable.
So, for example, if I am working on the billing subsystem, perhaps I add the [Billing] tag so the next developer knows the specific part of the code that I used as an entry point into the code.
Is this a thing? Or could this even be leveraged to be useful? I just find that I am often lost for months learning a new system and always wished there was a way to search the code for business requirements. Or at least had a dictionary of search terms.

I think a problem is that business requirements may not map directly to any kind of module, but may be spread over the code-base. So where would you put your tags?
Requirements may also change, so you might have tags linking to requirements that may no longer accurately describe the current behavior.
I would propose to instead document such requirements thru tests. Preferably automated tests whenever possible, but in some cases manual tests might be appropriate. This should let you know whenever a requirement are no longer fulfilled, and that lets you either change the test, or the product. Such test can also be useful if you are new to the code base to gain some understanding of how the code is intended to work.
Having a good architecture, appropriate code comments, and some kind of project architecture documentation are other common tools to make familiarization easier, but it is fairly rare that all of these things exist and are up to date.
Linking source code to external systems can be somewhat risky, since code tend to outlive systems and people. I have worked with code bases that have gone thru at least 4 different source control systems, and three different issue trackers. And even if you think having such tags is the best thing since sliced bread, you successor might consider it unnecessary bloat.

A New and Full Implementation of Generic Intellisense

I am interested in writing a generic Intellisense enabled editor for SQL and C# (et al. if possible!). I would like to do this in C# as an overridden or extended WPF richTextBox-type control. I know there are many example projects available and I have implemented a basic version of my own; but most of the examples that I have come across (and indeed my own) are just that, basic.
A couple of code examples are:
DIY Intellisense By yetanotherchris
CodeTextBox - another RichTextBox control with syntax highlighting and Intellisense By Tamas Honfi
I have however, found a great example of an SQL editor with Intellisense QueryCommander SQL Editor By Mikael Håkansson which seems to work well. Microsoft must use a XML library of command keywords, but my question is: How (in detail) do Microsoft implement their Intellisense (as-you-type Intellisense) and how hard would it be for me to create my own of the same standard?
Edit A: A year on and I have managed to develop my own editor control with basic intellisense mainly for my own "enjoyment". I thought I would come back provide a list of freely available .NET projects that helped me with my own development and can be used out-of-the-box and free of charge:
ICSharpCode (WinForms)
AvalonEdit (WPF)
ScintillaNET (WinForms)
Query Commander [for example of intellisense implementation] (WinForms)
Edit B: 15 months after the question was asked I am still looking for new improved editors. This one is nice...
RoslynPAD is cool!
Edit C: 2 years+ on from the question, I have found the following projects, both using WPF and backed by AvalonEdit.
CodeCompletion for AvalonEdit using NRefactory. This project is really nice and has a full implementation of intellisense using NRefactory.
ScriptCS ScriptCS makes it easy to write and execute C# with a simple text editor.

How (in detail) do Microsoft implement their as-you-type Intellisense?
I can describe it to any level of detail you care to name, but I don't have the time for more than a brief explanation. I'll explain how we do it in Roslyn.
First, we build an immutable model of the token stream using a data structure that can efficiently represent edits, since obviously edits are precisely what there are going to be a lot of.
The key insight to making it efficient for persistent reuse is to represent the character lengths of the tokens but not their character positions in the edit buffer; remember, a token at the end of the file is going to change position on every edit but the length of the token does not change. You must at all costs minimize the number of total re-lexings if you want to be efficient on extremely large files.
Once you have an immutable model that can handle inserts and deletions to build up an immutable token stream without re-lexing the entire file every time, you then have to do the same thing, but for grammatical analysis. This is in practice a considerably harder problem. I recommend that you obtain an undergraduate or graduate degree in computer science with an emphasis on parser theory if you have not already. We obtained the help of people with PhDs who did their theses on parser theory to design this particular bit of the algorithm.
Then, obviously, build a grammatical analyzer that can analyze C#. Remember, it has to analyze broken C#, not correct C#; IntelliSense has to work while the program is in a non-compiling state. So start by coming up with modifications to the grammar that have good error-recovery characteristics.
OK, so now you've got a parser that can efficiently do grammatical analysis without re-lexing or re-parsing anything but the edited region, most of the time, which means that you can do the work between keystrokes. I forgot to mention, of course you will need to come up with some mechanism to not block the UI thread while doing all of these analyses should the analysis happen to take longer than the time between two keystrokes. The new "async/await" feature of C# 5 should help with that. (I can tell you from personal experience: be careful with the proliferation of tasks and cancellation tokens. If you are careless, it is possible to get into a state where there are tens of thousands of cancelled tasks pending, and that is not fast.)
Now that you've got a grammatical analysis you need to build a semantic analyzer. Since you are only doing IntelliSense, it does not need to be a particularly sophisticated semantic analyzer. (Our semantic analyzer must do an analysis suitable for generating code from correct programs and correct error analysis from incorrect programs.) But of course, again it has to do good semantic analysis on broken programs, which does increase the complexity considerably.
My advice is to start by building a "top level" semantic analyzer, again using an immutable model that can persist the state of the declared-in-source-code types from edit to edit. The top level analyzer deals with anything that is not a statement or expression: type declarations, directives, namespaces, method declarations, constructors, destructors, and so on. The stuff that makes up the "shape" of the program when the compiler generates metadata.
Metadata! I forgot about metadata. You'll need a metadata reader. You need to be able to produce IntelliSense on expressions that refer to types in libraries, obviously. I recommend using the CCI libraries as your metadata reader, and not Reflection. Since you are only doing IntelliSense, obviously you don't need a metadata writer.
Anyway, once you have a top-level semantic analyzer, then you can write a statement-and-expression semantic analyzer that analyzes the types of the expressions in a given statement. Pay particular attention to name lookup and overload resolution algorithms. Method type inference will be particularly tricky, especially inside LINQ queries.
Once you've got all that, an IntelliSense engine should be easy; just work out the type of the expression at the current cursor position and display a dropdown appropriately.
how hard would it be for me to create my own of the same standard?
Well, we've got a team of, call it ten people, and it'll probably take, call it five years all together to get the whole thing done from start to finish. But we have lots more to do than just the IntelliSense engine. That's maybe only 40% of the work. Oh, and half those people work on VB, now that I think about it. But those people have on average probably five or ten years experience in doing this sort of work, so they're faster at it than you will be if you've never done this before.
So let's say it should take you about ten to twenty years of full time work, working alone, to build a Roslyn-quality IntelliSense engine for C# that can do acceptably-close-to-correct analysis of large programs in the time between keystrokes.
Longer if you need to do that PhD first, obviously.
Or, you could simply use Roslyn, since that's what it's for. That'll take you probably a few hours, but you don't get the fun of doing it yourself. And it is fun!
You can download the preview release here:
http://www.microsoft.com/download/en/details.aspx?id=27746

This is an area where Microsoft typically produces great results - Microsoft developer tools really are awesome. And there is a clear commercial advantage for sales of their developer tools and for sales of Windows to having the best intellisense so it makes sense for Microsoft to devote the kind of resources Eric describes in his wonderfully detailed answer. Still, I think it's worth pointing out a few of things:
Your customers may not actually need all the features that Microsoft's implementation provides. The Microsoft solution might be incredibly over-engineered in terms of the features that you need to provide to your customers/users. Unless you're actually implementing a generic coding environment that is intended to be competitive with Visual Studio, it is likely that there are aspects of your intended use that either simplify the problem, or that allow you to make compromises on the solution that Microsoft feels they cannot make. Microsoft will likely spend resources decreasing response times that are already measured in hundreds of milliseconds. That may not be something you need to do. Microsoft is spending time on providing an API for others to use for code analysis. That's likely not part of your plan. Prioritize your features and decide what "good enough" looks like for you and your customers then estimate the cost of implementing that.
In addition to bearing the obvious costs of implementing requirements that you may not actually have, Microsoft also carries some costs that may not be obvious if you haven't worked in a team. There are huge communication costs associated with teams. It's actually incredibly easy to have five smart people take longer to produce a solution than it takes for a single smart person to produce the equivalent solution. There are aspects of Microsoft's hiring practices and organizational structure that make this scenario more likely. If you hire a bunch of smart people with egos and then empower all of them to make decisions, you too can get a 5% better solution for 500% of the cost. That 5% better solution might be profitable for Microsoft, but it could be deadly for a small company.
Going from a 1 person solution to a 5 person solution increases the costs, but that's just the intra-team development costs. Microsoft has separate teams that are devoted to (roughly) design, development, and testing even for a single feature. The project-related communication between peers across these boundaries has higher friction than within each of the disciplines. This not only increases communication costs between individuals, but it also results in larger team sizes. And more than that - since it's not a single team of 12 individuals, but is instead 3 teams of 5 individuals, there is 3x the upward communication cost. More costs that Microsoft has chosen to carry that may not translate to similar costs for other companies.
My point here is not to describe Microsoft as an inefficient company. My point is that Microsoft makes a ton of decisions about everything from hiring, to team organization, to design and implementation that start from assumptions about profitability and risk that simply do not apply to companies that are not Microsoft.
In terms of the intellisense thing, there are various ways of thinking about the problem. Microsoft is producing a very generic, reusable solution that doesn't just solve intellisense, but also targets code navigation, refactoring, and various other uses for code analysis. You don't need to do things the same way if your sole goal is to make it easy for developers to enter code without having to type much. Targeting that feature doesn't take years of effort and there are all sorts of creative things you can do if you're not just providing an API, but you actually control the UI too.

When should I implement globalization and localization in a .NET Application?

I am cleaning up some code in a C# app that I wrote and really trying to focus on best practices and coding style. As such, I am running my assembly through FXCop and trying to research each message it gives me to decide what should and shouldn't be changed. What I am currently focusing on are locale settings. For instance, the two errors that I have currently are that I should be specifying the IFormatProvider parameter for Convert.ToString(int), and setting the Dataset and Datatable locale. This is something that I've never done, and never put much thought into. I've always just left that overload out.
The current app that I am working on is an internal app for a small company that will very likely never need to run in another country. As such, it is my opinion that I do not need to set these at all. On the other hand, doing so would not be such a big deal, but it seems like it is unneccessary and could hinder readability to a degree.
I understand that Microsoft's contention is to use it if it's there, period. Well, I'm technically supposed to call Dispose() on every object that implements IDisposable, but I don't bother doing that with Datasets and Datatables. I wonder what the practice in regards to globalization and localization on small-scale internal apps is "in the wild."

I usually ignore those kinds of warnings for small internal apps. Remember that FXCop is meant to make sure that your code is good for a framework, not all of them might be relevant to you, I always disable various rules that I don't think fits with the applications as I build them.
Though I would call Disponse on any classes that implements them, doesn't matter if they don't do anything now, an upgraded version of the class might start leaking something essential, and it's a good habit to get into.

.NET Team - Best Practices and Methods

What best practices and methods would you enforce on a new .NET development team?
Cheers

Use only Visual Studio
If you need a database, use a server (reduces SQL issues early on)
Use Version Control

Good question. I've had to deal with this very recently with my team. Here's a couple quick points:
Come up with coding and documentation standards. A search for C# style guidelines will yield some good results. StyleCop and FxCop might be useful for enforcing your standards.
Source control. SVN is popular, but I prefer Mercurial.
Depending upon what type of projects you are working on, you might want to decide on a standard architecture. Typically, we use a UI - Application - Business Logic - Infrastructure architecture.
Put your database in version control.

Update
MSDN - Design Guidelines for Class Library Developers - All Versions
I had also assumed the OP was referencing coding standards. As for the more general practices.
Unified Development Environment (Visual Studio will probably net the best results)
Version Control (Team Foundation Server is great if you can afford it, if not SVN)
Team Collaboration (Trac if you go with SVN, TFS has some stuff as well)

You are asking for a shelf of books. I don't think you'd want to read an answer long enough to actually cover what you asked.

Microsoft's Patterns & Practices group may have some suggestions that could be useful as a resource of where are some good practices.
Continuous Integration would be another practice I'd introduce along with Technical Debt.
I'd review various Agile practices and see what the team thinks are worth adopting and what isn't. Tribal Leadership would also be something I'd examine to see what stage is the tribe and try to bring it to stage 4 if possible.
If I could put some values into the team it would be to have some pride in our work, respect one another, and think of things in terms of good for the team rather than individual gain. Granted that culture wasn't part of the question it is a natural follow-up to my mind.

You need to use version control (svn is great), but at the same time you shouldn't check everything into the sourcecontrol. skip checking in compilation output and configuration files, instead check in the config files as app.config.template files and have each dev make his own copy of the config files called app.config. check in new changes to the .template file and have all devs regularly check and update their local version if it changes.

If possible, pair up junior members with more senior members. Either way, definitely have code reviews. I'd also encourage them to have scheduled workshops or discussions so that they can get more well-rounded skills and to increase their exposure to different areas that they might not currently be aware of.
I'd also encourage them to go to user group meetings.

I would start by looking through the MSDN Developer Centers site:
http://msdn.microsoft.com/en-us/aa937802.aspx

Since you are using C# I would recommend using StyleCop to maintain consistency in code layout. Since you've stated it's a new team, I'm assuming that the code base is new as well. Starting fresh with StyleCop is far easier than trying to get rid of warnings in an existing code base.

most people would agree that having automated unit tests is a very good thing. you may want to go the tdd route and never code anything that doesn't already have a test, or you may want to write tests after the code and just focus on the key areas of concern rather than striving for 100% coverage. either way, decide what you want to achieve with testing and make sure that it is adhered to. without a strict law on getting unit tests you may well find that some if not all of your code has no automated tests and the only way that code gets tested is when someone goes into the UI and actually uses it.

In no particular order,
Agile / Scrum
A nice suite of tools -Resharper, Redgate SQL Tools, FXCop,etc.
Test Driven Development
Continuous Integration

How do I find and remove unused classes to cleanup my code?

Is there a quick way to detect classes in my application that are never used? I have just taken over a project and I am trying to do some cleanup.
I do have ReSharper if that helps.

I don't recommend deleting old code on a new-to-you project. That's really asking for trouble. In the best case, it might tidy things up for you, but isn't likely to help the compiler or your customer much. In all but the best case, something will break.
That said, I realize it doesn't really answer your question. For that, I point you to this related question:
Is there a custom FxCop rule that will detect unused PUBLIC methods?

NDepend
Resharper 4.5 (4.0 merely detects unused private members)
Build your own code quality unit-tests with Mono.Cecil (some samples could be found in the Lokad.Quality the this open source project)

Review the code carefully before you do this. Check for any uses of reflection, as classes can be loaded and methods can be dynamically invoked at runtime without knowing at compile time which ones they are.

It seems that this is one of the features proposed features for the next version of Resharper. That doesn't help yet, but hopefully the EAP is just around the corner.

Be careful with this - it is possible that you may remove things that are not needed in the immediate vicinity of the code you are working on but you run the risk of deleting interface members that other applications may rely on without your knowledge.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.