How to find unused parts of a large .NET application?

How to find unused parts of a large .NET application? - c#

Consider a large multi-tier enterprise web application and many services with very complex functionality, mostly written in .NET (C#) on the server side and obviously html and javascript on the client, consisting of many hundred pages with the amount of service calls (actions) well in the thousands, hosted on multiple servers and developed over 15 years. Some parts are very new and modern, other parts are legacy.
Some parts of this application are obsolete and nobody actually uses those parts anymore. Whether these are whole unused sub-applications, unused pages, files, service calls, methods or even lines of code, doesn't matter. Older parts do not provide any usage statistics but do use dependency injection.
How can one automatically find out, based on access to production servers, which parts are unused, without changing the actual source code? So the question is not finding unreferenced / unreachable code. It's about finding parts that users don't actually use anymore.
One option could be looking at query logs. This discovers unused pages, but it is very difficult (a tedious manual process) to find out which parts in the background are used by those pages only.
Another option could probably be monitoring file access on servers. Does that make sense? Would that be feasible?
Yet another thought is doing something like test coverage tools do, but not during testing. Could coverage (something like lines of code executed) be measured in a live C#.NET application, assuming that debug symbols are available?

It is hard to give an answer without really knowing the situation. However, I do not think there is some automatic or easy way. I do not know the best solution, but I can tell you what I would do. I would start with collecting all log files from the (IIS?) server (for at least a year, code could be used once a year) and analyze those. This should give you the best insight on which parts are called externally. You do have those logs?
Also check the eventlogs. Sometimes there are messages like 'Directory does not exist', which could mean that the service isn't working for years but nobody noticed. And check for redundant applications, perhaps applications are active on multiple servers.
Check inside tables with time indications and loginfo for recent entries.
Checking the dates on files and analyzing the database may provide additional information, but I don't think it will really help.
Make a list of all applications that you think are obsolete, based on user input or applications that should be obsolete.
Use your findings to create a list based on the probability that application /code is obsolete. Next steps, based on your list, could be:
remove redundant applications.
look for changes in the datamodel of filesystem and check if these still match with the code.
analyze the database for invalid queries. This could indicate that the datamodel has changed, causing the application to stop working. If nobody noticed then this application or functionality is obsolete.
add logging to the code where you have doubts.
look at application level and start with marking calls as obsolete, comment / removing unused code or redirect to (new) equivalent code.
turn off applications and monitor what happens. If there is a dependency then you can take action to remove this dependency or choose to let the application live.
Monitoring the impact of your actions will help you to sort things out. I hope this answer gives you some ideas.
-- UPDATE --
There may be logging available, but collecting, reading and interpreting may be hard and time consuming. To make it easier to monitor you could think of the following:
monitor database: you can use the profiler tool, but it may be easier to create a trigger that logs all CRUD operations with all the information you need. Create a program that can read the scheme of the database and filter the log by table, stored procedure, view to determine what isn't used. I didn't investigate, but perhaps you can monitor rollbacks and exceptions as well.
monitor IIS. There are off course the log files, but you can also think of adding a Module to the website where you can write custom code to monitor whatever you want. All traffic passes the module. Take a look here: https://www.iis.net/learn/develop/runtime-extensibility/developing-iis-modules-and-handlers-with-the-net-framework. If I am not mistaken all you have to do is add the module to the website and configure the website to use the module. Create a program to filter the log on url, status, ip, identification, etc. to determine what is used.
I think that is sufficient for first analysis. It then comes to interpreting the logs. Perhaps you'll see a way to combine the logs so you can link a request to certain database actions, without having to look in or change the code. Just some thoughts.

You can use ReSharper. It will tell you such problems while you're coding.
However you can also detect problems afterwards. In the Menu you will find the entry "ReSharper > Inspect > Code Issues in Solution".
It will create a report, there you will find it under "Redundancies in Code".

Related

Hiding parts of my code from a programmer employee

I am working on a C# project and have two programmers to help me on parts of the project. The problem is that I don't trust these programmers as they are joining recently and need to protect my company's property.
I need to hide some parts of the code from the two programmers so they don't see it and they should still be able to work on their parts and run the full application to test it.
Is there such thing ? :)

Know a few things:
You Can't Hide Code Users Compile Against.
C# makes it incredibly easy to see what you're compiling against, but this is actually true for all programming languages: if they are required to compile it, compile against a dll, or they can run it, either as a DLL or as raw C#, they can get access to the logic behind it. There's no way around that. If the computer can run the program and it all resides on your PC, then the human can look it over and learn how to do it too.
HOWEVER! You can design your program in such a way that they don't need to compile against it.
Use Interfaces.
Make the code that the other employees must write a plug-in. Have them write their code as an entirely separate project to an interface that the core part of your API loads dynamically at run time.
Take a look at The Managed Extensibility Framework for a tool to do this.
Use Web or Remote Services.
Components of particular secrecy can be abstracted away so the details of how it works can be hidden and then invoked via a web call. This only works in situations where the core details you want to protect are not time sensitive. This also doesn't protect the idea behind the feature: the employee will need to understand it's purpose to be able to use it, and that alone is enough to rebuild it from scratch.
Build Trust Through Code Reviews.
If you don't currently trust your employees, you need to develop it. You will not be able to know everything that everyone does always. This is a key skill in not just programming, but life. If you feel that you can't ever trust them, then you either need to hire new employees that you can trust, or build trust in them.
One way to build trust in their capabilities is through code reviews. First, make sure you're using a version control system that allows for easy branching. If you aren't, switch immediately to Mercurial*. Have an "integration" area and individual development areas, usually through cloned branches or named branches. Before they commit code, get together with the employee and review the changes. If you're happy with them, then have them commit it. This will consume a little bit of time on each commit, but if you do quick iterations on changes, then the reviews will also be quick.
Build Trust Through Camaraderie.
If you don't trust your employees, chances are they won't trust you either. Mutual distrust will not breed loyalty. Without loyalty, you have no protection. If they have access to your repository, and you don't trust them, there's a good chance they can get at the code you want anyway with a little bit of effort.
Most people are honest most of the time. Work with them. Learn about them. If one turns out to be working for a hostile entity, they've probably already obtained what they wanted to get and you're screwed anyway. If one turns out to be a pathological liar or incompetent, replace them immediately. Neither of these issues will be saved by "protecting" your code from their eyes.
Perform Background Checks.
A further way to improve trust in your employee, from a security standpoint, is a background check. A couple hundred bucks and a few days, and you can find out all sorts of information about them. If you're ready to hide code from them, and you're the employer, you might as well do due diligence before they steal the secrets to the universe.
Your Code is Not That Important.
I hate to break it to you, but there's almost a 100% chance that your code is not special. Trying to protect it through obscurity is a waste of time and a known, poor, protection method.
Good luck!
**Why Mercurial? Just because it's one option that's easy to get started with. Feel free to use any other, like Git, if it suits your fancy. Which one you use is entirely besides the point and irrelevant to this overall discussion.*

You can't do it,
Even if you only give them a DLL with your code, they can extract the code with reflection tools, e.g. reflector.

Keep a separate backup and submit dummy placeholders to source control.

The complicated way: set up an application server with VS2010 and all the files they need, lock everything down so they cannot access any files directly and can only run VS2010 and the built application, and provide only DLLs for the protected code.
Theoretically, they would be able to work on the code they need to but would never have direct access to the DLLs, nor would they have the ability to install or use a tool such as .NET Reflector to disassemble the files... might still be some holes you'd need to look for though.
The right way: Hire trustworthy programmers. ;)

Put your code into a DLL and use Dotfuscator to obfuscate the internal workings.

The only way I can see is to give them compiled and obfuscated assemblies to reference. Because you can only obfuscate private members you may possibly need to modify your code so that public methods do not do much if anything at all. If there is any interesting code in a public method you should rearrange your code like this:
public bool ProcessSomething()
{
return this.DoProcessSomething();
}
private bool DoProcessSomething()
{
// your code
}
Even obfuscator that comes free with VS will do some job to make it non-trivial to look into your code. If you require more protection you need better obfuscator of course.
But in the long run it is impractical and sends bad signals to those developers telling that you do not trust them. There can be nothing good coming out of this. If you're not the boss (or owner of the code) I would not worry that much - after all it's not your property. You can talk to your boss to express your concerns. If you are the boss you should have not employed people you do not trust in the first place.

Application Architecture For Ease of Application Customization

I'm looking for input on a direction to take for building an accounting application. The application needs to allow for high customization, sometimes entire processes will need to changed.
I want a way to make changes without re-compiling the entire application when a customer has a specific modification request. The back-end will be a SQL database of some sort. Most likely SQL Server Express for cost reasons. The front-end will be C#.
I'm thinking of an event-based system that will have events for when different types of actions, such as entries, are made. I would then have a plugin system that handles the event. I may need to have multiple processes apply in a specific order to the data before it is finally saved. It will need to trigger other processes as well.
I want to keep my base application the same, which works for most customers, but have a graceful way of loading the custom processes that other specific customers have.
I'm open to all suggestions. Even if they are thinking of completely different ways of approaching the problem. Our current in-house development talent is .NET and MS SQL Server. I'm not aware of a software pattern that may fit this situation.
Additional Info:
This isn't a completely blank slate system, it will have functionality that works for a large number of the customers. For various reasons, requirements change based on states and even at the region and town level where customization may be necessary.
I'd like to be able to plugin additional pre-compiled modules. When I started looking into possible options, I was imagining an empty handler that I could insert code into through a plugin. So say for example, a new entry is made to the general ledger that raises an event. The handler is called, but the handler's code is coming from a plugin, which may be my original process that fits 80% of the customers. If a customer wants a custom operation, I could add a plugin that completely replaces the original one or have it add an additional post processing step through another plugin after the original runs. Sort of a layering process I guess.

You could look at Managed Extensibility Framework
It provide rich composition layer features that allow you to build loosely-coupled plugin applications.
Update : sound like you need the pre-defined modules on different geographic areas and using chain of responsibility design patern might help you manage the principle of change.
Sorry no codes provided just throwing my thoughts

Windows Workflow Foundation (WF) (part of the .NET Framework) is a potential candidate for your requirements. It enables various actions, command-lets and script-lets to be composed dynamically so that you can more easily customize different workflows for different users/customers.
WF is used by Biztalk for large-scale systems integration and is hosted in-process by many other applications that require the ability to easily modify the orchestration of a number of smaller tasks and actions.
You might want to start with this tutorial on WF4.
HTH.

It's not just plugins or the way how do you technically resolve that plugin problem, use MEF (+1 #laptop) or something else, You got to put most effort in defining plugin "points" in your application, this is gone be most important eg. where you will put that empty "events" to put your code, or what parameters this events or plugins will have.
For example usable plugin would be in before save event, but you will have to have only one place in application that will save various types of business documents, so you can call plugins there and parameter would be abstract document object.
So you have to think real hard about your system architecture, to be abstract enough for various plugin points, and do that architecture completely, don't do just a part of the system and start coding on that.
I hope that you understood what I meant to say, because English is not my native language.

SQL based storage vs SVN

My team is developing a new application (C#, .Net 4) that involves a repository for shared users content. We need to decide where to store it. The requirements are as follows:
Share files among users.
Support versions.
Enable search by tags and support further queries such as "all the files created by people from group X"
Different views for different people (team X sees its own content and nobody else can see theirs).
I'm not sure what's best, so:
can I search over SVN using tags (not SVN tags of course, more like stackoverflow's tags)?
Is there any sense in thinking of duplication - both SVN and SQL - the content?
Any other suggestions?
Edit
The application enables users to write validation tests that they later execute. Those tests are shared among many groups on different sites. We need versioning for the regular reasons - undo changes, sudden deletions etc. This calls for SVN.
The thing is, we also want to add the option to find all the tests that are tagged "urgent" and were executed by now, for tracking purposes.
I hope I made myself more clear now :)
Edit II
I ran into SvnQuery and it looks good, but does it have an API I can use? I'd rather use their mechanism with my own GUI.
EDIT III
My colleague strongly supports using only a database and forget file based storage. He claims it is better for persistence (which is needed - a test is more than the list of commands to execute). I'd appreciate inputs on this issue, as I think it should be possible to do it this way or the other.
Thanks!

Firstly, consider using GIT rather than SVN. It's much faster, and I suspect it's more appropriate in your use-case: it's designed to be distributed, meaning your users will be able to use it without an internet access, and you won't have any overhead related to communicating with the server when saving documents.
Other than that, I'm not making full sense of your question but it seems like the gist of it might be better rephrased like so: "Can I do tag-based searches/access restriction onto my version control system, or do I need to create a layer on top to do so?"
If so, the answer is that you need a layer on top. Some exist already, both web-based (e.g., Trac) and desktop-based (e.g. GitX). They won't necessarily implement exactly what you need but they can be a good starting point to do what you're seeking.

You could use SVN.
Shared files: obvious and easy. It also supports the centralised locking that you might need for binary files.
Versions. Obviously.
Search... Now we're getting into difficult territory. There is a Lucene addon that allows web searching of your repo - opengrok, svnquery or svn-search. These would be your best starting points for that.
There is no way to stop people seeing what's present in a svn repo, but you can stop them from accessing it. I don't know if the access control could be extended easily to provide hidden folders, you could ask the svn developers.
There's some great APIs for working with SVN, probably the most accessible is SharpSVN which gives you a .net assembly, but there's Python and C and all sorts available.
As mentioned, there are web tools which sit on top of SVN to provide a view into it, there's Trac, and Redmine and several repo-viewers like webSVN, so there's plenty of sample code to use to cook up your own.
Would you use a DVCS like git or mercurial? I woulnd't. Though these have good mechanisms in themselves, it doesn't sound like they're what tyou're after. These allow people to work on their own and share with others on a peer-to-peer basis (though you can set a 'central' repo and work with that as everyone's peer). They do not work in a centralised, shared way. For example, if you and I both edit a test case locally andthen push to the central repo, we might have issues merging. We will have issues merging if the file is a binary or otherwise non-mergable file. In this case you have a problem with losing one person's changes. That's one, main reason for not using a DVCS in your case.
If you're trying to get shared tests together, have you looked at some apps that already do this. I noticed TestRail recently that sounds like what you're trying to do. It's not free (alas) but it's cheap.

What considerations need to be made when transitioning an application to support?

I will be taking on the role of support for a complex application that is transitioning from the development team. This application is a sharepoint solution that connects to several (7) web services. The development team is rolling off almost immediately and will be available only for small questions.
I'm new to this role so I'm wondering what suggestions you have for me as I take on this large project. What are some considerations that should be made so that the transition to support is smooth and uninterupted?
I've been reading the documentation but I can already see some gaps that need to be filled. The applicaiton is very (perhaps overly) configurable and there is lots of injected code. Stepping through the code is about the only way I can gain an understanding of what is actually happening.

It sounds like you've already got your environment set up if you're able to debug the application, so that's the first thing I was going to suggest in a knowledge-transfer situation. Some general things that I would get from the developers before they depart:
A list of third-party components that the application uses, along with license information and website logins if applicable.
Access to every part of the environment that this thing runs on, both production and development. That means the source code management system, database server(s), etc. It sounds like you have some of these already but make sure you get access to absolutely everything.
If your development environment was given to you "as is" (i.e. you took it over from one of the departing developers, make sure you know how to rebuild it from scratch. They might have a document that describes the process of building a development box, but if not maybe you can get them to show you how to set up a fresh machine.
Three will go a long way towards this, but if setting up a server to run the application is different in any way from setting up a development environment, you'd want to know how so you can diagnose server configuration issues if they crop up, or even rebuild a server. Although this sort of thing may be someone else's responsibility depending on your organization.
Once you have those, you probably want to get some understanding of why the application does the things that it does. That will give you the context you need to understand support and enhancement requests when they come in.
Are the original developers the only source of this information, or are there business people who you will be working with after the developers leave? One of the first things I try to do when starting on an existing application that's new to me is to find someone who knows the business well and have them give me a high-level run-down of the application's purpose in life. From there you can go into more detail on individual components/features/whatever as needed. The business people may be a better source for this information than the developers are, so you may want to try them first.
Hopefully some of that helps.

If you're not the systems admin (as opposed to the SharePoint admin), develop an understanding with them of what tasks you are able to do and what you need of them.
This may include things like stopping and starting services (IIS, Timer Service, etc.) and filesystem and DB monitoring and maintenance. Getting this sorted out up front saves a lot of pain later.
If the sys admins don't have some understanding of SharePoint, educate them. They will need to know what the deal is with things like code deployments.
It's best not to feel my pain.

Monitor Usage Statistic-- How it is done?

Windows, Firefox or Google Chrome all monitor usage statistics and analyze the crash reports are sent to them. I am thinking of implementing the same feature into my application.
Of course it's easy to litter an application with a lot of logging statement, but this is the approach that I want to avoid because I don't want my code to have too many cross cutting concern in a function. I am thinking about using AOP to do it, but before that I want to know how other people implement this feature first.
Anyone has any suggestion?
Clarification: I am working on desktop application, and doesn't involve any RDBMS

Joel had a blog article about something like this - his app(s) trap crashes and then contact his server with some set of details. I think he checks for duplicates and throws them out. It is a great system and I was impressed when I read it.
http://www.fogcreek.com/FogBugz/docs/30/UsingFogBUGZtoGetCrashRep.html
We did this at a place I was at that had a public server set up to receive data. I am not a db guy and have no servers I control on the public internets. My personal projects unfortunately do not have this great feature yet.

In "Debugging .Net 2.0 Applications" John Robbins (of Wintellect) writes extensively about how to generate and debug crash reports (acutally windbg/SOS mini dumps). His Superassert class contains code to generate these. Be warned though - there is a lot of effort required to set this up properly: symbol servers, source servers as well as a good knowledge of VS2005 and windbg. His book, however, guides you through the process.
Regarding usage statistics, I have often tied this into authorisation, i.e. has a user the right to carry out a particular task. Overly simply put this could be a method like this (ApplicationActions is an enum):
public static bool HasPermission( ApplicationActions action )
{
// Validate user has permission.
// Log request and result.
}
This method could be added to a singleton SercurityService class. As I said this is overly simple but should indicate the sort of service I have in mind.

I would take a quick look at the Logging Application Block that is part of the Enterprise Library. It provided a large number of the things you require, and is well maintained. Check out some of the scenarios and samples available, I think you will find them to your liking.
http://msdn.microsoft.com/en-us/library/cc309506.aspx

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.