I am working on an application that is about 250,000 lines of code. I'm currently the only developer working on this application that was originally built in .NET 1.1. Pervasive throughout is a class that inherits from CollectionBase. All database collections inherit from this class. I am considering refactoring to inherit from the generic collection List instead. Needless to say, Martin Fowler's Refactoring book has no suggestions. Should I attempt this refactor? If so, what is the best way to tackle this refactor?
And yes, there are unit tests throughout, but no QA team.
Don't. Unless you have a really good business justification for putting your code base through this exercise. What is the cost savings or revenue generated by your refactor? If I were your manager I would probably advise against it. Sorry.
How exposed is CollectionBase from the inherited class?
Are there things that Generics could do better than CollectionBase?
I mean this class is heavily used, but it is only one class. Key to refactoring is not disturbing the program's status quo. The class should always maintain its contract with the outside world. If you can do this, it's not a quarter million lines of code you are refactoring, but maybe only 2500 (random guess, I have no idea how big this class is).
But if there is a lot of exposure from this class, you may have to instead treat that exposure as the contract and try and factor out the exposure.
If you are going to go through with it, don't use List< T >. Instead, use System.Collections.ObjectModel.Collection< T >, which is more of a spirtual succesor to CollectionBase.
The Collection<T> class provides protected methods that can be used to customize its behavior when adding and removing items, clearing the collection, or setting the value of an existing item. If you use List<T> there's no way to override the Add() method to handle when someone ads to the collection.
250,000 Lines is alot to refactor, plus you should take into account several of the follow:
Do you have a QA department that will be able to QA the refactored code?
Do you have unit tests for the old code?
Is there a timeframe that is around the project, i.e. are you maintaining the code as users are finding bugs?
if you answered 1 and 2 no, I would first and foremost write unit tests for the existing code. Make them extensive and thorough. Once you have those in place, branch a version, and start refactoring. The unit tests should be able to help you refactor in the generics in correctly.
If 2 is yes, then just branch and start refactoring, relying on those unit tests.
A QA department would help a lot as well, since you can field them the new code to test.
And lastly, if clients/users are needing bugs fixed, fix them first.
I think refactoring and keeping your code up to date is a very important process to avoid code rot/smell. A lot of developers suffer from either being married to their code or just not confident enough in their unit tests to be able to rip things apart and clean it up and do it right.
If you don't take the time to clean it up and make the code better, you'll regret it in the long run because you have to maintain that code for many years to come, or whoever takes over the code will hate you. You said you have unit tests and you should be able to trust those tests to make sure that when you refactor the code it'll still work.
So I say do it, clean it up, make it beautiful. If you aren't confident that your unit tests can handle the refactor, write some more.
I agree with Thomas.
I feel the question you should always ask yourself when refactoring is "What do I gain by doing this vs doing something else with my time?" The answer can be many things, from increasing maintainability to better performance, but it will always come at the expense of something else.
Without seeing the code it's hard for me to tell, but this sounds like a very bad situation to be refactoring in. Tests are good, but they aren't fool-proof. All it takes is for one of them to have a bad assumption, and your refactor could introduce a nasty bug. And with no QA to catch it, that would not be good.
I'm also personally a little leary of massive refactors like this. Cost me a job once. It was my first job outside of the government (which tends to be a little more forgiving, once you get 'tenure' it's damn hard to get fired) and I was the sole web programmer. I got a legacy ASP app that was poorly written dropped in my lap. My first priority was to get the darn thing refactored into something less...icky. My employer wanted the fires put out and nothing more. Six months later I was looking for work again :p Moral of this story: Check with your manager first before embarking on this.
Related
I have just inherited a legacy C# & VB.Net project which I will have to maintain and augment from now how.
There are no interfaces and obviously no Dependency Injection.
The first thing I am thinking of doing is creating interfaces and adding NInject, which would then make it possible to unit test the project eventually.
Is it a good idea or should I leave it alone ?
What are the best practices for implementing DI when it comes to legacy projects.
Thanks
I don't think there's a set best practice, other than use common sense - it's kind of a case by case scenario. A few important questions to ask yourself:
How much effort is going to be required to create interfaces for the current classes?
How much additional effort is going to be required to write proper
unit tests? Will these unit tests add more value than the time spent?
How long is this legacy system even going to be maintained? There's
nothing worse than doing a huge upgrade (requiring testing not only
by the development staff, but by the product user) to replace it in
18 months.
Also, how long has this legacy system been in place
without issue? There's no reason to invent work if it appears stable
and really has low maintenance.
Is TDD good approach for small and short projects done by small teams up to 4 people ? Is that really a profitable effort ?
What about Unit Testing ? TDD implies Unit Testing but no conversely. Wouldn't be Unit Testing sufficient, done from time to time during the project development life cycle, until reasonable code coverage ?
For me, it doesn't boil down to whether the project is small or short. TDD, done correctly, is about being able to quickly run a set of tests that provide full confidence in the code. There also has been a lot written about TDD helping to drive out the appropriate design for projects as well.
So, you could argue that TDD is best on small and short projects because you end up only writing the code that you need to make the tests pass and nothing else. Therefore, keeping the cost down. Then, there is the added benefit of having confidence in the tests and code when you make changes later.
The next small point I would make is that a lot of projects start small and short. These interim solutions have a way of becoming strategic platforms for development (if successful).
Recently I've read nice blog post from Szymon Pobiega, titled If you are not doing TDD…. I really encourage to take a look on this. It's because I'm a bit skeptical about TDD and placing it into great habits of development life cycle, as an ultimate solution for your projects safety, apart from the nature of the project. I like this fragment:
TDD is most useful when you have absolutely no idea about the shape of
the code. In such case you better not hack the code as fast as you
can. Rather then make one small step at a time, each time validating
the outcome. Ideally, do it in a pair. Each step you make allows you
to discover more tests that need to be written. I love the analogy Dan
used in his presentation: it’s just like swimming versus walking in
1.50 meter deep pool. You can walk all the way through ensuring every time one of you feet is on the ground or you can just swim which is
more risky (no contact with ground) but makes you move a lot faster.
TDD is like walking.
He is referencing to Dan North session from NDC 2012.
The answer is YES! And for the following reason.
Picture yourself the following scenario. You and your project team have created a great application and decide to put it in production. After a while a user comes to you saying: Hey guys I found a bug in your application can you fix it please? Of course you say yes fix the bug and the user is happy. Only after a while he comes back saying, great that you fixed the problem but now something that did work doesn't anymore can you repair it?
If you had used TDD and had made unit tests for the application the scenario wouldn't be like this. This is because you have tests made for your entire application. After solving a bug, you simply run the unit tests again to see if your "fix" din't break any other things in your application.
This answer is more aimed at the use of unit tests. The following is abbout TDD itself.
What TDD makes you as an individual do is think about what am I going to code in the following (object, class, function and so on). Also what are the edges of my code what if/else statements do I need what do I need to check for. If you are by yourself or in a project team of 20 people, this doesn't really matter. The thing with TDD is the thinking process of you not the other people in your project.
If you are doing UnitTesting its only a small step to adopt TDD. But you will benefit much more from it.
Just one example: TDD defines that you implement your tests first. This implies that you think about the goal that you want to achieve before you start to implement. This leads to better code.
TDD and Unit Testing are not alternatives to each other. In every project you should test your code, this is where you do Unit Testing and Integration and System Testing.
TDD is however, a development model. Like Waterfall and other development methods, TDD is a method too. In TDD you have some basic requirements and you write unit tests to ensure that the requirements are implemented and working. But when you are writing unit tests you realize that in order to achieve the major requirements, you need to implement more functions and classes. So unit tests in this context makes other requirements clearer.
Suppose that you need to write an application that prints the name of the computer the application is running on. First you write a unit test:
[Test]
public void ProducedMessage_IsCorrect()
{
AreEqual(BusinessLibrary.ProduceMessage(), System.Environment.MachineName);
}
and then you realize you need to implement the ProduceMessage function. You implement it as simple as it gets so that the unit test passes. You write the method like this:
public string ProduceMessage()
{
return "MyComputer";
}
Then you run the test and it passes. After that you submit your code and other members of the team gets the code. When they run the test the test fails because you hardcoded the name of your computer in the code.
So some wise member of your team changes the code to the correct form and you keep going.
It is all about choosing developers who have TDD experience. At least some of them should be experienced TDD developers I think.
I client of mines do not care about elegant and well structured code. I am writing the applications from scratch with minimal 3rd party tools. I am using l2s, Recatcha, tinymce, lucene, and structure map.
I would like to quickly get the clients product to the market as fast as possible while sacrificing elegant code. Are there any tools out there that will enable me to rush the product to the market?
No client ever cares about elegant and well structured code. That's not why you write elegant and well structured code. You write it because it's shorter, simpler, it's faster to write, contains fewer bugs and it is easier to find those bugs.
ADD: I know what I wrote above sounds contradictory. When I started working, I didn't believe that either. I had to learn the hard way. So to make the point clearer: This is what typically happens when you don't try to write elegant, well structured code:
You'll introduce subtle bugs that lead to weird, unreproducible behavior and take 10 times more time to find than writing the code in the first place
You'll solve the same problem multiple times. Or, the other way round: The elegant solution would have solved a set of problems while the ugly solution will only solve one problem. Or part of one problem.
You'll repeat yourself a lot. That means more code to write, more code to maintain, more bugs.
You'll write code that you don't understand a week later. So instead of adding new features/solving bugs you'll waste your time trying to figure out why some piece of code works (or doesn't work)
You'll solve the wrong problems. This is by far the worst danger, and it happens too often if you try to save the time you need to plan the thing properly.
Good code is for you and not your client.
I don't think there exists such a magic pill.
But I would rather recommend you checkout Joel's 12 Steps to Better Code. Not all principles may apply to you (if you are not working with at least a certain number of people) but things like version control, how you deal with bugs, testing and others will help more than what you think.
Assuming that adding more team members is not an option, you can either:
work longer hours (live of coffee and pizza until either you or the project is finished)
defer some features for version 2.
sacrifice quality.
deliver late.
the choice is yours but option 2 would be my recommendation. A program with fewer features that works is better than a feature laden product that can't be relied on.
Bit of a cliché, but there are no tools that can get you the result you're looking for, only people. For that matter, there are no tools that can guarantee you robust, reliable, well-designed and appealing products that people will actually want to use – those are all problems that can only be tackled by meatware. Respectfully, I'd be careful about the whole concept of "rushing a product to market", if I were you: I'm sure you have your reasons for taking that approach, but more haste really does often make for less speed, and less desirable results.
When you don't have time to build a product to reasonable standards then it's important to know which parts you can cut corners on and which you can't.
The most important thing to get right is the interfaces between components. Make sure that they are correct and that coupling between components is as little as you can make it.
If for example you have a report generator that sometimes crashes, occasionally generates the wrong results, and has missing and broken features you can repair it later on when you do have time, or even scrap the whole module and do it properly.
If you've hacked the interface though, and it relies on other components storing their data in certain ways, or relies on the internal workings of other modules it becomes significantly harder to rip it out and replace it cleanly.
Don't skimp on the design of the high level modules and the interfaces between them. Keep asking yourself if I have to rip out this module and do it a different way, will it affect any of my other modules... The answer should as much as possible be know. it's "easy" to go and fix code, but not if it's all one big tangled mess. The small compoents don't need to be good as long as you can replace them easily later.
Obligatory comment - I'm not suggesting anyone writes bad code of course. Just that sometimes there are essential business requirements that make deadlines such that you can't do a good job of everything, and it is an important skill to know which things you can fix up later and which you can't.
So anyway to answer your question design tools such as UML drawing tools etc are probably more use than coding tools
Mostly you end up with this:
(source: scottsimmons.tv)
Another one that I heard a project manager once say about adding extra people to a team:
"It's not because you have 9 women that the time to grow a baby will only take one month."
I would recommend using a continuous integration tool such as CruiseControl.NET or hudson and writing many JUnit tests (or the C# equivalent).
This way even if you develop quickly and don't spend enough time working out all the pieces, the CI server should prevent you from producing bugs which will take you an extremely long time to find.
That said, I agree with what the others stated, you write elegant code so you (or your teammates) will understand it and not so your client is satisfied.
Hi! I recently tried developing a small-sized project in C# and during the whole project our team used the Test-Driven-Development (TDD) technique (xunit, moq).
I really think this was awesome, because (when paired with C#) this approach allowed to relax when coding, relax when projecting and relax when refactoring. I suspect that all this TDD-stuff actually simplifies the coding process and, well, it allowed (eventually, for me) to get the same result with fewer brain cells working.
Right after that I tried using TDD paired with C++ (I used Google Test and Google Mock libraries), and, I don't know why but I actually think that TDD here was a step back in terms of rapid application development.
I had some moments when I had to spend huge amounts of time thinking of my tests, building proper mocks, rebuilding them and swearing at my monitor.
And, well, I obviously can't ask something like "what I did wrong?" or "what was wrong in my approach?", because I don't know what to describe. But if there are any people who are used to TDD in C++ (and, probably C#) too, could you please advise me how to do this properly.
Framework recommendations, architecture approaches, plain coding advices - if you are experienced in TDD & C++, please respond.
I think TDD is much harder to do in C++ than C#. The lack of reflection, and the common (and often well-justified) reluctance to rely on dynamic polymorphism (interfaces and in heritance) compared to static polymorphism does make it harder to mock out many classes.
There are some extremely clever unit test frameworks for C++, but the thing that's so clever about them is mainly that they try to bypass the language limitations.
TDD works best in dynamic languages. It's a great way to work in Python. It's doable in C# (which isn't dynamic, but has comprehensive reflection capabilities)
In C++, it's often problematic. That doesn't mean it can't, or shouldn't, be done, but when you do it, expect to have to work a bit harder at it. And sometimes, you may be better off using another approach entirely.
TDD is something that takes some practice to get right, regardless of the platform. What some people don't seem to realize is that the nature of the problem your trying to solve can have a big impact on how easily you can apply TDD to the solution. I've had problems in the past where I knew the solution I wanted to move towards, but it was extremely difficult to figure out how to break the problem up in a way that seemed to fit the TDD model. Now there are several reasons why this may happen, and it's impossible to say what the "right" way to handle that situation is.
At this point my first reaction to running into this sort of problem is to re-examine my original assumptions about the problem. Am I making it more complex than it needs to be? Am I trying to write tests to arrive at a design I've already decided on instead of letting the tests guide the design? Is it just a funky problem, and I need to accept that the typical TDD approach isn't going to work in this case?
For an interesting discussion of this you can look at this blog post from Uncle Bob Martin, where he talks about an attempt by Ron Jeffries to create a Sudoku Solver using TDD, and it doesn't really work. Now because this attempt did not produce a good solution doesn't mean that TDD is useless, it just means that the problem being solved is more complex, and does not lend itself to the emergent design approach of TDD.
Try the easiest - CxxTest.
I find test driven development very hard to do properly all the time; sometimes the tests just flow, sometimes a bit of a jump is required. To keep things fast I frequently step away from the TDD approach. That isn't a problem for me as I maintain a full set of unit tests for all the code I've 'completed' (allowing the relaxation while coding the new bits and refactoring) .
I've noticed that the majority of enterprise web apps I've worked on over the past few years have seemingly mis-used the powers of oo.
That is, what once would have been perhaps 1000 lines of HTML and script, has often now morphed into 10,000 lines of code, 50 classes, and 2000 method calls to do basically the same thing. I.e. oo and layered architecture appears to be over-used and/or ill-used, often leading to longer development times, higher-cost, and often nightmarish maintenance.
How often are other people seeing this happen?
How can oo be effectively utilized to, as the Buddha himself has said: as much as possible try not to harm...as much as possible try to help...
"The road to hell is paved with the best of intentions."
I haven't personally encountered this myself, but all the times I've heard stories it seems to be an issue of architecture astronauts (people who spend too much time thinking) or bad developers (people who spend too little time thinking).
In the earlier days of programming, you didn't see as much of this because of the limits of the hardware, languages, etc.
However, developers are now are trying to focus on writing code that's understandable by humans for loose coupling and higher maintainability by incorporating as many design patterns and OO principles as they can, but just like everything it can be over-done.
On the other hand, some developers might just not be thinking enough about the problems they're attempting to solve and writing extra code just because it gets the job done and not thinking about the bigger picture.
In either case, developers might not be malicious or even incompetent and want the best for the projects their working on, but they still over-do principles simply because they are trying too hard.
So I would say the solution is to remind developers to use OOP principles as guidelines, but just that. There comes a point when you have to find a happy medium between thinking and programming and just stop thinking and start programming.
See: Jeff wrote a good blog post about just this kind of thing: KISS and YAGNI.
I see these all the time :( Basically if people are going to do a mess they will do it trying or not to use oo design. It gets equally awful on either case.
Update 1: it is important to understand how/what will be reused (but not going crazy on it as that would hinder productivity), since we don't want to get tons of classes where every single one of them aren't reused and fulfill tons of different functions.
Basically the main issue is understanding and caring for what is being built, as you could apply oo, tdd, ddd, anything, and if the devs doesn't understand what they are doing it will end up in the same mess ... or worst :(
Bottom line, these things do help, but they aren't magic, they won't replace the developers skills to create maintainable code.
Update 2: Also note that a checklist or some bullet points won't do it. I mean I love SOLID, and plenty things going on and I think they really clear things up, but they usually make the most impact on the people that have been trying to avoid the mess.