Thoughts on Legacy Code
I was listening to a recent Hanselminutes podcast episode in which Scott spoke with Michael Feathers of ObjectMentor on the topic of legacy code. They touched on a number of really effective techniques for approaching a legacy code-base and I wanted to echo some of their thoughts and add a few of my own.
First, the definition of legacy code (as was discussed in the show) is really much broader than the face value of the phrase itself. In reality it’s harder to define “legacy code” than it is to know it when you see it. Generally I know I’m dealing with legacy code when any of the following happen:
- I’m fearful of making changes…the magic box where this code lives might get angry
- I’m confused and lost after more than a few clicks of “Go to Definition”
- I find few or no tests, or the quality of tests is poor
- There are tests but I get mixed or untrustworthy results from running them
- I have to write more lines of setup code to write a test than the lines of code that I want to test
- I can find nobody who will admit to understanding (or writing) this code
Doubtless there are more than just these indicators to tell you that you’re dealing with legacy code, but these are red-flags for sure. So what are some high-level techniques that we can use to help bring legacy code into the present…well, here are a few:
Under The Rug
We’ve all done it at home…there’s that last little bit of something that you missed with the broom. So, you lift up the corner of the living room rug and kick it under there with your foot. While you’re mom probably wouldn’t approve, sometimes with code this can be an acceptable approach – provided you acknowledge what you’re doing. What I mean by sweeping it under the rug is that you wrap a section of “legacy code” in a bit of more modern, testable code. In this way you can continue to use the legacy sections but a cleaner interface can be created for testing.
The Strangler Application
This is a pattern that was introduced to me in the Hanselminutes podcast. The basic premise is an extension of the “Under the Rug” concept. But instead of just wrapping the code in a clean exterior, we actually take that opportunity to slowly “choke” out the old code.
One technique that can be employed to slowly transform the codebase is using the “I” in SOLID…Interface Segregation. Applying one or more interfaces to each legacy class allows you to incrementally swap out implementations with clean, new code.
Embrace Continual Design
One thought that came out in the Hanselminutes episode was that design is never over. And I think this is really important to realize. Even in production/support mode, design is something that needs to be addressed. When extending an application, are we setting it up for continued success and maintainability. When re-factoring something or transforming it to un-legacy code, are we taking care that it won’t regress back into legacy code. If a team is focused on continual design it’s less likely for code to become legacy in the first place.
Posthumous Test Coverage
A third approach to legacy code is by testing the crap out of it. One could even make the argument that this is a pre-requisite to the previous three approaches. Spending a lot of time covering legacy code in tests has two major benefits. First, it helps the developer really understand the intent of the code, it’s style, quirks, and patterns. Secondly, it does what tests do best…helps make code changes safer. If a codebase is adequately covered in tests then making changes becomes less scary and potential regression is reduced.
Forklift
Lastly, there is one approach to legacy code that is rarely mentioned…and rightly so, it’s the most aggressive and can be the most costly. Start over…sometimes it’s the best approach.
Once in my life I had to deal with a massive codebase that had been through many, many, cycles of attempted refactorings. The years had left this poor application in a state of wild and unruly disrepair. After discussing all the options with the client it became evident that actually starting over was the best (and in this case, cheapest) way to accomplish the goal.
Starting over can be daunting but if you attack it in a practical and measured way, and possibly mix it with the other techniques described here it can be liberating. It’s important to have a good requirements phase when starting over, you can’t just look at legacy code and start re-writing it. And you need to have all the trappings of a well staffed project in place as well, including strong QA.
~~
So those are some thoughts on legacy code. As always, I’d appreciate hearing what everyone else thinks…what other techniques have you used?