Refactoring is a process of rewriting, reworking and re-architecting of the program’s codebase without changing its observable behaviour. It is often used to prevent or correct code duplication#, preserve the orthogonality# of the software, update the code base on the changes of requirements or better understanding on the underlying technology, or optimise. The ultimate goal of Refactoring is to make the software easy to read, have all logic specified in one and only one place (no duplications#), doesn’t allow changes to alter existing behaviour, and allow only simple conditional logic.
Refactoring speeds up the software development by making the process of understanding a codebase easier. The code should be readable so the next guy who picks up the codebase will not waste much time on comprehending the code before becoming productive. This is with extra benefit where we can spot bugs more easily as we tend to spend more time in debugging#. Make the code speaks its purpose with more clarity. But it might drag the performance due to indirection.
Hunt et al. advised to refactor the codebase sooner than later before the task itself grow more tedious and larger. Regular refactoring should be encouraged as the drastic improvement on the design due to cumulative of small changes on the project (Fowler), and do not be slave to the history. Be ready to replace any code in the project.
A hint to do refactor is when we want to add function (if it is not as easy), fix a bug (because it is hard to spot a bug) and/or doing a code review (especially with pair programming). A simple rule is “Three strikes and you refactor”, meaning if we’ve encountered the duplication or similarities three times we should do a refactor.
However, Refactoring is not a magic cure to all troubles a software engineer will encounter. It is not encouraged by Fowler to refactor an application that is highly coupled to the database schema. It is largely due to the high cost of data migration. Try Persistence Programming instead. Or instead build an intermediate layer between the object model and database model, so changes happen in one model will not affect another.
The user experience should be taken into account before refactoring especially for library developers and/or maintainers on published interface. The changes must be made to be known, and if such changes are scheduled, the users need to know when will that take place. The Pragmatic Programmer
If starting from scratch is an easier approach, don’t refactor. The sign to look at is when the codebase doesn’t work at all and is full of bugs which is impossible to stabilise it. That being said, one could try to refactor the software into smaller components with strong encapsulation, and make a refactor-versus-rebuild decision for one component at a time.
Most importantly, don’t do refactoring when deadline is coming. You’ll not make it in time with refactoring.
Fowler suggested three rules for Refactoring in order to preserve the codebase quality:
- Don’t add new functionality while refactoring
- Set up self-checking tests# for the soon-to-be refactored code
- Only do localised changes which are small as it is less prone to go wrong
Note: Self-checking test should print out “OK” when it passed and a list of failures otherwise.
Note: Don’t add tests unless you miss a case.
As recommended by Fowler, run the tests that exercise the code where refactoring is taking place so that it doesn’t slow down the process.
There are some technologies that work around Refactoring process: