One thing I've noticed is that a lot
of programmers say they're practicing refactoring. However, if one observes their behavior, many of them don't really use the technique described in MartinFowler
book. Fowler talks about two distinct aspects of refactoring:
- Transforming the structure of code to improve it.
- Breaking each major transformation (ex. ReplaceConditionalWithPolymorphism) into a series of small steps that allow the programmer to compile and test frequently.
Most programmers I know (and myself, frequently) forget the second aspect. This leads to lengthy sessions where the system won't pass the tests (or even compile). In my case, these are often high-stress sessions, and I've noticed they frequently end with just chucking everything and starting over.
This topic came up on the DoesYagniInterruptFlow
discussion, but I've been interested in the issue for some time.
My questions are:
- What is the opinion of this forum?
- How often do you take small steps?
- Does language have anything to do with it?
- Do refactoring browsers change anything?
Language is definitely related to it. A statically typed language is more likely to demand big steps then a dynamically typed one. Also, since there's an overhead to the busy work you have to do to satisfy your compiler, you're more likely to take bigger steps to reduce busy work.
On reflection, type-change situations are the most common cause for rat-hole
refactorings that I've recently been down (I'm working in Java). It's often really simple things (like RenameMethod
) where the number of repetitive changes throughout the code-base can be huge (for a frequently used method). Often, it may take a number of minutes to just to get the compiler to run. My guess is that these are also the refactorings that a RefactoringBrowser
would be most effective at performing -- GeoffSobering
Fowler himself in the book writes that he only follows all the steps when he is doing something complicated, or when he tries a refactoring and things stop working. He suggests others do the same, although one should be more careful the first few times one does a particular refactoring.
When I refactor, I don't always follow the exact mechanics Martin describes. When I don't, I often regret it. I do usually manage to conform to the guideline of never being more than a few minutes away from a clean integration with all tests passing. This makes high-stress, lengthy refactoring sessions (i.e. anything beyond one hour) a rare occurrence for me. If you're able to chuck everything and start over though, you've already mastered the hard part. Congrats. -- LaurentBossavit
The important thing is not whether you break refactorings into small steps or not. It is whether you return the code to a working state frequently or not. Small steps help to achieve this.
I rarely go for more than a couple of minutes without running the tests when refactoring. When I, once in a while, use an interpreted language I can get those cycles down to seconds. If I'm adding new features I use longer cycles, but I try to shorten them too. -- AndersBengtsson
Unfortunately the code most often in need of refactoring is gnarly old legacy code for which there are no tests.
Clean code with tests needs quite a lot of refactoring, too, otherwise it quickly turns into gnarly old legacy code with too few tests. -- lb
And of course, one always writes tests to capture the insights obtained from the CodeArcheology? of the legacy code base. -- gss
See also UnitTestingLegacyCode
You can certainly split refactoring into many small steps using a static typing language. We do this in C++. It's easier when you don't have big chunks of legacy code, so it's worth taking time out to rip such code apart when you first get it. We do that too. A lot of the time, not much is left afterwards apart from a few core algorithms.
We have a few perl scripts to automate particularly troublesome refactorings, in particular MoveClass
. Not as good as a full-on browser, but better than nothing.
I used to relish multi-hour, multi-area refactorings that strained my brain to breaking point. I now get more done, in less time, with higher quality, simply by taking smaller steps. This is not very intuitive however, and most people never stop trying to do as much as possible in each session.
I think this is at least partly because they fear the pain of integrating their changes with the code repository. The remedy to this is to make integration less painful, of course - a big part of this can be doing some work to decouple areas that different teams are working on.
...and one way to do that is to integrate more often. ;)
I'm always trying to persuade people of that one. :)
That's exactly one of the major benefits (IMHO) provided by small steps. You have many more opportunities to integrate. At the most extreme (i.e. with a zero-cost integration process) you could integrate after every step. With a non-zero-ocst integration process, you have many more points to choose to integrate (ex. a team-member asks if they can start a task that depends on some of the work done earlier in the task you're small-stepping through). Viva Integration
One issue that can dissuade people from integrating often is a high-cost (i.e. lengthy) integration cycle. I'm sensitive to that situation because the project I'm currently working on has an integration procedure that takes, on average, 40 minutes (and I know this because we've been timing it for the past few days). Needless to say, we're working pretty hard to fix the integration-time problem, but for the time being it takes a fair amount of discipline to integrate early and often
's baby-step approach strikes me as RefactoringForDummies?
. The refactoring philosophy has made my code much more readable and maintainable. I refactor in larger chunks, and rarely have difficulty keeping things stable. I don't fault MartinFowler
; it's easier to combine small steps than to split up a large step. -- JaredLevy
See also: ContinuousIntegration RefactoringInVerySmallSteps