Refactoring And Rewriting

Refactoring is cheaper than rewriting. This belief seems to be at the heart of XP practices, so perhaps it should be held up to the light.

I'm motivated by the first couple of comments of TimeToDoItOver. On the one hand: if you don't have time to do it right, when are you going to have TimeToDoItOver? On the other: We're not smart enough to do it right the first time, nor dumb enough to believe we'll get to do it over. So we refactor it as we go. Notice the response is talking about refactoring not rewriting. At least some of the arguments are due to confusing the two. See also HowDoAntsWalkInaStraightLine and a few other places.

Refactoring is cheaper than rewriting. With refactoring you are just moving old code around - in some sense you are not writing new code, or inventing new algorithms. Mostly you are adding layers of indirection and decoupling, and InformationHiding - what some call ShieldPattern(s). Once the shields are in, rewriting can take place behind them.

It should be cheap in any system, but some make it cheaper than others. Renaming a method is easier, for example, if your environment will quickly tell you every place it is used, and bring up an editor window on each one so that you can examine the context. Renaming classes is easier in latantly typed languages, where you don't have to update every variable declaration and method argument to have the new type name. Refactoring pretty much relies on the infrastructure of object oriented programming - the implicit indirection of message-sends, the substitutability of inheritance. ExtremeProgramming is partly a reaction to the realisation of this cheapness.

Refactoring is the moving of units of functionality from one place to another in your program. It encompasses things as simple as renaming methods, and as complex as adding a helper class, moving methods to classes where they better belong, and creating subclasses and superclasses to reduce the overall amount of code in the system.

Refactoring has as a primary objective, getting each piece of functionality to exist in exactly one place in the software. Once that has happened, if there is something wrong with a particular method, there's just one method to "rewrite". What emerges is that continuous refactoring makes most rewriting unnecessary.

KentBeck has written a short paper where he refactors a procedural implementation of "100 bottles of beer on the wall" until it becomes a very elegant object-oriented solution. At no point does he ever break it ... at every step the program still sings the song. (Kent, where is that paper?)

What I've learned in this focus on refactoring, over the past couple of years, is that there is very little that really needs the big rip-it-out-and-do-it-over that I've felt I had to use in the past. Proper factoring and small steps can do, what, 90% of the work that needs doing. The only couple of places where we have had a "need" to do something radical on C3 are places where we waited too long to break something up ... if we had started when we first felt Smalltalk resisting, we wouldn't have had to do radical surgery even on those two.

Best of all, often they won't let you rip it out and do it over. Refactoring you can do as you go along! --RonJeffries

"Refactoring has as a primary objective, getting each piece of functionality to exist in exactly one place in the software." - although it's clear that having each piece of functionality exist in exactly one place, is not a sufficient condition for WellFactoredCode.

Another part of it would seem to be, "each routine does only one thing." This is what leads to a reduction in comments - the name of the routine can be the name of that one thing. This naming doesn't work if it does two things. It also leads to potential reuse because it avoids some coupling.

It's interesting that XP describes alternating cycles of refactoring and rewriting (or at least, writing, ie adding new code). Further, that the refactoring is at least sometimes directed towards enabling specific new code to be added. -- DaveHarris

More commonly we add the capability, then refactor. Yesterday a team added some important new function (paying advance paychecks in a system that only pays the current one) by adding three subclasses each with only one method. They showed it to me, we talked about it, then caused the bottom-most method to forward to the first of the three subclasses. Removed the other two. That's how it usually goes: the code is well-factored when you start. You add your capability, then see whether refactoring it to make it better is called for. --RonJeffries

OK - some of the examples on these pages have it the other way about, though. -- DaveHarris

having each piece of functionality exist in exactly one place, is not a sufficient condition for WellFactoredCode Yes, it is. It may even be sufficient for having good design. Sure, you can screw up the naming, but that's what the CodingConventions (SystemOfNames, IntentionRevealingSelector, RoleSuggestingVariable, SystemMetaphor, SimpleSuperclassName, QualifiedSubclassName) are for. Your design is there to give you one and only one place for each bit of logic in the system. If you can honestly say you satisfy OnceAndOnlyOnce, even to the smallest details, you have a system that is perfectly adapted to do what you need to do and (this is the big part) handle the kind of variation you have to handle.

One of the concerns people raise about XP is that it might produce "brittle, fragile, just-barely-good-enough code". This is why that isn't true. You need to support a certain amount of variation in your code, in order for it not to be brittle, fragile, and just-barely-good-enough. If you refactor until you satify OnceAndOnlyOnce, your system does support variation, but only the kind and amount that actually occurs.

The emergent property of such code is that change becomes easier over time. The paradox is that you don't spend gallons of skull sweat making it so up front. Instead, the system uses you to evolve exactly those sophisticated mechanisms it needs. The rest of it stays simple. --KentBeck

My comment was motivated by pages like CommentExample, which shows a refactoring that does not reduce repetition of code - at least that's how it seems to me. -- DaveHarris

I can't speak for Kent (who could), but fact is we refactor based on a number of clues from Smalltalk. For example, if we find that some method is sending multiple messages to the same object, we use that as a clue that the other object should be supporting a bigger chunk of functionality that does those two things (perhaps differently) internally to the receiver. This is certainly a refactoring. It may, in the longer run, reduce duplicate code when the next person uses that method, but it might not reduce duplicate code in the instant. I think' I think that you do need, as is suggested above, OnceAndOnlyOnce, plus aa certain amount of "each method does only one thing". This latter notion isn't entirely clear to me: obviously the method I just described does two things; but it performs one function, with the meaning of whatever those two things are. (If they didn't make sense together, we probably wouldn't make the refactoring to begin with.) --RonJeffries

If I write everything in one large chunk of code using a sufficient number of goto's, I am writing everything OnceAndOnlyOnce, so it must be good code, right? :-)

In a real app you can't make it come out that way. Try it. Don't forget you can only do the assign of the goto in one place. --RonJeffries

So, I would add the condition that this "Once" should be encapsulated (more generally: shielded -> ShieldPattern) in some way (method, object, bridge, interface...). --FalkBruegmann

If it's encapsulated, eg in a routine, it then satisfies the extra "each routine does only one thing" condition which KentBeck seems to be saying is unnecessary. -- DaveHarris

"In a real app you can't make it come out that way." When compiled/interpreted to machine instructions, even C3 is reduced to humble goto's, and that's a real app, isn't it? <naughty grin>

I guess I made my point poorly. If I define a method in one place, and Smalltalk (or C) decides to expand it in line, the source code meets the OnceAndOnlyOnce criterion, and the object code doesn't. That's not a problem, it's why G*d gave us compilers.

What I was trying to say with "you can't make it come out that way" is that if you try to write an app of any size as a monolith, you can't get it to meet the OnceAndOnlyOnce criterion. I freely state that I can't prove that ... but in any language I know of, reducing all the duplicate code to one occurrence will (tend strongly to) produce "cohesion" i.e. each method/subroutine/module will (tend strongly to) wind up doing just one logical function. --RonJeffries

I see! So OnceAndOnlyOnce and "each routine does only one thing" are probably really two sides of the same coin. After all, you can turn above sentence around to read: "Striving for cohesion, that is, having each method/subroutine/module perform just one logical function, will (tend strongly to) reduce redundancy (by way of making it obvious and easy to remove), i.e. each logical function will (tend strongly to) be carried out by just one piece of code.

CodeNormalization picks up these ideas (not the goto's, the others!) and looks at them from a different angle. --FalkBruegmann

EditText of this page (last edited May 20, 2000)
FindPage by browsing or searching

This page mirrored in WikiPagesAboutRefactoring as of April 29, 2006