What Is Reworking

Reworking is what school children might call, "Do over!" It means, quite simply, redoing some work done previously. On this site it has two main usages:

1. Reworking writing.

As explained briefly on WhatIsRefactoring, it is possible to refactor the semantics (denotative meaning) of a work. However, natural language isn't limited to semantics (or functionalism). It contains connotative meaning as well, normally called style (e.g. rhetoric). The style of a work comes from the structure of the writing, the relationship of the words. It is strictly not possible to refactor a piece of writing and preserve the style.

Consider the difference between, "Quickly, John ran to Jane." and "John ran quickly to Jane." Functionally the two sentences are equivalent, but stylistically they are quite different. The first sentence emphasizes the speed at which John ran whereas the second (more weakly) emphasizes the characters in the narrative.

But more practically, refactoring text is really rare (although it is possible, say, by moving a section of discussion to a separate page on a wiki). Reworking is much more common. Restating what was said, removing what's not interesting, explaining points in further detail.

So, when someone says they are refactoring a page, they are more likely to be reworking it so it reads better.

A big goal when reworking is to preserve the relationship of the work under revision to other works in the system. Thus, when reworking a page on a wiki, one doesn't want to have to adjust the relationships it has to other pages. (This could be false if the discussion becomes meta, but we'll ignore those cases because they are self-defeating anyway.)

Think about renovating a basement bathroom in the house. A good contractor can do the job without changing the upstairs bedrooms. A really good contractor may not even affect the surrounding rooms.

2. Reworking code.

Typically, this is done in order to improve the system, say by cleaning up code or by optimizing. With respect to cleaning up code, reworking fits on the spectrum between CowboyCoding and DesignByContract from refactoring to a little bit towards CowboyCoding. Strictly speaking, refactoring is a special case of reworking (see below, WhatIsRefactoring).

Refactoring is good when you want to guarantee functional equivalence. Say when you are confident in how a subsystem works but not how it is written. Strict refactoring prevents additional bugs from being introduced. On the other hand, it preserves existing bugs.

The looser version of reworking is to just rewrite it to improve the quality. Really, this almost assumes that the programmer is perfect (ala CowboyCoding), and thus won't introduce bugs. But it's also limited in scope, thus reducing the amount complexity that the programmer has to worry about. So, by virtue of its increased flexibility, it has the potential to be more optimal than refactoring, even if refactoring is guaranteed to work. People who can afford to trade defects for time to market may appreciate this approach. Or those who understand the solution (and how to write it) well enough that any bugs introduced will be easy to detect. How that's done is another issue, of course. It could be false, too, as happens whenever one moves out of guarantees into the riskier world of contingent flexibility.

Another question is at what level do we want to guarantee functional equivalence. If you consider the system to be an onion of interfaces (not really true in many cases, but let's pretend), then you can select one layer and lock that down with RegressionTests. Anything beneath that you can rework without maintaining equivalence to corresponding units in the previous revision of the system. This once again balances flexibility against scope. While this may seem like refactoring, strict refactoring would preserve equivalence down to the atomic units in the system as well.

Why do you say this? Have you read MartinFowler's book? There are lots of places in it where equivalence is not preserved "down to the atomic units in the system." As I see it, the salient difference between refactoring and reworking (as described above) is that refactoring focuses on the idea of taking small steps and running the tests after each step, which makes it very easy to find where a bug crept in.

Yes, I've read his book. If you read WhatIsRefactoring, there is a distinction between strict definitions and practical applications of refactoring. As it says above, "strict refactoring would preserve equivalence down to the atomic units," emphasis added. What people normally call refactoring is different, but this page is to help clarify the distinctions. Actually, please do define the "grey zone" between the strict definition and the sane definition somewhere. That would improve things immensely.

There's only one definition of refactoring: "Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure." (cf. WhatIsRefactoring)

MartinFowler follows the "strict" definition (which is also the "sane" definition) throughout the book. Not everything in the book is an example solely of refactoring. He shows how to refactor and change functionality in interleaved, small steps. This refactor-change interleaving is one way to carry out a reworking. But refactoring and reworking don't compete head-to-head.

When you refactor, you don't change the external behavior of the code. So you need to pick a boundary -- the interface of some little subsystem, say. Then you don't change anything about the subsystem's external (outside that boundary) behavior, but you do change things about its internal (inside that boundary) behavior. Usually, you end up picking something you want to change (a method name, say) and this gives you an implicit boundary (every piece of the system that calls that method, but nothing else). This could get messy if you tried to do too much at once, so Martin continually warns the reader not to take steps that are too big.

It can't be functionally equivalent if the functional points change. The functional points are the atomic units, not the lines of code. Semantic atoms, not syntactic atoms. This is just like how on NarrowTheInterface, the interface talked about isn't the set of method signatures, but some semantic boundary. Does that make sense? I guess an example would do. You can lock an API down, but the side effects can change. The side effects are often the problems in the first place after all. I don't think we're disagreeing. I think that what you're saying should be folded into the mainline of this page. Care to do that?

One thing that reworking can do, as well, is remove defects or add features. This can be easy to do even by cleaning up code. It may be difficult to preserve a defect from one revision to the next, especially one that you weren't necessarily aware of. Or it may be more elegant to add functional points. If you believe in complete RegressionTests, you have to be mindful of that latter case. It's possible to make all the RegressionTests pass while ending up with a superset of features; thus, you have to write new tests for anything you've consciously decided to add or anything that emerges from the system and gets used.

If you're refactoring and you find a bug, you stop refactoring (you have a running system at any point), fix the bug, maybe fix or extend the UnitTests, then go back to refactoring. That's one of the good things about taking small steps: you don't have to keep track of whether you're trying to work around something or change it.

More generally, the key insight of refactoring is that you can separate your behavior into two modes: 1) adding features/removing bugs and 2) rearranging the code to make it easier to do (1). Switching back and forth between modes rapidly is not a problem, but paying attention to when you're in each mode pays off. Many people who have tried refactoring like this separation. When is reworking (which doesn't have this separation) preferable to refactoring-interleaved-with-changing (the true comparison)?

Refactoring is reworking. What would be nice is some description of how refactoring can be done in the stories section (or links out to the other pages). By the way, where's that reference to a RefactoringCap?? That's a perfect metaphor/story to describe what you are talking about.

Theoretic mumblings.

Fundamentally, we can view all linguistic systems as a graph G = (V,E). Each functional (logical) point can be represented as a vertex. Refactoring transforms G to G' = (V,E'), where E' does not equal E. Reworking transforms G to G' = (V',E') where G' does not equal G, but V' may or may not equal V and E' is not equal E. Clearly, refactoring is a special case of reworking when V' = V.

However, if G was abstracted as a node in a metagraph H, a good reworking on G will make a commensurate transformation on H to H. That is, H shouldn't change.


How do you rework cruft in the system? Do you refactor or do something else? Add your story below.

I don't like BigDesignUpFront because I can't think that far ahead. Usually, at most I do a small design up front to map out what the problem is and my initial solution. Then I jump in and code it. I don't like coding things in isolation, though. I will immediately hook it into the rest of the system so I can run it. If I can't, I'll rely on a test harness, but I don't like having only tests. Running the code is important. Running the code as it would be run is even more important.

The first revision of the system is usually bad. Sometimes it's really bad. So I may just delete it. Then I'll rewrite it. BurnTheDiskpacks.

I may still have problems solving the problem. The new revision may suck. So I'll just delete that too. As I do this, deleting the interface calls from the existing system becomes annoying, as well as implementing large, complex interfaces. So, the interface to the subsystem becomes simpler. NarrowTheInterface, I say.

Eventually, I decide I understand the interface well enough that I'm not going to change it. So, I delete the implementation. This leaves me with a set of stubbed methods or functions. As always, I start to write from the interface in, only writing what I need. I borrow on my experience with the previous revisions. I know I'll need a stack. I know there will be incidental system code I'd like to separate from the implementation.

This is how I rework to reduce complexity. There's more to it than this, like how to make simple interfaces, and design strategies to contain and reduce complexity. For instance, there are SelfDocumentingCode, SynchronizationStrategies and (occasionally) DesignPatterns. But this is the overall heuristic.

Note that this only works if the subsystem has an interface that fully defines its perimeter (say via TightGroupsOfClasses or the FacadePattern). Note that you cannot do this without a verification method to exercise the interface. Note that bugs will be introduced if too large a chunk is reworked. Note that interfaces that are complicated--say with a lot of boundary conditions, exceptions thrown, methods defined--will be too hard to implement. The ideal is to simplify until it's almost impossible to do it wrong. -- SunirShah

Quandary: I've just spent a year designing and implementing a reasonably complex system. Now I know a lot more than I did a year ago. The system works, but has a lot of warts. Some fundamental architectural decisions were wrong. Fixing them would break just about every test in the system (nearly 200 of them, both functional and unit). Am I stuck forever? Am I on the inexorable path to bit rot? Or am I just being a spoiled brat, and I should take my lumps and keep evolving this system, since it meets CustomerNeeds?? -- AlainPicard, architect on an XP project at Memetrics.

Solution 1: Do nothing. Solution 2: Rewrite. Set a new architectural direction and switch everything over in one marathon coding session. Solution 3: Incrementally evolve the architecture. Set a new architectural direction and incrementally switch over code blocks, making sure to OperateInTenMinuteCycles. All of the above solutions are based on my experience with applying them to non-trivial real-world projects. I wrote up a paper about using solution three; you can find it linked from the DenaliProject page. --JimLittle

Solution 4: Forget architectural purity. Make changes as needed, refactor, and let the architecture evolve on its own.

Jim, thank you for an excellent exposition. After reading this, it's clear that solution 3 is a no-brainer. But I do not want to embark upon it unless I know that I'll get to finish it, so I'll have to wait until such a time as is fitting to put in 2-3 weeks of solid work into the system (i.e. after release 2, or even 3 of the software). One question though, you say: be sure that your new architecture is simple. The thing is our current architecture is quite simple, but I feel it's hackish. This is one of these situations where it's not clear if I'm trying to satisfy my own sense of elegance (something some wikizens call stealing from the customer) or I'm having the foresight to prevent a future disaster (which, as architect, is my job, after all.) So, for now I'll proceed along 1) Do nothing, and, when/if it becomes possible, 3) incremental evolution. --AlainPicard

Glad I could help! The question of how much foresight to include in system architecture is particularly interesting to me. Would you mind if we discussed your project's architecture further off-wiki? My email is jlittle@titanium-it.com. --JimLittle Not at all. I'll drop you a line soon. --ap

I think we should apply some context here. In a manufacturing context, rework occurs when one of the things is not like the others. You either need to rework or scrap the item produced. Changing a design, simplifying the design, modifying an assembly line to produce a variation, etc., are not considered rework. If you CD burner makes a mistake and miscopies bytes in a file, hand patching the CD would be rework. Changing software to add a capability, add clarity, or even to fix a bug would not fit with the manufacturing definition of rework. -- WayneMack

EditText of this page (last edited August 14, 2010) or FindPage with title or text search