Reworking is what school children might call, "Do over!" It means, quite simply, redoing some work done previously. On this site it has two main usages:
1. Reworking writing.
As explained briefly on WhatIsRefactoring
, it is possible to refactor the semantics (denotative meaning) of a work. However, natural language isn't limited to semantics (or functionalism). It contains connotative meaning as well, normally called style (e.g. rhetoric). The style of a work comes from the structure of the writing, the relationship of the words. It is strictly
not possible to refactor
a piece of writing and preserve the style.
Consider the difference between, "Quickly, John ran to Jane." and "John ran quickly to Jane." Functionally the two sentences are equivalent, but stylistically they are quite different. The first sentence emphasizes the speed at which John ran whereas the second (more weakly) emphasizes the characters in the narrative.
But more practically, refactoring text is really rare (although it is possible, say, by moving a section of discussion to a separate page on a wiki). Reworking is much more common. Restating what was said, removing what's not interesting, explaining points in further detail.
So, when someone says they are refactoring a page, they are more likely to be reworking it so it reads better.
A big goal when reworking is to preserve the relationship of the work under revision to other works in the system. Thus, when reworking a page on a wiki, one doesn't want to have to adjust the relationships it has to other pages. (This could be false if the discussion becomes meta, but we'll ignore those cases because they are self-defeating anyway.)
Think about renovating a basement bathroom in the house. A good contractor can do the job without changing the upstairs bedrooms. A really good contractor may not even affect the surrounding rooms.
2. Reworking code.
Typically, this is done in order to improve the system, say by cleaning up code or by optimizing. With respect to cleaning up code, reworking fits on the spectrum between CowboyCoding
from refactoring to a little bit towards CowboyCoding
. Strictly speaking, refactoring is a special case of reworking (see below, WhatIsRefactoring
Refactoring is good when you want to guarantee functional equivalence. Say when you are confident in how a subsystem works
but not how it is written. Strict refactoring prevents additional bugs from being introduced. On the other hand, it preserves existing bugs.
The looser version of reworking is to just rewrite it to improve the quality. Really, this almost assumes that the programmer is perfect (ala CowboyCoding
), and thus won't introduce bugs. But it's also limited in scope, thus reducing the amount complexity that the programmer has to worry about. So, by virtue of its increased flexibility, it has the potential
to be more optimal than refactoring, even if refactoring is guaranteed to work. People who can afford to trade defects for time to market may appreciate this approach. Or those who understand the solution (and how to write it) well enough that any bugs introduced will be easy to detect. How that's done is another issue, of course. It could be false, too, as happens whenever one moves out of guarantees into the riskier world of contingent flexibility.
Another question is at what level do we want to guarantee functional equivalence. If you consider the system to be an onion of interfaces (not really true in many cases, but let's pretend), then you can select one layer and lock that down with RegressionTests
. Anything beneath that you can rework without maintaining equivalence to corresponding units in the previous revision of the system. This once again balances flexibility against scope. While this may seem like refactoring, strict refactoring would preserve equivalence down to the atomic units in the system as well.
Why do you say this? Have you read MartinFowler's book? There are lots of places in it where equivalence is not preserved "down to the atomic units in the system." As I see it, the salient difference between refactoring and reworking (as described above) is that refactoring focuses on the idea of taking small steps and running the tests after each step, which makes it very easy to find where a bug crept in.
- Yes, I've read his book. If you read WhatIsRefactoring, there is a distinction between strict definitions and practical applications of refactoring. As it says above, "strict refactoring would preserve equivalence down to the atomic units," emphasis added. What people normally call refactoring is different, but this page is to help clarify the distinctions. Actually, please do define the "grey zone" between the strict definition and the sane definition somewhere. That would improve things immensely.
- There's only one definition of refactoring: "Refactoring is the process of changing a software system in such a way that it does not alter the external behavior of the code yet improves its internal structure." (cf. WhatIsRefactoring)
- MartinFowler follows the "strict" definition (which is also the "sane" definition) throughout the book. Not everything in the book is an example solely of refactoring. He shows how to refactor and change functionality in interleaved, small steps. This refactor-change interleaving is one way to carry out a reworking. But refactoring and reworking don't compete head-to-head.
- When you refactor, you don't change the external behavior of the code. So you need to pick a boundary -- the interface of some little subsystem, say. Then you don't change anything about the subsystem's external (outside that boundary) behavior, but you do change things about its internal (inside that boundary) behavior. Usually, you end up picking something you want to change (a method name, say) and this gives you an implicit boundary (every piece of the system that calls that method, but nothing else). This could get messy if you tried to do too much at once, so Martin continually warns the reader not to take steps that are too big.
- It can't be functionally equivalent if the functional points change. The functional points are the atomic units, not the lines of code. Semantic atoms, not syntactic atoms. This is just like how on NarrowTheInterface, the interface talked about isn't the set of method signatures, but some semantic boundary. Does that make sense? I guess an example would do. You can lock an API down, but the side effects can change. The side effects are often the problems in the first place after all. I don't think we're disagreeing. I think that what you're saying should be folded into the mainline of this page. Care to do that?
One thing that reworking can do, as well, is remove defects or add features. This can be easy to do even
by cleaning up code. It may be difficult to preserve a defect from one revision to the next, especially one that you weren't necessarily aware of. Or it may be more elegant to add functional points. If you believe in complete RegressionTests
, you have to be mindful of that latter case. It's possible to make all the RegressionTests
pass while ending up with a superset of features; thus, you have to write new tests for anything you've consciously decided to add or anything that emerges from the system and gets used.
If you're refactoring and you find a bug, you stop refactoring (you have a running system at any point), fix the bug, maybe fix or extend the UnitTests, then go back to refactoring. That's one of the
good things about taking small steps: you don't have to keep track of whether you're trying to work around something or change it.
More generally, the key insight of refactoring is that you can separate your behavior into two modes: 1) adding features/removing bugs and 2) rearranging the code to make it easier to do (1). Switching back and forth between modes rapidly is not a problem, but paying attention to when you're in each mode pays off. Many people who have tried refactoring like this separation. When is reworking (which doesn't have this separation) preferable to refactoring-interleaved-with-changing (the true comparison)?
reworking. What would be nice is some description of how refactoring can be done in the stories section (or links out to the other pages). By the way, where's that reference to a RefactoringCap?
? That's a perfect metaphor/story to describe what you are talking about.
Fundamentally, we can view all linguistic systems as a graph G = (V,E). Each functional (logical) point can be represented as a vertex. Refactoring transforms G to G' = (V,E'), where E' does not equal E. Reworking transforms G to G' = (V',E') where G' does not equal G, but V' may or may not equal V and E' is not equal E. Clearly, refactoring is a special case of reworking when V' = V.
However, if G was abstracted as a node in a metagraph H, a good reworking on G will make a commensurate transformation on H to H. That is, H shouldn't change.
How do you rework cruft in the system? Do you refactor or do something else? Add your story below.
I don't like BigDesignUpFront
because I can't think that far ahead. Usually, at most I do a small design up front to map out what the problem is and my initial solution. Then I jump in and code it. I don't like coding things in isolation, though. I will immediately hook it into the rest of the system so I can run it. If I can't, I'll rely on a test harness, but I don't like having only tests. Running the code is important. Running the code as it would be run is even more important.
The first revision of the system is usually bad. Sometimes it's really
bad. So I may just delete it. Then I'll rewrite it. BurnTheDiskpacks
I may still have problems solving the problem. The new revision may suck. So I'll just delete that too. As I do this, deleting the interface calls from the existing system becomes annoying, as well as implementing large, complex interfaces. So, the interface to the subsystem becomes simpler. NarrowTheInterface
, I say.
Eventually, I decide I understand the interface well enough that I'm not going to change it. So, I delete the implementation. This leaves me with a set of stubbed methods or functions. As always, I start to write from the interface in, only writing what I need. I borrow on my experience with the previous revisions. I know
I'll need a stack. I know
there will be incidental system code I'd like to separate from the implementation.
This is how I rework to reduce complexity. There's more to it than this, like how to make simple interfaces, and design strategies to contain and reduce complexity. For instance, there are SelfDocumentingCode
and (occasionally) DesignPatterns
. But this is the overall heuristic.
Note that this only works if the subsystem has an interface that fully defines its perimeter (say via TightGroupsOfClasses
or the FacadePattern
). Note that you cannot do this without a verification method to exercise the interface. Note that bugs will be introduced if too large a chunk is reworked. Note that interfaces that are complicated--say with a lot of boundary conditions, exceptions thrown, methods defined--will be too hard to implement. The ideal is to simplify until it's almost impossible to do it wrong. -- SunirShah
: I've just spent a year designing and implementing a reasonably complex system. Now I know a lot more than I did a year ago. The system works, but has a lot of warts. Some fundamental architectural decisions were wrong. Fixing them would break just about every test in the system (nearly 200 of them, both functional and unit). Am I stuck forever? Am I on the inexorable path to bit rot? Or am I just being a spoiled brat, and I should take my lumps and keep evolving this system, since it meets CustomerNeeds?
? -- AlainPicard
, architect on an XP project at Memetrics.
: Do nothing.
- Advantages: Free
- Disadvantages: You'll continue to accumulate TechnicalDebt in your architecture. The long-term cost to relieve that debt will be larger than the short-term savings. I suspect that the growth rate of the debt is at least geometric and possibly exponential. The debt will eventually accumulate to the point that bugs are introduced more quickly than they are removed. At this point, only a dedicated effort to EliminateTechnicalDebt will save the software.
- Evaluation: This is a sound strategy if the software is going to be thrown away soon. But people often underestimate the life of software.
: Rewrite. Set a new architectural direction and switch everything over in one marathon coding session.
- Advantages: Easy, fun (at first).
- Disadvantages: High risk. No additional software releases can be made while the rewrite is taking place unless you use expensive revision branching. It will take longer to rewrite than you thought. The larger the rewrite, the more likely it is that you'll never get tests passing again.
- Evaluation: This is practical if the amount of work to be done is small. The more I learn, though, the more I limit the amount of time. A few days used to be the limit, then half a day. Nowadays I try to keep the rewrite down to ten minutes or less. I have used this approach frequently in the past (I'm a slow learner) and have often gotten stuck so badly that I had to discard everything I did. Other times, I've persevered and ended up with a whole new set of technical debt. Sometimes it works, but it often doesn't.
: Incrementally evolve the architecture. Set a new architectural direction and incrementally switch over code blocks, making sure to OperateInTenMinuteCycles
- Advantages: Cheap, effective, low risk.
- Disadvantages: Patchy code. Part of your code will be on one architecture and the other part on another. There's usually too much work to migrate everything in one iteration. If you're not careful, you can end up with several different architectures at the same time. This approach requires a lot of discipline.
- Evaluation: Despite the disadvantages, I've found this to be the best solution. The key is to aggressively apply this approach as much as your schedule allows. Also, be sure that your new architecture is simple! Make it as simple as possible and don't plan for the future. It's much cheaper to flex simple architectures than complex ones, and you will find additional improvements after you've worked within your new architecture for several months. Continue to apply this approach for the remainder of your project as new architectural improvements are discovered.
All of the above solutions are based on my experience with applying them to non-trivial real-world projects. I wrote up a paper about using solution three; you can find it linked from the DenaliProject
: Forget architectural purity. Make changes as needed, refactor, and let the architecture evolve on its own.
- Advantages: You don't have to tackle the job all at once and you will not let architectural changes get in the way of addressing the customer's needs. You only make changes as needed. You don't need to have an architectural expert to direct every change and the system can grow beyond the understanding of any individual.
- Disadvantages: You can't be the "expert" who tells everyone else what the one true way should be.
- Evaluation: You can both improve the code and continue to meet customers' demands. You do not need to do a secret architectural change that may involve more risk than value. If something is ugly, but stable, leave it be. When something needs to be changed, refactor, make the change, then refactor again, but don't make changes beyond the scope of the request.
Jim, thank you
for an excellent
exposition. After reading this, it's clear that solution 3 is a no-brainer. But I do not want to embark upon it unless I know that I'll get to finish it, so I'll have to wait until such a time as is fitting to put in 2-3 weeks of solid work into the system (i.e. after release 2, or even 3 of the software). One question though, you say: be sure that your new architecture is simple.
The thing is our current
architecture is quite simple, but I feel it's hackish. This is one of these situations where it's not clear if I'm trying to satisfy my own sense of elegance (something some wikizens call stealing from the customer) or I'm having the foresight to prevent a future disaster (which, as architect, is my job, after all.) So, for now I'll proceed along 1) Do nothing, and, when/if it becomes possible, 3) incremental evolution. --AlainPicard
Glad I could help! The question of how much foresight to include in system architecture is particularly interesting to me. Would you mind if we discussed your project's architecture further off-wiki? My email is firstname.lastname@example.org. --JimLittle Not at all. I'll drop you a line soon. --ap
I think we should apply some context here. In a manufacturing context, rework occurs when one of the things is not like the others. You either need to rework or scrap the item produced. Changing a design, simplifying the design, modifying an assembly line to produce a variation, etc., are not considered rework. If you CD burner makes a mistake and miscopies bytes in a file, hand patching the CD would be rework. Changing software to add a capability, add clarity, or even to fix a bug would not fit with the manufacturing definition of rework. -- WayneMack