[From ScreechinglyObviousCode, and a number of other places.]
We seem to need a large real-world example of well-factored code, to show people that the idea works in non-toy systems:
Could someone please show me a largish project with this obvious wonderfully refactored code? And I mean let's see the code.
Is there an OpenSource
project that could showcase some (mostly) well factored code? -- EricHerman
- HotDraw probably fits the bill: http://sourceforge.net/projects/jhotdraw/ (Perhaps it would be helpful to have a HotDrawDiscussion? on how well factored, maintainable and/or usable it is? (I've seen comments that people sometimes have trouble figuring out how to use and extend HotDraw.))
- JavaUnit isn't a bad place to look either. (Well factored, but certainly not large.)
- Mike Wutka's DTD parser is pretty fabulous, although it isn't "large". http://www.wutka.com/dtdparser.html. (Look at Scanner.java and say that again if you dare. The readNextToken method should be split up. There are lots of chains of ifs that are screaming their desire to be implemented in a more data-directed way: look at isCombiningChar, for instance. Observe that in isBaseChar exactly that is done; something similar should be used for isCombiningChar and its friends, and then all those methods could be improved at once if necessary by switching to a more sophisticated searching technique. Or would all that be violating DTSTTCPW?)
- SqueakSmalltalk sources... it's not perfect, but Squeak includes many built-in applications. Smalltalk-80 was better in that most of the experiments had been removed. (Trying to find "this is the application X code" in the image and "this is the application Y code" in the image is sometimes difficult, because of all the code-re-use going on.)
- A good portion of the Java Core API deserves the appellation, IMHO. As well as the .NET Framework API. Which parts of the standard API are well-factored? Which classes, which version and which implementation?
- The EclipseIde
- TAO ( http://www.cs.wustl.edu/~schmidt/TAO.html ) is large and has some academic papers available regarding the patterns used within. It is not the prettiest thing in the world, using a lot of C++ macros to achieve portability, but most of it does make sense.
- eCos ( http://sources.redhat.com/ecos/ ) is a probably large enough OS, which is sufficiently well-factored that all of the dependencies are statically switchable so that you can use only what you need, and it abstracts a lot of different hardware quite consistently.
It's not about being "well-factored", in a static sense. All that matters is if it's GettingBetter
or if it's getting worse.
Perhaps. But there is a level of factoring below which it is very difficult to make improvements. See the DebtMetaphor?
Not a very impressive list. And if the code has gone through several iterations we must assume it has been factored enough to be a good example. The implication otherwise is that refactoring is a holly grail that is used as a basis for a methodology but never actually happens.
Or we can assume that it most often happens within the context of an ExtremeProgramming
project where the code is written specifically because it has value to a customer and that customer is not giving it away. -- WardCunningham
Maybe. But for the more scientific minded of us proof is required regardless of the reasons why proponents say it can't be provided. I'm reminded of the explanations of why ESP doesn't work in controlled conditions...
Scientific minded people don't require proof, truly scientific minded people conduct experiments, and find out for themselves. Proof for anything in this industry is scarce, because it isn't science. ExtremeProgramming
works, you only have to try it, to find out. Science isn't about proof, it's about being able to repeatedly verify theories via experiments, that's the only proof to be had. Since all the ExtremeProgramming
guys say it does work, then I guess their experiments are validating the theory. If you don't believe them, then conduct your own experiment, that's what science is about. RamonLeon
Or it may even be that if you were to look at an open source project like JBoss you might find some very well factored code (a hunch), but because of the way you're presented the challenge, there's not one soul motivated to help you find it. -- WaldenMathews
Or it could be that it doesn't exist and the attitude is just cover.
Perhaps well factored
is an emotional judgement about a complex situation that escapes complete understanding. As such we could expect to know it when we see it but have trouble finding a sharp (scientific) definition. The same could be said of love. The pessimist will say, show me two people in love and I will find the self interest beneath it. He may be right, but that doesn't negate the value of the concept.
A statement of undisputable fact (except by saying that I lie about what I think.)
Ok I will, get down to the metal and stop dancing the issues. I looked at JUnit (JavaUnit
) and I dont think it is well factored. I can see that someone tried, which is a lot more than can be said for most code. To be noted is that I come from more C++ experience than Java.
Could you outline why it isn't well factored? Is this a value judgment or are there some metrics you applied. Thanks --TomAyerst
Metrics are also the production of a long series of value judgements. I don't think
there is any appeal to objectivity possible here.
If there are no objective standards for well factored code what is the point of asking for examples? Why is JUnit Not well factored? (Real question, I haven't looked at it for I while, I will now) --TomAyerst
The Junit presented in JunitCooksTour
is well-factored. You don't actually get the full source code, but I've been able to write two-and-a-half JavaUnitClones
based on that document, and the code I produced was some of the better stuff I've done -- quite well factored. Admittedly, a non-GUI StarUnit
isn't a 'large' program.
Now what inferences can be drawn or hypothesis made?
- It is not well factored, and the author did not do it right through lack of desire to write well factored code.
- It is not well factored, and the author did not do it right through lack of skill.
- It is not well factored, and the author did not do it right through lack of a pair?
- It is well factored, and my judgment is flawed.
- It is well factored, and I am just a C++ whiner.
- It is well factored, and I am just a TROLL.
- It is well factored, and I am the author seeking accolades by TROLLING.
All this debate of mine however is moot. This page seems to be arguing about well factored code and whether or not it exists. That point is interesting, in that if it is not possible to write a program that relatively is cheap to modify to a user's needs then XP and RAD and agile methodologies in general are hoopla and VaporWare
. The critical word in the previous sentence is relatively
. I have seen unbelievably poorly factored code (legacy code) that is now impossible to modify agilely. I have seen much better code that is modifiable, relatively cheaply. If some critical factor like cost of making changes varies significantly, then the optimal SE process changes too.
I think a much more interesting question is whether these examples of relatively wellfactored programs are a result of the XP method and its practices or the expertise of the early adopters who have tried it so far. What worries me about all silverish bullets that experts claim to work, by personal experience, is that I do not have evidence about what happens when Joe Bloggs who just worked on and contributed to 5 failed projects uses the method. Even if it is granted that XP is not a lead bullet, is it sufficient in and of itself to result in good code? Do we also need wise managers? Is it true that if we lack any of the things on my wishlist for making any project by any methodology work then you can't do XP and therefore can't fail at XP?
Bottom line: I have written code for a lot of years. I have a lot of experience. XP seems to contain a very insufficient quantity of direction for new programmers on how to achieve well refactored unsmelly code. It is hard. People with good intent try and fail. I can see that without a more concrete basis the jump from, say, "OnceAndOnlyOnce
" to good code is way too far. I like (adore) "once and only once" but it is kind of like a meta rule. I would however have thought that, highly decoupled, highly encapsulated, written in terms of historical invariant problem domain entities, layered, tiered, decoupled GUI and backend, risk identification, might also be terms referenced more often by XP advocates, they are instantiations of XP rules. Except that as far as I can tell "written in terms of historical invariant problem domain entities" often violates YAGNI.
I suggested before that the magical benfits experienced by XP proponents may be in the product of their own skill rather than the method per se. Consider the following literal quote from XpSimiplicityRules?
- My experience is quite the opposite. If I apply the once and only once rule to code until I can't any more, I certainly get lots of little pieces. This takes commitment. If I am committed to communicating through my code (SystemOfNames, etc), though, the result is not confusing. And it certainly isn't tightly coupled. It is radically decoupled, in that changes tend to be very localized, and all without much in the way of planning. -- KentBeck
If I apply the once and only once rule to code until I can't any more, the code will then as a consequence be good. What I don't know is if I ask any progammer I meet to do that, will the code then as a consequence be good? I suspect (know) not. Sometimes it will even get worse. Where worse is defined as more hours of work to make it good than when it started.
(Note: apologies and thanks to specific authors of software I gratefully use.)
-- another AnonymousDonor
(refactor at will these thoughts)
Really large programs get that way by being poorly factored. I've seen nicely sized solutions to large problems. By nicely sized I mean that they were large enough to have something to say about problems I wanted to understand while being internally organized well enough that I could understand it. Here are two examples from the field of IC-CAD that should be available to someone who wanted to look.
I'd like to look, but I can't track down Bill Croft's email address. Anyone know?
- DPL from the MIT AI lab - This large mask layout system was used to make parameterized cells and assemble them into working circuits. This was the first program I ever studied that I could open anywhere, start reading, and make sense of what I read. This is in spite of being written in an unfamiliar language (Flavors) and having a listing that was four inches thick.
- The mask layout program by BillCroft at Purdue EE department - This is a truly awesome C program that could do VLSI scale designs on a PDP-11. The implementation included the command processing, high-resolution graphics, and custom database. Amazingly the program was only about half an inch thick and could be read in an afternoon. (Contrast this to my own companies' graphics drivers for the same device which ran ten times this for the drivers along.)
Most CAD programs I've examined are written as if the authors just wanted to get done and didn't know that (re)factoring would make that happen sooner. -- WardCunningham
Well, libart ( http://www.levien.com/libart/
) seems a good candidate for a LWFP to me. Of course, the key thing in a CAD package is the maths (libart's is fudgy but ok), not the program structure, but that's probably not a popular view here.
I just went and looked at a short sampling of JBoss code (inspired by Ward's calm generosity above). It's about what I expected. Lots of classes, mostly small. Methods mostly small, nesting levels small. Readability not too bad -- maybe you could line stuff up a little neater in places, but that's not at the core of refactoring. But the question "how can you tell if it's well factored?" was definitely there. I can't confidently assert that JBoss is well factored, although I have the feeling it's pretty good. I remember, though, working with the Mc
Cabe tool several years ago. That's a unit testing tool, by the way. It has a simple means of computing the complexity of a method or procedure, and it can give you a graphical picture of the shape of that method according to the logical paths through it. These shapes are the kind of thing you could learn to read with a little practice (talk about patterns!). I believe the McCabe
tool also has an advanced feature in which it identifies modules of very similar path structure and can suggest that these are in fact the "same" module, so that you can factor and combine. Unfortunately, my experience with the tool stopped a little short of playing with that. -- WaldenMathews
Well-factored is not an absolute. If you can't show me realistic examples of large programs without bugs, that doesn't mean that reliable software isn't a useful concept, even for comparing programs neither of which is perfectly reliable or perfectly well-factored. It's unusual for programs in practice to reach, or even closely approach, the ideal either for well-factoredness or bug-freedom.
Most programs have both a to-do list and a bug list. Attention paid to the to-do list makes it hard either to eliminate the bug list or to put the program in a perfectly factored state. I've worked on two sizable programs since my exposure to XP, the open source Lisp compiler http://sbcl.sourceforge.net/
and a closed source program to play the GameOfGo
. Both are very far from being perfectly factored, containing various of design decisions which were either suboptimal from the start or have become suboptimal as times have changed. But both exhibit the benefits of substantial amounts of refactoring, and of various change-friendly early design decisions that got important things right in the first place.
It might be better to look at a class of software, and consider which examples within that class are relatively well factored, and to what extent it was worth the trouble. Probably at least two classes should be chosen, one in a relatively static problem domain (e.g. XML parsers) and one in a relatively fluid problem domain. The Lisp compiler mentioned above is effectively in the second class. Even though it now aims to implement a fixed ANSI standard, back in the 1980s it compiled a rather different dialect of Lisp taking advantages of the idiosyncrasies of the Mach operating system and optimizing its compilation strategies to machines with reasonably large register sets, and was rebuilt using a weird bootstrapping approach which optimized rebuild time at the expense of flexibility and reliability. Since then it has been mostly converted to compile ANSI Common Lisp, is now used on (several modern CPUs butmostly on) the register-poor X86 family, runs on Unixes of all kinds, and has a radically different build system which is slower (which on modern machines is tolerable) but handles change much more reliably and routinely.
I think you never need a large well-factored program. If you need a large well-factored system
, you are better off with a well-factored set of small well-factored programs. -- NikitaBelenki
Is this an oxymoron? If a program is well factored, then it should not be large. -- MikeRettig
I recently read the source code for AWK and was truly impressed. The revision history shows that it has been getting better for more than a decade, and they've never had to rewrite the book. Could this be considered refactored code? http://cm.bell-labs.com/cm/cs/awkbook/index.html
I think BrianKernighan
really groks the UnitTest
concept. -- ChrisGarrod
Or perhaps they, you know, designed it first?
Brian's views on testing can be found here:
"In an attempt to keep the program working, and to maintain my sanity as bugs are fixed and the language slowly evolves, I have created somewhat over 1000 tests" --BrianKernighan
Here's the code for the longest well-factored program in history:
See also: GreatProgramsToRead