The hardware world underwent a revolution with Shewhart and WilliamEdwardsDeming
process quality control practices.
Briefly, here's one common practice they debunked. A tool operator
produces parts, and drops them onto a conveyor belt. Some way down the
belt, a "quality control" operator picks up each part, measures it,
and tosses the ones that are out of tolerance.
In this model, managers expected to get higher yield by ordering the
tool operator to go faster, and by hiring more QC operators to throw
away the increased defect rate. The tool is the expensive part, so
making it go faster returns value on its investment.
This practice has a parallel in some modern software shops (such as
Microsoft's). A programmer writes code, and a test engineer (in
another office) picks up each integration and writes unit tests for
This practice creates errors under the assumption they must be
created, so they must be removed downstream. It over produces, and then
censors out the bad parts.
Shewhart and Deming changed this practice, for hardware. They ignored errors and focused on variation. They separated variation into two classes: assignable cause and common cause and created a tool (the control chart) to help differentiate between the two. Assignable cause variation indicates that an underlying cause for this variation can be found. Common cause variation indicates that the variation is due to normal system operation and no unique cause can be identified. Assignable cause variation is to be addressed by identifying the cause of the variation and, if necessary, taking steps to prevent its recurrence. Common cause variation can only be reduced through experimentation. Measure the current level, try a different approach, measure the new results and if they are an improvement, implement the new approach. There are equations that can be used to determine if it is cost effective to either do 100% inspection of the result or 0% inspection of the result. Essentially, if the cost of inspecting everything exceeds the cost of handling actual problems after the fact, eliminate inspection.
The point is to focus on constant, measurable improvement and letting errors take care of themselves.
In the new model, tool operators themselves learn to calculate a few
statistical formulas, and they themselves rate the odds that their own tool will
yield product out of spec. Then, when those odds are safely under some
threshold, say 1 in ten million, the company can safely take the QC
people off the conveyor belt. The entire process now operates with
much less industrial waste.
This is the QualityIsCheaper
Because it also employs fewer statisticians, they fought it ;-)
Programming is a design activity. Our hard drives themselves take care
of the statistical aspects of production for us, in their own way. But
these parallels between waste in production efforts, and waste in
design efforts, are compelling.
They indicate we should seek ways to waste less. In a design effort,
chronic bad design is waste. The broken output is FireFighting
, long bug hunts,
, and hours beyond a FortyHourWeek
. So we try to use
to literally stop the bugs before they start.
But in a design space, we can't use statistics, because each element
of product should
be measurably different from each other. So we
tune our production tools using another entire human being: PairProgramming
ThankYou PhlIp, I hadn't spotted this link. -- MatthewAstley
Actually, in a design space, we need to use statistics. Some ways of coding things are more error prone or more likely to cause errors when modified. Hence the need for a CodingStandard. If nothing were repeatable, experience would be meaningless.
Natch. But attempts to pencil them on graph paper, hang them next to the "tool", and calculate their StandardDeviation
would miss the point and risk measuring the wrong things. YouGetWhatYouMeasure?
Without some sort of measurement how do we 1) ensure a change actually makes an improvement, and 2) move CodingStandards beyond being the opinion of the loudest debater? Sans measurement, how should we decide?
I do not understand the metaphor. Programming is not the production of lots of identical parts, is it? In programming, each "part" is different. So there is no comparison with conveyor-belt production processes. I mean, do those large companies throw away any failed code? I don't think so. It is probably repaired
by the original programmer or by his/her colleague.
I'm a fan of TestFirstDesign
, but mainly because of reusability and tracability of errors. --WillemBogaerts
If every program is completely unique, then what part does experience play? How can we have any standard ways of doing things (such as TestFirstDesign)? These things imply an underlying commonality of process (not the product being produced). If there is a commonality, it can be measured. Note even the title of this page implies a measurement. I don't believe the question should be "Can we measure software development characteristics?" but rather "What characteristics?" and "How?"