Extreme Programming Challenge Nineteen

XYZCorp are a company with 4 developers and 1 product. The company owns a large quantity of valuable digital map data, and the product allows their thousands of customers to print maps at various sizes and scales, with various kinds of information displayed. The most complex part of the product is the bit that interacts with the Windows printer devices to determine what their capabilities are and hence how to produce the best picture for the customer.

Most of their problems are to do with zany printer drivers that either (i) crash unexpectedly or (ii) have combinations of capabilities that have never been seen before.

The company is too small to buy the few hundred different printers it would need to test exhaustively, and in any case a few hundred printers times a few hundred combinations of printing options is too much to contemplate. Plus, an example of the kind of problem that soaks up a lot of time is an HP driver that GPFs about 30 minutes after you ask it a particular question. The current solution is lots of (mostly untested) exception code, and lots of code that compares two things that should match just to check that the printer driver is still working.

How are they supposed to write automated tests for this kind of thing? (answers that talk about "other" benefits miss the point: the issues described represent 90% of the "bugs" that users report). -- DaveCleal
<seven days later> Must be too difficult, I suppose. -- DaveCleal
Great challenge! Why don't they create a fault-injectable-driver, and inject different types of faults. For every fault, their system should respond in a predetermined way.
I _do_ know that I _don't_ know what I'm talking about in this context, but... How much of the printer driver is exercised when you print to file with the (not connected) printer defined? If the driver's involvement is just to return printer characteristics, you don't gain anything, but if the driver participates more, testing with copies of the drivers (contributed by the customers who's printers failed) might at least avoid the hardware expense of accumulating bunches of printers for testing purposes. Or, at least create your own "MVP" program, asking customers with an assortment of printers to serve as beta printer testers running the candidate automated tests.
Testing to file is partly helpful - you can tell whether the driver GPFs - but only partly - because you can't tell whether the printout appears. The biggest problem is that you really want to find the problem *before* the customer's printer fails.
Hmm, I don't remember seeing this challenge before. It happens that I've had this problem. Of course you focus on using a minimal set of printer features, doing all the math inside the program, etc. I don't know of any way of finding, fixing, or testing for a bug that doesn't occur on the developer's machine. In the printer driver case, it is exacerbated by the fact that a high percentage of the problems are in fact in the drivers. No amount of testing without the printer, no amount of inspection, no amount of anything I know of will find a bug that isn't in your code. -- RonJeffries
OK, that makes sense. What the (real) XYZCorp actually do is to write lots of code that they, somewhat strangely, call 'belts and braces' code(see BeltAndBraces). This is code that continually checks the reasonableness of various aspects of the data structures that pass back and forth to the driver, and pops ups and says �please call support and say...' whenever anything unexpected appears. None of this code ever gets tested, and most in fact has never been executed, but the bits that have have saved a lot of time and embarrassment.

In summary, they are very sceptical about automatic testing, because of (i) not knowing what printer driver people will be using and (ii) this difficult-to-exercise code. So, they rely upon manual code inspection. And so, they would argue, XP isn't for them, because without automatic tests, a lot of the rest doesn't work too well. -- DaveCleal
You could still unit test the belts and braces code. It has some expected behavior in a given set of circumstances. If it doesn't, you have no business writing it.

JimCoplien gave an example of this kind of situation. "What if you add an element to a set, then have a undetectable memory fault so the element is not in the set. The program must run correctly in this case." I can't even imagine what sort of strategies I would use to execute "correctly" in this case, but if I had a strategy I would damn well unit test it to death. (You could probably buy a particle accelerator and point it at your memory chips if you couldn't think of anything smarter.) -- KentBeck
But the problem is that this system has some "extremely simple" behavior in a given set of "relatively costly to simulate" circumstances. Also, the code's failure won't be too disastrous: as the code is only executed when the environment is already breaking assumptions the program needs to be true if the user is to get what they want (in this case, an accurately printed map). To me, there is a boundary where the cost of testing gets so high relative to the likelihood and cost of failure that the XP position on testing becomes untenable. -- DaveCleal
It is OK to UnitTest other people's code. It is also OK to distribute UnitTests to your clients and have them run them, and save your resources. Write a UnitTest for a printer driver interface and have your clients provide the actual printer drivers, and have the UnitTest fail the printer driver if necessary.

Then there is the test of the actual output. But GuiTesting is already documented as difficult to automate. I suspect there is a fundamental reason for that. -- MattRickard
code that continually checks the reasonableness of various aspects of the data structures

I would attack this by looking at the definition of "reasonableness". In this context I assume that a good data structure is defined either in terms of a syntax tree or in terms of domain boundaries. (The combination might be a semantic net). You can then test your protocol-checker by running round the edges of the network and checking that the appropriate exception is thrown (or error-message is dispatched) for each violation (and that no error is generated for good values). You test the behaviour of your exception/error handlers separately.

Finally, you can test that the normal-case breaks properly by generating random errors. If the number of places where errors are checked during the normal-case code is small then you could test them exhaustively.

See also: RandomTesting

-- DaveWhipp

It is OK to UnitTest other people's code.

I'm not convinced. What's the point of a UnitTest if you can't fix it if it fails? Perhaps I'm being picky and you really meant something like a CharacterizationTest suite instead. That might be ok, but it raises a question about what impact a failure in this suite would have on the customer? If they couldn't upgrade to a working driver (because it doesn't exist?) then they will still carry on using your system, won't they?

It's my experience that users will report known bugs unless you tell them at the time of their experience of the bug that it's a known bug and it's not your fault. Therefore the (solvable part of the) problem isn't really the bugs in the printer drivers, it's your handling of those bugs, and you can test that: I think my solution would be to put a (very thin) wrapper around the printer drivers so that I could replace them with FakeObject which inject failures in my test suite. Then the only untested code is the wrapper which is thin enough to be reasonably transparent.

Not that I'm sure about how one would convincingly fake a GPF in a printer driver, nor how one would handle it!
When you write code, you can and should test it immediately by putting a breakpoint in the debugger, running to the point where the new code was introduced, and if necessary modifying a variable used in an if() statement so that the new branch is taken. You should do this for your BeltAndBraces code.

You can take this one step further, and learn from the hardware design ASIC guys: DesignForTestability (and debuggability). If good testing is part of the spec, then the code can have specific parts that are ONLY there to allow testing, and still be the simplest possible solution. You could add code to each if () test in your code which decremented a counter, and when the counter came to 1, it would corrupt the data structure (or whatever). Then run the code starting the counter from 1, 2, 3, 4, ... until you get through the entire program, making sure that the expected behaviour happens for each of these cases.

View edit of November 17, 2011 or FindPage with title or text search