Debugging And The Scientific Method


One day I had the sudden insight that debugging is an awful lot like the scientific method. This now strikes me as so obvious that I'm sure many other people have had it as well, but I haven't seen anything about it on Wiki, or in the computer science literature I've personally read. (I'm definitely not the only person to make this observation, though.

For example, see
 ***  ( BrokenLink 2002/12/09 )

There's already a page about the ScientificMethod on Wiki, but for those readers who need a refresher, the scientific method goes something like this:

  1. Identify the behavior you want to explain.
  2. Form a hypothesis that might explain that behavior.
  3. Conduct an experiment that tests the hypothesis.
  4. If the experiment contradicts the hypothesis, revise it or replace it with a new hypothesis and go back to step 3.

The point of this exercise is to construct a hypothesis that not only explains the behavior but one which can also be confirmed by a suitable experiment. (Picking a testable hypothesis and an appropriate test for that hypothesis is a topic worthy of its own essay.)

The debugging process consists of first finding the cause of a bug and then fixing it. It is this first part which struck me as being a lot like applying the ScientificMethod. Here are the steps I use for finding the cause of a bug:

  1. Identify the bug you want to fix.
  2. Make a guess as to what the program is doing to cause the bug.
  3. Conduct a test to see if the guess is correct.
  4. If the test contradicts the guess then revise it or replace it with a new guess and go back to step 3.

The point of this exercise is to understand why a bug is happening. Once you know that, it's usually (but not always!) straightforward to fix the bug.

The scientist uses experiments to determine the validity of a hypothesis, and the computer programmer performs tests to see if he understands what that the buggy code is actually doing.

In both these cases, the problem is to understand some underlying process that is not visible itself but which can only be inferred from its side-effects. The scientist performs experiments and the computer programmer performs tests. Of course, tests are experiments.

At this point, I'm inclined to argue that the process used to find the cause of a bug isn't just like the ScientificMethod, it is a direct application of the ScientificMethod.

It turns out that the ScientificMethod doesn't just turn up in debugging. If you work with computers, and especially if you're a programmer, then almost every day you find you need to figure out how something new works. You may be trying to learn a new programming language or a new API, and you might have some basic documentation (if you're lucky), or you may just have some unexplained examples to work from, but either way, you don't know exactly how the thing works, and you need to figure it out. So you gather as much information as you can, and then you make your best guess. Then you think of some experiment to perform to see if your guess is right. Often it's not right at all or only partially right, so you come up with a better guess. With a little bit of luck you are able to construct a model of how the thing works and that you know is correct because you've performed experiments which confirm that model is accurate.

If you're a computer programmer, you've very likely performed more experiments and used the ScientificMethod more often than most scientists. And just think: you've been told all these years that there wasn't much science in computer science.

(-- CurtisBartley)

I agree with Curtis wholeheartedly.

I work with scientists, and it is kind of fun to point this out to them when I help them solve some malfunction that could be in any of four subsystems (only one of them software). Typically, they'll say "why don't we try this, this, and this." And I say, "one at a time, controlled experiment." Most of them appreciate the humor. I've made the observation outside of the throes of a problem to a few scientists and they tend to agree.

Another thing that I've noticed that helps programmers generate a little bit of excitement about debugging, is pointing out that it is like detective work (which of course is a variation of the ScientificMethod). -- MichaelFeathers

Keeping the scientific method in mind works well for me when debugging any new piece of code that leverages pre-existing code. But I find the nature of experimentation required quite different between C libraries and C++ frameworks (on average). With a C library, I can experiment with supplying different arguments to a function, and through trial-and-error arrive at a workable solution (or move on to a different library). With a C++ framework, I need to spend time analyzing the behavior of the pre-existing code in the context of the new code, stepping through many layers of method calls in a source debugger. The debugging is harder in C++, but the scientific method still works if I have access to the all the source I need for the investigative process. -- ScottJohnston

Debugging this way is lots of fun and it's good (and enjoyable) to be good at it. Even better is to have such an extensive network of tests, and to work so incrementally, that when something doesn't work there's no deduction or experimentation involved: it's just, Oh, that thing I just typed doesn't work, look at it, oh, there's the problem.

One of the biggest downsides to ExtremeProgramming is that it gets rid of all those opportunities for HeroicProgramming and HeroicDebugging. -- RonJeffries

The other extreme (as it were) of thorough upfront design (don't type something that isn't going to work) also reduces the need for day-to-day HeroicDebugging. I'm sure there are other approaches.

Even in the most bug-free code, HeroicDebugging and the ScientificMethod can still be useful when something outside your control goes wrong: a slow diode in your hardware, a bug in the compiler/runtime system/OS, something somewhere dropping whole or partial network packets... -- JimPerry

Oh, that thing I just typed doesn't work, look at it, oh, there's the problem.

Sometimes this happens, but sometimes the problem you are trying to solve is more complex. Even in a fairly well-known problem domain (payroll for example), problems can arise that are not as simple as making an '=' into a '!=' because you happened to noticed that you were hitting keys and accidentally forgot to hit a few other keys.

When doing ComponentBasedDevelopment, sometimes I buy an off-the-shelf component (say a grid control) and the thing works fine. Then as my requirements solidify and we understand the problem we're trying to solve better, we try something that breaks the system. Is it our code that's causing the problem? Are we misusing the component? Is it a bug in their grid control?

A problem in our code or misuse are the easier ones. But if it's a problem in their grid, and switching to another grid is far more costly than working around it, then the task of working around their bug can be hard and non-obvious. The ScientificMethod is a natural method for arriving at a solution.

Another area not so straightforward is when you have many class instances operating concurrently. Sometimes a design makes sense but you run into deadlock or something else that is not obvious, and solving it is difficult for mere mortals. Here too, the ScientificMethod can be applied.

-- PhilipEskelin

Interesting notion. Let's remember a couple of things about the ScientificMethod, though - before it was understood, there were two ways of handling what we would call "Scientific" inquiry:
  1. The WitchDoctor? approach. Invoke the gods. Sacrifice a small fuzzy animal. If that doesn't work, try sacrificing something else until it does work and then state that the gods are appeased. Unfortunately, a lot of debugging uses this approach today. Turn off the machine. Reboot. Try rearranging the order of your statements, etc...
  2. The NaturalPhilosopher? approach. If you can't explain it, at least write down everything you know about it. Categorize your bugs. Ask anyone else if they've seen this bug. List the places the bug occurred in alphabetical order. Write down the relative humidity every time the bug occurs... It doesn't help you solve the problem, but it makes you feel as though you're accomplishing something. :)

-- KyleBrown

In order to be good at finding and fixing bugs, you need to have the methods and qualities of a good detective. The world's greatest detective is ... SherlockHolmes.

-- KentSchnaith

Beginners often use a disorganized "trial and error" debugging approach instead of using logic (or "scientific method"). To be successful with non-trivial (non-local) bugs, you need to gather and analyze information. This leads to my favorite quote to give to beginners ...
"If you fully understand the conditions under which the program works correctly and incorrectly (IE: if you find the boundary between the two), then you'll know what's wrong and how to fix it - without ever looking at the source code." -- JeffGrigg
I've even had this experience, from time to time.

I think that Francis Bacon was the first to suggest looking at such boundaries.

There is a book on debugging by A R Brown & W A Sampson (1973) which refines this boundary technique. They suggest that you need to have a sharp boundary between
  1. what the symptom is and what it is not,
  2. where it is and where it is not,
  3. when it happens vs when it does not happen,
  4. how much it happens vs does not happen.

I need to look at the source code with a debugger to fully understand this boundary. Someone who doesn't need to look at the source could probably attain the Dr. Seuss ideal of reading a book with their eyes closed as well...

Or else you have read a lot of code and know the kind of mistakes that are made. I swear I can smell the interrupt vector foul up in the IO system of my laptop.

These suggested boundary conditions are interesting. I think that in most of my successful debugging episodes, I conduct experiments to get a clear idea of where the boundary is for at least 2 or 3 of those criteria, and I've found that usually narrows it down enough that I can pinpoint the bug (when debugging our own codebase, at least).

Easy bugs are... well, easy. Hard bugs are the ones where you do 10 experiments trying to understand the conditions under which it occurs, only to realize you don't remember the results from the first few experiments. Therefore, jot something down every time you change something and run another experiment. Having a list of what you tried, with what results, will allow you to make hypotheses. Having a hypothesis will allow you to chose your next experiment with something better than just guessing.

This is especially helpful when debugging hardware problems.

Don't worry about your lab notes being neat, or readable by anyone but you. The lab books of many famous scientists look like so much chicken scratch. They were working tools. Yours should be too.

If you have any doubts about the truth of the above suggestion, try debugging disassembly or a tricky graphics routine. Whenever I have to do either, I make sure I have a pad of graph paper handy. Or at least the backside of an envelope if I'm desperate.

Agreed. For me, debugging consists of 'probing' the problematic functionality to collect information about it; when it works, when it doesn't, specific ways it doesn't work in simple cases, etc. In order to hypothesize intelligently about what might be wrong, you need both some vague ideas of what things the bug is affecting, and some concrete observations of how it affects them. There are occasional ZenDebugging? situations where you make a conceptual leap and arrive directly at the answer (FlashOfBrilliance? or something?) but most situations are probably more like detective work. You collect information and then use that information to reason about where and what the problem can be/can't be.

Some bugs don't admit of any scientific method... My colleague spent all night trying to fix a bug with Netscape thing. His partner came in in the morning and said, "Oh, you still have version 3.2 installed on your machine. That won't work, you need version 3.3. (or whatever version)." No amount of any work would have sorted that out. One had to know the answer. These are the ones I truly detest.
That is simply not true. You cannot debug in a vacuum. Sometimes it suffices to look at your code. Sometimes you have to consider its interaction with 3rd party code. And sometimes (unfortunately), you have to disassemble other code (or worse, kernel drivers) and figure out what it does. With sufficient logical debugging on the problem you describe, your colleague would have discovered that the problem was due to a bug in Netscape. The next logical question is 'have they fixed it, or do I have to?'.
I first learned about using the Scientific Method to debug when I read Robert Ward's book Debugging C. I think it is out of print now though, but was of inestimable help in my early career. -- MichaelCrawford
I would be very hesitant to describe debugging as using the scientific method.

The scientific method looks at many data points and tries to construct a general theory to cover them. Debugging looks at a single data point and tries to construct a specific theory to cover specifically that point.

Yes, using a structured approach to debugging is extremely valuable, but so far years and years of debugging have provided limited insight to the nature of bugs and theories about their occurrence. Most of the theory for improved software has been made in the areas of language development and coding standards and, in these areas, the changes are more based on opinion than any research showing statistically significant results.

I welcome the scientific method to software development, but let's not delude ourselves into believing the current state of software development reflects a scientific approach.

-- WayneMack

Turn it around - of what method of software development would you say that it definitely reflects an unscientific approach ?

Are there, say, "artistic" approaches to software developments that work ? Is it valid to oppose "scientific" and "artistic" ? What other categories are there ?

If we think as scientists would when we approach a software development problem, can we then validly say that we are using a ScientificMethod? If not, why?

-- LaurentBossavit

I am not trying to propose an alternative to the scientific method, I am just lamenting the lack of "science" in "computer science."

I cannot really think of any theories on software bugs. Here I mean "theory" in the scientific sense of a hypothesis which is generally applicable, has predictive value, and has been widely tested and verified.

One hypothesis is that bugs are linearly related to the number of lines of code created or changed. Another hypothesis, implied by refactoring, is that the number of bugs can be reduced by creating many short routines, thereby slightly increasing the number of lines of code. This leads to real world conflicts between those who believe that changes should be tightly focussed on very specific problems with minimal extraneous changes, and those who believe that general, code-wide improvement has significant benefits. Where is the supporting evidence verifying the claims and showing the limitations of either belief?

A bug in software is merely an instance of a problem. Science is not about solving a single instance of a problem, but rather about solving the problem in general. This is what I see missing from software development.

-- WayneMack

Debugging doesn't need to be a science in order for it to be scientific. Stay at the lower level. Your software doesn't work. You formulate a hypothesis as to why that should be. You then formulate an experiment that will affirm the hypothesis if it comes out one way, infirm it if another outcome is observed. These experiments are (manual) tests.

The experiment to be devoutly wished for is a code change. Your theory about your software's failure mode is best formulated as "This bit of code needs to be changed just so, to make it into a better description of what we want to happen. When we do that, the code will work as expected." When you perform that experiment, you aren't just fixing a bug; you are validating an entire body of thought about how your code interacts with other code and its environment.

An "artistic" approach might be to say, "This is how I feel the software should work (or not work). I'll make it so; anyone else is entitled to their own views, though my aim is to express a deep truth by showing how my code solves that particular problem; or doesn't solve it."

Don't try to elevate trial and error debugging to being "science." Finding your car keys when you lose them is not science. Understanding why your car keys get lost and reducing the time to find them or reducing the number of times they get lost, now that would be science. Avoid the misuse of the term, because we do need true science in software development, and the lack of science is obscured by people claiming to do science when they are not. Debugging is challenging and important, but it is not science.

I'm not sure I understand why calling debugging scientific is "elevating" it, as if science is something higher than debugging. They are different things, almost certainly; and, also almost certainly, there is an overlap in the skills required to do either.

Actually, debugging follows the ScientificMethod, while science does not. Real scientific research is a complex combination of performing experiments, interpreting those experiments, suggesting theories, evaluating theories, convincing others, modifying mental models, and so on. The ScientificMethod that you learned in school is a definite oversimplification. -- JaredLevy

Quite. Read e.g. PaulFeyerabend's AgainstMethod.

This was also explored in ZenAndTheArtOfMotorcycleMaintenance
See TrialAndErrorProgramming

View edit of July 29, 2014 or FindPage with title or text search