What Is Science

[Refactored from The TwoIrreparableMistakesOfTheSoftwareField by BillTrost]

Is BubbleSort a property of the universe that a scientist can discover, or an invention that could have been patented if a mechanism was in place for patenting software early on? -- MichaelFeathers

Computer progress would be [bleeped] if such patents were allowed in the 50's through 70's. Battles over airplane patents almost ruined that industry. The military had for force a compromise.

Is Programming Science?

I am not sure I would call programming a "science". DisciplineEnvy

According to the definition given in: I'd never thought of it like that. I had thought that Science was explanation, and that the explanation was tentative, incomplete, and in development, yielding somewhat predictable results.

That fits programming, or at least "some programming".
What are the boundaries for Science?

Are scientists limited to discovering aspects of the universe? There's sort of a naive description of physics, which goes like this: there's a universe out there and our physicists, gloriously large-brained beings, are observing, describing, and predicting it.

The problem? We, being small-brained bipeds, can't quite comprehend something as big as the universe. So we make models of the observable phenomena and then ponder on them (in the word's of Frye, cited above, "You don't learn Nature. You learn Physics"). Eventually, we come to things like quarks and so on and it is no longer quite so obvious (if it ever was) that we are observing and describing properties of the universe (instead of properties of the way we interact with and model the universe).

But the notion of 'Science' is much wider than the above would indicate. We also call things like Psychology and Biology sciences. Both of which have the core scientific pattern of "observe, deduce, experiment to verify". But neither of which smell very much like Physics at all. They're interesting because they both deal with abstraction a lot more.

Biology, in particular, has many levels of properties. There's biochemistry, at the molecular level, there's cell biology which, by and large, builds on the biochem [and ignores a lot of the biochem details] to talk about structural organization and function of the cell, there's ecology, which completely ignores cell biology to talk about much higher level things, .... All these things are observable, true, but there's a very explicit modeling step and a very explicit ignoring of relevant information and, in many cases, a notable lack of experimentation. Experiments are partially replaced with "study and statistically analyze" and causal mechanisms are replaced with correlations.

And once we've seen this layering explicitly, in biology, it becomes clearer that it's present in physics too. And, maybe, necessarily present. You can't talk about Geophysics if you're going to worry about turbulence and its effect on the stability of stream beds (is there an effect? Probably not). So not only does physics abstract and construct models, it makes many models, all of which are useful in some area and most of which are slightly inconsistent.

And then it gets even worse as we consider things like Paleontology. There's darn few experiments in Paleontology. Or maybe it's that all the experiments are "thought experiments" (imagine this creature running...). Sure, there are digs going on all the time. And there are lots of brilliant people engaged in reconstructing our saurian heritage. But it's a science that fundamentally relies on serendipity ("A new type of ankle-bone! And one suited to swimming. Wow! What a great scientific discovery.").

Why am I babbling about this? Because I think "the universe" in MichaelFeathers' formulation is completely wrong.

Try this for a better notion of Traditional Science: The study of observable repeated phenomena, geared towards prediction and employing a reasoning cycle consisting of an observation stage, a deductions and hypothesizing stage, and a verification stage.

It's still a horribly naive definition. But it opens the doors to all sorts of things that I consider scientific (or that I think could be done scientifically), while preserving the traditional approach (which is, I would claim, the really important thing).

Why have I made such a big deal about this? I work in Artificial Intelligence and I would claim to be a scientist. So would Paul Cohen, who wrote an absolutely wonderful book called EmpiricalMethodsForArtificialIntelligence (ISBN 0262032252 ), which, while it has a slightly different notion of science from the above, nonetheless talks a lot about how we can be more scientific in our work.

And I would claim that each science,as it has emerged, has used a different method of investigation (related, to be sure). And that claims that CS is not a science because the way it proceeds doesn't correspond to a traditional view of the way physics proceeds, is just plain wrong.

Having said all that ....

Now on to Sorting Algorithms. That's a very neat question. The question is, basically, account for a new sorting algorithm. Is the beastie a scientific discovery? Or simply engineering? I'm ignoring the patent issue because, by and large, patents are evil. (See PatentsAreEvil)

(caveat: The boundary between science and engineering is incredibly fuzzy and positing a binary choice between them is also a false dichotomy. But, perhaps, a useful one). The answer: It depends. Either one. For example, tweaking an algorithm. There's a (very obvious) rule of thumb that says When merge-sorting a file, minimize disk i/o because disks are going to be the performance bottleneck. Use that rule and you may wind up inventing a new variant of merge-sort. but we'll all recognize it for merge sort and, unless you did something very clever, nobody will even notice it.

A slightly harder example: Merge-sort a singly-linked list in place. There's an example of how to do this in Sedgewick's book on algorithms that does the trick but involves (roughly) twice as much list traversal as the merge-sort used by LISP hackers. The LISP hacker algorithm is a simple tweak on the Sedgewick approach, but it's a clever tweak and clever enough that, many times, when it is used the inventors are credited. But still, this is engineering. Though engineering that required detailed observation and some hard thought (closer to science).

Hard example: Getting to merge-sort from bubble-sort. Does the above notion of "Science" cover it then? Maybe not. One reading of my version of the traditional notion of Science requires us to have an observable bubble-sort before we can talk about bubble-sorts (once we have a bubble-sort, we can analyze variants of it, but we do want it to be observable). But, like I said above, each new science invents its own variant of the scientific method.

This is simply science of the CS sort. -- WilliamGrosso

Last night, I posted something here and then trashed it in favor of the BubbleSort question. I figured that it cut to the crux. Anyway, naive or not, my notion of science does seem to match yours to a degree. If you are attempting to discover something to increase knowledge, it is science. If you are trying to make something for people, it is engineering. Software makes things interesting because the material is information. That said, I tend to think that there are aspects of science in engineering. Requirements analysis is a micro-science in IMHO (see GuerrillaDomainAnalysis). -- MichaelFeathers

I would first change the description of the scientific pattern form "observe, deduce, experiment to verify" to "observe, invent, experiment to verify". The "deduce" angle on describing science got tossed out decades ago (before Kuhn, even). Then there is an aspect of science involving computers. I would put automata theory in math, and compiler writing in engineering, and methods of development and HumanComputerInteraction in sociology or psychology. Then when designing an undergraduate curriculum, we can select what aspects of math, engineering and sociology are appropriate for what future lives of undergraduates. -- AlistairCockburn

Good point. Changing that one word makes everything much simpler. I'd also like to add HerbertSimon's SciencesOfTheArtificial? ISBN 0262691914 as relevant to TwoIrreparableMistakesOfTheSoftwareField's claim 1 (and a great book by a Nobel Laureate in Economics). -- WilliamGrosso

I think "Invent" is really the wrong word to use in the "Observe, <word>, experiment to verify" phrase when talking about science. I'd like to suggest that a better phrase to describe science is: "observe, hypothesize, experiment to verify". Further, Alistair's "observe, invent, experiment to verify" is probably closer to a description of engineering, rather than science. The important distinction here is that the ScientificMethod is about creating (aka "discovering") higher-level abstractions (i.e "Laws") about the world. I'm not sure "invent" is a good substitute for "discover" (although I find that Merriam-Webster disagrees: http://m-w.com/cgi-bin/dictionary?book=Dictionary&va=invent). I guess if we agree to use the "to devise by thinking" definition of "invent", I would have to concede that "inventing" a hypothesis makes sense. However, I'm think that the common use of "invent" connotes a more physical operation (ex. "Thomas Edison invented the light bulb" more so than "Einstein invented the special theory of relativity"). http://www.soci.niu.edu/~phildept/Dye/method.html -- GeoffSobering

"Observe, <word>, experiment to verify"

How about "Observe, synthesize, experiment to verify" - except that this kind of rolls "hypothesize" into "synthesize" - I think it's seductive, though not necessarily correct, to encapsulate this into a neat three-word quip. A longer form might be "observe, describe, hypothesize, synthesize, experiment to verify" though I'm sure the tendency will be to roll "describe" into "observe" and one of hypothesize & synthesize into the other. Clearly, you start with observe and you wind up with experiment, and what you do in between determines the quality of your science. -- GarryHamilton

I'd put it something like this: So, I'd call the invention of the BubbleSort engineering, but the characterization of the properties of the BubbleSort mathematics. I wouldn't call it science though. Even the best scientific theory can be overturned by new evidence later, but this is not true for the properties of the BubbleSort. I'm not entirely sure what the utility of defining "mathematics", "science" and "engineering" is though. -- AndyPierce

Things like the algorithmic properties of BubbleSort are essentially mathematics, as is most "hard" CS. Mathematics is not science. It is a tool used by scientists, but mathematicians are not scientists. Who says so? Every scientist and every mathematician I've ever met. -- AlainPicard

Perhaps Mr Picard should get out more and meet some more scientists and mathematicians.

Hmm. There is a strong correlation and much cross-over. Science, at its heart, is trying to understand things. The way it does this is, in principle, as follows. You gather data, collate it, and try to find patterns. Having found patterns you use them to make predictions. Predictions in hand, you then perform experiments to test them, confirming or refuting your hypothesized pattern. Then, if necessary, revise your ideas about what the patterns are and repeat.

It's a science when you make predictions and test them.

In mathematics we do exactly the same thing, except that having tested our predictions a few times we then set out to prove that the pattern is what it seems to be.

It's true that mathematics is not science, and most scientists and mathematicians will agree. It's false to characterize mathematics simply as a tool that scientists use. That's the view of a scientist, not a mathematician.

The literal definition of "Science" (from Latin, I think) is simply "Knowledge". So I regard "science" and "scientific" as any process that provides the ability to increase knowledge in any area (though note I said "increase knowledge", not "increase your knowledge"); merely being able to learn what others have previously learned is not science, but education. Science is being able to take your own learned concepts, couple them with external observations, and produce new knowledge. -- GavinLambert

Latin scientia. But remember that words change meanings all the time, even within a single language.

I am concerned with discussing two extreme kinds of science. I believe environmental geochemistry, a combination of environmental science with geochemistry, fits both requirements, since it seems to be at a turning point.

I thus propose to use this space to start adding new pages in both directions, of a SpecialityEnvironmentalGeochemistry? and a GeneralityEnvironmentalGeochemistry?. -- FrancoLevi?

doesn't quite cut it for science these days because all the stuff you can easily observe has already been observed. Try extending the process to something like: Example: We guess there will be interesting astronomical observations we can't make from the ground. We invent the space telescope. We observe that: hey yeah, look at all that cool stuff we didn't see before! We <word> about all that cool stuff. We experiment to verify that we were right (or wrong) about what we <word>ed.

With time and training the initial guesses become (hopefully) more educated and more likely to be correct, but it still frequently happens that the process is derailed, either because invent isn't yet technologically possible (YouCantGetThereFromHere), or because after a successful invent we fail to observe anything useful (the guess that the newly possible observations would be interesting was wrong). This is also called a "FishingExpedition?" and granting agencies in general hate this kind of proposal. That's why you have to write the grant according to the standard "observe, hypothesize (the word the granting agencies use), experiment to verify" model and then do the guessing and inventing on the sly.
Science is Testing Models of Reality

Science is the process of gathering, comparing, and evaluating proposed models against reality:

    theoryRank = fit x simplicity
Math is "playing with" axioms and science is the testing of axioms against reality. Note that one can do a lot with axioms that are not necessarily tied to reality (see MisuseOfMath) because the axioms may be partially or completely wrong to begin with. GiGo.

-- top

 What make of you of this top?


Do you have any idea why empirical Science is often contrasted with the Mathematical Sciences?

Math is step 2, empirical is step 3.

Do you claim that CS is not systematic?

It is often missing step 3. Step 3 can indeed be tough, but truth does not grade on effort.

Do you claim it is not empirical? do you claim that Science must be one or the other but never both?

Both step 2 and 3 are necessary for "decent" science.

Do you really think it necessary to redefine every word you use? Do you think doing so will advance or impede communication? Have you heard of private languages? -- have you heard of Wittgenstein?

I found the existing definitions lacking. For example, "Systematic" can be interpreted many different ways. I tried to tailor one that better fits the IT world using an algorithm(s), words from our world such as "model", and StepwiseRefinement.

Have you a defined epistemology? Does Quine ring a bell? Can you deal with http://www.ditext.com/quine/quine.html this.

Before we explore other models of science, do you have any specific complaint about the one I proposed? - t

I find it insufficient. For some reason I am not compelled to redefine every term I use. Your claim is 'Computer Science does not conform to my view of what Science is. I reject any alternate views in common use because I can. I particularity reject the definitions which would allow computer Sci to be a Science, because the more outlandish my claim the more likely some fool will feed my narcissism.' I am not compelled to entertain you now that I have exposed your intent. Nor even am I amused. Either read Quine and answer his criticisms or do not.

The existing definitions as stated are too vague. Dictionary writers are not necessarily interested in "clinical" precision. Their primary goal is to mirror the usage of terms as found. And I'm not necessarily saying that computer science is "not science", but merely that so far it has focused too narrowly, getting stuck on minor or narrow variables. A wider and more useful view would look more like economics and psychology than "traditional" science. It's a "smell" that much of what's called "computer science" doesn't measure anything. Science without measuring is highly suspicious. - t

{ComputerScience is a branch of mathematics that deals with computational theory and algorithms. ComputerScience no more deals with economics and psychology than, say, SetTheory deals with geology. Economics and psychology are academically well covered in the fields of SoftwareEngineering, Information Systems, Information Technology, Humanities Computing, Human Computer Interaction and what is sometimes called Computer Studies. All of these are occasionally lumped-in, terminologically (and arguably erroneously), with Computer Science. That may be the source of your confusion. Because, say, Human Computer Interaction occasionally is mentioned in the context of Computer Science, it appears that Computer Science cares little about Human Computer Interaction. In fact, HCI is a strong field on its own.}

If ComputerScience is sufficient for answering "is tool X better than tool Y", as some of you imply, then why do we need those other fields? Or are you now agreeing with me that CS is one piece of a vast puzzle? CS has served us well with regard to machine performance, but is having a hard time moving beyond that. - t

{I think you've been very confused about what "Computer Science" means. I don't recall anyone else here being confused about it. There are certainly cases where ComputerScience is sufficient to answer "is tool X better than tool Y?", but only for certain definitions of "better".}

"Certain definitions" is the operative word. A practical definition of "better" would contain a good many variables, but nobody is brave or smart enough to attack that problem, instead hiding behind obscure and narrow pet factors, insulting those who don't buy into their magic key. (Related: TooManyVariablesForScience)

{Any study encompassing "a good many variables" is going to be complex and expensive. It is difficult to justify (i.e., get research funding for) a study to find out whether, say, Java or C# is "better", or to find out which is the "best" of OO vs Functional vs Procedural programming. Even agreeing on an appropriate definition of "better" or "best" is likely to prove highly contentious, and the objects under study are all moving targets. Which version of Java? Which version of C#? That makes research difficult, and the results of that research currently appear to be of little interest to the industry. As long as there are other questions that are easier to answer, with results that are considered more important (e.g., does network routing algorithm X exhibit higher performance than Y?, etc.), they will get funded and "is tool X better than tool Y?" questions will largely be ignored.}

I generally agree with that assessment. That is why one shouldn't go around insisting on their pet GoldenHammer is truly the way and insulting anybody who rejects it.

{You mean, like you do with TableOrientedProgramming? :-)}

I don't claim it objectively better, only that if fits my own WetWare better, and other ExBase colleagues have reported the same to me.

. . . . . .

I submit we have enough to do in our own field....

-- I really have no idea what you think you are saying, top

On the other hand, math and science have testing/falsification in common, it's just that in pure math (in its rigorous formal form), the test is in regards to logic and consistency first and foremost, whereas in science, the test is in regards to physical reality, first and foremost, and math (including logic) are used towards that end.

It could be said that math tests for internal consistency and science for external consistency. - t

And both are somewhat similar to engineering, including programming, in which the testing/falsification is in regards to "what works" for some set of goals.

The less math that backs up a field of study, the less rigorous it is, ...

Not necessarily. Perhaps the field simply has a lot of axioms and the evidence is based on testing these axioms against reality rather than what the axioms imply. ...and the more of an art it becomes, and the more that one tends to have unfalsifiable heuristic methods (and indeed, methods based simply in taste and opinion) rather than provable algorithms.

The field of software is backed by theoretical CS, but not all programmers approach it from that angle nor are trained in that background, so we have some programmers who think software is a mathematical field and others who disagree and think it's an art. It's both, it's neither, it depends on what the individual person does.

People who spend their careers doing business apps usually seem relatively more likely to have the most outright disdain for theoretical CS and math, compared with programmers in other areas, as far as I can tell, and don't seem to notice that other areas of programming outright require theory and math background. Some areas do, some areas don't - it's the blind men and the elephant. -- DougMerritt

Example? Most of our education seems geared toward the physical world. However, most of business is about human-made rules and preferences. There does not seem to be a lot of "science" in this.

I think you may have misunderstood me. I'm not saying that theoretical math and CS is necessarily applicable to the writing of business apps (it may be in some cases, but that would seem to be rare for the common real-world cases). I'm saying that, precisely because the need for such theory is far more rare in areas of software such as business apps, people who do mostly the latter tend to under-appreciate that such theory is nonetheless very important in other areas of software, possibly even in most other areas.

Thus I observe arguments here (and elsewhere) between people who do one kind of software, and people who do another kind, and thus I claim "blind men and the elephant" parable applies.

BTW I believe that many business "human-made rules and preferences" actually should be informed by theory more often than they actually are in small and medium-size companies.

Theory does not always pan out in the real world. Many ideas are tried, but only a few seem to be usable in the end. Small businesses don't have the resources to experiment via trial-and-error.

The international conglomerates do so, although they don't talk about it all that often. Competitive advantage.

Note for instance WalMart staying on top of (from what I hear) relatively state of the art data mining, which to do well, involves plenty of hairy theory; one cannot do state of the art data mining by sheer artisanship and native intelligence, it requires heavy duty mathematical machinery as well (non-state of the art can of course get by without the math, and still get sometimes-valuable results; I'm not trying to draw a line in the sand here).

Most companies will let the marketplace test ideas before handing over big bucks. After being burned or let down by promises such as ExpertSystems, neural networks, etc., they are understandably skeptical of yet more "lab toys". Thus, they'll let others try them, keep their ear to the ground, and then copy successful projects from other companies.

I believe that similar observations apply to some (but certainly not all) business rules applied in database schema and query design. -- Doug

In theory, the machine should optimize queries, not people. In theory, you are supposed to only ask for what you want, not how to get it - that was to be the machine's job. Even that turned out not to be fully realizable.

There are aspects of ComputationScience? that are entirely mathematical. These include such things as BubbleSort and TypeTheory, fundamental cryptography.

There are aspects of ComputationScience? that are entirely psychological. These include such things as design of HumanComputerInterface?, extraction of requirements, UseCase stories, etc. There is often a great deal of guessing, observation, iterative refinement here. It also includes aspects of Steganography (hidden in plain sight), social-engineering of multi-agent systems (where agents include people - e.g. PeerToPeer systems that penalize the people who aren't uploading).

There are aspects of ComputationScience? that fall closer to engineering. Actual programming to a set of requirements, for example, has dozens of approaches... and choosing one over another certainly isn't math. Network design, hardware mass manufacture, .... Also, choice of heuristics, tweaking of fuzzy parameters, all that sort of tinkering.

There are aspects of ComputationScience? that fall squarely within 'Science' - observe, hypothesize, experiment, abduct, (and sometimes repair). Identifying the cause of bugs certainly follows this pattern. However, all components of ComputingScience are still science... from those involving hard math to those involving soft psychology and polls on user satisfaction.
"The difference between screwing around and science is writing it down" -Adam Savage
Any interesting cross-over discussion on: TestFirstDesignIsLikeTheSocraticMethod. -- GeoffSobering
I thought science was just what Newton describes in Mathematical Principles of Natural Philosophy.
The fact that bubble sort works is a mathematical characteristic of the universe; the specifics of building a machine to perform it is an invention and covered by patents (which are evil). If your machine is a universal computation machine (patentable in its own right) then the program to transform it into a bubble-sort machine is protected by copyright and does not need a patent, nor should it be as that leads to effectively patenting the discovery.

Discovering the bubble sort algorithm is science. Everything else is a simple matter of implementation :)

Wouldn't that be a mathematical discovery? It's not a physical thing. Science is the study, analysis, and modelling of the physical world; at least by my notion of "science". True, it is influenced by machinery concerns, but unless we make a "problem" to be solved in terms of machines, such as minimizing computing resources and time, I classify it as math.
See Also: ApplyingScienceDiscussion, TooManyVariablesForScience, ScienceAndTools
CategoryScience CategoryDefinition

View edit of November 5, 2014 or FindPage with title or text search

Meatball   Why