Oo Empirical Evidence

AdVerecundiam is probably the best we can do. There is some alleged empirical evidence of OO over procedural, functional, etc. in the past, but still tends to rely on AdVerecundiam or ProofByAnecdote? in the end.

Perhaps software engineering is too complex to have empirical evidence. We might not be happy about this, but we must come to grips with it. Maybe someday we will have it, but we cannot rely on it to promote good OO practices and design.

The problem is not complexity; people regularly perform more complex tasks with far more predictability. The problem is that software engineering doesn't yet really exist. The practice of writing software today is not really comparable to what engineers do. We are learning, but there is a lot of catching up to do. Even worse, all the engineering process in the world can't make up for a lack of basic theory. You can't engineer what isn't understood. Likewise, you can't measure what you don't understand. So "argument from authority" and "anecdotal evidence" are the best we can manage. This shouldn't be a surprise; it is a symptom of a larger malady.

So if a customer asks, "Why should I choose OO over foobar?", what should I answer? Say that I like it and it works for me? There's not a lot of mileage in that.

Why is the customer even asking the question? OO or not is a technical issue, not a business one.

What is the difference? If you pick a bad paradigm or methodology, it may hurt profits.

If you put the customer in charge of making technical decisions, it certainly will.

What about an IT department asking for advice about whether to "go OO" or not?

The IT department is asking the wrong question. It should be asking what languages it should go with. Its decision will be based on more pragmatic concerns, such as available talent, available tools, and possibly the hardware platforms it must support. I don't think the "goodness" or "badness" of OO would play a part in this decision.

Even with an OO-capable language, the issue of how heavily to use OO still looms.


Here are some forms that such evidence may take:
  1. Satisfaction surveys of customers and/or managers.
  2. Code comparison metrics. For example, lines-of-code (or token counts) and change-impact analysis (on real or sample changes in requirements).
  3. Reasoning and philosophical ponderings.

First, it might be helpful to focus on what OO allegedly improves. Common candidates that I encounter include:

A. Increased Reuse (see ReuseHasFailed).

B. Reduces the impact (on code) due to changes in requirements (CodeChangeImpactAnalysis).

C. Improves the "grokkability" or mental navigation of the software code.

Item 1 would only be good evidence if a wide sample was taken. Many people are biased one way or the other over OO. A simpler method would be to simply take a poll as to how many Fortune 500 software companies use OO.

But ProofByPopularity is sticky. Technologies such as Amiga and the OS/2 operating system received rave reviews, but were rejected by the market. It is hard to distinguish between fads and merit using popularity. Besides, using OO and using an OOP programming language are not necessarily the same thing.

Amiga and OS/2 never got much market share with Fortune 500 software companies. The benefit of looking at Fortune 500 companies is that these are the companies making money. The ultimate test of the business value of a methodology is profitability.

But the F500 seem to hop on fads just like anybody else. Their IT decisions may not be enough to float or sink them anyhow, unless perhaps they bet big on a real stinker. They also tend to pick where the momentum is because that is what the tools and training is focused on. IOW, the QwertySyndrome.

I agree that there are significant problems with copy-catting F500 companies. The point I was trying to make is that it is better than the proposed method of satisfaction surveys (reason being the small sample size and lack of grounding in profitability). In my opinion of course. I think anecdotal evidence from trusted software gurus is better than either of these, but it certainly isn't empirical evidence. My personal technique is to read a book, if it's a good convincing book, I look in the bibliography for other good books. After reading a bit about the new fad from different trusted perspectives, I try it out myself. This is how I learned about XP, refactoring, patterns, etc. The problem is that this only works for one person (and only if that person likes reading technical books/articles); it doesn't do much to convince a manager to say, "Well, I've read a lot of books on it and tried it myself. We should do it." If you've got a manager that is convinced by such propositions, keep your job as long as you can, because they are rare in my experience.

Item 2 is just eval, in my opinion. Code metrics are like IQ tests, they measure how well the code scores against the metrics, which may or may not have any relevance to RealLife.

While I agree that using a limited set of metrics is bad, I don't think the idea should be totally dismissed. Sports enthusiasts have found *many* ways to measure players and teams beyond simple scoring. The biggest remaining issue is how big a weighting factor to put in each metric.

...And you still get sports fans arguing about who the best baseball player or boxer is. And that still doesn't tell you whether they are the best because of their training method or their natural talent. Knowing that MichaelJordan? is the best basketball player doesn't help JoeSchmoe? who's trying to learn basketball.

But battles over the top sports stars are usually over a few percent. Most will agree that a player averaging 20 points, 10 rebounds, 8 assists, 46 percent accuracy, and 2 "mistakes" (turnovers, etc) per game is a more valuable player than one who averages 5,3,3,35,7 respectively. Perhaps different coaches will weigh such stats a bit differently, but at least they have stats.

I also do not see how "trying to learn" applies here. This topic is not about training.

Item 3 won't convince anyone in business.

A, B, and C are not relevant to any customers, who are concerned only about feasibility and cost.

This page seems torn between two separate issues: What evidence is there for technical decision makers, and how do we keep business decision makers from messing up what are essentially technical decisions. The second issue requires framing the decision in business terms, keeping the technical stuff out of it.

I don't see how business and technical decisions are vastly different. If Boeing wants plane sales, they have to use good plane engineering. The question is how do we tell if a plane (design) is good? You then have things like fuel-efficiency measures and wind tunnel tests. Should we give up on the idea of using science and engineering thinking to compare paradigms and software engineering techniques? If so, then software is somehow "special". I find this a bit hard to swallow. I think the software discipline (oxymoron? :-) is just naive or too new rather than special. Until a way out is found, we should keep our minds wide open

The technical tests Boeing might use to test out its designs test the validity of a given design, not validity of a given design method. SoftwareEngineering already has processes that are analogous to Boeing's wind-tunnel test: One of them is UnitTests.

The kind of empirical evidence being discussed on this page have to do with ObjectOrientedProgramming, as a method, and whether or not it's more efficient and robust than another programming method. Does Boeing have empirical evidence for whatever design methods or methodologies it uses in-house? Does anybody ever say there "Hey, I have numbers that prove that if we switched to using this rapid-prototyping tool, then our productivity would go up 15 percent" ?

They might actually have numbers, but 99:1 they came from the vendor of the tool.

If they do enough designing, then it should be measurable. If they turn out hundreds of different designs that turn into products, then one can start comparing. Have different teams for each methodology compete for the same aerospace project. Over time one can then count how many accepted designs came from each methodology. Not a perfect metric perhaps, but if one approach is clearly better, then it should stand out. It is things that are say only a few percent different that are hard to detect. IOW, if the signal is smaller than the noise.

Does such an approach measure how long it takes to write functioning code, or just a design that claims to represent functioning code?

I assume the military will choose a tested design.

"If they do enough designing, then it should be measurable..."

How much time are you willing to spend on this? Paradigms, especially in software, rarely last 20 years, or even 5 in many cases. If you're going to reap the benefits, you want to be one of the early adopters not one of the late majority.

1. This assumes there are benefits, and 2. new paradigms sometimes fizzle quickly. It is not much different than trying to pick stocks; you don't know at the time whether you are looking at the beginning of an upward trend, or a short hiccup. Early adopters just may be the only adopters.

Sure, in twenty years, we'll be able to see with perfect hindsight whether OO was good or not, but that doesn't help us today. Although OO has been around for a long time, it's only become common in the last 10 years or less.
The benefit of looking at Fortune 500 companies is that these are the companies making money.

Wrong, wrong, wrong, wrong, wrong. The companies in the Fortune 500 are defined as the 500 U.S. corporations with the largest revenue, which is completely totally different from the largest profit. It's possible, and it happens all the time, for a small 10-person company to show a 150% return-on-investment, whereas Wal-Mart might show a 2.3% return.

Every so often, I have a customer ask me to show that I'm improving productivity. In one case, it was because the company wanted productivity to improve over time; in another, it was because they were considering paying for another programmer to be added to the project.

Whenever this happens, I'm put into a quandary. I don't know how to objectively measure productivity in the programming world. I can define it -- "productivity = features / effort" -- and I can measure effort -- but I can't measure features. And that stumps me every time.

I'm aware of the function point metric, but last time I looked at it, it wasn't entirely measuring features. It included stuff like "number of database tables" and "number of forms." A sufficiently clever team might spend more time per function point and come up with fewer forms and database tables, while still satisfying the base requirement. They would be penalized, because they would seem less productive. BehavioralEffectOfMetrics (blatant plug) says that you'll get what you measure. If you measure productivity using function points, then you'll get lots of forms and database tables, and that will lead to crappy software.

Count table columns and form fields instead. Also, if there are no paycheck rewards, then there is little incentive to fudge the design, unlike say Soviet factory quotas. They don't even have to know that things are being compared.

This is a very basic problem. If we can't objectively measure productivity in software projects, then we have no way to say whether or not OO is better than procedural. We can't say whether or not anything is better than anything. We have nothing but anecdotal evidence and personal experience. If that's all we have, fine, I can use that with some success. But it's very hard to defend to a business person, and a hell of a way to run a profession.

-- JimLittle

I think you may have set an artificially high goal. Few things in life, especially business related tasks, have any objective measure related to them. The best justification relies on theory, not experimental examples and not anecdotal examples. Explain the rationale behind OO (I'm little unsure exactly how and why you believe "OO" and "procedural" are at odds or exactly what you are referring to).

Theory is only useful when it's backed up by evidence. Without evidence, you can't tell which theories are correct and which ones are hogwash. I've seen far too many software theories that sound reasonable but haven't worked in practice.

When I said "procedural", I was referring to languages like C and Pascal. --JimLittle

Evidence is not all-or-nothing. See EvidenceTotemPole.

If someone does not believe the theory, he will also find a way to disbelieve the evidence. Also, very few theories are so bad that they do not have some supporting evidence. Theories can be best evaluated based on their internal consistency and scope, neither of which can be shown with "evidence."

You can't assume that everybody is stubborn. Some will be, some won't. Also, it is often not a matter of outright disbelieving a theory, but rather that software engineering is so complex that it is hard to consider all tradeoffs and variables at once. One approach may make certain things easier, but it may also make other things harder. Those "other things" are often not readily apparent. How many times have you seen an example in a trade publication that showed some technology or technique simplifying something. However, when you actually try it, you find the examples or theories are often just "best case" scenarios (EvidenceByBestCaseScenario), and the article never mentioned the down-sides. After a while people get somewhat shell-shocked at new claims based on catchy theories.

I believe the above is echoing my point. For anything complex, an example will never match an individual situation. If one understands and accepts the theory, he will find ways to adapt the example to his situation. If one does not understand or doubts the theory, he will find numerous differences between the example and his situation. This is not stubbornness; I am merely saying that if one does not believe the underlying theory is at least possibly correct, examples based on that theory will just appear to be "wrong." [Note: When I say theory, I am implying something more than just opinion. I am talking about something that is well thought out, has internal consistency, and reflects reality.]

So you seem to think that accepting base concepts or philosophies is the key? As a case study, I don't believe that global hierarchical taxonomies are very useful for code design (LimitsOfHierarchies) and I find relational structures more manageable than their common counterpart, NavigationalDatabases, which also generally reflect the nature of OO designs. It appears to me that OO hinges on these two base assumptions, but I reject them. Does this make me "stubborn"? If you can show them superior, then I might change my mind. (http://geocities.com/tablizer/core1.htm)


Beware the wishy-washy proof. Paradoxically, I have found through trial and error that when it comes to proof, "all or nothing" is my best approach. I use proof or I use judgment and opinion - I don't try to mix the two. For me, good sound proof (hard evidence) is better than good sound judgment and opinion (based on sketchy evidence) is better than weak, poor, wishy-washy proof (based on sketchy evidence).


How do we best use anecdotal evidence and personal experience?

EdYourdon has allegedly taken surveys of manager satisfaction of projects. I don't believe OOP projects stood out from the crowd on those.

Given the immaturity of software the only viable basis for making decisions are past experience and recommendations by people we trust. If I'm writing text processing software then I'll use those 2 criteria to decide which language to use and what paradigm to apply.

The evidence seems to suggest that personal preference makes a much bigger difference than any objective feature or "magic law" of the competing paradigms. Thus, what you probably will get if you ask for recommendations is a personal preference. Besides, that is not much different than what Yourdon has done, just at a smaller scale. And being trustworthy does not necessarily ward off accidental variations of PersonalChoiceElevatedToMoralImperative. For, humans seem to have a hard time telling the difference between subjective and objective. I suppose that consulting with people who think like you can help you find products and techniques that will better fit your personality. In that sense, "asking around" is useful, but does not solve the problem at hand here.

But until the problem at hand here is solved (finding a good way to make software business/technical decisions), what is the best poor way to make software business/technical decisions? For me, it is "personal experience" and "asking around".

There are objective metrics of sorts. For example, comparing code size (lines of code or number of tokens, etc.) and CodeChangeImpactAnalysis. However, there are subjective issues still involved in them, such as how to weigh tokens.


Ways of evaluating technical alternatives (either by a technical person or by a manager) in no particular order:


Things that can interfere with being objective, or receiving objective information from others:


Please distinguish between RealOo? and FakeOo?. FakeOo? is when they do use ObjectOrientedLanguages? but the don't use ObjectOrientation.

RealOo? needs UnitTests, has FewShortMethodsPerClass and follows some DesignPatterns. FakeOo? does not use UnitTests, certianly does not use DesignPatterns, has FearOfAddingClasses and does not use PolyMorphism.

-- GuillermoSchwarz

You're confusing good practices with "real OO". UnitTests are orthogonal to OO; they are good practice under any paradigm, and conversely, AlanKay did perfectly good OO that sometimes lacked unit tests. Design patterns were invented (or if you prefer, recognized) long after OO was invented. Polymorphism is a perfectly natural fit with OO.

You're making up your own idiosyncratic terminology by insisting on this. What good does it do to speak a language that the rest of the industry doesn't share? Just call it good practice, or best practice, or something.

Although millions of people would call you just plain wrong and misguided for being against polymorphism, no matter how you label that stance.

-- DougMerritt

Note that I did not write the text you are replying to. Also, you are using ArgumentByPopularity?. However, it does hilite that NobodyAgreesOnWhatOoIs. Some OO proponents do indeed reject heavy use of polymorphism for the same reasons I rail against it. -- top
Above, DougMerritt unnecessarily reprimanded GuillermoSchwartz? for being in opposition of polymorphism. When reviewing Guillermo's own sentence, removing intervening and irrelevant sub-expressions, we find he says, "FakeOo? does not use . . . and does not use PolyMorphism." Any reader with good comprehension must come to the conclusion that Guillermo is not in opposition to polymorphism, but is in fact in favor of it. As a regular user of OO languages and analysis techniques, I find myself wholeheartedly agreeing with Guillermo's statement.

Now, on to Guillermo's other requirements, some of which I disagree with as well.

RealOo? does not need unit tests, but it sure is recommended if you want a program that is both understandable 5 years later (thanks to unit test's inherent value as documentation) and which actually still works in the same time frame (thanks to unit test's inherent value in regression testing). Attempting to conduct unit testing without RealOo? is a horrifying exercise in pain and suffering. Taking procedural programming as an example, if your code makes heavy use of the sockets API, you'll need to stub out the sockets functions your module calls. But, since they're often implemented at the system call level, it's nigh impossible to link your code against a library implementing the same named functions while also linking against libc. You have to write your code to use different names (in my code, I used IPC_xxx, where xxx was the same name as the corresponding sockets API call). Given unique names for the sockets module API, it's doable (I've done it before), but painful. It hurts. It took a lot of time and energy to get my stuff right. (On the positive side, I now understand how sockets work much better than I ever did before). It demonstrates that if you want to a module M1 that calls functions in M2, you need to create a stub module S12, which stubs just the functionality you need in M2 from the context of testing M1. Then you have to find a way to vector M2's API, so that it calls the procedures in S12 instead if those in M2. This involves function pointers -- precisely the same kind of things you get for free in RealOo?.

So, in essence, one can make the argument that, while RealOo? doesn't need unit tests, proper unit tests does need RealOo?, even if built on top of non-OO languages.

With regards to few short methods per class, this is something that is equally difficult (but, again, not impossible) to do in procedural languages. Having a language feature designed expressly to support the creation of abstract data types (namely, classes and interfaces) obviously is a huge enabler to writing small collections of small methods. The succinctness of many of these short methods comes from the freedom of concerning oneself with the concrete types of the arguments, instead concentrating on the abstract. Without this support, you'll find yourself writing many modules whose sole purpose is to multiplex functionality across several different concrete types (indeed, this is precisely what module S12 does in the aforementioned unit testing example). This is something that a RealOo? language does automatically for you. That is their raison detre.

I'll disagree somewhat with the design patterns requirement; I've written plenty of procedural code that implemented a variety of design patterns. Nothing new there. --Samuel A. Falvo II
What about GNOME and KDE? Two Desktop Environments roughly equal in success, features, maturity. Both written in the OO:ed paradigm, but one in a non-OO language and the other in an OO one. Doesn't that indicate that there is no significant gain in using an OO language over a non-OO one?

Just because it is written in a non-OO language does not necessarily mean it does not use OO principles. I don't know if this was done in this case, but it is something to check.

GTK+ and Gnome are heavily object oriented in spite of being writen in C. They implement signals (a.k.a. message passing/virtual methods), inheritance, encapsulation, data hiding, and polymorphism.

I don't know about systems software or low-level GUI controllers, but for custom application GUI frameworks, more use of declarative techniques are the way to go in my opinion. For one, OOP frameworks are hard to share among different languages. Related: TooMuchGuiCode.


http://csdl2.computer.org/persagen/DLAbsToc.jsp?resourcePath=/dl/proceedings/&toc=comp/proceedings/esem/2007/2886/00/2886toc.xml&DOI=10.1109/ESEM.2007.46

Fee-based, but here is the abstract:


See Also: DisciplineEnvy, OoIsPragmatic, IsObjectOrientationMoreComplex
CategoryEvidence

EditText of this page (last edited August 1, 2008) or FindPage with title or text search