Object Oriented Purity

This is not an argument about whether ObjectOrientedProgramming = Good. Or whether Purity = Good, even. It is perfectly okay to have a language which makes compromises, and you should pick the language which makes the right compromises for your project.

Most OO languages are not considered 'pure' for some reason or another. See, for example: http://www.research.att.com/~bs/oopsla.pdf for a discussion of purity as it relates to C++. But what is purity exactly? Well, ObjectOrientedProgramming, by definition, has three requirements:

All OO languages have some implementation of these three features. However, some of them don't quite fit, usually because they are MultiParadigmProgrammingLanguages. Lets take a look at them in turn.

Encapsulation

Which is just this: you don't get access to the internal state of an object. Or, more strongly, clients aren't allowed knowledge of how services are implemented. C++ and Java make steps in this direction by having 'private' parts. Other languages use closures to do this. Some don't bother at all (e.g. Perl5). Public member variables are a violation of encapsulation and languages (even Java) that allow them may be considered less than pure. However, if the access mechanism for the public member is syntactically identical to a function call (as in the EiffelLanguage), then the decoupling promised by encapsulation remains intact and the language can be considered to be significantly more pure than if it did not support this feature. Languages that allow access to member variables exclusively are still more pure. Self [SelfLanguage] is an example of this kind of language.

Some languages go further than C/C++ and make encapsulation "by convention", i.e. voluntary. PythonLanguage (widely considered PureOo?) is one; you mention PerlLanguage (widely considered a hybrid) as another. Other than multiple languages which make everything voluntary, I can't think of two languages which have the same encapsulation rules. EiffelLanguage has a very flexible set of encapsulation rules; C++ and Java are less flexible (they are similar to each other, but different in key respects). Using a particular encapsulation policy as a criterion for OO-ness seems to be artificial. At a minimum, encapsulation should be part of the culture of the language (something admittedly hard to specify in a language standard), if it is enforced by the tools so much the better. But many feel that compiler-forced encapsulation gets in the way of things like UnitTests (many of which like to frob the internals of an object).

Inheritance/Delegation/Prototyping

One reason that this is fundamental to OO is because it provides a significant mechanism for reuse. Type hierarchies are a (un?)happy side effect of inheritance. It's an optimization that is an implementation detail and not a fundamental property of inheritance/delegation/prototyping. TypeInference, for instance, provides a solution to the same problem without forcing the programmer to do book keeping in the code in order to help the compiler.

Inheritance is really thorny territory. It gets you into the whole implicit type vs explicit type vs implementation malarkey, which you will find beaten to death on CircleAndEllipseProblem debates. And then there's multiple inheritance, 'become' type-changing operations, and exceptions to types (think of a platypus - see ImperfectHierarchies?). Delegation has fewer problems than inheritance, but the main difference is that delegators do not get access to the internal state of the classes they inherit from. Self springs to mind as one of the 'purest' languages in this respect: using prototyping instead of straight inheritance or delegation means you can support platypuses (platypi?). It's not without its problems though, especially as a practical language.

While strong typing does force a significant amount of programming overhead it does allow the compiler to detect some errors that would otherwise be discovered only at run time.

Polymorphism

The ability to handle objects of different types as if they were objects of the same type. Often constrained to handling objects with a common ancestral type as if they were all of that type. Enables LiskovSubstitution. In the most common languages polymorphism is implemented by having an explicit type hierarchy. Slot- and prototype-based OO languages (think perl5, JavaScript, Self (again)) let pretty much anything go and fail if the slot/method called does not exist (not quite true: perl5 'auto-vivifies' things that don't exist). FunctionalProgrammingLanguages and some OO languages that are based on them (notably CommonLisp with the CommonLispObjectSystem) provide a feature called ExternalPolymorphism which moves the definition of the implementation from the type definition of the object to the definition of the function. ExternalPolymorphism can lead naturally to MultipleDispatch.

See also: PrinciplesOfObjectOrientedDesign HowObjectOrientedIsClos



Discussion:


Strong typing

It seems to me that strong typing (like DBC below) is an issue that is tangential to purity. It's really polymorphism that enables LiskovSubstitution. Strong typing can help to resolve ambiguity, but it can also be a big pain when there isn't any ambiguity to resolve. My personal opinion is that type inference could be combined with strong typing to provide the best solution. That way the compiler would be responsible for determining which type is in use in all of the obvious cases and the programmer would be required to be more explicit in any case where there is ambiguity - which would actually aid future readers of the code as well. In this way the typing structure of the language could actually be of more help than hindrance in most cases.

But my point is really that the typing strategy is an implementation detail. It's one that OO inevitably raises, but it is not itself a defining characteristic of a "pure" OO language. -- PhilGoodwin

Is StrongTyping or StaticTyping intended here? The latter debate (static vs dynamic) seems to be more heated in terms of OO-ness; some feel that OO required dynamic typing (I disagree on this point). See TypingQuadrant


External Polymorphism

"Languages with strong types can also have 'external' polymorphism (the ability to handle a range of return types from methods). Again, many do this by having explicit subtyping."

I removed this from the Polymorphism section because I don't think that it's true. Please see ExternalPolymorphism and correct me if I'm wrong. -- PhilGoodwin


DesignByContract

[I]t... [seems to me] like slot/prototype based languages are 'the most OO'. But there's a more recent fly in the ointment: DesignByContract. You could see strongly typed languages as enforcing one form of contract, and it captures the idea of what we're trying to do in 'optimizations' of the original three concepts which make testing easier. Potentially, this is the one that got away in OO - something that should really have been added to the definition, although it is orthogonal, and would benefit other paradigms. Eiffel stands out in the DBC arena, but none of the more 'purely OO' in the senses described above do. If we ignore DBC, then Self and SmallTalk are benchmarks against which other OO languages can measure purity; with DBC in the mix, then I can't think of a single language that comes up to scratch. These are just my opinions, so flame on. Remember, though, I'm not describing what's good here, just what's pure. -- BrianEwins.

I think that DBC is a good thing. But it's not part of the definition of OO. It's interesting, like typing, because it addresses an issue that naturally falls out out of the definition of OO without actually being directly addressed by it. In fact, I think that DBC is a part of the type issue. Types are usually defined as a set of methods and their signatures (called an "interface" by environments such as Java and COM). If that definition were to be extended to include DBC contracts then Liskov substitutability would be enforceable at compile time and strong typing would become much more valuable. You could still use type inference in cases where there is no ambiguity of course. -- PhilGoodwin


Some lines in a program are talking to the next human reader. Some lines are talking to the compiler. Some lines are talking to the execution machine. Finally, some are descriptions of what to do next... computation in motion.

In the above paragraph, "the most OO" seems to refer to the computation in motion parts. Comments talk to the reader; compiler statements such as macros and type definitions talk to the compiler; DBC assertions talk to the execution machine.

In Smalltalk, a class definition is a funny thing, because it is an execution statement directed to the execution universe that the development environment is sitting in. A method definition and temporary variable declarations are talking to the compiler. Self has fewer of these, and so perhaps seems more OO.

I am in favor of separating description of computation from talking about the program (as in assertions and discussions about the type of a value). In fact, I'd like the four types of statements to be distinct overlays (three overlays on the execution description). -- AlistairCockburn


Encapsulation is not about preventing access, but rather controlling it. An example would be a output_file class that opens the file for output when a method that requires output is called. In such a case, part of the internal state is the "openness of the file". The user is not prevented from accessing it, since he can change it to is_open at anytime (by calling a method that writes). Rather he is prevented from making arbitrary change to the state, i.e. his access is controlled. -- ThaddeusOlczyk

An example of an object hiding its state would be an object that caches data. The first read of the data incurs the overhead of, for example, a database access. Following reads return almost immediately, until for some reason the cache is invalidated. This is a common example where the state of an object is hidden from the caller. -- WayneMack


The whole argument about ObjectOrientedPurity to me is a RedHerring... it amounts to little more than proponents of some languages (Smalltalk chiefly, with a few SmugLispWeenies occasionally - and ironically - chiming in) throwing rocks at other languages (C++ and Java seem to be the main targets) for failing to dot some perceived i or cross some imaginary t - i's and t's which may or may not be important in the GrandSchemeOfThings?. It seems to follow from the reasoning that "OO == good, C++ == bad, therefore C++ != OO".

Ignoring design flaws which may or may not be present in C++ or Java (which probably have little bearing on its OO-ness; one could certainly design a language which was purely OO and horrible to use if one wanted), why precisely should we regard "pure" OO languages as "good", and hybrid languages (which allow easy use of other programming paradigms) as bad? I like OO very much; however I think the notion that OO is the OneTrueWay and that "non-OO" code ought to be banished to the nether regions is premature. Our discipline is still too immature for such closed-mindedness. I see no reason to declare a language bad simply because it has things like free functions, an encapsulation model that differs from Smalltalk, etc.

In other words...if a language is a hybrid rather than pure OO - who cares? It's goodness is largely orthogonal to its purity.

Isn't this last section itself the only "red herring"? Or perhaps instead a strawman - an argument propped up in order to then be knocked down. First is the strawman itself - the claim that this page is just a collection of people "throwing rocks" at hybrid languages. The page, itself, explicitly and gingerly steps around making value judgements. Instead, it very carefully focuses on what ObjectOrientedPurity means - for better or worse. The second paragraph is itself the only paragraph on the page that makes any claims about whether any language is "good" or "bad". Similarly, the only suggestion that "OO is the OneTrueWay" or that non-OO code "ought to banished to the nether regions" is in this paragraph itself. The writer sees "no reason to declare a language bad ..." - and neither to any of the other contributors on this page. Finally, the writer once again agrees with the premise of the rest of the page, namely, that a language's "goodness is largely orthogonal to its purity". I think the strawman is again flat on the ground.

An argument can be made that this particular page - which is generally invective-free - shouldn't be home to the above. However, the comments above need to be said somewhere on Wiki, to counter some of the (IMHO specious) arguments on pages like IsCeePlusPlusObjectOriented or IsJavaObjectOriented, where advocates of certain other languages (including the inventor of one prominent OO language) go about making claims that C++ and Java (and perhaps a few others - though these two get singled out for abuse) aren't ObjectOriented. Much of the debate centers around the proper definition of OO; many folks invoke AlanKaysDefinitionOfObjectOriented (which is more restrictive than the encapsulation/inheritance(delegation, prototypes)/polymorphism definition given above) on the grounds that HeInventedTheTerm (thus he and he alone gets to decide what is OO and what ain't.)

Is C++ "pure" OO? Certainly not; it's an extension to a decidedly procedural language (CeeLanguage) and is chock full of non-OO features. Java is a tougher call; the usual argument made against the "pure" OO-ness of Java is the presence of non-object types like int and bool (in a pure OO language, the argument goes, every first-class type is an object, without exception). CsharpLanguage (which so far hasn't been flamed quite as much) lies between the two, though with its BoxingConversions the distinction between objects and StackBasedValueTypes? becomes less important. Java will of course get BoxingConversions in JDK 1.5.

Do both languages have OO features, and can one write code in the OO style? Absolutely. In Java (in particular) it is difficult to write code in any other style. Can be done, of course, but it can be done in Smalltalk as well.

So...where do we draw the line for whether or not a language is OO? At some level of purity? Do we admit hybrids? If we make the definition too restrictive and start excluding things like Java or CsharpLanguage - then the term starts to lose its usefulness. Absolute convincing evidence (from carefully-conducted research; not the mere personal experience of certain programmers) that pure OO languages are inherently superior to their hybrid cousins, the pure/hybrid dichotomy is an interesting taxonomy but of little use in evaluating programming languages. What is more important to me, as an OO programmer, is whether or not a language supports (easily; PointerCastPolymorphism doesn't count) OO programming. If it's got a non-OO feature or two (or even is a full-fledged hybrid like C++), that doesn't necessarily reduce its capability to enable OO.

I think the page would be improved if this last section - including my comments - were deleted.

Or moved somewhere else; I do think they need to remain somewhere on Wiki. Given that this last section has a higher heat content than the rest of this page; I've no argument with moving it away from here.

EditText of this page (last edited June 2, 2008) or FindPage with title or text search