There is evidence that ReuseHasFailed
-- that means ObjectOrientedProgramming
, thus far, has failed. So this page is about ReFactoring ObjectOriented
. Let's begin.
Starting from AlanKaysDefinitionOfObjectOriented
1) "Everything is an object."
That's a great idea, and this is where the OOP paradigm has taken us. It made a fabulous abstraction away from the machine. But did it get us what we wanted?
It depends what "we" wanted. Note that "everything is an object" applies to only some OO languages.
Yes, well one of the things "we" wanted, was reusability, but despite the promise from OOP of reusability, it never happened. The proof of this is the concept of the standard library
to do all sorts of common tasks in a standardized
(i.e. bureaucratic) way. If Objects were really easily reusable, one wouldn't worry about that. We would have several ad hoc
data primitives (see UnifiedDataModel
) that can be glued together in more and more complex ways to create MashUp
s (like Unix command-line with pipes). I want a DataEcosystem
There may have been some notional idea of a grand pool of objects that we all draw upon, and of course that didn't happen, but object re-use certainly happens within projects. And without, in the form of software components.
[OOP has not been better for reuse than other paradigms. Objects aren't very compositional, and tend to be highly specialized for their context.]
That's right. What we should
have is loosely-coupled, modular objects that are easy to assemble into complicated patterns, like Unix did with the simple set of its commands.
The degree to which a collection of objects is compositional depends on how well it was designed specifically to be compositional. Objects aren't inherently bad at this, but they aren't inherently good at it either.
Right, but the idea of refactoring OOP, is to design objects to be inherently compositional by agreeing to a uniform message-passing syntax, using >> & << operators, for example.
2) "Objects communicate by sending and receiving messages."
Great, so why don't we have a standardized way to do this? Currently, every programmer invents their own mini DomainSpecificLanguage
for their objects and modules and it's generally not obvious at all what methods are designed to be message passers. Why not have a standardized syntax for this?
In most OO languages, we do have a standardised way to do this -- a method invocation.
Sorry, that isn't good enough. Firstly, methods are used for other things besides message or data-passing [Edit: so there's a conceptual block in parsing the other programmer`s purpose with his/her multi-use methods]. Secondly, there is no standardized way to Name the methods you're invoking and, in effect, you have to learn each module's DomainSpecificLanguage
, along with each programmer's programming style.
Method invocation is the typical way for OO languages with static classes to implement the OO notion of "message passing". The message is the invocation, not just parameters. How do you propose to enforce standardised naming?
- Easy: Standardized syntax. It is proposed to copy the I/O syntax of C++ with the use of >> and << operators. [Edit: No more method names to learn.]
- Isn't the dot "." already standardised syntax for the same purpose?
- No. Because I can't pass in arbitrary messages. [Edit: Also, here's another neat thing: you don't need to worry about parameter lists. You enforce simple, modular objects that only use the message-passing operators. Otherwise you make compositional objects for parsing more complicated data.]
- If you're passing arbitrary messages, how do you expect your receiving object to usefully recognise them?
- By having a new paradigm of "bottom -> upwards". Instead of "Animal.talk("meow")", a top-down approach, you even wouldn't conceive of creating an "animal" object -- who could possibly create such a beast anyway? It's the same mistake that OldSchoolArtificialIntelligence? made, trying to create high-level intelligence involving symbolic logic and high-level constructs that they never could deliver on. Compare to low-level, ArtificialNeuralNetworks?, where you have small, low-level components that interact in complex, emergent ways to create complex behaviors, organically. The latter was more successful in AI, and apart from data struture objects (Trees, Graphs, etc), it will be more successful than the Objects currently in the wild now. Like in NeuralNets?, simple data gets passed between neurons and it works. The key is that there are intelligent actors (users) noticing smart behavior of objects in the community and gluing things together to do even more interesting things (MashUps?). Unlike web objects or database models, these objects will work with user-stored data in their native form. [Edit: you create a convention for easy type-checking of objects being received and throw an exception if it isn't the expected type. Also you make a convention that an object passed as a message has to be an "atomic" type as defined by the receiving object OR a tuple of such that will be processed serially by the receiving object (perhaps on it`s own thread)]. Boom. ...now you're cooking with gas.
- But that doesn't answer my question. In NeuralNetworks, "simple data gets passed between neurons and it works" because the output of an artificial neuron is designed, by the developer, to be compatible with the input of an artificial neuron. The same is true of objects and method invocations -- we design them to work with each other, precisely so we don't have to deal with the (probably requiring StrongAi, and possibly intractable) complexity of processing arbitrary data. Indeed, that's why we define an "animal" object in a hierarchy of objects -- if we define methods common to all animals in a base "animal" class, then we can write code that works with all animals.
[The 'method name' can generally be understood to be part of the message, especially in languages that allow access to generic method calls (e.g. Smalltalk, Ruby, Python). In that sense, we can argue a standard way to send a message (e.g. the `.` operator). But the argument about a DomainSpecificLanguage
for each object is relevant; it is difficult to create generic object-glue, so composition or integration of software components becomes a chore.]
Ah, but in this case, I'm suggesting imposing a complete, universal data model that is inherent and coupled with the new message-passing syntax for the language. It seems intractable, but it`s done. The only job is to get programmers to see the value in the universal DataModel
. My marketing ploy for them is that it will allow the creation of a DataEcosytem?
for the entire Internet. Open-sourced and uncoupled to any proprietary vendor (like CORBA, for example).
3) "Objects have their own memory, which consists of other objects."
Ah, but here the ObjectModel
has not been adequately defined. There has to be a "base case" to this InfiniteRegress
. What objects do integers consist of? I'm suggesting that the OOP paradigm needs to define atomic objects, which are not defined by other objects (its "types"), and then start the data taxonomy from there with the first object (i.e. grouping or container) from that. Prototypes, to me, seem quite an interesting development here. There it is clear, just like C structs, that one is building the Object space upwards, not trying to make CastlesInTheSky?
. The machine forms a common basis for interaction, over everyone's meta-abstractions.
The "OOP paradigm" doesn't need to define anything; it's neither a standard nor a model. Individual languages may or may not define primitives, but "infinite regress" is simply not a problem.
Well, perhaps you have a great little data universe, but if I wanted to use it and share it, I bet it would be more cumbersome than what it's worth.
Sorry, I don't understand.
is not a problem; actually, it's quite elegant (cf. MetaCircularInterpreter
). But it does need to be bootstrapped. In any case, not every object needs to be stateful, and there are good arguments in favor of object-systems that push all state to the edges (i.e. into external databases).]
If an object isn't stateful, then it belongs to the machine types, otherwise why do you have an Object?
If by "isn't stateful" we mean objects whose state is immutable, then we may usefully create immutable ValueObjects. If by "isn't stateful" we mean no state at all, then classes are sometimes used as façades (see FacadePattern), or as abstract base classes to define the least specific type of an inheritance hierarchy. Objects can be non-stateful without being machine types.
No, I mean "isn't stateful" in the sense of a (non-static) function.
How does that relate to your question above, "if an object isn't stateful ..."? Do you mean a FunctorObject?
That wasn't a question, that was a challenge to you. Implicit in that challenge was a claim (from me) that all objects should be stateful. Perhaps there's all sorts of strange, toy or novel objects which are stateless, but they would seem useless. In fact, in my refactoring of OOP, there is added a special state-query operator
("?"). That, plus a cloning function (like in PrototypeProgramming
) would be the only methods every
single object may need. With perhaps a couple of others to add persistence. I'm suggesting (without a thorough argument) that all the other types of programming styles (using Functors, etc) are simply dead-ends after making a UnifiedDataModel
coupled with a DataEcosystem
that is completely free and open-sourced.
[Stateless objects still have constructors, still encapsulate behavior and data. Updates can be modeled indirectly: instead of mutating state within the object, return a new object of the same type. LucaCardelli
indeed starts by exploring such an approach to objects. If you combine these objects with LinearTypes
, you can even enforce update-like behaviors, though you'd lack aliasing and shared state (i.e. where you have multiple direct references or pointers to the same object). Aliasing can always be modeled, of course, with another LayerOfIndirection
for explicit names (cf. ConceptOrientedProgramming
); ad-hoc aliasing isn't an essential property for OOP, but it is a common one. In any case, I didn't suggest removing all state from the system, just pushing it outside our programs (e.g. into theoretical objects outside our programs - again, InfiniteRegress
is not an issue, just a bootstrapping problem). Doing so is highly advantageous for LiveProgramming
, i.e. such that we can tweak our code or the data from which they were constructed, and recompute the objects on the fly. The vast majority of ObjectOriented
design patterns remain accessible even using externalized state. If you still consider such stateless objects 'useless', I suggest that's a limitation of your imagination and experience rather than a limitation of stateless-object models. If it doesn't match your vision, that's fine, just remember that other people have other visions for ObjectOrientedRefactored
4) "Every object is an instance of a class (which must be an object)"
Here, I read the world "class" much like prototypes or C structs (or "types" in other environments). Here's again, there must be a clear view here to avoid the infinite regress. CeePlusPlus
makes this distinction by having types separate from the instantiation of same. But that regress must end somewhere, and for C++, it's types. One does not ask: "Well, then what is the type of type?" There isn't one! The ObjectModel
in C++ ends there! Types are rooted in the machine architecture, but Python and "pure" OOP languages are not. And this is where things get hairy and magikal.
[No, you don't need the regress to end somewhere. You could, like I understand smalltalk does, have a fixed point.]
Yes, and that fixed point would be the end to that regression. Huh?
5) "The class holds the shared behavior for its instances (in the form of objects in a program list)"
In other words, the "class" holds the template or the definition of each object instantiated (or "created") from it. Hence, I think the word "prototype" is very apropos.
Sure. But this is a terminological quibble rather than a conceptual problem.
I think the terminology here is actually a bit vague: do we really want Object methods to be in the form of [smaller] objects in a program list? No. I don't think so. I think I want functions to be different from objects.
Functions are different from objects, in most object oriented languages. What do you mean by "do we really Object methods to be in the form of [smaller] objects in a program list"?
Sorry, I missed a word: "do we really *want* Object methods..., etc.". But maybe the notion from AlayKays?
is itself vague,
Of course, it is possible -- and perfectly reasonable, and sometimes useful -- to have object oriented languages where functions are also objects.
Well, this is the point where computer science left reality. See ComputerScienceVersionTwo
Why do you feel "computer science left reality" here? There are practical reasons for functions to be objects.
There seem to be, but I think not. This is where the conceptual confusion between WhatIsAnInterpreter
and what is a compiler appear. In a compiler you have to have functions as objects sometimes to maintain any level of indirection and abstractibility, but in the proposed revamping of OOP, all those types of flexibility stays within the (pre-compiled) interpreted language that is holding your DataEcosystem
. Your newly refactored OOP paradigm imposes an implicit DataModel
but that imposition is ignorable because it encodes a universal UnifiedDataModel
making any academic criticisms about loss of some pet key terminology TLA ignorable also.
6) "Classes are organized into a singly-rooted tree structure, called the inheritance hierarchy. Memory and behavior associated with instances of a class are available to any class associated with a descendant in this tree structure."
This is also where, it is argued, where purity hammered practicality into oblivion.
In what way?
This can be hard to see because the community has maladapted to just coding up one's own little domain languages. This is the common practice. In the ZenOfPython
"there should be one and only one obvious way to do it".
It was a great idea at first, to get completely away from the machine (i.e. "wouldn't that make inter-operability so much easier?") But no, it hasn't happened. After the unification things got more complex in most ways. And the reason: because the object model must be rooted in data, something concrete, not in the abstract which means many things to everybody and anything to most others. We're not making physical objects
in a simulated environment (which could have served as an alternate "grounding" source), so let's not do it anymore. See the ObjectOrientedLandscape
Sorry, you've lost me. It's true that we're not making physical
objects in a simulated environment -- unless we're making simulations -- but we are making computational objects in a computational environment. In short, we're making virtual machines to solve data processing and control problems. See OopBizDomainGap and ComputationalAbstractionTechniques.
But that's inverted and OldParadigm?
. We don't have "data processing and control problems", businesses have them, because they're generating all sorts of "business" data. We
have a great tool called Internet that doesn't have a way to generate value out of the vast resources a data out there where AllDataRelatesToOtherData
. Great silos of cool data with no real way to link it all together -- for a better society and wiser, more informed society, not "for business". [Edit: IOW, we don't want "computational objects" from the user`s point of view, any more than you want your view of your file system to show the tree of i-nodes which link the various data groups together. From the point of view of the user of the DataUniverse?
, we want a ThreeDimensionalVisualizationModel
, making the complexity of the DataUniverse?
I think you've interpreted "data processing and control problems" in a more narrow way than was intended. All programming deals with data processing and control problems, and all programs are data processors. The terminology may evoke "business" associations, but it applies to software and computing in general.
The hierarchical object model is not about inter-operability, but about being able to create a consistent approach to storing objects in containers and collections in the absence of generics or templates, and being able to leverage the LiskovSubstitutionPrinciple to code against generic interfaces with multiple concrete implementations.
Yes, but don't miss the forest from the trees: you don't want to deal with the mess of protocols that make TCP/IP and the Internet work -- you want a WWW browser that lets you see it all from the outside-in.
Well, that the essence of my stating as a "grounding" source. But my argument is perhaps only a personal one: it is more
foundable to have a Object hierarchy rooted in the machine, not some abstraction.
So what's to replace it? Here are proposed rules for ObjectOrientedRefactored
1) Objects exist within the concreteness and limitations of the machine. (Just GeneralPurposeComputer
here, not LambaCalculii?
2) Everything outside the concrete types is an Object
3) Objects talk
to other Objects, otherwise they would be a machine type
4) Objects should communicate to other objects via uniform syntax
. ("<<" or ">>")
5) Since Objects are stateful
, there should be a simple, universal way to query state
6) In a DataEcosystem
, Objects should have a clone
(or copy) function to transfer a copy to your own machine.
These simple operators should be the only methods for every object
. Object names
will inform and do everything else for documenting most everything needed for the class (see NameConceptualUnits?
). This will create a "plug and play" OOP model making it much easier to re-use code. There may be no need for any other method or function ever
. The only other primitives would be functor
objects. A functor object is for giving a result, given an input, and may or may not contain state. --MarkJanssen
Isn't this simply the OO model used in, say, C++ and Java?
- What should be the "state" of a collection object? A recreatable reproduction of its contents. A flag within the interpreter environment can set whether objects should be sorted for display or not, rather than have each class creator debate with him/herself.
No, not really. With the uniform syntax, there is no longer parameter passing, for example. It's a completely different frame of mind
. You build objects up from simple primitives that only have >>, <<, and ? operators. The standardization on MessagePassing
makes everything different.
One could argue that it is partially
the model used in C/C++ and Python prior to the great TypeClassUnification?
, but for Python no longer. The point of this treatise, is that Python's journey into abstraction went too far. I don't know what other programming languages use a abstract Object type for the base of all other types. The answer probably has to do with languages that have DynamicTyping
Java has an abstract Object type for the base of all other types. It has StaticTyping.
My bad, I guess I mean something like lexical
typing - types purely maintained by the language definition, not by the machine.
What you mean by "types purely maintained by the language definition, not by the machine"?
It is proposed to copy the I/O syntax of C++ with the use of >> and << operators.
What is this nugget of irrelevantly-specific syntax doing in an otherwise conceptual treatise?
Ummm, because I like it and didn't want anyone else to use it? :^> -- MarkJanssen
The main problem with object orientation is TightFieldCoupling
, in other words, the members of each object are tightly coupled to each other. This means that if you want to only use some of them, or you want to add more, you have to do this in multiple places in your application and then recompile. Worse yet, if you want to do clever things with your fields like display a list of them, you must use reflection or some other clumsy approach to get at the compiled definitions. The problem isn't really the fault of object orientation because it actually predates object orientation. The problem was introduced when we introduced structures. The structures paradigm has been useful for many years but it is now running out of steam. My KnowledgeDatabase
idea might help. -- JonGrover
- Thanks, Jon for your input. The point of "shoe-horning" into the message-passing model is to make everything loosely-coupled, and modular. You build-up types from simpler primitives which only have >>, <<, ?, and copy operators into complex objects whose very same operators now do much more complicated things or take/give much more complicated inputs/outputs. The resulting "complex" type that your argument says is tightly coupled, I think is a minor issue: mostly because the uniform syntax makes it easier to conceptually parse what the object is doing at every level -- there's no linguistic verbiage to distract and decode. Additionally, the community would probably create naming conventions for objects that make it unnecessary, kind of like HungarianNotation did for C.
[What does it mean for objects to "talk" to other objects? Is this more interrogative and interactive, negotiations and commitments, or something? How do we model a conversation, or concurrent conversations? Will we need a RobertsRulesOfOrder
for OO systems, and a model for filibustering?]
That is already handled by TCP protocols. A DataEcosystem
only survives by CrowdSource
ing the management. It becomes an AdHocracy?
where errors are eventually bubble up, to be corrected by highly-ranked users (cf UserRanking
, and GlassBeadGame
Cryptically short story of a long answer: Get rid of closures
. See ClosuresConsideredHarmful
Why? They're just a tool. I find your suggestion rather peculiar, as it's akin to "get rid of phillips screwdrivers" or "get rid of while loops". Why closures and not, say, default parameter arguments or "do until" loops?
Because people argue that ClosuresAndObjectsAreEquivalent
They are conceptually equivalent, but it depends on the specific language what that equivalence means. In some LISP dialects (for example) they are the same thing, and distinguishing objects/instances from closures -- either conceptually or syntactically -- is largely meaningless. In some OO languages with static classes, they provide a means for capturing lexical context by an anonymous method or function but are by no means equivalent to defining classes. In either case, they are useful. See http://en.wikipedia.org/wiki/Closure_(computer_science)
Perhaps I'm being thick but what actually is the problem with "just coding up one's own little domain languages"? I have a general purpose language, whether C, C++, Java, Smalltalk, Forth, or Python, and a problem to solve. An inescapable part of solving any substantial problem is transforming the language environment into the tool for the job, and then doing that job. It's perhaps harder to see this in languages with reserved words, but surely the overall progress of BaseLanguage?
or Library->toolCrafting->applyingTools is pretty universal? What is the alternative other than starting with a domain specific language?
Coding up a mini domain language is de rigueur for solitary programmers, but the issue is how to create a community of reusable, sharable code. To test yourself, ask: How many mini-domain languages of others do you use (outside the standard library)? We have a giant network called the Internet, I argue that we need loosely-coupled, modular objects that talk to each other in a standardized way so we can have a DataEcosystem.
But data is about
something and the universe that we collect data about is too complex to ever be standardized in a meaningful way (assuming that Gödel doesn't simply preclude it anyway). The only hope to manage it is to create domains which are specific to problems. That's why industries have jargons rather than just using "standard English". Jargon is just another form of DSL and exists for the same reason - a language which covered all possible manipulations of meaning for all possible data would be so massive and complex as to be useless. We manage data processing (whether in computing or not) by defining ad-hoc what operations are important and deserving of "head space" based on what aspect of the data we're interested at a particular time. If those definitions are useful they get passed around - as words or code libraries or interface definitions - amongst the community who share the same concerns, perhaps leading to a Standard being formally defined.
Aye, that's also why there is a UnifiedDataModel to bring order to the mess and be able to
scale. There is also, by necessity, a VotingModel -- all tending to an ultimate, social, FractalGraph like the WWW itself. Only a ThreeDimensionalVisualizationModel could handle the scaling of the conceptual complexities. But your point is noted, the complexities are vast, but with CrowdSourced data and the VotingModel it is tractable.
Consider your criticism about the complexity of the data universe a bit like the world of physics and chemistry before the development of the PeriodicTable?
I think the generalized DataEcosystem
you're referring to is utopian.
That is correct. That's the good news. But to bring it into believability, compare the idea of the LibraryOfCongress cataloging system used in academic libraries--it holds all the world's knowledge, as diverse as it is, in a systematic, usable way. Now take it further until you get down to chapters, paragraphs and individual words (like WikiWords?) and you start to see that it is completely possible.
See also GlassBeadGame
[I think what you're describing, then, is not necessarily utopian but something we've seen before on a smaller scale: an OODBMS. A distributed OODBMS (a DOODBMS?) for general-purpose knowledge management could be an interesting thing. I imagine it would be like OpenCyc
combined with http://archive.org/stream/implementationof00wyri#page/n7/mode/2up
Yes, interesting! It is a bit like that, rather than a DOODBMS, I would call it a folksonomy
[As a Java developer, I make extensive use of libraries developed by others, and those libraries make extensive use of other libraries, and in my products I make libraries available to other developers. This appears to be typical of C, C++, Java, .NET, Python, Ruby, Perl, Haskell and LISP development. I'm sure it's true of every popular language, as no individual has the time, speed or patience needed to develop all necessary components from scratch. There are projects specifically about creating a community of reusable, shareable code, such as Apache, RubyForge
, CPAN, etc. Of course, this is all geared toward creating software, not a DataEcosystem
per se. If you're looking to create an environment of live, interacting, distributed objects in a more global sense (and you're not re-creating CORBA! :-), perhaps you want to look at the ActorsModel
Thanks for those tips! I may indeed be re-creating CORBA, but I think not.... <Cheeky grin and duck...>
See also: DataModel