Date And Darwens Type System

In TheThirdManifesto, ChrisDate and HughDarwen propose ways of "marrying" OO and the RelationalModel; a key part of that is a type system with some interesting properties. (It is claimed by others that this type system is insufficiently rigorous to be useful; see bottom of this page.)

Actually, there are three versions described in TheThirdManifesto. In the base version, there is no inheritance or subtyping. In the "second" version, subtype polymorphism is added, but only single inheritance is allowed ("inheritance" is used loosely here; as the mechansim for subtyping isn't necessarily inheritance). In the "third" version, a limited form of MultipleInheritance (SingleMeetMultipleInheritance) is allowed.

The text below applies to the second and third versions (unless indicated otherwise).

Key claims thereof:

(More refinement to come. -- DanMuller)

[Stuff moved from NominativeAndStructuralTyping; needs ReFactoring]

[[Then I think you'll be very interested to see the view in the type theory linked to OoLacksMathArgument -- DougMerritt]]

On inconsistency and undecidability:

Well, D&D do have a seemingly complete treatment, except where it leaks: here and there, especially in the essential parts. Most notably a compiler/database system, type checker, etc, will not be able to decide if the type system declared by the user is sound. It is also unclear what exactly defines a type, what are the programmer's responsibilities with regards to declaring types, subtypes, "posreps", operations, etc. Well, with a little bit of imagination, one can fill in the dots, and arrive at a type system that will have unsuitable properties for building software systems. The jury is still out until they do actually come up with a rigorous and complete proposals for a type system, rather than sketches of this or that principle. Their documentation with regards to type theory is especially weak.

The main thing to remember about types, however is that they are a syntactic tool for abstraction. A type system will have the essential role of realizing type judgement, assign a type to a legal syntactic element in the language, and that type assignment will ensure that the expression can be evaluated, and when evaluated it will yield a value in a certain domain. But the key thing to remember is that they are a tool for abstraction, and a syntactic tool at that. The typing judgement should be deterministic, and computationally feasible.

Hear, hear. This is one reason why DependentTypeSystems? have not become popular.

All the other consideration cannot be entirely separated from language design pragamatics, there's no such thing like the right way of constructing a type system, which D&D naively imply (they even have a pseudo-theorem stating that if "type inheritance" is to be supported, then it will have to conform to their model). Different languages take different decisions that are motivated by many trade-offs in the overall exercise of language design for real.

[[One of the most basic tradeoffs, in fact, is whether the type system's power should be purposefully crippled to force it to be decideable (which some think is the only way to go), or whether it should be as powerful as possible, and thus undecideable (as others think is the only way to go).]]

[[It doesn't seem to me to be possible to "prove" that one or the other is the only way to go, because both have been done, and both have separate strengths and weaknesses, and so far, some prefer the one while some prefer the other.]]

On convertibility:

One thing I have wondered about is whether a "looser" type system based more on convertibility between types, rather than superset/subset relationships, would work within their relational algebra context. (Sorry if this is getting a bit off-topic for the page -- feel free to move this discussion elsewhere.) -- DanMuller

[[What do you have in mind? The most general kind of "convertibility" is any function whatsoever that takes as parameter an object or value of type T1 and returns one of type T2, but that's too general.]]

Well, I haven't thought it through much, but I was thinking in terms of discarding the notions of supertype/subtype and IS-A, and instead focusing just on programmer-defined implicit convertibility between types. In other words, the programmer defining a type can define explicitly how values of that type can be converted to or from other types. Thus you get a web of connections based on convertibility rather than IS-A or set relationships. This seems like it might be a bit more -- flexible? pragmatic? I dunno. It doesn't directly address the semantic-substitutability view of types, I guess, but remember that I'm coming from a focus on D&D's point of view. Also, I've been developing a dislike for the object-message type of OOP in favor of the more flexible and general multimethod style of polymorphism; this probably also has implications for how one prefers to view types, by eliminating the special status of the message receiver and putting all "objects" back in the role of data passed as arguments or returned as values.

[[Interestingly, PerlSix's ConvenientObjectOrientedLoopbacks and implicit conversions let you specify things like this, and it also has multimethods. Coincidence? Is LarryWall editing this wiki?]]

On Everything but RelationalVariables should be immutable

Yes, DateAndDarwen's type system is based on immutable values. They make a sharp distinction between values and variables, eschewing the term "object". I find that their insistence on a sharp distinction between "value" and "variable" clarifies certain things; the definitions of these terms are more easily agreed on than that of "object", and are sufficient. They do allow for indirect references to variables, via in/out parameters of operators and via a downcast operator. (They call these references "pseudo-variables".) Binding a variable to an in-out parameter requires that the type of the variable be a supertype of the parameter's. Downcasting a variable requires a run-time check of the variables value to ensure that the value conforms to the target subtype. -- DanMuller

Since the only primitive mutator of a variable is assignment (other notations being merely a shorthand for some combination of value-construction and variable-assignment), it seems to me that type coercion might suffice in place of supertype/subtype relationships.

Demonstrating liveness isn't that difficult, because in D, all objects are stack allocated. DateAndDarwen never explicitly mention this, but it's a consequence (possibly unintended) of their variable/value distinction.

Surely you mean, an implementation of D could always stack-allocate objects.

Variables in D are never aliased: there is always exactly one copy in scope at a given time. If they were, you'd run into the difficulties described below (and in Appendix G). It'd also violate OO Proscription 2, "No object identities".

Therefore, you know that when the variable containing a value gets assigned to, mutated, or goes out of scope, the existing value is dead. Thus, you can always use in-place mutation for assignment, syntactic sugar for "update" operators, and stack allocation of AllocationRecord?s.

A corollary of this is that any value with indefinite extent must be global. In other words, we're programming in pointer-less C (with a database, of course). Whether this is a good thing is somewhat debatable.

This doesn't apply if D supports LexicalClosures or any other form of non-LIFO scoping, but if you have any sort of nested scopes, you're allowing aliasing in the presence of update operators. I have a moderately long analysis prepared on how this would make TheThirdManifesto inconsistent. I snipped it because I realized that as written, the prescriptions in the manifesto aren't inconsistent, so it would just be wankery on my part to post it. -- JonathanTang

On Pointers are excluded:

One thing that I feel might be missing from DateAndDarwen's type system is a means of having one variable refer to another; in other words, there is nothing directly recognizable as a pointer. I think that they might argue that the relational aspects of the system, which define tuples and relations as the only aggregate types, can fulfill the roles that pointers or GeneralizedReferences would. It is indeed a very relational-model-centric view of programming. -- DanMuller

I thought they were pretty clear why pointers didn't work, and it had nothing to do with physical addresses. It's all about aliasing. If you have any sort of reference, then you may have multiple variables that might be affected by any sort of mutation. That's the whole point of references. In fact, you can think of a mutable "object" as being a set of variables that all change their values simultaneously when one of them is updated.

This runs into problems with SpecializationByConstraint, as D&D explain in Appendix G. To recap their argument (almost verbatim), consider the following code:

C := CIRCLE(LENGTH(3.0)); E := C; THE_MINOR_AXIS(E) := LENGTH(4.0));
This is all legal TutorialDee code. The first assignment just invokes a Circle selector to assign a particular value. The second assignment is legal because Circles are substitutable for Ellipses: every Circle is an Ellipse. The third assignment is legal because the declared type of E is Ellipse, and THE_MINOR_AXIS is an operator defined on Ellipses.

Under normal TutorialDee semantics, this presents no problem. GeneralizationByConstraint? occurs on the third assignment, so E now holds an Ellipse. And only values are stored in variables, so this doesn't affect C at all.

But if you allow references or ObjectIdentity, then C would have to change with the mutation of E. But Circles don't have a minor axis! How can you set the minor axis when it doesn't have one? You'd either have to break the constraint that a circle's minor axis equal its major axis, or you'd have to change the major axis too, which isn't what the code asked for.

That's why D&D speak of the "FourOutOfFiveRule". You can have 4 of the following 5 language features: [stuff refactored to FourOutOfFiveRule]

[Moved from TutorialDee]

One of the anonymous writers here has a rather dim view of D&D's work, at least partly unjustifiable, I think. In particular, this statement: "It is also unclear what exactly defines a type, what are the programmers responsibility with regards to declaring types, subtypes, "posreps" [sic], operations..." is patently incorrect. These issues are made quite clear in the TheThirdManifesto, and the suggested syntax for the admittedly (by the authors) incomplete language "Tutorial D" even includes all the constructs necessary for this. I'm under no illusions that their type system is the only possible right one, nor am I even convinced that all of their stated requirements are really necessary to support their goals with respect to relational rigor, but I've yet to see (here or elsewhere on this wiki) any significant, valid criticisms of it. -- DanMuller

Valid criticisms are to be expected when we will have a formally defined language with that particular type system. There's nothing to criticize about language D, as there's no such thing as language D. No one should torture himself to reconstruct a decent semantics of D from all kinds of informal discussion, that would be just a waste of time. Take the following example:

 TYPE point is
   POSSREP cartesian { x double, y double}
   POSSREP polar { r double, theta double }

Is this enough to properly define the type point? It is, according to Tutorial D syntax. The semantics however, are left as an exercise:

 POINT p= cartesian ( 1, 2);
 THE_x(p) := 2;
 THE_theta(p) := PI;

How about defining a + operator?

OPERATOR '+' ( P1: POINT, P2: POINT ) RETURNS POINT BEGIN /* what goes in here? */ END

Similar holes are there with regards to subtyping and many other crucial aspects. Of course a language without formal semantics and without a serious implementation exercise is doomed to have many holes, so the presumption should be that it is incomplete until proven otherwise. A collection of BNF forms together with half a book of "informal discussions and explanations" do not a language make.

The fact that some parts of the language are not specified does not automatically invalidate other parts of the language. And the semantics of the language are fairly well defined in the book. Make sure that you're looking at the second edition of TheThirdManifesto, because it was considerably revised and expanded compared to the first.

DateAndDarwen explicitly declined to fill in details of defining and manipulating actual representations in Tutorial D's syntax, although in spite of this they give some speculative examples.

They do not define the semantics of the constructs in a rigorous way.

In fact, you can easily define an addition operator with what is given in the book -- you don't need actual representations to do that. (Although they do not address the definition of infix operators, nor is that in any way necessary.):

    RETURN cartesian(THE_x(P1) + THE_x(P2), THE_y(P1) + THE_y(P2));

What is THE_x and THE_y for a point value constructed as polar(r, theta)?

So your specific criticism is invalid (and the language need be no more complete than it is to determine this).The specific criticisms raised on NominativeAndStructuralTyping (where my comments originally appeared) are addressed in the book, contrary to the claims made on that page.

Try again.

I really don't understand what you consider informal about DateAndDarwen's treatment; it's far more rigorous than the definitions of many other languages that have actually been implemented. (Cf. for instance Ruby or Perl.)

Part 3 of the book that is suppose to define the "meat" of the language has the title "informal discussions and explanations", and that's exactly what it is, they are not a systematic attempt in any way shape or form to define the semantics of the language. At the very least Ruby and Perl are defined by their implementation, and have a large body of documentation. Implementation of Tutorial D would be difficult not because of a lack of completeness or rigor in the language's definition, but rather because of the demands imposed by its relational operations. Some aspects of the type system that might be difficult to implement (e.g. rigorous type system verification) are quality-of-implementation rather than correctness issues, and are mentioned explicitly by D&D in footnotes or appendices.

That "quality of implementation" issues may very well turn out to be a provably impossible task, or unnecessary complexity that the system leaves as an exercise for the user. The idea that somebody can draw a "model" and implementation issues are left as an exercise for an implementor has proven consistently bad.

It's also worth noting that at least one company has a commercial product based on TheThirdManifesto and TutorialDee; I haven't been following that product's progress closely enough to know how the extent to which they adhere to Tutorial D's syntax or feature set. -- DanMuller

Well, then do not bring it. In any case, debating a language that is undefined is not worth the exercise, but it is close to ridiculous that RelationalWeenies present it as a model of language design or a model of a good type system, taking advantage that not many people have time to read TheThirdManifesto by DateAndDarwen. -- CostinCozianu

So are you saying that such books should not be written, or read, because they're worthless without working all these issues out in implementations? By this logic, E. F. Codd would never have published his works on relational theory. In other words, what's your point?

The difference is that whatever Codd published had a well defined semantics, he formulated in mathematical terms what a relation is, what are the relational operators, he also offered the pragmatic justification.

Part 3 of the book is a set of informal discussions expanding on the terse formal prescriptions, proscriptions, and suggestions of Part II, Chapter 3. They do not stand in place of formal descriptions. (Did you really read the book?) did you?

[Factually inaccurate discussion of the inheritance syntax removed. The BNF rules for type inheritance are given on page 261]

If you find it objectionable that the details of how to get from the actual representation, whatever it is, to either polar coordinates or cartesian coordinates, then you're quite missing the point of the book. If some people are holding TheThirdManifesto up as an exemplar of good programming language design, then they also are missing the point. The book is about investigating the requirements for relational database systems. The authors acknowledge that there are open issues that require further investigation; if you can come up with concrete arguments as to why those issues are unresolvable or debilitating, I'm sure that the authors would be interested in hearing them. As it stands, I think that their type system is quite interesting. I agree that a concrete implementation based on these ideas would be illuminating. -- DanMuller

[Whoa, hold up, guys, things are drifting out of control. As a not-so-innocent bystander, it seems clear to me that the central argument was already clearly stated: ``"It is also unclear what exactly defines a type, what are the programmers responsibility with regards to declaring types, subtypes, "posreps" [sic], operations..." is patently incorrect''.]

[Everything else that doesn't appear to address that head-on is a digression from the current argument.]

[Costin's point appears to be that, since the semantics are left as an exercise, therefore you shouldn't claim that a type theory is actually being presented after all, if I understand correctly. You are saying, "it is too complete enough on that topic", I believe. Followups presumably should stick to specifics supporting those two views, right?]

[I haven't read the book in question, so I don't have an opinion, but it seems like both of you would be more clear if you did stick to very specific examples and such.]

OK, then I'll have to wait for Costin to post more specific objections that haven't already been refuted. The issue of how to define and work with actual representations is immaterial to the type system, so if he's objecting to the type system per se, this was an irrelevant digression.

You already missed one objection, it is in the code above. What exactly does THE_theta(p) means for a value p that was created with cartesian(x,y). The issue of how to define and work with actual representation in presence of multiple possreps, and in conjunction with subtyping (that can introduce further possreps), and operator definitions, is paramount to the design of the type system.

{ I haven't read TheThirdManifesto yet, but Date addresses this issue in AnIntroductionToDatabaseSystems (the Types chapter, p. 119 in 8th ed.). The idea is that you have to create functions to translate between possreps:

 operator the_theta (p point) returns rational;
     return ( arctan ( the_y ( p ) / the_x ( p ) ) );
 end operator;

- SomeGuy? }

[Actually, the type implementer has to choose an actual implementation and then to provide operators to convert between the actual representation and the possible representation(s). If there is only one possible representation, the system can provide the operators by using the possible representation as the actual representation, as the operators are trivial in this case. That's the way things work in Duro. -- ReneHartmann?]

You either have a type system that works or one that doesn't. And the only way to prove that an actual type system works is to integrate it with a complete language design, and show how it addresses the major problems that any language design has to face. In absence of that, all you have is castles made of sand, and nobody is going to take that kind of thing seriously.

If he objects specifically to the incompleteness language design, I can only say that designing a specific language was not the author's primary goal, and he's right that arguing over the language's details is likely to be wasted effort. I disagree with his apparent opinion that this negates any value that the type system discussions of the book might have.

Actually, I'm hoping to see some more discussion here (was: on NominativeAndStructuralTyping); I think that was developing in a direction that was more interesting and productive. -- DanMuller

Thanks for the insightful comments, Jonathan. After reviewing that part of Appendix G, I understood your explanation of the consequences of aliasing and nested lexical scopes immediately.

Note that there is, of course, a form of aliasing implied by virtual relvars. I think that using a TREAT_DOWN operator in a relational expression that defines a virtual relvar can cause an interesting problem. Inserting a tuple containing supertype value in a referenced base relvar would cause the virtual relvar to become unreadable, due to run-time type errors. One could imagine a similar approach to aliasing of variables, although the overhead would be high. Given the defined parts of the language, and allowing nested lexical scopes, a TREAT_DOWN operator can facilitate aliasing of a variable via an update parameter, as you implied. The initial invocation of TREAT_DOWN does a run-time check of the source to make sure that its MST conforms. In analogy to the use of TREAT_DOWN in virtual relvars, I can imagine tagging the referencing variable (parameter) in a fashion that requires the check to be repeated every time that the variable is read. This approach could, of course, be extended to pointers, too. Not very attractive from a performance POV, perhaps, but it seems consistent with the aliasing that occurs in relvars. -- DanMuller

View edit of March 2, 2011 or FindPage with title or text search