Nominative And Structural Typing

One key attribute of typing systems is the issue of nominative vs structural typing. (Like many things, this isn't strictly a binary choice - there are typing systems which have attributes of both; however it is useful to consider the two boundary cases). For any typing system, there are two important predicates that must be answerable:

In structural subtyping, the answers to the above questions are dependent on the structure of types. In nominal subtyping, the answers are dependent on explicit (or sometimes implicit) declarations by the programmer. Consider the following declarations, in a C-like syntax ("record" rather than "struct" is used intentionally, as C structs impose an ordering on their members).

 record PolarComplexNumber 
    double phase, magnitude;

record VelocityVector { double phase, magnitude; };
In a structural typing system, the two types would be equivalent, as they have identical structure. In a nominal typing system, the two types would not be equivalent, as they have different names (PolarComplexNumber vs VelocityVector). [There is a mix here of nominal within structural; a true structural Velocity would be Velocity == { double, double }, i.e. no names to the columns of the tuple] Next, consider the following:

 record VelocityVector3d
    double phase, magnitude, azimuth;
This new record adds a third field (azimuth); it otherwise looks like VelocityVector (and PolarComplexNumber, for that matter). Is is a subtype of VelocityVector? In structural typing system, the answer is yes [Disagree: in structural type systems phase is (double, double) -> double, first projection, and therefore not applicable to (double, double, double)]. In a nominative typing system, the answer would be no; to get the subtype declaration one must explicitly declare it:

 record VelocityVector3d
    inherit VelocityVector;
    double azimuth;

Nominal typing is a "subset" of structural typing, in that two types cannot be equivalent in a nominal typing system unless they are structurally equivalent (else, the typing system would be unsound). Likewise for the subtype relationship.

Most statically-typed production languages, including AdaLanguage, CeePlusPlus, JavaLanguage, and CsharpLanguage, are (for the most part) nominally typed. Many of the statically-typed FunctionalProgrammingLanguages, such as HaskellLanguage and the numerous variants of ML, are structurally typed. C++ is an interesting hybrid, actually. At a base level, C++ is nominatively typed--two classes with different names are not considered equivalent; and subtyping must be explicitly declared (via inheritance). However, the template system uses structural typing--any type can be an argument to any template; and the compiler won't complain unless a particular instantiation of a template is incompatible with its declaration. (See LatentTypesSmell for more on this).

For languages with DynamicTyping, such as SmalltalkLanguage, the issue is somewhat moot. The runtime typechecking done by such languages is usually of the form "does object X have method/attribute Y", as opposed to "is object X a subtype of type Z" - so the fundamental questions discussed above are less often asked in a dynamically-typed language (though they do get asked). In general, most dynamically-typed languages behave like structurally typed languages; as the "does object X have method/attribute Y" question is a structural query, and not a name-based query.

Which is better? There are advantages to both approaches. Structural typing is arguably more flexible - one common complaint in JavaLanguage is that two classes with features in common cannot be used polymorphically unless they both share a base class/interface which lists those features. Expect JavaGenerics to amplify these complaints. On the other hand, it is very common that two objects that are structurally equivalent are semantically different--the canonical case of this are "units" objects--objects intended to represent some physical quantity, expressed in some particular UnitOfMeasure?. As NASA found out, passing an object of type DistanceInInches? to a function expecting DistanceInCentimeters? can be problematic--especially if the requisite conversion is not performed. These two abstractions might well have the same internal structure:

 record DistanceInInches {
    double d;

record DistanceInCentimeters { double d; };
In a structural type system, the two types would be considered equivalent.

Support for labeled records allow structural typing to regain the flexibility nominative typing possesses in distinguishing similar types. The 'label' isn't a declared name implicit to the environment or unique to a given type; instead, it is part of the structure and must be part of each value and included for pattern-matching. For example:
 T1 = polar(phase=Double magnitude=Double)
 T2 = velocity(phase=Double magnitude=Double)
 T3 = velocity(phase=Double magnitude=Double azimuth=Double)
 T3 < T2 structurally (you may use T3 wherever T2 should appear)
 T1 != T2 due to different labels (which affects pattern-matching functions)
These are, generally, the same labels used to distinguish constructors in a union type:
 Maybe A = just(A)|nothing
 List A  = {type L: list(head=A rest=L)|nil}

Is there a specific citation saying that NASA's bug came about because of a problem modeling measurements in their typing? Without something specific, it might be worth pursuing a better example than the one given here, since sample code like you mention above would be considered poor design by many programmers pursuing either a nominative or structural mindset. -- francis

Right, I'm aware that the MarsOrbiter bug had to do with units of measurements, but I think you could argue that the bug was simply due to some design mistakes that should've been corrected regardless of what language was being used. Basically I think it might be useful to come up with a less-arguable example in that case. -- francis

[Not to mention that the MarsOrbiter problem was, IIUC, a human failure to generate data according to the specified unit system, not a software failure as such.]

I was aware of this distinction in typing, but hadn't heard of these names for the two kinds of typing. Thanks for the lucid definitions.

I have an issue with your second example, however. At the beginning of this page, you appear to subscribe to the definition of types as sets of values. (Cf. definition of "subtype predicate".) How can VelocityVector3d by a subtype of VelocityVector, if in fact a variable of type VelocityVector3d can hold a wider range of values than a variable of type VelocityVector? This question points to a whole can of worms opened by treating "class" and "subclass" as equivalent to "type" and "subtype", a practice common to languages such as C++ and Java, and a topic discussed e.g. by ChrisDate and Darwen in TheThirdManifesto. -- DanMuller

I think you are confused by the seemingly logical and principled approach with regards to typing in TheThirdManifesto. First of all, you repeat a mistake from one of ChrisDate's article at dbdebunk - VelocityVector3d does not hold a "wider" range of values. If double has unlimited precision (i.e. is not contained within 64 bits or 80 bits or something like that), then from a set theoretic perspective, the sets (double), (double x double), (double x double x finite_set), have all exactly the same cardinality. But the point is that the set theoretic perspective has only very little to do with typing. Types are not sets of values, nor is type theory the same as set theory.

To cut a long story short, in presence of subtyping aVelocityVector defined above as:

 record VelocityVector
    double phase, magnitude;
Can be elaborated as "the type that comprises all valid syntactic elements E for which the expressions E.phase and E.magnitude are legal. As such it is obvious why VelocityVector3d is a subtype of VelocityVector?.

[To answer both sets of questions: There is a lot of disagreement within the academic community at large, as to when subtyping is appropriate. Many texts on typing theory, as well as most OO and many functional programming languages, do treat "extended aggregates" (where one aggregate type includes all the fields of another, and then some) as subtypes. It is often questioned whether or not this is a good thing to do. From a type-theory point of view (which is only concerned with making sure that operations on a supertype are also valid operations on a subtype - semantics notwithstanding), this is a correct thing to do. In other words, subtypes are not subsets - instead they can be supersets.

ChrisDate, and others, prefer to limit subtyping with subsets. The positive integers is a valid subtype of the integers (being a valid subset) in this view; but VelocityVector3d is not a valid subtype of VelocityVector? - despite the same operations (projection of "phase" and "magnitude") being available.

Many operations on elements of a set are not valid operations on a subset (or are otherwise not closed over the subset) unless the operation is redefined. Subtraction is closed over the set of integers; but not over the set of positive integers, for example--although 3 and 4 are positive ints, 3-4 is not. Where mutable objects are concerned, taking subsets and declaring them to be subtypes causes all sorts of problems - the CircleAndEllipseProblem is a well-known example.

A good treatment of the various views of subtyping can be found in ObjectOrientedSoftwareConstruction.

My personal belief: it depends, and semantics are exceedingly important.]

Thanks very much for both replies. Interesting. I have been very much influenced recently by DateAndDarwensTypeSystem, which they seem to develop rigorously and fairly completely. (But perhaps I'm not qualified to judge that yet.) -- DanMuller

Nominative and Structural Typing under Type Abstraction

There are two common forms of abstraction over types (and values, too). One is parametric abstraction, meaning a type is provided as a parameter. The other common form of abstraction is InformationHiding (or implementation hiding, as some would call it).

One would typically see these sorts of abstract types in ML style languages and others that never bought heavily into ObjectOriented idioms. (OOP tends to deprecate the 'hidden abstract type', which has a common implementation for all instances within a module, in exchange for a more flexible 'hidden abstract value', each instance of which can have a different implementation. Both of these achieve InformationHiding, just at different granularity. The OOP approach generally hurts when time for optimization.)

Example signature for a functional stack:

 signature Stack<Elt> // Elt is a parametric abstract type 
   type S; // S is a hidden abstract type
   new   : S;
   empty : S -> Boolean;
   push  : Elt`e -> S -> S`s {post: (top s) == e};
   pop   : S`s -> S   {pre: not empty s};
   top   : S`s -> Elt {pre: not empty s};
 // ... and a simple implementation ...
 module ListStack?<Elt>
    type S = List of Elt;
    new    = nil;
    empty  = L   => (new == L);
    push   = E L => list(head:E rest:L);
    pop    = list(head:E rest:L) => L;
    top    = list(head:E rest:L) => E;
 assert forall Elt . Eq Elt => ListStack?<Elt> implements Stack<Elt>

The above ML-style modularity (perhaps barring a few tweaks for conditionals) should be familiar to anyone vaguely interested in such things as distinctions between NominativeAndStructuralTyping. I bring it up in this context to discuss how NominativeAndStructuralTyping fit into the picture.

Nominative typing is a very natural fit when typing 'S'. For purpose of modularity, one codes against a <i>signature</i> rather than against the implementation, and the only thing code developed against the signature for Stack knows about 'S' happens to be its name. Importantly, each instance of S is a distinct type. That is, there might be many instances of the Stack<E> signature for a given Elt E, and each instance has its own unique 'S' (even if the same implementation is used).

So the questions are: Should we attempt to support this modularity pattern with structural types? How do we do so?

I posit that in most cases, the answer to the former question is 'no'. Type systems can easily be hybrids of nominative and structural typing.

However, as the programming system scales upwards, 'pure' structural typing grows highly desirable. Attempting to share and maintain names across code ownership boundaries is bureaucratic, difficult to represent across languages, expensive, insecure. Getting the above 'each instance of S' type analysis right is a hassle, especially if we want to distribute some code for performance and disruption tolerance. Structure can be optionally verified, extensible, more robust. Open distributed systems and CrossToolTypeAndObjectSharing are easier when the types being transported are purely structural.

So, then, the question is: how do we represent the hidden type abstraction using structural typing?

I won't claim to have the only answer, but the approach I'm taking is related to EeLanguage's RightsAmplification by sealer/unsealer pairs. Basically, if I really want a type to be 'hidden', I 'instantiate' a sealer/unsealer pair as part of the module. This allows me to 'seal' values, which may then be passed around by arbitrary external modules (and even external processes). When the value comes back to me, I unseal it to operate on it.

Usefully, the sealer/unsealer pairs creates what (in type-based cryptographic analysis and zero-knowledge proofs) is often called a 'phantom type'. That is, a compiler can actually recognize that each 'instance' of a sealer/unsealer will serve as a unique type template similar to how each 'instance' of the Stack<Elt> signature will have a unique 'S' type, and perform the same sort of static analysis one would perform using nominative typing. If the compiler can prove a value remains confined to a given host, it could remove the sealer/unsealer entirely. Finally, even if the value has the potential to cross host boundaries, no actual encryption is needed until one crosses to a host that lacks the unsealer capability. This allows a great deal of performance and offers some powerful safety analysis without hindering a more dynamic implementation.

Effectively, the sealer/unsealer is providing a structurally enforced variation of 'hidden' abstract types. It seems promising to me, since I am aiming for a purely structurally typed system.


View edit of July 9, 2010 or FindPage with title or text search