Types Are Types

After the controversy, diatribes, insults and general guff on the TypesAreSideFlags page, I'm reluctant to enter the mess. However, by way of trying to find something positive from the unremittingly negative impression given by that page, I'm going to give a purely personal view on types.

As an experiment, may I suggest that people append their comments, questions and thoughts, rather than interspersing them through the initial document. Then I'll address the issues raised by editing the original, enhancing it where necessary, and the responder can remove or replace their contribution based on my refinements.

I'm sure it won't last - people here are rampant individualists and won't stick to any structure - but we can try.


Types are difficult.

TypeTheory has made enormous strides, allowing fantastic facilities such as provided by HaskellLanguage with its polymorphism, first-class functions, type inference and complete type safety (provided you treat it properly). But types are controversial. People don't know how classes and types interact, and there's still the question of whether a square is a rectangle, a rectangle is a square, both are the either, or never the mane shall tweet.

Caught in all this is the poor student, who has to learn to use types as embodied in C, C++, Java, Python, Ruby, PHP, and other systems, with a total hotch-potch of "types", classes, coercion, inheritance and lack of true type-safety.

So what is a "type"?

One way of thinking about types.

In complexity theory a "problem" is a collection of "questions". Each "question" is said to be an "instance" of the problem. A particular graph is a single instance of the TravelingSalesmanProblem, a particular integer is an instance of the Factorisation Problem, and a particular program is an instance of the HaltingProblem.

Consider, then, that a "type" is a collection of "values". Each "value" is said to be an "instance" of the type. The number five is an instance of the Integer type, The number 3.14 is an instance of the FiniteDecimal type, and 3/4 is an instance of the FractionalNumber type.

It's important to realise early, and throughout, that a "value" is an abstract thing, separate from (although obviously related to) its representation in the program text, and equally separate from its representation in computer memory during run-time. Keeping these different concepts separate is crucial.

We can think of each type as being a set of values, but each value is permitted to belong to several types. We can create equivalences, and define conversions - note, these are different - but each "type" is a set of values. Further, the values concerned are not limited to numbers, strings, or characters, they could be lists, maps, dictionaries, functions, or even entire programs.

We can now give functions that take pairs (or more) of values, and define what the result is. In the case of integers we can easily define "add", "subtract" and "multiply", which always result in an element of the Integer type (ignoring overflow and underflow), and "divide", which gives an element of the FractionalNumber type. We can define functions to carry values from one type to a value in another type, sometimes exactly (as in Integer -> FiniteDecimal) but more often approximately.


When we implement a language, we have to represent values as collections of bytes, and whenever we act on a collection of bytes, we need to know what it represents, and therefore what to do with it. Note also that a particular string of symbols in the program could represent a value in any one of several types. The symbol "1" could be any of Integer, FiniteDecimal, FractionalNumber or Float. The way that value gets converted to a collection of bytes in memory will probably depend on which type it belongs to. The permitted operations, and the results thereof, will also depend on that.

In statically typed compiled languages, the collection of bytes representing a value needs no extra associated explicitly stored information regarding the type(s) of the value represented. The binary code being executed knows what to do, because that part of the code already "knows" what "type" is the one to use; the associated type(s) is/are implicitly encoded by virtue of the code being statically type-correct. The type containing the value represented by the string of symbols in the program has been definitively deduced, and that type information has been completely applied to the generation of the data-manipulating code. If that were not the case, then the generated code would not be statically TypeSafe.

In dynamically typed languages, it is always necessary (in at least some, but not all, subprograms which occur both in theory and in practice) to have extra explicitly-stored type information, beyond the simple bare representation of the value. Such a "tag" ensures that when the collection of bytes is accessed, the program uses it in an appropriate manner. Without the tag, the program has no way of knowing whether a particular collection of bits is, for example, a raw 64-bit Integer, or a Float.

In Python, (dynamically but strongly typed) for example, the collection of bytes holding a value additionally has a tag to say what type the collection of bytes (or more correctly, the value represented by the collection of bytes) belongs to.

Learning about Types

All this is totally indigestible for the poor student who has no experience and misleading intuition. The student needs to be given something more concrete to help get them started. Talking about one means of implementation - for example using "side-flags" - can ease them into the idea that a value might belong to any one of several types, and the representation of them in bytes, and the actions performed on them, vary according to that type. Examples, and many, many of them, need to be given.

The problem is that by focussing on an example implementation technique can result in a very blinkered understanding, inappropriate and perhaps even impossible for extension and enhancement. The ability to step beyond the concrete implementation and into the abstract is not given to everyone, but perhaps some students are being unnecessarily hamstrung, damaged by overly concrete presentation.

On the other hand, if a student can't, when the time comes, transcend their concrete examples and move to the abstract, then perhaps they actually would never understand anyway.

You might think you understand all this, and you might be right. But what happens when you allow functions as first class objects, and take non-termination into account. Suddenly there is a whole slew of problems about what types things are and can be, and it's not enough to think that nothing will go wrong. Experience shows that things do go wrong, and spectacularly so.

Simplifying the Problem

Types are hard, and simple ad hoc schemes have all so far proven to be either overly-limited, inadequate, inconsistent, incomplete, or some combination of them all. To present a simple scheme as an introduction, with plenty of examples, may be good pedagogy, but it should carry a health warning, making it clear that types are much more complex than they appears.

A simple understanding is fine for most people, and for most day-to-day programming, but don't underestimate the problems. If you haven't programmed in OcamlLanguage, or written a compiler that does type deduction, or implemented a language with functions as first-class objects, then the literature suggests that there are things you haven't thought of.


Types are subtle, and quick to snare the unsufficiently wary.

Brilliantly worded, author!

I have seldom seen such a well-done summary of such a hopeless ThreadMess. And a summary it is, though summary is much too weak a word really. And the title is - well - particular. It is a good title no question. Just not what I would have expected. But then again it's somewhat misleading, as is the introduction. Who can expect to add a personal view of types to this? But maybe this will keep the quality of this page unexpectedly high.

In response to the opening statement: someone added: There is sufficient evidence in the form of confused arguments to show that the "idea of a type" is understood differently by many people. Coming to a common understanding is difficult, so the field/area/topic of types is difficult. Hence the opening.

After the heading "Learning about Types" someone also added: It is looking in a standard dictionary for clear and precise technical terminology that is ludicrous. It's exactly the woolly-minded, imprecise, intuitive "definitions" that cause trouble when thinking about technically complex areas. Beware the person who advises you thus.

Dictionaries give a general notion and examples, but it doesn't answer tough questions such as how types differ from set-theory, "category", "validation", etc. (More questions are at QuestionsForTypeDefinitions) -t

In the above it is stated that: Someone replied: This is simply wrong. An instance of a [value-]type is not a variable, or a constant. It is a value. The concept of a "variable" is not referenced in the above discussion. In a program we create variables which can then hold values. A variable may hold the value "red", or a constant may be set to (or represent) the value "red". That does not mean that the variable is of type color. It means the variable is of type "variable that holds values of type color". The author of this comment appears to have ignored completely the paragraph that came next, which said: Someone commented on the comment, saying: This seems exactly right, including the idea that a given variable is an instance of the type "Variables". This, however, creates a potential for deep confusion of levels. The original commenter then wrote: This continues to confuse the representation in the program with the abstract concept of types and values. Talking about a literal in the program, and referring to what it is called, is exactly the confusion of levels that makes types a difficult subject.

What makes types a difficult subject is when ones here have not done as much research as they could have, from respected scientists such as Kernighan, Wirth, and others (i.e. backing up claims with quotes and references). I'd refer people some more web pages for further understanding of instance, literals, values, and types.. but I've spent enough time here and provided many quotes and references above. Please read the quotes from the respected computing scientists, and consider using the term "member of a type" instead of "instance".

A type is a classification system, as clearly defined in the dictionary. See also TypesAreNotTypesTheyAreTypes. What is math? Who can prove math is really math? What is a type? What is an apple? The first place to look is the dictionary. It is a group, classification, domain.

The dictionary is irrelevant? A classification system is exactly what a type system is.. which apparently 99 percent of the people don't understand. Instead people ramble on and on about how Type Systems are actually Pilons or Flags or Pieces of Mint flavored Gum that people carry in their pockets. Type systems exist to classify (domains) and constrain to a certain group (again, domains). This, is exactly what the experts say too. It is very sad, that today computer scientists (as we call them) will not take a look at their dictionary.. in cases where they are unsure of the definition of a simple phrase.. such as type. It is, very very sad.

{The dictionary should provide a layman's comprehension of types and computer programs. The expert's definition of the word ought to at least bear some resemblance to and not utterly contradict 'in spirit' the dictionary definitions, lest the expert create artificial confusion in choosing that particular word for use in his or her domain. But expert definitions need to be formal; use of informal words won't help one expert communicate with another. It just so happens that 'Type' isn't defined even for experts. 'TypeSystem' is defined, and 'TypeTheory' (the formal study of TypeSystems) is defined, but 'Type' is defined on a per-TypeSystem basis. Experts can, however, agree on the essential properties of 'Type' - those things that are true for every possible consistent and applicable TypeSystem, and they can do so for types in TypeSystems applied as broadly as observations over the 'real world' (or even imaginary worlds, like mathematical models) and add additional properties when they know additional constraints (such as types as applied strictly over and within programming languages). Necessarily, these will be high-level and broadly applicable properties, lest some expert could come along and create a TypeSystem just to contradict them. These essential properties will have a basis strictly in information theory (i.e. a TypeSystem is not 'applicable' if one cannot acquire the information necessary to apply it) and logic (to ensure consistency).}

{The author here focuses on types as applied over values, which is a bit distinct from types over mutable state, types over behavior, and sub-structural types over program components as a whole (e.g. ensuring that there is only one copy of any value in use in the program at a given time). The domain of TypeTheory is vast and quite deep and it is easy for someone to get lost in it, especially if they have no desire to perform or study the math necessary to keep up. As a consequence, it should be noted that most computer scientists - even those with master's or doctorate degrees - do NOT qualify as 'experts' in TypeTheory. (Or CategoryTheory, or Mathematics in general...). TypeTheory is deep enough to be a Master's degree and provide plenty of Ph.D. material all on its lonesome.}

{Fortunately, programmers and those in computer science don't need to become experts, as that would add at least two years to their degree. Programmers can, for the most part, remain laymen when it comes to use of 'types' ... and the dictionary's definition can, for the most part, properly satisfy their need for a definition. However, programmers should at least know enough to be humbled by the field, just as the majority of computer scientists are invariably humbled by the advanced components of post-calculus math and non-monotonic logics.}

{This humility is desirable because when a programmer, who seems like an expert to younger programmers and other laymen, gets idea in head to design a language or design a new TypeTheory (e.g. TypesAreSideFlags) without really understanding why it would fail. This programmer can lead others astray and simply add to the general confusion surrounding 'types'. Lacking skill is not generally a problem for those who aren't UnskilledAndUnawareOfIt and choosing to preach their views or if they're SnakeOilSalesmanAttemptingToMakeAQuickBuckSellingBooksForIdiotsByIdiots. Most of the big Type debates on this WikiWiki are a consequence of one or both of the latter (it's really difficult to distinguish the benign fools from the malign charlatans).}

{The only programmers that benefit a great deal from advanced study of TypeTheory are those involved in (a) language design, (b) compiler implementation, and (c) correctness proofs over programs. The rest only need to learn enough about types to use their programming languages of choice and perhaps take advantage of certain language features that derive from the chosen TypeSystem (e.g. TypefulProgramming).}

Demand a Simple Explanation of Type

Perhaps "type" should be separated from "type theory", "type systems", and "type implementation". A lot of people here are not looking for a "type system definition" or a "type theory page". People are asking "what is a type?" and saying "types are this?". Precisely if one wishes to grasp what a "type" is, he should grasp the "concept" of a type (domain, grouping, model) rather than studying full fledged "type theory", "type systems", and "how to implement types".

Consider an integer or byte type. When one declares a byte or integer, he shall think of these as a "limited number type". How hard is this to grasp? It is not hard to grasp, despite what some claim. An integer type is a way to constrain and group a number. A byte is a very limited model that we can work with since it can only hold values from 0..255. A byte is not just some pylon or side flag, it is a model to follow. A type is a simple concept for even a layman to understand.. and it should be.

To make types easier to understand, it is also important not to introduce confusing terms such as "instance" when discussing "values", since "instance" is also a confusing phrase commonly used in OOP. Rather "member" should be used when discussing "possible values of a type" since the word "member" is less overloaded in computing science.

Re: "simple ad hoc schemes have all so far proven to be either overly-limited, inadequate, inconsistent, incomplete, or some combination of them all."

Can you provide an example? Re: "making it clear that types are much more complex than they appears."

"Are complex" and "can be made complex" may be two different things. It is probably possible to take ANY concept and create a complex and/or convoluted version or model of it. But that act does not necessarily make the concept itself complex. For example, proponents of Smalltalk are often appalled at what languages like C++ and Java have done to OOP.

The basic idea of 'types' is very simple, just like the concept of 'object'. But talking 'type theory' is more like discussing the entire field of ObjectOriented PatternLanguage, not merely the simple concept of 'object'. In the same sense as used above, 'objects' are also hard (or "complex") once you have to start putting them together to create bigger objects or solve real problems, or start creating first-class objects in a distributed system, or deal with the issues surrounding the creation and destruction and naming of objects. As with a simplistic notion of objects (e.g. in the same vein as "TypesAreSideFlags", consider "ObjectsAreIdentifiers"), any simplistic or "ad hoc" notion of types will not make predictions or provide any help at all when it comes to dealing with complex or combinatorial problems involving types, such as those involved with type inference and first-class functions.

Ultimately, the statement above refers more to the domain of 'types' - including its theory, the teaching of 'types' to students, what it would mean to define the use of types in a type-sensitive programming language, etc. All these same problems apply to 'object': the theory of objects and pattern languages, the teaching of 'objects' and their relevance to students, what it would mean to define object in an object-oriented programming language. The gestalt context of the statement ("making it clear [...]") in the above discussion should have made it clear that the word 'types' was referring to more than merely the 'concept' of type.



View edit of October 19, 2014 or FindPage with title or text search