"We should not impose regularity where it does not exist." --TedNelson
Giving relational dynamic character that is roughly analogous to dynamically-typed or type-free programming languages. (The "Oracle model" is more analogous to static or strong-typed languages.) Column existence, and perhaps table existence can also optionally be dynamic. Thus, a DBA is optional
. Such a tool could be useful for RapidApplicationDevelopment
, and experimental or demo applications.
(Topic title not to be confused with DynamicSql
describes the basic ideas behind DynamicRelational
, but adds (or subtracts) features to accommodate OOP. Since the mixture of dynamism and OO features tended to confuse people, I felt a dedicated topic was in order. (Dynamism seemed necessary there to allow such a database to accommodate more paradigms.) Thus, for a working definition of dynamic relational, lets exclude
these features from MultiParadigmDatabase
- Optional table designation (entity attribute). Tables are assumed mandatory here.
But these features are kept:
- Values are essentially stored as strings, with an almost PerlLanguage-like typing system. There is no "side flag" or required formal schema to indicate "type" of a column. Explicit types via schema's can be optionally added as needed, as described later. (As in Perl, this may require more explicit operators, such as not overloading "+" to mean both addition and concatenation. See ComparingDynamicVariables.)
- There are no explicit nulls. If an application wants to treat zero-length strings as null, or some other symbol/keyword, it can. (The string "[nul]" is a suggested convention if needed for compatibility with null-using applications.)
- There is no way to distinguish between a "missing" column and an empty, zero-length column. (This is perhaps a debatable feature, but I personally like it for reasons I'm still trying to articulate.)
- Dynamic schema via "Create-on-Write" - Columns and tables can be created just by putting or updating data in them. No explicit schema alteration is required. These "non-limits" can be incrementally revoked, per below. (The "non-missing-distinguish" rule above is what gives this behavior to columns.) An insert or update is sufficient to "create" a column and/or table. There would never be a "no such column" error (and perhaps tables). Creating indexes would perhaps be the only explicit "Create" command in a dynamic environment (if not automatic). The need for DDL commands would thus be greatly reduced.
- An operation to find all columns in a given schema has to do an entire table scan to find those columns with values. (However, such an action is generally against the philosophy of DynamicRelational.) One can view each row as an independent map (dictionary array). The schema for the table is simply the unique union of all map keys of the table rows.
- There may be ways to automatically track the existence status of columns, but this probably creates overhead and thus should be optional.
- Every row automatically get's a read-only auto-number "rowID" field.
Optional features include:
- Optional schemas or "rigidity flags" on tables which require some columns to be statically defined, required, and/or possibly column types defined also. A trigger-like mechanism would be used similar to "On row change, if field foo less than 2 characters in length, then abandon change with the message 'foo is bad...'". This will allow a gradual transition to the static model if needed. Thus, one could make a given database instance "static" just like current RDBMS by supplying sufficient restrictions/validation. For example rapid prototyping may favor the dynamic approach, but later we may want to put some rules (such as types, validation, "required") in place when the model solidifies and matures. Maybe some columns will be "static" and others left dynamic. See below for a list of possible incremental constraints. The approach is generally analogous to DualTypingLanguages.
- Uniqueness guarantees - The option of having key- or record-based unique record verification. (Without such, the DB may not qualify as "relational" in a pure sense, but rather BagAtational. However, every row automatically gets an "auto-number" key.)
argues that dynamic records don't really violate the "tuple" requirement of relational because one can conceptually view such as an ever-changing traditional "rigid" rectangular view. It may hurt your brain a little to think of it that way, or burn CPU to get a full rectangle, but it still manages to keep it technically honest to relational rules. Relational does not dictate that schemas never change.
Tips for adapting ODBC drivers and/or SQL for dynamism are given in links near the bottom of MultiParadigmDatabase
seems the closest existing product to this idea, although only the cell "types" are dynamic, not the schema. Schemas must be declared and updated explicitly.
- Dynamic schema is a key feature. If SqLite lacks that, it doesn't qualify in my book.
- A database does not contain "cells" and "tables of cells" as you speak. It contains relvars according to dbdebunk, Date, et al.
- But then most readers wouldn't know what the heck we were talking about.
- Then make a RelVar page and beat it into them because the current industry is in a sad state and most people on here are smart enough to grasp the idea of a relvar after reading about it anyway
- [We can always point them at the page that already exists: RdbRelVar]
- I didn't feel such was necessary. The "street" terms are good enough for topic at hand because this topic is not about formal definitions. (Do I smell a documentation-style HolyWar brewing?)
Some also suggest that ProLog
creates or infers ad-hoc relational or relational-like structures.
Yeah dbdebunk.com really thinks highly of SQLite... http://www.dbdebunk.com/quotes2004.html
Comparison operator discussion and examples moved to ComparingDynamicVariables
This article describes DynamicRelational
-like techniques that are becoming popular:
But, it does not call the alleged alternatives "relational". One possible reason can be seen in one of the examples that violated uniqueness in column name per record, such as having two colors for a car. If they instead supplied "color_1" and "color_2", then the unique column name requirement would not be violated, keeping it true relational (or at least closer to it). I wonder if they have a good reason to abandon unique attribute names per row. And some of them look more like an EssExpressionDatabase
(rows of nested lists), which is a curious idea, but not "relational". --top
Examples of optional/incremental "lock-down" constraints:
- Require that a given column is required to have data for a given table.
- Require that a given column is of a certain type (or passes a "parse test" against typical "type" templates or reg-ex's).
- Disallow "instant" new columns (for non-DBA)
- Disallow "instant" new tables (for non-DBA)
- Table-oriented GUI system
- INI file replacement
- RAD for experimental applications or prototyping
- In-house short-term projects
- User/customer insists on more self-service flexibility
To use SQL with DynamicRelational
, explicitness in types for comparing etc. is strongly recommended. There are different ways to go about this, as described in ComparingDynamicVariables
. As a draft suggestion, I'll propose "type symbols" borrowed somewhat from old-style BASIC.
WHERE columnA $< columnB // $ = string compare (resembles an "S" for String)
WHERE columnA #< columnB // # = numeric compare
WHERE columnA @< columnB // @ = date/time compare (at)
ORDER BY $columnA, #columnB, @columnC // first on string, second on numeric, third on date/time
I propose these symbols be required for all comparisons with dynamic columns to prevent static-database SQL habits from leaking into the dynamic version. If the type of column(s) participating in an expression is known, (explicit types via DDL are optional, but possible) then the symbols would not be required. Thus, if one imports data and schema from a traditional RDBMS, which would give each column an explicit type, then the SQL originally used in that traditional RDBMS would work as-is (barring dialect differences) and not need type symbols
. If symbols are given that conflict with explicitly-defined column type(s), then a coercion of types to that of the operator (symbol) is attempted; and error-out if not convertible. Thus, "numberColumnA $> numberColumnB" would compare with string versions of the numbers, equivalent to "toString(numberColumnA) > toString(numberColumnB)" in a traditional RDBMS.
The two system-required and system-defined columns, "entity" and "row_id", are defined as string and numeric types respectively, and are not normally changed by users nor the DBA. (I won't rule out a low-level custom tweak.) Thus, explicit type comparison indicators are not needed for comparisons such as "...WHERE row_id=7 AND entity='foo'". ("Entity" is essentially the table name and thus is not normally referenced as an explicit column, but typically comes from the FROM clause. An advantage of column-based referencing is that the table can be "computed" at query run-time, adding to dynamicness.)
If two columns are of predefined types (via DDL or system-defined), then comparisons of such columns don't need indicator symbols.
...WHERE numberX > numberY // both are predefined as numbers
...WHERE numberX #> thingA // a dynamic column is involved such that symbol needed
See also: ObjectsAreFromMarsTablesAreFromVenus