God Table

The "God Table" is similar to the GodClass. A "God Table" is an table that represents [allegedly] way too many other entities in the system and may grow beyond all logic to become The Table That Holds Everything.

A God Table it is typically created to avoid having lots of tables to represent "catalogs" (for example, the catalog of countries, the catalog of states and the catalog of cities are combined in to a single "locations_catalog") sometimes this is used for all catalogs in a system (units of measure, location information, names, valid statuses in the life cycle of other entities and even flavors of ice cream are stored all in the same table), the justification for doing this is that generally all this tables end up being pretty similar (all them have 3 fields: id (primary key), name, description) and sometimes the name is the pk.

I think God Tables are created because of FearOfAddingTables (the relational cousin of FearOfAddingClasses)

A big question about this, is that have seen this "pattern" in huge enterprise systems (like OracleFinancial?) and, while I don't really like creating God Tables, I am not sure if it is a GoodThing or a BadThing.

I do know it goes against Foreign Key verification, because makes you have a GodCatalogId? in your Tables instead of a CountryId? or a FlavorOfIceCreamId?, so I guess it is not a GoodThing, but I would like to know the opinion of the community.

[While a "God Table" is almost invariably a bad idea for the reason noted above, there may occasionally be requirements or restrictions that dictate otherwise. Database and modeling issues should always be carefully considered on a case-by-case basis. Be wary of "one size fits all" directives about modeling!]

I disagree. Lots of small tables often results in artificial separation of concepts that are really more or less the same thing and may grow the same over time. The real world is often hybrid stuff. It fits the philosophy of set theory under which something can be more than one thing at the same time based on shiftable set memberships, something that splitting prevents. One is not forced into hard taxonomies, keeping the system open and flexible. DelayClassification. I've seen messes made by mad splitters and I wanted to slap them. The SQL statements needed kajillion JOINs and UNIONs to do anything, and ran really slow. A virtual separation is easier to code up (and undo) than the other way around. (In my experience it is the other way around)

[Providing this sort of advice is contrary to good modeling practice. The recommendation to avoid "lots of small tables," or that a "God Table" is good or bad, depends entirely on the requirements. Although one can probably show statistically that a "God Table" is on average more likely a bad idea than a good idea, following naive advice to use one approach or another -- without knowledge of the requirements -- is doomed to failure. A "God Table" may be perfectly appropriate if queries need to performed frequently on the entire table. A lot of small tables may be perfectly appropriate if they are conceptually disjoint, and combining them via UNION or the like is infrequent or not required at all. And so on. Furthermore, the implication that JOINs should be avoided smacks of casual denormalization or even laziness; again, this must always be considered in light of the requirements. Defining rules for modeling -- aside from those of normalization itself -- is meaningless without the context provided by requirements, because it is the requirements that determine functional dependencies, conceptual groupings that define entities, and so on.]

But people tend to have partitioning preferences and I am not sure that my preferences are objectively better or rather fit the way particular way I design software and systems. For example, heavy splitting may provide greater MentalIndexability to some people that justifies the more complex SQL (more joins and unions) because it speeds up their thinking in general even if they have to read longer queries.

It just may come down to individual psychology. If you can prove big tables are "bad normalization or laziness", please do. Even how people measure and perceive "duplication" (part of normalization) varies per individual. I've been in some crazy debates, and people just think different. Even duplication can be a psychology issue. CommonSenseIsAnIllusion.

[But that's exactly my point. I can't prove bad normalization or laziness or lack of common sense or anything else without knowing the requirements. A naive edict to "avoid JOINs" may smell of laziness or bad normalization, but to prove it in the absence of requirements, I have to guess at functional dependencies based on table and column names -- a highly questionable practice. Data modeling is not dependent on psychology (though that may be a factor), it is dependent on one and only one thing: The requirements. If "keep the queries as simple as possible" is a general requirement, then it will have a predictable effect on the data model, as will "maximise performance" or "maximise flexibility" or "minimise redundancy and update anomalies."]

I am saying that I disagree that there is one and only one solution for any detailed set of requirements (that don't pre-dictate implementation). There are a lot of "open" implementation decisions. Here is one that I just encountered:

 table: address_1

table: address_2 (2nd alternative) ------- street unit_type // 'apt', 'bldg', etc. unit_num zip etc...

I would lean toward the 2nd, but the final deciders went with #1, probably because it avoids having to track "unit types", and roughly 99.9% of "units" are either apartments or building numbers. (This example is not really about splitting, thus is perhaps destined for movement to another topic.) There is no one right decision here. They both have merits and downsides. I will not call #1 "wrong" even though it was not my pick.

As far as "keep simple as possible", there is no good objective measure of simplicity. Code size is not it. For example, lots of nesting (ThickBreadSmell) may be less code, but I wouldn't always call it "simpler" because it is often more difficult to mentally chunk-ify, at least for me (some dig them, but my brain is not that powerful).

One Table to rule them all, and in the darkness bind them.

I suggest this be merged with FearOfAddingTables. It is not big enough to justify a separate topic. Is somebody ignoring FearOfAddingLotsOfLittleWikiTopics?? Also, MultiParadigmDatabase does indeed suggest a GodTable, and it may be a good idea where flexibility is needed.

I believe they should be kept separate, FearOfAddingTables is the cause and GodTable is the effect (the same way that FearOfAddingClasses causes GodClass) and I am sure the must be other examples in this Wiki where cause and effect are separated
See also: MultiParadigmDatabase, MaspBrainstorming, TableQuantityVersusAppSize
See also: CategoryRelationalAntiPattern, AttributeTable, WebGodObjectDiscussion

View edit of December 2, 2013 or FindPage with title or text search