Pernicious Ingrown Sql

In my travels I've noticed a troubling disease. At first I thought it was just the occasional isolated case, but now I'm worried that it's an epidemic.

I've observed that a lot of people have the notion that a web application should be looked at as a contraption for generating SQL strings and then wrapping HTML around SQL result sets.

(This is in stark contrast to my perspective, that a database is a place where good objects go to rest when we don't need them, and that the fact that a database speaks SQL mainly influences how I optimize certain operations.)

Symptoms include some or all of of the following:

Email WilliamPietri sent to the XpMailingList, reproduced here by RobSartin.

My point in posting this wasn't really a request for technical solutions; I'm looking for ways to wake programmers up, to get them to understand that there are other ways of looking at their code. Once people get it, I can get them to use an ObjectRelationalMapping tool. But how do I get them to think about things in the right way? -- WilliamPietri


I HaveThisPattern. True story: in giving a series of tutorials on a ModelViewController-style web application framework, explaining the SeparationOfConcerns, how to configure the controller to invoke actions in response to specific request parameters, database interaction via a simplified ObjectRelationalMapping layer, and presentation generation, the most frequent questions is, "Where do I put my SQL?" -- StevenNewton


What I do to avoid this is use a data-driven ObjectRelationalMapping framework. It's a lot easier than writing or maintaining SQL to do various queries, and it winds up making applications more flexible (for instance, by providing database independence). The one I use is the EnterpriseObjectsFramework included with WebObjects; it's 100% Java, and you don't have to use it with WebObjects if you'd rather use some other web application server. (WebObjects is built on the EnterpriseObjectsFramework, not the other way around.)

One of the things that's great about the EnterpriseObjectsFramework is that you can reverse-engineer an existing database into objects very easily. You point the graphical EOModeler tool at your database, select which tables and stored procedures you want to use in your object model, and it will build an object model for you automatically. Then you can customize it and start using it in your application quickly. -- ChrisHanson


I HaveThisPattern, too. We required from a subcontractor that an application had to have a distinct object-relational mapping layer. They said, sure, they did this every time anyway, and had a finished framework ready for that. Later, it turned out that their "framework" for "object-relational mapping" consisted of a class that took a sql query string, executed the query and converted the result set into a vector of String arrays - never mind whether a field was a number, a boolean or a date, everything ended up as Strings. These were then sent to the servlets / JSPs for processing and display (<%=d[i][5]%> etc.). After all, Strings are objects, right? So it must be an object-relational mapping, right? - It would be more funny for me if it wasn't true. -- FalkBruegmann


Amazing! I've seen the same thing as FalkBruegmann talks about, except the ResultSet was converted to a two dimensional String array. Of course this was on a EJB project, with tons of duplicated code and no tests. Fun stuff. -- TomCopeland


I guess this Pattern / AntiPattern originated in the days of cgi-script web applications, where the database was the only place for data to persist between requests within an interactive session. If every piece of data with a lifetime of greater than one request had to get onto the database within the request that created it, it is easier to think of the data in SQL terms, rather than craft them into objects that would be mapped to SQL and destroyed within milliseconds anyway. -- AndrewMcGuinness


I think there is an ObjectRelationalPsychologicalMismatch (philosophical divide) issue here. OO proponents tend to see SQL as a mere storage mechanism, but relational proponents see relational as a powerful tool that can simplify code and structure management, and thus not something to be merely "hidden away" but used to its fullest potential. If you only use RelationalDatabases as a storage mechanism, I can understand such a distaste for "naked" SQL. But I tend to side with the ChrisDate viewpoint on this one. OO proponents are making a mistake by pushing aside relational's potential and end up reinventing the database in their own application code without realizing that they are doing such. (I agree that SQL is not the ideal relational interface language though. But, still better than hand-coding DB features from scratch.)

Further, in order to have "swappable vendor" database code, you would have to use a rather wimpy subset of RDBMS features, creating more code to do what the DB might otherwise do. Plus, the SQL standard is often not closely followed or leaves too many gray areas. The alternative is go with an open-source RDBMS or roll-your-own from scratch, which OO proponents seem fond of doing anyhow it appears to me and some other relational proponents.


SQL text found in applications is mostly monotonous. See WhereAndAnd. So IMHO creating "vendor swappable" code is not a big issue.


This problem is not limited to web applications. I have seen many VB applications that is also a contraption for generating SQL strings and then wrapping VB gui around SQL result sets. As the saying goes, you can write Fortran in any language, this kind of bad practice can be seen in any kind of system. Web apps are just the latest fad that attracted the newest batch of newbie programmers.


This problem may not be limited to web applications, but the issue mentioned above with objects being destroyed at the browser still exists and is a very strong factor in modern web app architecture decisions. Cookies and sessions are both unreliable on the browser. Is it any wonder we cling to the one data storage method over which we have the most control?

Besides, a well-designed relational database will take care of so much work for you. Maybe this is just laziness talking, but I trust an established database to police variable types more than my own code. Provided you don't ScatterSqlEverywhere, I see no problem in letting it do its job. See DatabaseIsSoftwareToo.

By the way...

"Web apps are just the latest fad that attracted the newest batch of newbie programmers."

Does this comment belong on this page?

-- LaraKramer


I would like to see a semi-realistic example of why it is a bad practice. I have seen some cases where wrapping adds and updates may be prudent because most RDBMS don't allow updatable views. But beyond this, I don't see read-based queries a problem. The maintenance effort caused by wrapping is equal or greater than that of not wrapping it in my observation. SQL used to its full potential is too powerful to be wrapped by a "flat" API anyhow.

One can make utilities to generate the repetitious parts of SQL (See HelpersInsteadOfWrappers for examples). This is automating the repetitous parts without trying completely push SQL into a closet. Another way of saying this is that automating repetition and hiding SQL are not necessarily the same thing.


Ok, I HaveThisPattern as well. I would love to write code in a nice, totally non-coupled fashion. Unfortunately, I have to react to business need and I'm sure I would end up taking too long writing code in this fashion. Sometimes having totally decoupled code with processing engines, data engines and the like are function of institutions that have a tight relationship with an IvoryTower or a very large corporation. Let's face it the medium and small businesses of the world (most of them), don't have the resources for this kind of intellectual twiddling. With that said, I would like to say I would l love to be in a spot where I had the luxury how to do it TheRightWay?. I don't believe this is a newbie or intellectual problem. It's probably more of a time-resource problem. -- JoshuaRobinson


I don't really agree with the overall premise of this page, but there are some legitimate concerns.

I'm not sure what an "untyped string" is, but I think the mention of integers probably refers to enumerated type codes in the database. One way of addressing this sort of thing is to accompany the schema with code that defines such things symbolically, and being consistent about using the symbolic names.

These are all legitimate concerns. Centralizing the definitions of non-trivial queries, sticking to standard SQL whenever possible, and having a flexible compatibility layer to hide details of the DBMS API all help here.

This, I don't see as a problem at all. For many processes, especially those involving large updates to the database, the DBMS can do things much more efficiently than application code can.

These don't sound like problems with the use of SQL, unless it's perhaps an indirect consequence of trying to force the output of every query into an object.

That just sounds like a characteristic of programming large systems. :)

Wrapping every database in business objects discourages the use of queries tailored to the immediate needs of the application, which in turn leads to gross inefficiencies. A pattern I've seen often, for instance, occurs if you have need to display a list of only a few columns of some business entity; loading each business object in order to display only a few attributes in each row can be terribly inefficient, especially if the business objects contain complex hierarchical structure that's not needed for this particular display. (Consider for instance bank account transactions with detail records, when you only want to display a summary list of check numbers, description, and total amount.)

[That's an excellent point. Display shouldn't be considered business objects. Perhaps operations from the display are on business objects.]

The problem isn't limited to display, that's just a common example. Any sort of batch process or "macro" process that might have to examine numerous records in the database will be up against the same problem. Yes, you could use business objects to encapsulate some of this, as they also often encapsulate basic loading and updating of their corresponding database entity. But eventually, you'll run into some process that relates to multiple business objects. Personally, I think that attempts to map the entire database schema to business objects causes more code organization headaches than it solves, and, as implied above, discourages use of the inherent flexibility of a relational database system.

On the other hand, the inherent flexibility of SQL is what makes using it so difficult. When any statement can at any time, access and modify data, and you give up specific data structures for generic datastructures like resultsets, code becomes essentially weakly typed and data completely unprotected. Every time you write a new statement, you have to hope you aren't violating any business rules. With objects you know you aren't violating any business rules, they are all contained within the model. Trading Object.Name for Record["Name"], means giving up strong types, for those of us who like that. It means no longer having the compiler assist us when changing things around. It means resorting to hunting down field names with search instead of compile errors. IMHO, this leads to a lot of broken brittle code, it also leads to spending copious amounts of time chasing SQL statements just to find out what fields exist in the record you're working with. Using business objects, one can generate a base system to work with, and tune and optimize from there, you can always drop back to SQL when the need arises, and it always does. Starting with SQL on the other hand, forces one to start from scratch. I'd rather start with something and tune the slow parts, that start from nothing each time. It's not like using the object model forces one to give up SQL, it simply let's you only have to write the slow parts by hand, rather than all by hand.

I've been working as a DB designer for ~20 years now and on the way dabbled with OO (my favourite language being Smalltalk). This is mainly to try and overcome some of the difficulties people experience connecting OO with (relational) databases: nowadays mostly working with Java/JDBC. There are a lot of interesting issues involved but the above comments about SQL are both true and a little unfair. SQL is a flexible language precisely because, as Hugh Darwen states, a database is a set of axioms (the data asserted as tuples in a relation [translation: rows in a table]). The result of a query is a theorem (new data derived from the orignal facts/axioms/data). The process of deriving the theorem is a proof (we can prove a piece of SQL correct because it is based on relational calculus/algebra). Now, would you want your mathematical theorem generator to be inflexible? How would that limit you to what you could prove correct? In practise, not many people bother to prove things, but the point is that RDBMS's are based on a sound mathematical footing and hence one could prove things if you wanted to. -- PaulHorth?

"...code becomes essentially weakly typed and data completely unprotected. Every time you write a new statement, you have to hope you aren't violating any business rules."

If you're using an industrial-strength DBMS, you put the database constraints in the database, where they will be checked regardless of who accesses the database from where. This ability is fundamental to relational databases, for exactly the reasons you cite. Many database constraints can be easily expressed declaratively.

I agree, but find that most business rules can't be enforced with simple constraints. Most rules aren't simple validation, they are far more complex and the business objects make them easy to express. So I stand by my original statement, you basically have to hope you aren't violating rules when you allow sql instead of business objects.

Well, as a database is a set of axioms (see above), I would like to ensure they are correct and not inconsistent before I use them. At least with SQL, being based on relational calculus/algebra, can be 'proven to be correct. In the DB I use (Oracle) complex constraints can be enforced. I like to stick constraints that are company/system wide in the DB. Application-level constraints should go in the application layer. -- PaulHorth?

If you rely on business objects to enforce all the rules, then you will have a very hard time indeed when you have to "start tuning the hard parts" - you'll have to heavily refactor your validation code to decouple it from the business objects, or you'll have to replicate a lot of validation in different forms.

I use the database for simple validation that constraints can handle, but a system which works and is slow is far better than a system that doesn't work at all, and I find most SQL approaches are so riddles with bugs and violations of the rules that I consider them not working sufficiently, it doesn't matter how fast they are.

My experiences have been colored by working on a proprietary system that uses such an object-based approach. I've been "tuning" this system for several years.

I'm sympathetic to many of the concerns cited, though. I also preferred - vehemently - static type checking several years ago. I'm very concerned about maintainability and vendor lock-in. However, what we have done, and it seems to be serving us relatively well, is to add a heavy dose of unit tests to the system, along with centralizing the definition of non-trivial queries so that it's easy to automate many of the unit tests. In addition, we have our own DatabaseAbstractionLayer that isolates us from SQL dialectic differences -- but this was admittedly an expensive architecture to create, and the write path through this layer is only rudimentary at the moment, although the read path is working very well. For writing data, we currently do still rely on the business objects, where all of our data validation is encapsulated. (We're not using a very capable DBMS at the moment, although we hope to address this problem soon. We're using MS Jet, via DAO. You may laugh at me now, I won't be offended. Jet's numerous SQL idiosyncrasies are a major source of headaches for our abstraction layer, which generates SQL from relational expressions. The good news is that once I determine a workaround for a problem, nobody else has to worry about it again.)

I won't laugh, databases are a pain for most of us, and I know sql quite well. Knowing sql has little to do with the problems in using it. I've even considered not using objects and unit-testing every single procedure as an alternative method, I'm still on the fence. I like how fun and easy programming is with objects. I like pretending I simply have unlimited memory and no database. As a language, I think SQL sucks, it has little in abstraction capabilities and makes reuse a pain, maybe one day I'll find an approach I like on all fronts, until then, I'll take objects.

You may know SQL, but knowing it and knowing why RDBMSs are important and how to design a proper relational model and how to interface with objects and how to optimise it are two different things. Too many people spend a lot of effort trying to tie OO and DB together when nowadays it is probably cheaper for most medium to large scale projects to buy in (or get for free) a third-party tool like TopLink or Hibernate to do the ORM mapping. -- PaulHorth?

Maintenance of references to database names (tables and fields) is definitely an issue, and it would be nice to have the compiler warn you about invalid references. However, searching for these names does not have to be so difficult, especially if you use a simple trick. Use your DatabaseAbstractionLayer to enforce a uniform syntax for referring to them. For instance a small class named "fld" could be used to represent field names, so all field references appear as fld("FOO"). Now a simple regexp such as "\bfld\s*\(\s*"FOO"\s*\)" (Perl-style) can reliably find all references to field FOO, outside of SQL.

If you leave performance tuning to the end, in many applications it will end up consuming a huge amount of effort.

I disagree. I find it's quite easy to trace the database, find the most repetitive hits, and combine them, cache them, or write a proc to replace them. Few parts of an application really need that much effort, at least imho.

I keep telling our application-level programmers the same thing: retrieve only what you need from the database. This is the single most important guideline for performance when using a database, and following it requires bypassing business objects. Our small group of DB experts spend a lot of time rewriting application code to fix performance problems, but with decent schema documentation and well-designed schema, it should rarely require an expert to do this right in the first place.

Personally, I think dba's are a waste of time. I've yet to find a dba that knew as much as a solid programmer did. I've always found dba's to be programmers who couldn't cut it, of course I'm sure they're not all like that, just speaking from personal experience.

That's interesting, I've yet to find a programmer who knew as much as a solid DBA/designer about designing databases. Most programmers seem to treat an RDBMS as a place to shove data in a 'flat-file' structured manner: with all the horrific consequences we identified in the 1960's from that approach. Please do not judge people on how well they know your area of expertise, particularly when you may be ignorant of their area. -- PaulHorth?

[That's a dangerous attitude to take, whether it is supported or not. A similar attitude is that testers are programmers who couldn't cut it. The problem is that testing and DBA work and lots of other things are all essential, and it is counterproductive to sneer at those other professions.]

[A famous example is garbagemen and janitors. Civilization would collapse without them. It's much better to appreciate what they do for us, than to sneer that they can't do anything else - which may be true, may be false, but is irrelevant. We need them, and should be very glad they exist.] [Of course, good DBAs are quite skilled compared to garbagemen and other unskilled/semi-skilled professions, and that should be appreciated in its own right.]

[http://www.thefirstpost.co.uk/index.php?storyID=7092 (external link) The photos here illustrate the pressing requirements for garbage men and janitors quite effectively. Sorry if external links are frowned upon given that they are likely to disappear, but I was reading this page trying to decide how to tackle my current DB-based application (in terms of access layers) and while reading about janitors above I simultaneously ran into this link on another tab - A coincidence too good to pass up. Remove it if you will. More on topic, I will be the DBA and developer for this application, given the small size of the company I work for and I've previously built a (my first) RDBMS-based application using a single DatabaseAccessLayer object that dealt with ALL queries and attempted to return "business objects" representing the various outputs. I had mixed feelings about this - it helped in some ways but not others. More reading required... -- Chris R-M]

Like I said, I know it's not all like that, just recounting my experiences. I think dba's hamper development. I think programmers work better without them. I don't see a need to not allow the programmer to do the work himself, at least the one's who are good enough to do the dba's job, and there are always several. I like having a dba to maintain productions systems, upgrades, maintenance kind of stuff, but not in development. I'm sure other opinions will differ, but that's my take. No, I don't realize that, you said that it was a dangerous position to take, so I justified why I take that position, how is that unrelated? Except that I disagree with your assertion that it's a dangerous position to take.

You said you think it's counter productive, I disagree, I think it's a more productive attitude. I think having that attitude is what makes one get the work done himself, rather than depending on a dba. I think programmers who don't need nor want dba's and can do the work themselves are far better programmers. I don't sneer at dba's, that's your projection, I just think they aren't needed, at least in development.

Ah, I didn't realize you were just disagreeing with the "programmers who couldn't cut it" piece. OK, I was arguing the first distinction you made. So basically we agree, because I didn't say all dba's were programmers who couldn't cut it... I just said the ones I knew.

DbasGoneBad

Actually, I'm not sure how DBAs got into this conversation. I used the term "database experts", and the next response was about DBAs. The database experts in my work group are all programmers. I'm one of them.

I haven't worked with enough people titled "DBA" to have a strong opinion one way or the other. The ones I've met seem long on implementation-specific knowledge and short on fundamentals and theory, but the same could be said of most of the programmers that I've worked with. -- DanMuller


To you fans of heavily wrapping the DB behind API's and methods, I am curious about something. Would you still recommend it if the RelationalLanguage was consistent across vendors? If so, why? I am trying to figure out if it is the differences in vendor implementations that you are trying to protect from, or just the strings and their lack of compile-time checking, or being that it is a different paradigm than your favorite language? -- top

It's because it's strings. Even with a single perfect implementation, it'll get wrapped. Consider that the OO wrapper is likely to be an interpreter pattern that takes an abstract syntax tree of objects and creates sql out of it, the point of which is to make the sql operators into language elements rather than strings. I'd rather have to type ...
    new Or(
        new Eq("value", aVar), 
        new Eq("another value", aVar))
consider the lisp version
    (or
      (eq? "value"  aVar)
      (eq? "another value" aVar))
notice that wrapping queries doesn't necessarily weaken them, making insert, update, select, and delete into objects doesn't somehow cripple sql, it just makes them into language elements rather than strings.
    "'value'='"+aVar+"' or 'another value'='"+aVar+"'";
the object version is structured, and can be manipulated, changed, and can be translated directly to equivalent sql. It's just our way of making SQL native to the language. Strings suck.

But less than the alternative. I have not seen any SQL wrapper language that is simple, uncluttered and flexible. Sometimes there are patterns such as WhereAndAnd that one can take advantage of, but if something more complex comes along you may end up with a DiscontinuitySpike (have to rework the whole thing). Thus, I keep such wrappers light-duty so that they are easy to abandon if more complex SQL expressions are needed. They might resemble something such as:

  appendAndClause(clauseName, varName, value, validationOptions)

This simply adds an expression to a criteria clause (string). If simple appending no longer works, then I don't use such anymore and work with more direct strings. This way one can take advantage of patterns in SQL expressions, but resort to raw strings if/when needed without having to use wrappers for every other expression. Your approach seems all-or-nothing.

Boolean expressions in general don't convert to OOP API's very well.

Oh, explain to me how this is less functional ....

  new Or(
    new Eq(1,4),
    new And(
      new Eq(3,3),
      new Eq(5,5)))
than this
   (1==4)||(3==3&&5==5) 
Oh that's right, it's not, is it more too type, certainly, it's called the interpreter pattern and it let's us have mini languages directly in the code. Sql's a mini language, thus we can interpret it from objects to sql, without losing any of its capabilities, so you don't have a clue what you're talking about.

They are harder to read and expand the code by about a factor of 3. I see no net advantage to bloating up the code with such.

Because it let's us work with a language that has semantics, rather than strings concatenated together.

If they make you feel comfortable, so be it. Just don't extrapolate your personal favorites to everybody else. I find them much harder to read if there is no trivial pattern. They slow me down, period! I have seen others laugh at such code also. It might be digestable for simpler expressions, but conditionals can get pretty elaborate and are hard enough to read in their original form. See ExpressionApiComplaints.

This wildly bloated code, isn't wildly bloated, and [?] be quite easy to work with, and is used for far more than SQL, those same expressions can be passed to HigherOrderFunctions and applied to collections, and lists. That's OO using functional programming idioms, unfortunately, languages like Java and CSharp don't have expressions, so we fake them. It's harder to read because it's prefix formatted, just like lisp and scheme, but once you get used to it, it's quite natural.

[Take a look at the C# 3.0 specification, where one can write this kind of code
  table.Where(m => m.Field != 'Value')
They've added expressions to the language and a generalized query syntax (lookup LINQ in google) -- PabloMontilla? ]

One can say that about anything ugly. Anyhow, do you have a specific example of it being wonderful and shining shoes? Further, what is your recommendation for those using weak meta-abilitied languages like Java?


To answer top's questions from my own perspective, here were some of my motivations for creating a relational API wrapper for SQL:

-- DanMuller


What's wrong with pushing computation onto the db? Unless I can use domain specific knowledge to short circuit some of the computation, the optimized db routines are likely to be a lot more performant than what I would otherwise write. Yes. working with strings is a pain in the ass.. but given that the industry seems to have standardized on p_programs_ expressed as strings (at least at the programmer level) I'm hard pressed to see the disadvantage, in practice...


I'd say that this entry also points to it's corollary, especially talk about getting developers to think about things in the 'right' - i.e. OO - way. This doesn't encourage developers to consider alternative approaches, but in my experience swap from one default pattern to another.

I work with a team of programmers and designers with many years experience, but most of it OO (while my programming is mostly server-side). They have translated the rule 'database access is expensive' - to become variously 'databases aren't performant' and 'to avoid going to the database, get as much information as possible every time you do'. Our default pattern for web search screens is to go off, query every possible result (possible hundreds of rows) back from the database to the Java layer, then let the user navigate through the record set a page at a time. Given that in 95% of cases the result to user wants is on the first result page, this is a complete waste of time. Of course, the second we want to make the application allow, say, flexible reordering or filtering of the results on the page, then it's NOT a waste of time.

This isn't specifically an OO/relational thing - it would be perfectly possible to have defined things all the way through to get the Top 5, but in my experience it does stem from a combination of always wanting to use the same approach, rather than thinking what is the most appropriate approach, and secondly an ignorance about databases - i.e. how to write an efficient query to get the nth page of data from a set in a stateless system - or even knowing that you can. Lack of knowledge of what your tools offer is the biggest cause of re-inventing the wheel and inappropriate decision making.

Of course, doing an efficient nth page query might be something your framework or O-R mapping tool provides, but as with any framework, how it operates should be something you understand. A framework should be a tool to save writing repetitive code, rather than a tool you use to avoid understanding 'what lies beneath'. It's good to know that doing operation X might be badly performing, but it's better to know that it's actually causing the server to page 6 Gb of virtual memory, not just waiting to lock a row - it gives a sense of perspective.

For hand-tuned SQL (i.e. queries, multi-row updates) I recommend storing it on the database - where views won't do, most implementations of SQL support functions / procedures that can accept bind parameters and either return result sets or pointers to result sets. This allows responsibility for SQL performance to be passed off to a smaller expert team / individual, who may not even know the target language. It also allows the same query to be shared across multiple languages (useful if you need to provide .NET and Java interfaces into the same system), or the same query to be tuned for different RDBMS versions (more useful), or even RDBMS (currently difficult due to variation in implementation).

I still think most of the problem is with Java's static typing (it's a Static/Dynamic conceptual split as much as Object/Relational). Ideally you want something that can automatically extend classes when the database structure is changed (or deprecate on column or table removal) - the sort of syntax that makes Oracle's PL/SQL a very concise language - you can define a collection of %ROWTYPE, fetch into it, and iterate through it, without having to know anything about the underlying table structure.

I like your statement:

"A framework should be a tool to save writing repetitive code, rather than a tool you use to avoid understanding 'what lies beneath'."

Seconded. It's so powerful a statement that it perhaps deserves its own topic. I boldified it, if you don't mind.

This is a quote in ASP.NET 4 Unleashed, page 935, that chapped my hide:

"A huge amount of developer time is spent performing brain-dead, tedious translations between the object and relational universes. I cringe when I think of the number of hours I've spent declaring classes that contain one-to-one mapping between properties and database columns. This is time I could have devoted to going to the park with my children, seeing a moving, walking my dog, and so on."

"LINQ to SQL promises to finally enable us to put SQL to a well-deserved death. Or more accurately, its promises to make SQL a subterranean language that we never need to interact with again. (SQL is plumbing, and I am not a plumber.) This is a good thing. Death to SQL"

(end quote)

Spot on. SQL is, at best, an unpleasantly-exposed low-level protocol for communicating with certain DBMSs.

My solution would be to make app languages more like SQL (or a better query language), not SQL more like OOP. LINQ is a partial step in the right direction, but it cannot go the whole way because encapsulation inherently prevents it. RelationalBreaksEncapsulation. Yes, one could break encapsulation to "escape" into collection-orientation land with enough tweaking, but it will lead to GreencoddsTenthRuleOfProgramming. --top

Absolutely, we should "make app languages more like ... a better query language", as in (for example) TutorialDee. LINQ is also a step in that direction. Your bit about encapsulation, however, is utter nonsense.

If you like the ADA family of languages. If it's nonsense, you haven't articulated clearly why.

It's nonsense because "Relational" doesn't break encapsulation. DateAndDarwensTypeSystem demonstrates that inheritance, polymorphism, and encapsulation can all exist effectively in the context of a pure implementation of the RelationalModel.

Not as OOP is currently being used in most shops. Further, I vote to separate the type system from the relational engine (DoesRelationalRequireTypes). I know others prefer a kind of GodLanguage approach that integrates everything, but that prevents mix-and-match progress, experiments, and sharing in my opinion. Let's not rehash those debates here, just offer links for those interested in those debates.

You know how OOP is currently being used in most shops?

As an informal observation, yes. I've taken no formal survey. If your personal observation suggests otherwise, you are free and welcome to state so.

My personal observation suggests otherwise. Furthermore, my personal observation here is that except for rare and brief rays of lucidity, most of your "observations" are heavily coloured by what you wish was true, rather than what is true.

You mean my observation that OOP is more popular currently than data-driven techniques really means that deep-down I prefer OOP? This makes no sense. Anyhow, we are all subject to bias, bias being a human trait. Thus, you are not immune. CommonSenseIsAnIllusion.

Huh? How could you possibly derive that from what I wrote, and where have you acquired the apparent strange (and incorrect) notion that OOP and data-driven programming are an "either/or" relationship?

I'll get back to the "bias" discussion another day. As far as either/or, it is because they have fundamentally different underlying philosophies and base assumptions. In short, RelationalBreaksEncapsulation.

Utter nonsense. You are viewing the world through a foggy, distorting haze of your own peculiar biases and wishes. RelationalBreaksEncapsulation is a dire soup of misinterpretation and misunderstanding except for the clear bits written by yrs trly. As I noted above, DateAndDarwensTypeSystem demonstrates that inheritance, polymorphism, and encapsulation can all exist effectively in the context of a pure implementation of the RelationalModel.

Most OOP shops make the so-called FirstGreatBlunder, and this creates many of the mismatch problems. I am approaching this from the perspective that I am one of many developers, probably with zilch rank, and that I likely could not change the practice of FirstGreatBlunder. Thus, my statement is essentially correct in such an environment.

The fact that some shops code badly does not diminish my point -- or make your claim correct -- by even one iota.

I'm also skeptical that DateAndDarwensTypeSystem would be practical, but that's secondary.

I'm giving your skepticism, accompanied by no evidence whatsoever, precisely the weight it deserves: None.

Can you point to a successful production system that uses DateAndDarwensTypeSystem?

Depending on your definition of "successful" and "production", yes: The RelProject is the first successful implementation of the majority of the version 3 specification, though Alphora's Dataphor was apparently strongly influenced by an earlier version. The professional DBAs to whom it has been demonstrated have applauded it for eliminating accidental erroneous joins between unrelated types (e.g., invoice number joined to customer number) and for trivially handling multiple units in a single type. E.g., a LENGTH type can handle both metres and feet, or a TEMPERATURE type that can represent Fahrenheit, Kelvin, and Celsius. Programmers to whom it has been demonstrated have applauded it for supporting MultipleDispatch and workable specialisation by constraint.

I didn't mean implementation of a language that supplies the concepts, but rather a production domain application built with such a language. That's where most of the complaints about PerniciousIngrownSql come from.

I am aware of two such applications. One is used for birdwatching. The other was part of an evaluation comparing TutorialDee to SQL done by one of my students, based on a real-world scenario I supplied. In the latter, DateAndDarwensTypeSystem worked effectively and resulted in an implementation that was subjectively considerably more readable (hence, more easily maintained and perhaps less likely to result in user/developer error) than the SQL equivalent.

As far as joining invoice number to customer number, those kinds of bugs are usually caught early in testing of an app. It's "almost right" implementations that tend to cause the biggest bugs. I suppose it could be useful for ad-hoc querying for one-off requests, although it would have to be weighed against the "bloat" of most type systems. Further, a built-in natural join mechanism or "common join catalog" (AlternativesToNaturalJoins) could also prevent problem joins.

Any so-called "bloat" will be offset by the clarity of the schema design, and the elimination of any need for a "common join catalog".

But how is customized column types related to breaking encapsulation? It's not really a problem at the column-level.

There is no breaking of encapsulation. That's my point. Encapsulation, however, is only meaningful in the context of column types. It has no meaning at a relation or tuple level unless you're deliberately trying to make the FirstGreatBlunder, and even then it's questionable.

But a practical buggaboo at the column level is how does RDBMS type info "get into" an existing app language? This is one reason why "type dumb" scriptish languages seem to work better for RDBMS interaction.

The canonical<->custom type value conversions that must occur between the RDBMS and client language are trivially specified. DateAndDarwensTypeSystem defines explicit mechanisms by which values in a given type may be instantiated (or "selected", in Date & Darwen's parlance) from values of more primitive types, or by which the explicitly exposed (i.e., not violating encapsulation) component values of a given type's value may be extracted. That said, research is being done into converting RDBMS type definitions into native client language definitions, in a manner not unlike the object broker mechanisms of CORBA.


Meanwhile, PerniciousIngrownRun-onORMBuilder-NotationCalls are now widely considered a "GoodThing".


See also: ScatterSqlEverywhere, HelpersInsteadOfWrappers, SeparationAndGroupingAreArchaicConcepts
JanuaryEleven

CategoryAntiPattern

EditText of this page (last edited January 27, 2011) or FindPage with title or text search