Some basic questions about OODBMS architecture:
Do all OODBMS systems have a common set of underlying principles? (In the same way as relational DBMS have underlying principles) By "common set of principles" I mean roughly the same architecture.
If OODBMS don't have a common set of underlying principles, are there groupings of OODBMSes that work similarly? What are the names of those groupings?
Who are the major commercial players in the OODBMS and how do they compare given answers to the above questions?
Any thoughts or articles that might answer these questions appreciated.
is a real object-oriented database. -- EricUlevik
[I'm trying to get at what an OODBMS is here, rather than how to wrap objects up in an RDBMS
Read the CrossingChasms
patterns by KyleBrown
. He deals with the use
of relational database systems to support persistence and how to handle the
various problems that one runs into in mapping object structures onto an
RDBMS. An OODBMS is essentially intended as a solution to the same problem
by removing the RDBMS entirely and therefore getting rid of the mapping problem.
So you could say that the underlying principle of an OODBMS is to provide a one-to-one mapping between in-memory object structures and persistent object structures. Of course this may not be achievable in all cases and this is not
necessarily a panacea for persistence.
Do all OODBMS systems have a common set of underlying principles? (in the same way as relational DBMS have underlying principles)
I don't think so. Off the top of my head, there are at least 3 approaches to building an OODBMS and they all work somewhat differently. The 3 I can think of are:
- RDMS vendors coming out with OO extensions. For example, Informix (formerly Illustra's) "universal server." Back when it first came out (I haven't looked recently), the whole thing was built around the notion of "data blades." And what a data blade amounted to was a map from (data type) to the integers. E.g. provide a total order on the data type so that all the traditional stuff RDBMS's do, could be done.
- Frame-based systems. Lots of work in AI is done on knowledge representation. One of the central notions, first proposed in the early 1970's, is that of "Frames." A frame turns out, to a good approximation, to be an object that has no behavior but has very explicitly stated axioms (e.g. many frame systems have a constraint language that is very close to first order logic). Frame representation systems tend to have very powerful reasoning capabilities but limited scalability (and awkward query languages).
- "Pure Object" DBMS. These (Versant, ObjectStore, Poet et al) grew out of a desire to provide persistent objects. These things work by hooking into the OS virtual memory mechanisms and mapping pointer types to indexes. Some of them are code-intrusive, some aren't, but none of them scale very well because virtual memory doesn't scale very well.
Now, in addition to these three approaches to building an OODBMS, there are also people who use an extra software layer to allow RDBMS's to deal with objects. There are lots of approaches to this. The simplest, and most common, notion is to define a map using the following heuristic: a class is a table and an instance is a tuple.
A very good book that discusses a lot of the design decisions involved in this approach (and offers a fairly nice discussion of a working system) is
Building Scalable Database Applications
C.J. Date has also recently written a book on integrating objects into relational databases.
Chris Date once recounted to me a conversation he had with Dave Childs, the inventor of set-theoretic databases and a common acquaintance.
Childs: (goes on at length about set theoretic stuff) "Do you understand what I'm saying?"
Date: "Only the words."
has questions about ObjectsVsRdbmsPerformance
The primary difference between object databases and relational databases relates to keys. Relational databases explicitly key tuples with tuple data, whereas object databases usually have to generate synthetic keys as persistent equivalents to memory address. Explicit keys in tuples allows meaningful subsetting can be done inline with the query whereas a synthetic key, being unrelated to the data, cannot.
So, a relational database tends to have greater functional power than a pure object persistence layer. The result is that we tend to use hybrids, which causes pain and pollution of pure object architectures such as EJB.
I have a lot of experience with several OODBMS products, but the large ones include:
1. Versant: Very powerful, but a little wonky.
Versant does work and scales very well, but it requires some time and knowledge to properly install, configure, and distribute it. Coding in all of their language bindings is completely transparent. Versant supports extents (in effect, a collection of a particular kind of class), and root named object lookups.
2. POET ObjectServer?
Suite: Very powerful, but not the most powerful of all OODBMS products on the market.
POET's strength comes from its extreme ease-of-use and great backwards compatibility. With POET, you can use ODBC to connect to the database and query it using SQL or OQL. Very nice. POET scales well, up to a point. Its biggest scalability bottleneck is that it doesn't include distributed databases yet. I've worked with POET a lot and their folks tell me that a federated database design is coming down the line. Coding in all of their language bindings is completely transparent. POET supports extents (in effect, a collection of a particular kind of class), and root named object lookups.
POET is inexpensive, and really easy to use. If you're a Java or C++ programmer, I highly recommend taking a look at POET as your first foray into OODBMS products.
3. Objectivity/DB: Very powerful, scalability is best here. However, it's a bit more complicated than other OODBMS products because it is so flexible.
I've used Objectivity with Smalltalk development and I love it. Objectivity also supports Java and C++. Their language bindings are transparent, but some more so than others. The Smalltalk transparency is there if you ignore their generated subclass code, which is easy enough to do. You never have to touch their generated code. Java transparency is achieved - like with the other products - by munging byte-codes in the class file.
Objectivity has, in my experience, the most scalable architecture of all the mentioned products. They use the concept of a "Federation" of databases, where each database can contain a cluster of object Containers. Locking is controlled by this distribution of databases. So, in effect, you can isolate high insert/update containers from high query containers. Objectivity also supports what they call MROW transactions, which allow object lookups and navigation and
object update/writing at the same time. POET does something similar; Versant is much more complex in this category, but to no great advantage.
The biggest pitfall with Objectivity is that once must really think through
how to best design and distribute their databases and their respective containers to gain maximum performance. This is true of all OODBMS products, but with Objectivity there is so much flexibility it can be a little daunting to approach the problem. However, finding an adequate solution is almost always possible and implementing it is then very straightforward.
The complaint I lodge against Objectivity is that they do not support Extents. The only way you can get at an object graph is via a named root entry. This is fine, and almost all systems fit into this paradigm of storage and lookup, but Extents are helpful some of the time. Container scanning is a close analog in this case, but isn't exactly the same thing (unless you only have one object class per container).
Objectivity/DB has great tools and their documentation is fantastic.
: Works, but isn't very compliant with standards. Very geared toward C++.
I've only evaluated ObjectStore
and I'm not that impressed. NuffSaid
. If others disagree, let me know.
Performance and Relational vs. OODBMS
Ok, here is my position: For almost every object-oriented system I've written an OODBMS is a better match than any relational database.
I base this on the following criteria:
- Lookups and queries (not SQL or external query requirements, but for my application code.)
- Insert and updates.
- Ease of use and development leverage gain therefrom.
The single biggest benefit one can get from using an OODBMS is that their object-oriented code: be it C++, Smalltalk, or Java, will
- be cleaner
- be, on average, faster
- be easier to read and extend
- be properly designed; not compromised for the benefit of a relational database
Why not use the products that are going to help get you into product that much sooner? I cannot see the advantage of using an RDBMS. Business folks shouldn't be running SQL anyway. It's time we moved past that.
Today one can acquire hardware that easily accommodates hundreds of millions of objects in main memory cache, for a few days of development cost. If we held our objects in main memory, writing through to slow persistent medium as required, it would seem that the most business type applications could completely forgoe a relational database. And be much faster. And easier to build. And understand. And maintain...
-- Tim Biernat