Objects And Data Are Separate

Extracted from ObjectRelationalImpedanceMismatchDoesNotExist.

An argument is to say objects are about behaviour and not data, so there can be no mismatch. Associating an object with a specific data representation may make an OODBMS easy, but it is a design mistake, much like putting behaviour in stored procedures is design mistake.

Whoever you are, please do elaborate on how "associating an object with a specific data representation" is a "design mistake", especially as it pertains to OODBMSes. To the best of my knowledge, the "data representation" that GemStone uses to store objects on disk is approximately the same "data representation" that VirtualMachines use to store objects in memory. Does this make Java a "design mistake"? (I might not disagree with you there ;-) Does this make Smalltalk a "design mistake"? -- RandyStafford


This view separates behaviour and data completely. Data are seen as a sea of attributes connected together by relationships. Behaviours wash over the attribute sea, changing state whenever a behaviour has changed the sea. It's not relevant which object(s) have changed state, because in any relatively complicated system it ends up being very difficult to assign data directly to a particular object. If you can [?], the behaviours tend to be uninteresting and the object is basically just a data encapsulator, which isn't much of an object at all.

Take, for an example, a person. A person has a great number of roles. A person has multiple roles at work - a role at the hospital where you were born, a role at the grocery store, a role for the company from which you purchase your house, etc, etc. The number of contexts a person is involved with is truly huge.

So what is a person? Commonly, we kind of cheat and have every context keep their own version of the person, so it may not be an issue. But really, a person is the identity that is the nexus of the relationships in which the person is embedded. A person has no real intrinsic data or behaviours, a person is identity and everything else is generated by relationships.

The behaviours belong to the objects implementing the relationship and data will often span relationships. Sure, you can make a data encapusulator and call that an object, but it's not a very satisfying object IMHO. Multiple objects may mutate a person's location attributes, but in what object does a person's location belong? Position indexed by time so it may not be appropriate in the base person object, again it's function of relationship, even if this case space is related to time.


<< Comments regarding normalization withdrawn: On further inspection (plus a thorough re-read of the laws), it appears the problems I had in mind as a model are not in fact normalization issues, but are instead caused by poor database design. >> -- SeanMcNamara

No, laws of normalization are there to stay in the relational model, and you don't need any 'relaxation'. This is a logically flawed argument. Once you relaxed the laws of normalization, you have a flawed relational schema. What exactly bothers you about the normal forms? -- CostinCozianu


I think Costin is referring to the separation of data and function so the two may vary with respect to each other. This is the most common refactoring away from monolithic/homogenous execution/persistence environments like Gemstone. Accept this fact once, for the persistent objects, call them mementos. They can be decorated with their functionality as they enter and leave the execution environment. Pretending the execution environment is persistent is denying reality to some extent, and taking the pain at the database once can make things a lot easier. See ValueObject (and StateObject). Of course, this could be another tangential leap :). -- RichardHenderson

Well, that's exactly what I think of Gemstone and most ObjectDatabases? that I know of, they are Persistence toolkits. Some of them have to modify the execution environment of the client application, in order to do all kinds of tricks. They are by no mean databases as described in DatabaseDefinition.

That is patently false. GemStone is, in every respect, a database management system as described in DatabaseDefinition. The databases it manages are, as Date defines, "collection[s] of persistent data that [are] used by the application systems of a given enterprise". The data definition languages used by GemStone are ObjectOrientedProgrammingLanguage's (namely, the SmalltalkLanguage for GemStone/S and the JavaLanguage for GemStone/J). These ObjectOrientedProgrammingLanguage's are also the data manipulation languages within GemStone. Both GemStone/S and GemStone/J provide for "data security and integrity". Both provide ACID transactions, with recoverability from transaction logs up to the last commit before a crash, and with backup and restore capabilities on the database itself. Both allow for the establishment of security constraints on persistent objects based on the identity of the accessing user. In GemStone/S security is based on a fixed set of privileges that can be granted or denied users, and on the fact that every persistent object belongs to a "segment" whose accessibility can be restricted on a user or group basis (see http://wiki.cs.uiuc.edu/VisualWorks/DOWNLOAD/papers/GSAdminNotes.zip). In GemStone/J security is based on the JavaSecurityArchitecture? (http://java.sun.com/j2se/1.3.0/docs/guide/security/). Both contain a "data dictionary" and tools to browse it: in GemStone/S it is the set of all objects reachable from the global collection AllUsers?, whose constituent Users each have a SymbolDictionary? containing all persistent objects resolvable by name by that user (including classes, which persist as first-class objects in the database). In GemStone/J the "data dictionary" is a well-defined JNDI tree containing all persistent objects committed into the database and the classpaths defining all classes that have been deployed into the server. Furthermore, the reflective capabilities of the ObjectOrientedProgrammingLanguage's that serve as data manipulation languages are available for data dictionary purposes.

On the point of execution environments, it may be of interest to you to note that, from a process architecture perspective, GemStone and Oracle are very similar. Both have processes to move disk pages between disk and memory, for example. The main difference is that in GemStone's case, the disk pages contain binary "data representations" of object state, as previously discussed, whereas in Oracle's case they contain rows of tables. More to the point, both contain server-side processes for executing "application" code - object methods, in GemStone's case, and stored procedures, in Oracle's case. Personally I don't consider this server-side code to be "client" code in the traditional sense, so Costin's remark is off the point. True, it could be regarded as "client" code of the "database", but on the other hand, it is deployed into the DBMS and, in GemStone's case, due to object encapsulation, it is one with the persistent objects. Richard's comments about "decorating" "mementos" are conceptually true, but are hidden from the application programmer by the way class loading and method dispatch work in the ObjectOrientedProgrammingLanguage's that serve as the "data manipulation languages" in GemStone. In this regard running server-side application logic in GemStone is no different than running application logic in any other VirtualMachine; except for the fact that the movement of objects into that virtual machine is managed by the DBMS (as is movement of data into the server-side Oracle processes that run stored procedures).

This avoidance of the separation of data and function, in GemStone's case, is exactly the point. It allows us to refactor a persistent object model with minimal impact. I can delete or change the types of instance variables and still preserve an object's contract to its clients. By contrast, if I change a table definition, by deleting columns or changing their types, I then have to go find every stored procedure that uses that table and make the corresponding changes. -- RandyStafford 2001/07/23

What you say is fair enough, except that you forgot to specify the ways that might be usable within Gemstone to achieve data integrity, an essential function of the database. Not that Oracle really abides by the relational model here (it is in deep suffering too), but it does its best, and we can hope for better in the next versions. I can hardly see how you take a general purpose programming language, even worse, an imperative style programming language, and make it a good data manipulation language.

As far as I know, both Smalltalk and Java, don't support logical operations on sets, they don't allow a declarative approach to data manipulation that will allow an optimizer to choose the most efficient course of action, and both are not too handy in either specifying or enforcing declarative integrity constraints. Maybe Gemstone/Eiffel would have been closer here. -- CostinCozianu


Then we agree in substance. It can easily be demonstrated that systems designed to be isomorphic to the underlying database architecture can be optimal "simply" by eliminating the ImpedanceMismatch, designing it out. We may have limited control of the database architecture, but we have complete control of the higher level design. Thus the knot is moved out of the system. -- RichardHenderson.

Gosh, you always jump to conclusions. Database systems that are 'isomorphic' or are designed to fit to the underlying physical architecture are logically flawed , ever since Codd first published his 1968 paper 'A Relational Model for Large Shared Data Banks'. This is what databases deal with: LARGE and SHARED data banks. If your problem deals with data which is not large or not shared, than by all means use whatever it pleases you - even a file system is the best solution in many of these cases - , but don't complain about ObjectRelationalImpedanceMismatch.

Please also note that even ODMG 3.0 model, is not designed to be 'isomorphic' with underlying physical architecture, on the contrary, it is specified only at a logical level (with two exceptions: object identity and transaction handling). -- CostinCozianu

I pruned it. We are fencing over words I think Constin. Work with me a bit. -- RIH


I'm having trouble with the core of your argument. Aren't objects the combination of attributes and behavior? If I have no attributes, and simply behavior, what I have are functions. It is the aggregation of the collection of attributes and their behavior that are classified as Objects so I don't see how you can suggest we remove one-half of the required criteria.

Required by whom? There is data. No one is saying there's isn't data. But why does it physically need to be bound to an object? If the object can deliver on its interface contract then why do you care if the attributes are physically contained with the object? If we had a separate data model operated on by objects you could not tell the difference.

Required by definition. You do not need to combine data and functions, but if you do not, don't call the result an object.

Arguments from anonymous authorities are always persuasive. Why you care where data that an object resides is not clear. Isn't more important than object meets its interface contract? If you ask for person's age and get it, why does it matter that it's a member variable or comes from elsewhere?

The reasons are "Cohesion" and "Coupling." It is true that at one level, a user does not care how something is coded, but on another level, he cares how reliable and maintainable the resulting software is.

I think there is some confusion here. The statement being made in the italicized segment above is what I was trying to say in my segment prior to that. Objects are defined as the combination of attributes and associated behavior for a given entity of a given type (Class). I believe that is the widely accepted definition for an object. Of course it is important that an object obey it's interface, but an interface is exactly that, a contract for interactive behavior, and is not by definition an object itself. Where an object stores it's attribute data is not at issue, as it could be stored within a database, or within memory without affecting the fundamental nature of an object provided the object itself has all the information required to extract its state without input from an external source. -- SeanMcNamara

The bit about an object having data is completely redundant. An object says what it provides in its interface. If it has contained data is only important if you allow public access to data members. -- AnonymousDonor

Where are you getting this definition from? I've never heard that an object IS it's interface, since that would make the two terms redundant, leaving no reason for both. In counterpoint, here are some definitions that have been offered:

	From Jacobson: Object-Oriented Software Engineering :
	?An object is characterized by a number of operations and a state which remembers the effect of these operations.?

From Booch: Object-Oriented Design with Applications : ?An object has state, behavior and identity; ? the term instance and object are interchangeable.?

Both of these definition include the notion of state which is what I believe we are talking about in terms of the objects data. -- SeanMcNamara

Objects have state, state that is specified by the interface contract. But there is nothing that says that the state and the object must be stored together. The object merely has to meet its interface contract. The data model and object model can be completely separate and still meet the behavioural promises of an object. If Manager object says it can return the age of the manager, the age does not have to be an attribute of Manager or even be clearly related to Manager. As long as Manager knows how to get its age that's all that counts from a program point of view. If that age attribute is part of some data model that doesn't even know about Managers it doesn't really matter. -- AnonymousDonor

Hey AnonymousDonor: you're ignoring the point. The definition of "object" in ObjectOrientedProgramming is the encapsulation of state and behavior. It doesn't matter whether objects and data can be separate. -- RandyStafford

The key to an object is that it owns its state and no other object (or anything else) accesses that state without going through the contract. To enforce this encapsulation of state and behavior, it is very convenient to implement state and behavior together in the same place. In fact, it is so hard to avoid breaking encapsulation when there isn't a strong link between state and behavior, I doubt it occurs in any system built with more than 4 programmers with a single turnover. -- MarkAddleman

Anonymous, how would you describe an object (say Person) that has a few methods (celebrateBirthday:void, canApplyForDriversLicense:boolean) that affect it's internal state, but make use of an internal attribute (age) that is not specifically included in the interface? An object may have state not directly accessed through it's exposed interface. -- SeanMcNamara


A nice philosophical point, but that doesn't change the facts of implementation.

Clearly it does. The difference is dramatic. The attribute physically being part of an object is radically different than an object operating on a separate data model. -- AnonymousDonor


CostinCozianu has a good point in that it would be possible to hold state in the database and have behavior in the code.

To do this with a relational database, each object instance would hold only the primary key values for the record it represents. "Property get" methods for primary key values could directly return the values, but "property get" methods for all other attributes would have to issue a SELECT statement to the database. Likewise, "property set" methods would issue UPDATE statements. You can't update primary keys; this is a relational database rule - because primary keys "identify" the records / object instances. This would support having many physical object instances in memory for the same business object; "object identity" is the concatenation of class/table id and the primary key values.

This would be like using the HandleBody? or EnvelopeLetter patterns - the objects in memory are "only" stateless pointers to the "body" - the record holding the real data in the database.

It would be possible to do this, in theory. But in real-world implementations using current relational databases, the performance impact would be very dramatic. -- AnonymousDonor2

Not to mention the additional development-time effort that would be required to design and program this way and ensure a functionally correct, scalable, performant, multi-user safe architecture. -- RandyStafford

I agree that despite the performance problems cited above, this approach is certainly possible, acceptable, and could perhaps even be warranted in certain situations, however this is simply an implementation detail. If that is what is being suggested, this page should be renamed ObjectMethodsAndObjectDataAreSeparate?, since the Object is still (per all the definitions provided) the combination of state and behavior (i.e. attributes and methods.) -- SeanMcNamara

The above argument about performance is gratuitous, and it is supported at most by anecdotical facts. You're arguments on ObjectRelationalImpedanceMismatch are an eloquent example, you can't blame relational databases for you not being able to use them, and not having a sound design, and not having a specialized DBA on the job.

So you're telling me that I don't know how to use relational databases, and that I haven't had sound designs, and that none of these projects have had a specialized DBA on the job (how the hell would you know?), and that I should throw away my past decade of experience in favor of your specious arguments? Get real. Your arrogance is unbelievable. -- RandyStafford

<quote> caveat: these were anecdotal observations, not a scientific study, and no effort was spent doing tuning and optimization in either persistence mode </quote> Likewise, the same applies for the other example ,which was built on J2EE making matters worse (does this mean including EnterpriseJavaBeans?). If you are quoting meaningless performance figures (2x, object per second) and pretend them as arguments, I don't know who is really to blame. For something reliable and verifiable, you can visit the link below.

Just because the observations were anecdotal does not make them meaningless. On the contrary, they are quite meaningful, and as arguments they are intended, not pretended. The fact is that every time I have seen a ObjectOrientedProgrammingLanguage DomainModel mapped to a relational database in my experience of the last decade, it has been expensive, painful, slow, and ridiculously consumptive of time and effort.

 Patient: "Doctor, it hurts when I do this (patient makes rotating motion with right shoulder)."
 Doctor: "Then don't do that"
Unless the doctor is CostinCozianu, in which case he says "oh, you're not doing it the right way - I can't tell you what the right way is, but you have to disregard all other opinions except mine". -- RandyStafford

On the other hand, I know a lot of anti patterns related to the OO prejudice that the database should be some kind of an extension of the runtime environment, and it's sole and unique mission is to provide object persistence. Such anti patterns are abundant in OO circles, see EnterpriseJavaBeans for example. Empirically I found your views very close to this trend. Of course you get the worst performance approaching database application development this way. Great chances are that you approached the development of the mentioned application as such. And then you blame it on ObjectRelationalImpedanceMismatch.

Yes, I do blame the performance problems I've referenced on the ObjectRelationalImpedanceMismatch. And if you've bothered to familiarize yourself with the content on this Wiki before so obnoxiously jumping in, which I doubt, you know that I'm not a fan of EntityBeans (see EntityBeansAreEvil, for example). So here's your big opportunity to enlighten me, Costin (or should I wait for your authoritative book on the subject - where is it, BTW?). Funny that I've been successful building object-oriented systems all these years without the benefit of enlightenment by your authoritative views on the subject. Please do tell me, if I want to use an ObjectOrientedProgrammingLanguage to build an application with persistent storage requirements, what is your esteemed approach to that problem? Or do you not advocate using ObjectOrientedProgrammingLanguages at all? -- RandyStafford

Randy, for how many more times do I have to remind you that WikiWritersDontGetPaid, so you'll have to be more patient with me? I don't intend to write a book as there are already way too many bad books on the subject. And a lot more are on the way. Add to that official documentation, marketing propaganda material and what-not, and you'll see why ANYBODY at all should not feel like he should be writing a book !!!

But there are already people with more experience and more theoretical vision than you and me combined who have already written books on the subject. And I wait for their future work, while I am doing some modest java programming against relational databases, AND anecdotically not feeling any mismatch. I volunteered with this discussion to kind of clear up the mess and the incommunicado state between the two sides. If somebody better than me will be willing to take on the arguments on the relational side of the story, I'll be MORE THAN HAPPY to hand over the task.

By the way, in '95 I was learning to do that with C and Embedded SQL.

It's not the case to discuss individual examples, unless you identify specific performance problems and not tell us it happened to me this way (and by the way, no optimization was done for the relational database) and pretend you have a real argument. I invited you to do this, but so far you failed to, while you assert a generic performance problem. Based on the arguments you presented, at best you can assert empirical performance problems.

I do have a real argument, Costin, which is repeated observation over years of experience. It may be of some interest to you that I worked for Oracle Corporation in 1995, where I developed and put into production an e-commerce system, including designing the schema and writing all the business logic in PL/SQL - so I do know how to use relational databases. What were you doing in 1995? How much experience have you had implementing DomainModel's in an ObjectOrientedProgrammingLanguage and mapping them to a relational database? I seriously doubt you've had any, since a few days ago you didn't even know what a DomainModel was (and now you now assert that the example I took the time to provide is not a DomainModel). And FYI, I've been familiar with TPC for years and it has had absolutely no relevance to the successful development of production software systems that I've been involved in. Although it is funny, don't you think, that the top performers involve J2EE application servers. It's also funny, in an ironic way, that their website is so unresponsive - they must be doing some ObjectRelationalMapping under there. -- RandyStafford

FYI: If you are so familiar with TPC please note that J2EE application Servers ARE NOT involved in top performers at TPC transactional benchmarks (neither TPC-C, nor TPC-W), unless you consider badge engineering as such. Please have a look at the source code. It's true that it is awfully non-OO, however it apparently performs well.

Empirically, I suggest that you take a close look at http://www.tpc.org, and tell us what you see there in terms of verifiable performance results.

I'll try to move the performance discussion to where it belongs (EmpiricRelationalVsObjectPerformanceMatch), if you feel you also have theoretical arguments (I do, but on the opposite side) you can start a TheoreticalObjectvsRelationalPerformanceMatch? or something similar. -- CostinCozianu


...you can't blame relational databases for you not being able to use them, and not having a sound design, and not having a specialized DBA on the job.

I contend this is evidence enough of an ImpedanceMismatch. -- MarkAddleman


I'd like to suggest we refocus this discussion a little bit, and move the extraneous segments to other pages. The initial discussion was whether object are separate from their data (state.) I think it has been shown that this can not be true by the definition of the term Object. The discussion has since devolved into the ObjectRelationalImpedanceMismatch discussion, which already has a page. Agreed? -- SeanMcNamara

I think the extraneous discussion would belong to EmpiricRelationalVsObjectPerformanceMatch, and probably should be moved there.

What is the true definition of term object ? Common usage (folklore) is object = data + behaviour, but the discussion of what is the true definition of object is far more subtle than that. And anyway no matter what the exact definition we choose from the many approaches of today, it will probably not impact our ObjectRelationalImpedanceMismatch discussion in a significant way.

Please don't draw any conclusion of this discussion (i.e. Object and Data are separate) my take on the subject is that the object model belongs to the client application space and contains only a "transient" copy of the persistent data (the copy is only valid within transactional boundaries) plus the behaviour associated to data, while the logical data model belongs in a relational database. Thus I can easily say that ObjectsAndDataAreSeparate.

The mapping between the world of objects in the application and the world of logical data model should be straightforward, if we relax some object oriented principles that are very dear to us, such as the object identity, and if we don't look at the database as only a persistence engine. -- Costin Cozianu

Costin, can you please provide a reference to a definition of object that does not include the notion of encapsulation of data (state)? As I said earlier, if what you are proposing is that in implementation, behavior should be kept completely separate from the data, then perhaps the entire page should be renamed to reflect that. I would think you would object to the statement that "Well designed relational databases should not be normalized", yet you seem to expect us to accept a similar statement for objects that is dependent upon redefinition of fundamental terms. -- SeanMcNamara

Sean, an object in the application object model can be seen as some kind of encapsulation, and almost by the force of nature this kind of encapsulated object stays in the application space. An entity as referred in data modeling is by necessity NOT encapsulated and lives inside the database. So the object and the data i.e. the entity, or even with a better term, the tuple are separated when programming against a relational database, even if we like it or not..

I say some kind of encapsulation because you cannot achieve perfect encapsulation even in the application space. I tried to do that in a recent project, realized I was in mistake and had to refactor and back off. The problem is that even if you have a perfectly encapsulated class in a common language (Java, Smalltalk) you potentially may have several clients operating on the object. So if the encapsulation is to be perfect we'd have to make sure that in the application space we also enforce object identity, that is only one object instance is active for any given entity in the database at a given time. This is a bad application of object identity principle that you can find in almost any OO book as Heaven sent.

Well in multi-client, transactional database applications it is not a good principle. I'll detail with more examples in RelationalHasNoObjectIdentity in short time.

You have to just let go of this principle and accept you can have several physical instances of let's say an entity let's say with AccountId?=100 that are either spread in different runtimes in a client/server architecture, or in multiple runtimes in a three tier architecture, or EVEN in a single runtime in a three tier architecture. Each of those instances will operate on copies of data, or will send direct commands to the database relational engine (such as UPDATE Accounts SET balance=balance+5000 WHERE AccountID=100), and you'll have a perfectly consistent architecture because you substitute the perfect encapsulation required by OO principles (including object identity) with data integrity constraints and transactional integrity constraints that will guarantee you'll always have demonstrably correct results. This model is far more scalable than any model based on strict object identity and object encapsulation. -- Costin

The only principle I'm holding onto is the definition of an object as something that encapsulates behavior and state. All the arguments you have given are purely implementation detail, and simply extend the database into the runtime environment. Here's a question: How do you propose to handle transient objects (i.e. objects that have no persistence outside of the runtime window?) For every object created (say VoipPacket?) would you create an entry in the database for it? Once again, this seems like nothing more than implementation detail, and is not fundamental to the notion of an Object in general. An Object is defined as the combination of behavior and state, but makes no statement regarding where these reside in an implementation sense. That's why I think the topic (at least in name) is misguided. In terms of using a RDBMS for storage of all attribute data, people currently do that, so your recommendation seems to be nothing more than to ALWAYS do the manipulation on the database, never keeping attribute data resident in memory. Some others have offered their thoughts on why this is not a good thing, usually based on notions of performance, which although nobody has given hard numbers seems intuitively correct. With recent RDBMS advances (memory resident tables, partition tables, solid-state-storage systems, etc) I'm sure that databases can be made to perform at levels quite close to direct memory access, but again, no hard performance numbers have been given on either side. Perhaps you would do well to list in bullet-point form the specific advantages you feel using the RDBMS for runtime storage provides. I think it possible to make some fairly strong arguments for your recommendation, but also feels it's important to acknowledge that it is merely an implementation issue, rather than one that addresses the fundamental notion of what an Object is. -- SeanMcNamara


Sean, if you think the problem is that simple, would you like to attempt a standard definition for Object ? Believe me, there's no large consensus in the OO world on what exactly an object is or what exactly an object should be. Also, I like to know how you define behavior.

I've already provided definitions from Booch and Jacobson, and would provide another from Coad if I was near my bookshelf. I asked you above to provide some definitions of Object that do not include the notion of both behavior and state. You have failed to do this. I have suggested that a major part of the disagreement centers around what the nature of your point is, suggesting that it is perhaps an implementation issue. You have not addressed this suggestion in any way other than saying that the definition of object I am using is incorrect, but have not provided any other definition. Without providing a constructive point to move forward on, there is no use in continuing this discussion. -- SeanMcNamara

Well, than I suggest that you should season your bookshelf with something more serious. What worries me these days is that I often find people who think that the real and final authorities on OO are only the three amigos. You can do a search on the web for LucaCardelli, BarbaraLiskov, Benjamin C. Pierce, MartinAbadi? (I can provide a dozen more names if you want). From their online papers you can find even more references and more names. -- Costin

The definitions you are using are correct in their own way, although it is desirable that they should be more precise. What is incorrect is either for you to pretend it is THE correct definition or for you to ask me for what THE correct definition is. One sketch of an alternate definition I presented in the LiskovSubstitutionPrinciple, and it doesn't deal with objects per se, it splits them into object values and object variables with very beneficial consequences. -- Costin


I'm not saying to extend the database in the run-time environment but neither the opposite approach (making the database an annex of the run-time that ideally will auto-magically persist the objects that needs persistence). We have to balance the two sides - client-runtime and database. Therefore database has absolutely nothing to do with transient objects. I didn't say never keep the attribute data resident in memory, neither did I say always do manipulations only in the database.

The key is to achieve a balance: for some use-cases you load objects from database, modify put them back in, for other use-cases you'll just have to give up your OO ego, and just throw an SQL to the database.

There's no question of making database perform at the same speed as an optimized code generated by a language such C++, or even Smalltalk or Java.

But the problem is that you simply CANNOT use the concepts of C++,Java, Smalltalk and do application for LARGE SHARED DATABANKS with the same "speed" as you would do a single client application. Their model just doesn't support it, because they don't have any notion of shared access to an object attributes, they don't understand transactions concurrency control and isolation, they don't handle large amounts of data efficiently. Of course you could write some class libraries in the host language to deal with the mentioned issues but you would be reinventing a big wheel.

An alternative simplistic approach to program your business application like you don't have any database at all, and still have transactional consistency and concurrent shared access is to lock on instances. See for example:

 transaction.begin(); (...) entityBean.ejbLoad(); (...) entityBean.doSomething(); (...) entityBean.ejbStore(); (...) transaction.commit(); (...) 

It's not actually the exact name of the methods, but it is what happen with entities, even you only call entityBean.doSoemthing(); the rest will be called for you by the application server in this exact order). More, the majority of object databases operate in a very similar way, by transparently performing load() and store() operations under the hood, and offering you the illusion that you don't have any database at all.

In order to achieve consistency in the above example you have several options, all of them are bad if that is THE only way of how you deal with ALL your objects, in ALL your transactional use-cases. The only safe approach is to request SERIALIZABLE isolation level from the relational database. Optimistic concurrency control just doesn't cut it, because there are some subtle ways in which your results may be incorrect, even if nobody else modified the entity concurrently. Some of the application servers are very funny in that they take matter into their own hands about concurrency control by directly locking on instances in the middleware, some object databases are also funny in that they require you to explicitly call for getting locks on objects.

The approach that I described I is just guilty on several counts, so while it may be the only way to approach a few particular use-cases where you had to do something similar even if you programmed in old C+Embedded SQL fashion, I assert that in general case it is desirable to be avoided. -- CostinCozianu

The topic of the page gave the impression that the statement was being suggested a tautology, yet you now say that it is simply an extreme opposite to the no-database approach. If you are simply trying to document a potential pattern, then I recommend we refactor the page using the standard pattern template to discuss the context in which the pattern is valid, along with benefits and drawbacks. The content of this thread as a whole seems to be concerned with disputing the topic-as-tautology. If that wasn't what was being proposed, we've been wasting time. -- Sean


Anonymous, how would you describe an object (say Person) that has a few methods (celebrateBirthday:void, canApplyForDriversLicense:boolean) that affect it's internal state, but make use of an internal attribute (age) that is not specifically included in the interface? An object may have state not directly accessed through it's exposed interface. -- SeanMcNamara

If the object says it can perform those behaviours then it can. There may be dozens of attributes it doesn't own yet makes use of to perform its behaviours. How the object gets the age for the person is not a concern to the user and in no way should the user of an object expect to see an age attribute in the class definition. In fact, depending on the object exposing the age attribute could be a sign of bad design because it may indicate this class is just a bunch of getter and setter methods. -- AnonymousDonor (who is not CostinCozianu)

RE: There may be dozens of attributes it doesn't own yet makes use of to perform its behaviours

It seems to me that this quite clearly breaks encapsulation. Other than parameters explicitly passed into a method invocation, and object should own all attributes that it's behavior depends on.

I agree that the user should not care how the object gets the age. Those details are encapsulated within the object itself. My point was not that an object needs to expose all of it's attributes through it's interface, but quite the opposite, that the interface of an object is not sufficient to describe all it's attributes, as was suggested above (although it might be hard to spot since the comments have moved from their original location.) -- SeanMcNamara


It seems to me that this quite clearly breaks encapsulation. Other than parameters explicitly passed into a method invocation, and object should own all attributes that it's behavior depends on.

This may be possible in simple objects, but many objects need to collaborate/use other objects to implement their behaviour. To print a document, for example, the printer object uses a physical printer that is no way owned by the printer object. The printer will invariably use memory, queueing, screen, and file system resources that it doesn't own as well. None of the resources will be persisted with the printer object. They are runtime services used by the printer object. Facts like age are really no different. -- AnonymousDonor

It depends on exactly what you mean by "owned." The precise term is encapsulated. This merely means that either by convention or through actual enforcement by the language, all accesses to a data item are through a restricted set of methods in some common unit. This unit is usually known as either a class or a module. A printer can be encapsulated if you restrict direct access to it. It also can be directly accessed if desired. I suspect you probably encapsulate your printer with a driver module, and you can do higher level encapsulations.
moved discussion of currency, object identity and encapsulation to ConcurrencyIssuesAreOrthogonalToEncapsulation. Also included there is some discussion of performance issues.
comments on ObjectRelationalImpedanceMismatchDoesNotExist moved there
comments on DatabaseApplicationIndependence moved there
see GemStonejConcurrencyMechanism

EditText of this page (last edited August 21, 2004) or FindPage with title or text search