Collection Oriented Programming

From :

The term for a ProgrammingParadigm that involves operations on entire collections, avoiding loops. Programming languages supporting this paradigm are intrinsically suitable to both sequential and parallel/distributed execution. Assuming the operations in question can be performed in parallel...

Several noticeable subcategories of CollectionOrientedProgramming:

Often times, in a collection-oriented language, the collection (of whatever shape) is superior to the scalar. (In AplLanguage and JayLanguage, a scalar is viewed as a zero-dimensional collection).

For example, here is some pseudo-code to get the average of column 3 in a 2D array:

  x = average(myArray, column=3)
If we wanted to get the average from a rectangular section of an array, we may have something like:

  x = average(myArray, columns=3:8, rows=2:5)
The "traditional" way to do this would resemble:

  sum = 0
  count = 0
  for column = 3 to 8
  ...for row = 2 to 5 
  ......sum = sum + myArray[row, column] 
  ......count = count + 1
  ...end for
  end for
  average = sum / count
(Dots added to prevent TabMunging)

An optimizer would know that it could take an array-oriented language version and assign different rows or columns to multiple processors to divide up the workload. With the "traditional" approach, the optimizer may have a harder time because it does not know whether the order of the calculation makes a difference or not without complex analysis. Our "average" function knows this by definition.

Typical features found in CollectionOrientedProgramming languages or APIs:

As an example of the difference between treating an array as a single object, and coding or generating loop logic, consider the AplLanguage statement

  A <- A + A[2,3]  
In PliLanguage, the equivalent statement would be executed one element at a time, with the rightmost dimension cycling the most rapidly. So the sequence will be A[1,1], A[1,2], ..., A[1,n], A[2,1], A[2,2], ..., and so on. When these additions hit A[2,3], all "later" elements (those following A[2,3]) will have the new (doubled) value added to them, rather than the original one. APL broke new ground (and simplified things considerably) by treating arrays as single objects. -- PaulMorrison

To do: comparison with TableOrientedProgramming.

And with SqlLanguage

CollectionOrientedProgramming? Is that a "recognized" paradigm? I can hazard a guess as to what it is, but it smells to me like TableOrientedProgramming--a DesignPattern that someone is itching to promote to a full-fledged paradigm. :)

Aficionados of languages like JayLanguage, KayLanguage and AplLanguage prefer to think of them as array-based or VectorProcessing languages.

Well for a start is this a new concept or a novel and relatively unknown name for an old concept? Also arrays and vectors are homogeneous do you claim that CollectionOrientedProgramming is a generalization of this concept to cells of arbitrary complexity? What does it mean to process implicitly over a tree of trees of type < real , int, string>? How do you handle the case of <int, real, string> ,< real,int, string> < null, null,real> for example? Do you intend to constrain the collection types at all. If so how, if not how do you plan to define correct semantics? Vectors and Arrays answer to the linear algebra -- do you or anyone else have a programing model in mind. I have no issue with implicit looping and lang based on them. Awk is a good thing. And APL does a very good job at being APL. I like functional languages as well, But I do not see how this can be generalized to the extent this comment seems to imply. -ANNON
And the comments above seem terribly confused about TuringEquivalence?. If a language is only TuringEquivalent when you add another TuringEquivalent language on top then, it's not really TuringEquivalent.

So Perl 5 RegularExpressions aren't TuringEquivalent just because you can plug in arbitrary bits of PerlLanguage code into them.

You don't need a special language or significant pre-built library to do CollectionOrientedProgramming. By simply employing collection-oriented thinking, I've improved the expressive power, compactness, and maintainability of my code significantly, and that's just using what I can easily do in VbClassic. The big part of the trick is to just stop worrying about optimal processing of loops or doing more than one thing inside a single loop until there's a measurable performance issue that would be addressed (and there usually isn't). Pretend it's a FunctionalProgramming paradigm, and just don't worry about the fact that evaluation is not lazy.

Example: I want a comma-delimited list of the values from a field in a DAO recordset. The obvious thing to do is loop through the records until EOF is reached, and within that loop, concatenate the field's value to a string, inserting a delimiter. Thinking collection-oriented, though, I write a function to get values from a column of a recordset as a VBA.Collection, then a function to return a delimited list from a collection's items. Now, with 2 lines of code, I can get my delimited list by calling these 2 very reusable functions.

Another example: In MicrosoftAccess, I want a comma-delimited list of the names of fields bound to any controls on a form that follow a particular control naming pattern. I write one function to get a filtered subset of a collection of controls (or any other class that has a Properties collection) based on one of its properties matching a text pattern. I write another function to get a collection of values of a named property from the objects in a collection. As in the example above, another function gets a delimited string from a collection's items. Now, we can get the controls with Name properties matching my pattern, get a collection of the ControlSource property values from those items, then get a comma delimited list of those values. That's 3 lines of code to get my comma-delimited list of field names that can be used in, say, a SQL INSERT statement.

To make this more ObjectOrientedProgramming (still in VbClassic without inheritance), I could create a wrapper class for a collection, and employ methods of the wrapper instead of passing collections to functions. I just haven't gotten around to that yet.

-- SteveJorgensen

I've done similar stuff in webby scripty languages, but I feel that a table would be better than lists of lists. For example, often one wants to use the same field list, but with a few modifications for stuff not used in another set, and extra info such as type (string, number, date) to know if to reformat for dates or put quotes around strings. And maximum allowed length is needed for validation. And a title for report columns or error messages is often needed. Without a table, this results in 3 to 6 different lists or maps. In short, a full-blown DataDictionary of sorts is needed if one wants to do a collection-oriented column management right. But without native tables, it is not smooth. --top

Anybody know much about the LuaLanguage? Is it table-oriented or array-oriented, or a little of both?

Lua doesn't seem to be table- or array-oriented, as far as I can tell.

Lua tables are associative arrays (like Python dicts or AwkLanguage's arrays, and you can nest them), but with an optimization for the common case of tables indexed by consecutive integers: a table with cells indexed at 1, 2, ... N is actually implemented as an array, and has O(1) access. Each table has an array part and a key/value part. In practice, Lua tables are sufficiently expressive to implement almost all other collection ADTs (particularly when combined with metatables), though there is a bit more overhead than using struct pointers or ML-style variant types. Lua is not bulk vector-oriented in the same sense as (say) KayLanguage, though. --ScottVokes

R, the statistical language, is a good example of a language with implicit looping.

Create a collection containing the numbers 1, 2 and 3:
  x <- c(1,2,3)
What does x contain?
  1 2 3
And what do we need to do to multiply each item by 2? (Note: the lack of a for loop)
  x <- x * 2
What does x contain now?
  2 4 6
-- Mike Judge

This is the part where the Lisper pops in and shows an EssExpression implementation of such...

EuphoriaLanguage has similar collection effects:

  x = {1,2,3}  
  x = x * 2   -- x is now {2,4,6}
and treats strings in the same way:

  puts( 1, "LOWERCASE" + 32 )  -- displays the word "lowercase"
Operators can be applied to nested hierarchical collections containing a mixed set of scalars and sequences, in which case it could also be seen as implicit recursion/traversal of the hierarchy tree.

The list monad (OnMonads) in HaskellLanguage works along similar lines, although it must be explicitly invoked:

    f = do x <- [1, 2, 3]
           return (x * 2)
    f == [2, 4, 6]

In general, this monad is more useful when multiple lists are to be used, in which case the CartesianProduct is produced:

    g = do x <- [1, 2, 3]
           y <- [4, 5, 6]
           return (x * y)
    g == [4,5,6,8,10,12,12,15,18]

See Also: DataIdiomAndBehaviorIdiomQuantity, CollectionOrientedVerbs

EditText of this page (last edited May 18, 2012) or FindPage with title or text search