Use Enumerations Instead Of For Loops

This pattern is wrapped up inside another pattern that should probably be named "Use Vector and Hashtable instead of arrays".

Most long-time Smalltalk programmers are comfortable with the idea of representing collections of things using the rich collection hierarchy available in Smalltalk. For arbitrarily-sized collections of things, we just naturally use the class OrderedCollection. For places where it seems that the index of an object in a collection can be something meaningful *itself*, we naturally represent that by a Dictionary.

Likewise in Java, we have the classes Vector and Hashtable that are equivalent to OrderedCollection and Dictionary. For an old-time Smalltalker like myself it's natural to use these classes in the way described above. However, this seems to be very different than the way that programmers with more of a C or C++ background use Java. They seem to tend to use arrays more, and handle all of the issues of size, enumeration and lookup themselves.

For example, I have yet to find a single use of Vectors or Hashtables in any of the introductory Java books I've read (Just Java, Teach yourself Java in 21 days, Java in a Nutshell). On the other hand, in the 100+ Java classes I've written to date, I have used an array exactly *once*.

The benefit of using these other classes is that I don't have to think about how big my arrays are -- I also don't have to worry about how "System.arraycopy()" works, since I'll never be resizing an array. I also have simpler code - I think (this is a personal preference) that a while loop based on an enumeration (an instance of the Iterator pattern) looks cleaner than a "for" loop.

Perhaps another difference is that my classes tend to contain collections of other objects and not just primitive types. Again, this is probably due to my Smalltalk background where I tend to use more abstractions and don't worry as much about something actually being an object.

-- KyleBrown


Watch out!

Contrary to intuition, Enumeration in Java does not operate on a copy or a snapshot of the Vector. Removing items from a vector has the side effect of breaking all Enumerator objects currently iterating on the vector: they will miss an item in the vector.

Solution: If the vector can be modified in another thread or in the process of iteration, get the Enumeration object from a clone of the vector, not the original.

-- BillKayser


Just like Smalltalk :-)

Mario Wolczko (http://research.sun.com/people/mario/) wrote a goodie for one of the ParcPlace Smalltalks which detected attempts to change collections while the collections were being iterated over using InternalIterators, and raised an exception. Something similar could probably be done for JavaTm?'s ExternalIterators.

JamesNoble


I just came across something similar to the previous in a couple of classes I'm writing. I've written two classes, DoublyLinkedList and SortedDoublyLinkedList. Don't ask me why I'm not using a Vector - the answer is that we can't spare the space and time constraints of growing and shrinking the internal array. Our problem is that we are ASSURED that there will be elements added and removed to these lists while Enumerations are enumerating over them. Again, for space and speed concerns we can't afford to make a copy of the list - also a business rule in our domain depends on not doing that.

So what I've done is to use the ObserverPattern. Each Enumerator is an Observer on the DoublyLinkedList? (which inherits from Observable). When an element is about to be removed (but right before it happens) the Enumerations are all notified through the observer pattern - they then get a chance to clean up and get the next element if the element to be removed happens to be the one they are currently holding on to.

-- KyleBrown

Why not java.util.Collections.unmodifiableCollection(collection)


After working some time now in Java I prefer using Arrays in many cases. In particular, when an object is immutable, or the array size is going to remain fixed. The advantages of using arrays (when convenient) include the fact that you don't need to do downcasting and you don't have to worry about elements being added or removed (since you can't change the length of an array). Also, performance of arrays is better. (Yikes! Did I actually list performance as a possible design consideration?)

-- BillKayser


Watch out for Hashtable and Vector - their performance leaves much to be desired. Take a look at Vector's source code, for example, and you'll see that it's nothing more than an array with add/remove operations implemented via System.arrayCopy. Hashtable's hashing/collision algorithm is also far from being efficient. I've been using ObjectSpace's JGL library for all my collection needs. As of this writing, I am not aware of a better Java collections' framework (JGL is FREE, btw). Do give it a spin: http://www.recursionsw.com/products/jgl/ (link corrected) (and corrected again. JGL is now maintained by Recursion Software.)

Andrew Smith?


Although the name of the Idiom is "...NotForLoops" I found it very easy to use enumerations for loops, especially if you want to Iterate nodes in a HashTable. I also suggest (as written above) to use the enumeration on a clone of the HashTable.

Jonathan Chashper (yonathan_chashper@icomverse.com)

I think you misunderstood the name of the idiom. It's referring to using enumerations and Java's while keyword to build loops instead of using arrays with the Java for keyword.

KyleBrown

The point is the idiom is misnamed. It's "int index versus enumeration" rather then "while versus for". -- DaveHarris
Since some folks have mentioned performance, getting an enumeration from a Vector and using while(e.hasMoreElements()) is slower that iterating through the Vector itself with a for loop. Myself, I find the latter as clear as the former (OK, I admit, I've been doing C/C++ for years).

Cassio

Really? I just glanced at the 1.1.5 source code, and I'd expect them to be about the same. VectorEnumerator accesses the Vector representation directly, and so doesn't have any extra function call overhead, nor does it do the access checks twice. Have you measured it? -- DaveHarris

On the subject of clarity - I find the 'for form' clearer too. As a C++ programmer, I was taught that a for statement expresses that the programmer knows how many iterations will be performed:

 for (int i = 0; i < array.length; i++)
means that we want exactly as many iterations as there are elements where as a while statement expresses that the programmer doesn't know how many iterations will be required:

 while (c != End'Of'File)
However, as a C++er, turned Smalltalker, turning Javaer (!) I prefer to use Vector and Hashtable which leads me to use a while loop which makes me feel uncomfortable :-(. Strange how something as simple as a syntactic form can mean so much many years on. -- PaulDyson


For performance reasons, JavaSoft's tricks and tips newsletter recently recommended using:

  int size = vector.size();
  for( int index = 0; index < size; ++index )
	elementAt(index);
  }
I prefer to use collections, but I find that the type-safety provided by arrays generally outweighs the disadvantages, provided I can determine the necessary size in advance. If I can't get this information, then I'm forced to use Vector and Hashtable.

-- ArickAnderson

Re: performance. I recently ran across http://www.ibm.com/java/education/javahipr.html. A little long in the tooth at this point, but quite interesting.
There are some interesting angles on Enumerations that haven't been addressed yet:

1. Has anyone noticed that StringTokenizer implements Enumeration? When I saw this, It really hit me how powerful the Enumeration pattern is. Forget about hiding what kind of container you have: You can hide the fact that there's no container at all! This is a really strong decoupling.

2. Enumeration is a pattern, and we shouldn't feel restricted to the one Enumeration class packaged with Java. In the absence of genericity, we can make iterators that deal with particular types for particular domains.

-- RobbShecter
"A for statement expresses that the programmer knows how many iterations will be performed" - I've never accepted that. I used to delight in such code as:
	for (Node *node = list->head; node != NULL; node = node->next)
	// Use node.
	for (int i = 0; array[i] >= 0; i++)
	// Use array[i].
in my C days.

Further on speed: I often find the Vector is hidden inside another class but needs to be exposed somehow. For example, a person might have a list of children:
	class Person {
	private Vector children = new Vector();
	int getChildCount() { return children.size(); }
	Object getChild( int i ) { return children.elementAt(i); }
	Enumeration getChildren() { return children.elements(); }
	// other methods
	}
Now, using getChildCount()/getChild() has to go through a layer of indirection which getChildren() short-circuits. This should make it faster for a naive language implementation. -- DaveHarris

I rather like the idea of using both Enumerators (well, actually the Java2 Iterator) with a for loop. For example:

 for(Iterator iter = vect.iterator(); iter.hasNext();) {
   tempVar = (SomeClass)iter.next();
   //operations on tempVar here
 }
Notice the semicolon after hasNext(). It means that the increment portion of the for loop is empty. The iter.next() in the body of the loop handles this function (provided it is called once per cycle of the loop). The advantage I see to this (and I can't claim original credit - a colleague showed this to me) is that the scope of iter is the body of the for loop, and therefore is available for garbage collection sooner than if the scope is the entire method.

Anybody want to shoot this down? -- GregVaughn

Since you asked, I do ;-) I can appreciate the scoping argument, but your method should not be long enough to make much of a difference in this respect, and the empty part of the for() statement suggests a mismatch to me. I find the explicit while() statement easier to see. -- SteveFreeman

Not me (as you well know, that's how I typically write things, too). I've never even heard the bit about "for means you know how many iterations you want" and I disagree with it. I like for loops (as opposed to while loops) for this situation because they bundle all of the loop control into one statement, and they allow me to scope the iterator variable local to the loop so that it doesn't clutter the namespace outside the loop. (The garbage collection issue is actually the least of my worries.) If you use a while loop with an Iterator or Enumeration, you have to initialize your iterator variable in an extra statement before the loop proper. Note, though, that this is all fairly minor coding style compared to the main point here, which is to use Enumerations or Iterators in preference to looping by indices.

For those working with Java 1.2 (er ... Java 2), you should really use the new collections classes and their Iterators. Among their many advantages over Vectors and Hashtables, the iterators detect the situation mentioned above where the collection is modified during the iteration by another thread (and throw a ConcurrentModificationException). This has led me to detect several obscure synchronization bugs that would otherwise have been much more trouble. -- GlennVanderburg


I have two major objections to the discussion thus far:

1-Performance is *always* a concern when using object iterators, regardless of how they are implemented (Check out the implementation of the Enumeration that comes back from a Hashtable). If the code you are writing is to be executed potentially hundreds of times per second, than the added abstraction is certainly a liability. Besides, it is possible to express the intent in a design document in a way that leaves the implementation details open for the best fit. Not every line of code has to participate in a proper abstraction.

2-I really dislike using for loops as mentioned above because I think that they are misleading. I do not agree that the construct (I can't figure out why this is screwing up my formatting):

 for(Iterator iter = vect.iterator(); iter.hasNext();) {
   tempVar = (SomeClass)iter.next();
   //operations on tempVar here
 }
bundles the control logic in one statement - it doesn't. The statement that advances the iterator is in the body of the for loop, hidden in a method call. This code could be expressed much more clearly in this form (there should be a carriage return before the 'while' statement):

 Iterator iter = vect.iterator();
 while (iter.hasNext())
 {
   Object obj = iter.next();
   //code
 }
for the simple reason that this is the very purpose of a while loop. If you are seriously concerned about garbage collection, either add brackets before and after or explicitly set the reference to null when you are through with it.

-- BradleySmith?
For me, for loops are much more intuitive than iterators. I can read and write them without thinking, and I never mess up the boundary conditions. Iterators require more brain cells, distracting me from the problem at hand. My programming background (6 years of C, preceded by Basic, Pascal, and Fortran) is probably why.

In my current Java, I use arrays if I know the size in advance and ArrayLists? if I don't. Either way, for loops work just fine. I only use iterators when the class forces it (Hashtables, StringTokenizers?, etc.) or there's a major performance cost in determining all the values in advance. -- JaredLevy


Hmmm. The point of using iterators rather than using an integer index (that's the real question - in a language like CeeLanguage or JavaLanguage, you can use the for loop construct with an iterator) is that instead of you creating an essentially meaningless dummy integer and hand-controlling the loop, you leave it up to the iterator. You trust the iterator to know when to go on and when to stop, and to hit each element of the collection exactly once. So the traversal information is encapsulated in the iterator, rather than you being responsible for it. Also, it avoids off-by-one errors and that sort of thing--confusion between postfix and prefix ++ operators (for those in languages like CeeLanguage or JavaLanguage or PhpLanguage) and confusion between starting at zero and at one, or between using < and <= for the test.

Also, you can change the thing you are iterating over from an array to a linked-list to some other structure back to an array, without being forced to change the loop code. Or switch from processing one-line-at-a-time from some file on disk vs loading the entire file at once and processing the in-RAM copy. This makes it easier to quickly change the implementation and measure the performance. (EditHint: Is there another name for this pattern? RefactoringCppToReduceDependencies? ReduceUnimportantInformation? I could have sworn I read something like this before somewhere on this Wiki, but I don't remember / can't find that page just now.)

It also allows you to *dynamically* do it differently during the same run - perhaps files located on the local disk are processed a line-at-a-time, while files on a remote server are loaded all at once and process the in-RAM copy.


The new for loop construct in Java 1.5 helps remove this drudgery. For example:

 Object[] array;
 for (Object o : array) { ... }

Collection<Object> collection; for (Object o: collection) { ... }
As you can see, there's basically no difference in the format. Perfect for your everyday forward iterations.

-- JamesHollidge


wrt: "Performance is *always* a concern", I was taught to design and implement code that most clearly conveyed the design of the solution to the problem at hand.

Many's the time I was reminded of Knuth's admonition: "premature optimization is the root of all evil", and the similar sentiments.

Of course, if there's a known inefficiency with a particular idiom, or an implementation of a particular class or pattern, then attention must be paid.

All in all, though, my strong preference is for clarity first, second, and third. Problems with inefficiencies can be addressed when evidence for their existence manifests itself, and the ancillary costs of improving the efficiency don't overwhelm the benefits.

-- ChrisGerrard


Perhaps related to CollectionAndLoopVsSelectionIdiom.


See also: UseClosuresNotEnumerations, BetterForLoopConstruct
CategoryJava CategoryCollections CategoryIdiom

EditText of this page (last edited February 6, 2011) or FindPage with title or text search