Smooth Sort

According to its creator, E W Dijkstra: "Like Heapsort —which inspired it— SmoothSort is an algorithm for sorting in situ. It is of order N∙log N in the worst case, but of order N in the best case, with a smooth transition between the two. (Hence its name.)"


The implementation of SmoothSort is somewhat more complicated than that of HeapSort, on which it is loosely based -- as is the proof of its correctness and computational complexity.

It's a bit of a monster to implement. The implementation is not aided by the fact that the paper gives it in GuardedCommandLanguage? (a language with obscure syntax and DynamicScoping), with one-character variable names, and that the descriptive part of the paper goes into great detail over trivia, but skims over the difficult bits.

The approach is this.

The Leonardo numbers l[0] = 1, l[1] = 1, l[n] = l[n-2]+l[l-1]+1.

A heap is a section of an array, the size of one of the Leonardo numbers L[n], with a root (one node) and two child heaps of size L[n-1], L[n-2]. We put the root on the right. So a heap of size 9 will be structured like so:
 { {{{1}, {2}, 3}, {4}, 5}, {{6}, {7}, 8}, 9 }

The heap condition is that a root must be greater than both of the roots of its subtrees. Note that this means that the rightmost node of a root is always guaranteed to be the greatest node of the heap. Note also that a sorted array is already a valid heap and does not need any fooling about to make it so.

Of course, we need to be able to sort arrays of any size. So we break it up into heaps by taking the largest Leonardo number we can get, from the left to right. So an array of 22 elements is broken up into heaps of size 15, 5, 1 and 1. Furthermore, we impose a condition: the root of each heap must be greater than (or equal to) the root of the heap to the left. This means that the rightmost node of the whole ball of wax will always contain the greatest maximum and, as before, an array that is already sorted needs no extra work to make it valid.

Another very important point: the sizes of the heaps making up the array will never have two consecutive Leonardo numbers - except possibly the final two, because if there are two consecutive numbers and there's more stuff on the end of the array, we could have used one item of that stuff to have made a larger heap.


Ignoring Dijkstra's optimisations, the sort algorithm is: start with a string of on heap of size 1 at the start of the array, and extend that to the right one element at a time - always keeping the "valid" condition. Once you've done that to the entire array, the largest node will be at the end, so shrink the heap one node at a time, keeping the valid condition, until you are back to a single heap of one. Since each time you shrunk, the rightmost node being lost was the largest i the heaps, the array is sorted.

Now then.

To extend a string of heaps by one: if the last two heaps are consecutive Leonardo numbers, then add the node and the three parts (the two heaps and the extra node) become a single bigger heap. This heap will need to be "filtered" like a mergesort, but we don't need to do anything else, as we know both root nodes of both heaps were greater than the nodes of the heaps to the left, so after the filter, the top node of the final heap will also be (this is without Dijkstra's optimisations, of course).

if the last two heaps are not consecutive Leonardo numbers, then just add a new heap of size 1. However, the root of this heap may be less than the root of the heap to the left, so swap the two roots, filter this heap, and move over to the left (rinse and repeat).

To reduce a string of heaps by one: if the right hand heap has a size of 1, there's nothing to do.

Otherwise, strip off the right hand (root) node, which leaves us with two heaps. The roots of these two heaps don't necessarily satisfy the second valid condition, so do the root-fixup on the left-hand sub heap and then the right hand subheap.

View edit of August 7, 2009 or FindPage with title or text search