Radix Sort

See other SortingAlgorithms

A category of sorting algorithms (not just a single algorithm, although it is usually casually referred to as if it were) based on characteristics of values rather than just on comparisons. As such, can be faster (e.g. O(n)) than the O(n log n) limit of comparison-based sorts.

It is therefore preferable when the speed of sorting is important; however, it cannot be used on completely random data, since the values won't be found to have any relevant patterns that can be taken advantage of.

Also see McIlroy and Bostic's "Engineering Radix Sort" (http://citeseer.ist.psu.edu/mcilroy93engineering.html) for coding details. It can be an extremely fast string sorter.
There are two types of RadixSorts: MSD RadixSort and LSD RadixSort. LSD RadixSort works on the least significant digit first. As such, it right-aligns its values, so it is suitable for sorting integers. MSD RadixSort works on the most significant digit first, so it left-aligns its values. It is often used to quickly sort a list of strings. See en.wikipedia.org/wiki/Radix_sort for more information.

LSD RadixSort works as follows:

1. Sort by the least significant digit of the key, usually with CountingSort or BucketSort. The sort must be stable; you'll see why next.

2. Repeat for the rest of the digits, working up to the MSD. Because it's stable, the ordering from the previous iterations remain. The end result will be sorted.

MSD RadixSort works differently, and could be considered a special case of BucketSort.

1. Partition the list into buckets by the most significant digit.

2. Recursively MSD RadixSort each non-empty bucket.

3. Concatenate the results.


Strictly speaking, it's not O(n) except in the subset of cases where there are at most 'k' bits needed to describe an element's position in the sorted set, and 'k' is constant. If 'k' is not constant, the complexity becomes O(kn). It's likely that we don't need to add a bit for every new element to be distinguished (which would result in O(n^2), rather one only adds a bit where it is required to differentiate an ambiguous key (bounding case, each comparison in a conventional sort becomes a bit in the key: k = log(n)), and so the complexity becomes O(n log n). Does that complexity sound familiar to anyone? Remember, big-O isn't average case. It's a guide to see what you have in common with an infinite number of monkeys to sort. -- WilliamUnderwood
Moved here from SortingAlgorithms (RefactorMe):

RadixSort Kicks ass when applicable. O(n) complexity. More information, please? I've heard of this one but rather forgotten... If the key is all there is to the data, and k << n, then this is the fastest possible sort. (someone correct me if I mess up this explanation:) A StableSort. RadixSort is useful when the key you are sorting on is a very small integer (0..k), but you have such a large data set keys tend to be repeated many times. For example, if you are sorting strings, you can use RadixSort to (partially) sort by the first 2 letters of the string (interpreted as a number between 0 and 65 535). You start with an temporary array t[0..k] initialized to zero. Make 2 passes through the data. On the first pass, count how many times each key occurs:
  for(i=0; i<n; i++){
    t[A[i]]++;
  };
. If the key is all there is to the data, we can generate the output file directly from the temporary array, and we're done. (See "Address Calculation: The Forgotten Sort" by Douglas Davidson http://massmind.org/techref/method/sort/addrcalc.htm for details.)

But usually there's more data associated with each key, so ... First we find out where each key should start in the output array:
  sum=0;
  for(i=0; i<k; i++){
    sum += t[i];
    t[i] = sum;    
  }
At this point, we know that data with the key "0" starts at location 0, data with the key "1" starts at location t[0], data with the key "65 535" starts at location t[65 534]. At this point, t[k] should equal n.

Next we make the second complete pass through the data and move each item directly to the proper place in the output array.
  for(i=0; i<n; i++){
    current_key = A[i];
    destination = t[current_key]++;
    OutArray?[destination] = A[i];
  };

2 passes through the input data, one pass through the temporary key table: O(n+k).

(Is this stable if we sort-in-place ?).

Actually, that's just one pass of RadixSort, which actually means that it is an unrelated sort being used as part of RadixSort. Specifically, this is a CountingSort. The idea of RadixSort is that you sort the list by the first digit, then sort it again by the second digit, and so on.
CategoryAlgorithm

EditText of this page (last edited May 26, 2011) or FindPage with title or text search