Is Sedgewick's Shell Sort the most optimal one?

1k Views Asked by At

The following is Sedgewigck's version of the Shell Sort. Is it the most optimal or are there more efficient ones?

import java.util.Arrays;
public class ShellSortSedgewick {

public static void main(String[] args) {
    { // Sort a[] into increasing order.
        int [] a={5,8,44,77,23,81,90,52,25,21,35};
        int N = a.length;
        int h = 1;
        while (h < N / 3) {
            h = 3 * h + 1; // 1, 4, 13, 40, 121, 364, 1093, ...
        }
        while (h >= 1) { // h-sort the array.
            for (int i = h; i < N; i++) { // Insert a[i] among a[i-h], a[i-2*h], a[i-3*h]... .
                for (int j = i; j >= h && a[j] < a[j - h]; j -= h) {
                 int temp=a[j];
                 a[j]=a[j-h];
                 a[j-h]=temp;
                }
            }
            h = h / 3;
            System.out.println(Arrays.toString(a));
        }
    }
          // See page 245 for less(), exch(), isSorted(), and main().
}
1

There are 1 best solutions below

3
Olof Forshell On

I guess you would need to define what "optimal" means in this context.

If if means spending as little time as possible in determining a sequence of gaps for a specific number of units (n) to be sorted/ordered, there are any number of not quite grabbed-out-of-thin-air sequences floating around.

If it means the best sequence of gaps for a specific n things become much more complicated.

Consider a few low n and their optimal sequences:

  • 2,3,4,5 {1} an insertion sort
  • 6 {4,1}
  • 7,8,12 {5,1}
  • 9 {6,1}
  • 10 {9,6,1}
  • 11 {10,6,1}

These sequences of gaps have been determined by testing every single ordering of the different n against every possible gap sequence for that n and where the sequence of gaps produces the lowest average number of comparisons required.

For a given n there will be n! unique orderings to test against 2^(n-2) gap sequences.

Although the n are small what is interesting is how the number of gaps vary.

  • 1 gap works best for n in the range 2 to 5
  • 2 gaps work best for some n in the range 6 up to at least 12
  • 3 gaps work best for some n in the range 10, 11 and possibly beyond

The conclusions I draw from this is that in all probability

  1. there is a unique set of optimal gaps for every n
  2. from one n to the next the gap sequence will vary in terms of numerical values and possibly also in the number of gaps in the sequence

Returning to the issue of what you mean by "optimal" I guess it could also mean which gap-determination algorithm is ... "best?". I would say it depends and given what I have just written you may agree. For some n Knuth beats Sedgewick and for others Sedgewick beats Ciura and so on. The issue is that for a fairly large n - say 200 or more - the test will be performed using "random" samples and not the entire set. So you may be faced with the situation where the random sample is ordered more efficiently by Knuth whereas had it been the complete 200! set Ciura would, on average, order it using fewer comparisons. This is the situation where the random sample is better suited to the algorithm and where the entire set is not.

But I digress ... you asked a question and I attempted to impart some facts to aid you in responding to it.