CPSC 225, Spring 2011
Lab 2: Sorting Benchmarks

The point of this lab is to run some experiments to compare the run time of some sorting algorithms. The programming for this lab is not particularly difficult, but you will have to think about the design of the experiments. This lab, unlike most labs, has a written component. In addition to programming the experiments, you will have to write up a short lab report to present your results.

Both the programming and the written parts of the lab are due at the beginning of lab next week. The program should be copied into your homework folder in /classes/s11/cs225/homework. For the written part, you can either turn in your report on paper or you can include it in a file along with your program.

You will need a copy of the file SlowSort.java, which you can find in the directory /classes/s11/cs225. SlowSort contains several static methods for sorting arrays. You should start a new Eclipse project and copy SlowStart.java into that project.

Experiments with Sorting

The SlowSort class defines the following methods for sorting arrays of various types:

   public static void selectionSort(double[] A)    // sort the entire array
   public static void insertionSort(double[] A)    // sort the entire array
   public static void selectionSort(double[] A, int count)  // sort A[0]..A[count-1]
   public static void insertionSort(double[] A, int count)  // sort A[0]..A[count-1]

The class also includes the same set of four methods for sorting arrays of type int[] and arrays of type String[]. All of these methods have average case run time Θ(N2), where N is the number of items that is being sorted. (When sorting random arrays, the average run time is what you expect to see.)

Now, there is a standard class in Java, java.util.Arrays, that includes some built-in methods for sorting arrays. Arrays.sort(A) can be used to sort an entire array A of numbers or strings. Arrays.sort(A,0,N) can be used to sort the first N items in an array A of numbers or strings. The sorting methods used by the Arrays class have run time Θ(N*log(N)), at least in the average case. (In fact, the sorting methods used by this class uses are versions of MergeSort for sorting Objects and of QuickSort for sorting numbers.)

The goal of this lab is to investigate the run times of the various sorting algorithms and to get a feel for the real difference between Θ(n2) and Θ(n*log(n)). To do this, you will construct arrays filled with random values, use various methods to sort them, and measure the run times.

The basic idea is to make an array filled with random numbers or random strings. (You will have to give some thought to the question of what it means to generate a "random string." Very short strings are not a good idea, since you will get too many duplicate strings. Insertion sort, for example, will perform much better than average on an array that contains many duplicates.) It's a good idea to make multiple copies of each array, so that you can try several sorting algorithms on the same data. Remember that just saying "B=A" does not make a copy of the array A; it just makes B point to the same array as A.

To measure the time that it takes to sort an array, you can use the following pattern:

          long startTime = System.currentTimeMillis();
            ... // Sort the array
          long runTime = System.currenTimeMillis() - startTime();

This measures the run time in milliseconds, where 1 millisecond equals 1/1000 second. For a short array, the time might be measured as 1 or 2 or even 0 milliseconds, which is only a very crude measurement of the actual run time. You really want to use arrays that will give a run time of at least, say, a dozen milliseconds.

Make sure that you only measure the time that it takes to sort the array. Don't include the time that it takes to fill the array with random values.

You should apply SlowSort.insertionSort(), SlowSort.selectionSort(), and Arrays.sort() to arrays of type double[] and to arrays of type String[]. You can do int[] too, if you want, but it's not required. You should sort arrays of size 1000, 10000, and 100000, at least. For Arrays.sort(), you can also sort arrays of several million items -- or even larger if you ask how to increase the amount of memory that is made available to your Java program.

You can either write one program that does all the work, or you can write several very similar programs. You should output the results in a clear format, and you should follow the usual style rules for programming, including the use of comments.

Results and Analysis

For the written lab report, you should first present the results. Tables might be a good format for this.

Second, you should have a brief discussion of the comparisons between the run times of different algorithms. How does InsertionSort compare to SelectionSort? How do they compare to the algorithms used by Arrays.sort()? Is there any difference between the sort times for double[] and String[]?

Third, you should try to come up with some explanations for the observations that you have made.

Fourth and finally, I would like you to do a bit of math. Saying that the run time of an algorithm is Θ(n2) means that the run time is approximately proportional to n2. So, we can expect that there is some constant K such that the run time is approximately equal to K*n2. If you have measured the run time, it is possible to get an estimate for K by saying that K is approximately equal to the run time divided by n2 (and we can expect the approximation to be better for larger values of n).

This means that you can use your run-time measurements for a particular algorithm to find an approximate value for the constant K for that algorithm. Once you have K, you can use the formula K*n2 to predict the run time for array sizes n that you have not measured. Similar computations apply, of course, to an algorithm with run time Θ(n*log(n)).

I would like you to do calculations of this type to make predictions about the following:

  1. How long would it take to sort an array of one million doubles using insertion sort?
  2. How long would it take to sort an array of one billion doubles using insertion sort?
  3. How long would it take to sort an array of one billion doubles using Arrays.sort?

Present your actual numerical predictions. It would be nice to convert from milliseconds to a more convenient measure of time.

David Eck, for CPSC 225