## CPSC 225, Spring 2010 Lab 2: Sorting Benchmarks

The point of this lab is to run some experiments to compare the run time of some sorting algorithms. The programming for this lab is not particularly difficult, but you will have to think about the design of the experiments. In addition to programming the experiments, you will have to write up a short lab report to present your results.

You will need the file SlowSort.java, which you can find in the directory /classes/s10/cs225. The SlowSort has several static methods for sorting . You should start a new Eclipse project and add SlowStart.java to that project.

### Experiments with Sorting

The SlowSort class defines the following methods for sorting arrays of various types:

```   public static void selectionSort(double[] A)    // sort the entire array
public static void insertionSort(double[] A)    // sort the entire array
public static void selectionSort(double[] A, int count)  // sort A[0]..A[count-1]
public static void insertionSort(double[] A, int count)  // sort A[0]..A[count-1]
```

The class also includes the same set of four methods for sorting arrays of type int[] and arrays of type String[]. All of these methods have run time Ω(N2), where N is the number of items that is being sorted.

Now, there is a standard class in Java, java.util.Arrays, that includes its own methods for sorting arrays. Arrays.sort(A) can be used to sort an entire array A of numbers or strings. Arrays.sort(A,0,N) can be used to sort the first N items in an array A of numbers or strings. The sorting methods used by the Arrays class have run time Ω(N*log(N)), at least in the average case. (In fact, this class uses versions of MergeSort for sorting Objects and of QuickSort for sorting numbers.)

The goal of this lab is to investigate the run times of the various sorting algorithms and to get a feel for the real difference between Ω(n2) and Ω(n*log(n)). To do this, you will construct arrays filled with random values, use various methods to sort them, and measure the run times.

The work from this lab is due before the next lab, on Thursday, February 4. You should copy the program's project folder from your Eclipse workspace into your homework folder in the directory /classes/s10/cs225/homework. For the written lab report, you can hand it in in class or include it in the material that you copy into your homework folder.

The basic idea is to make an array filled with random numbers or random strings. (You will have to think about how to generate random strings.) It's a good idea to make multiple copies of an array, so that you can try several sorting algorithms on the same data.

To measure the time that it takes to sort an array, you can use the following pattern:

```          long startTime = System.currentTimeMillis();
... // Sort the array
long runTime = System.currenTimeMillis() - startTime();
```

This measures the run time in milliseconds, where 1 millisecond equals 1/1000 second. For a short array, the time might be measured as 1 or 2 or even 0 milliseconds, which is only a very crude measurement of the actual run time. You really want to use arrays that will give a run time of at least, say, a dozen milliseconds.

Make sure that you only measure the time that it takes to sort the array. Don't include the time that it takes to fill the array with random values.

You should apply SlowSort.insertionSort, SlowSort.selectionSort, and Arrays.sort to arrays of type double[] and String[], at least. You can do int[] too, if you want. You should sort arrays of size 1000, 10000, and 100000, at least. For Arrays.sort, you can also sort arrays of several million items -- or even larger if you ask how to increase the amount of memory that is made available to your Java program.

You do not have to write a single program to run all the arrays. You can write several programs, or a program that can be easily modified to perform different experiments. Your program should, however, output the results in a clear format.

### Results and Analysis

For the written lab report, you should first of all present the results. Tables might be a good format for this.

Second, you should have a brief discussion of the comparisons between the run times of different algorithms. How does InsertionSort compare to SelectionSort? How do they compare to the algorithms used by Arrays.sort? Is there any difference between the sort times for double[] and String[]?

Third, you should try to come up with some explanations for the observations that you have made.

Finally, I would like you to do a bit of math. Saying that the run time of an algorithm is Ω(n2) means that the run time is approximately proportional to n2. So, we can expect that there is some constant K such that the run time is approximately equal to K*n2. If you have measured the run time, it is possible to solve for K and say that K is approximately equal to the run time divided by n2 (and we can expect the approximation to be better for larger values of n.

This means that you can use your run-time measurements for a particular algorithm to find an approximate value for the constant K for that algorithm. Once you have K, you can use the formula K*n2 to predict the run time for array sizes n that you have not measured. I would like you to do this to make predictions about at least the following:

1. How long would it take to sort an array of one million doubles using insertion sort?
2. How long would it take to sort an array of one billion doubles using insertion sort?
3. How long would it take to sort an array of one billion doubles using Arrays.sort?

Present your actual numerical predictions. It would be nice to convert from milliseconds to a more convenient measure of time.

David Eck, for CPSC 225