Lab 2: Sorting Benchmarks

The point of this lab is to run some experiments to compare the run time of some sorting algorithms. The programming for this lab is not particularly difficult, but you will have to think about the design of the experiments. In addition to programming the experiments, you will have to write up a short lab report to present your results.

You will need the file SlowSort.java, which you can find in
the directory */classes/s10/cs225*. The `SlowSort` has several static methods for
sorting . **You should start a new Eclipse project and add SlowStart.java to that project.**

The `SlowSort` class defines the following methods for sorting arrays of various types:

public static void selectionSort(double[] A) // sort the entire array public static void insertionSort(double[] A) // sort the entire array public static void selectionSort(double[] A, int count) // sort A[0]..A[count-1] public static void insertionSort(double[] A, int count) // sort A[0]..A[count-1]

The class also includes the same set of four methods for sorting arrays of type `int[]`
and arrays of type `String[]`.
All of these methods have run time Ω(N^{2}), where N is the number of items
that is being sorted.

Now, there is a standard class in Java, `java.util.Arrays`, that includes its own
methods for sorting arrays. `Arrays.sort(A)` can be used to sort an entire array A of numbers or
strings. `Arrays.sort(A,0,N)` can be used to sort the first N items in an array A of numbers
or strings. The sorting methods used by the `Arrays` class have run time Ω(N*log(N)),
at least in the average case. (In fact, this class uses versions of MergeSort for sorting Objects and of
QuickSort for sorting numbers.)

The goal of this lab is to investigate the run times of the various sorting algorithms and to
get a feel for the real difference between Ω(n^{2}) and Ω(n*log(n)). To do this,
you will construct arrays filled with random values, use various methods to sort them, and measure
the run times.

**The work from this lab is due before the next lab, on Thursday, February 4. You should copy the program's project
folder from your Eclipse workspace into your homework folder in the directory /classes/s10/cs225/homework.
For the written lab report, you can hand it in in class or include it in the material that you copy into
your homework folder.**

The basic idea is to make an array filled with random numbers or random strings. (You will have to think about how to generate random strings.) It's a good idea to make multiple copies of an array, so that you can try several sorting algorithms on the same data.

To measure the time that it takes to sort an array, you can use the following pattern:

long startTime = System.currentTimeMillis(); ... // Sort the array long runTime = System.currenTimeMillis() - startTime();

This measures the run time in milliseconds, where 1 millisecond equals 1/1000 second. For a short array, the time might be measured as 1 or 2 or even 0 milliseconds, which is only a very crude measurement of the actual run time. You really want to use arrays that will give a run time of at least, say, a dozen milliseconds.

Make sure that you only measure the time that it takes to sort the array. Don't include the time that it takes to fill the array with random values.

**You should apply SlowSort.insertionSort, SlowSort.selectionSort, and Arrays.sort
to arrays of type double[] and String[], at least. You can do int[] too, if you want.
You should sort arrays of size
1000, 10000, and 100000, at least. For Arrays.sort, you can also sort arrays of several million
items -- or even larger if you ask how to increase the amount of memory that is made available to your
Java program.**

You do not have to write a single program to run all the arrays. You can write several programs, or a program that can be easily modified to perform different experiments. Your program should, however, output the results in a clear format.

For the written lab report, you should first of all present the results. Tables might be a good format for this.

Second, you should have a brief discussion of the comparisons between the run times of different algorithms.
How does InsertionSort compare to SelectionSort? How do they compare to the algorithms used by `Arrays.sort`?
Is there any difference between the sort times for `double[]` and `String[]`?

Third, you should try to come up with some explanations for the observations that you have made.

Finally, I would like you to do a bit of math. Saying that the run time of an algorithm is
Ω(n^{2}) means that the run time is approximately proportional to n^{2}.
So, we can expect that there is some constant K such that the run time is approximately
equal to K*n^{2}. If you have measured the run time, it is possible to solve for K and
say that K is approximately equal to the run time divided by n^{2} (and we can expect the
approximation to be better for larger values of n.

This means that you can use your run-time measurements for a particular algorithm to find
an approximate value for the constant K for that algorithm. Once you have K, you can use the formula
K*n^{2} to **predict** the run time for array sizes n that you have not measured. **I would
like you to do this to make predictions about at least the following:**

- How long would it take to sort an array of one million
`doubles`using insertion sort? - How long would it take to sort an array of one billion
`doubles`using insertion sort? - How long would it take to sort an array of one billion
`doubles`using`Arrays.sort`?

Present your actual numerical predictions. It would be nice to convert from milliseconds to a more convenient measure of time.