CPSC 327, Spring 2004
Final Assignment

This assignment is due on the last day of class. It is an individual assignment, and you should not work with other people on it. There are two parts to the assignment. The first part is a long written assignment that brings together a lot of ideas from the course and applies them to an important practical problem. We will talk about this part of the assignment in class. The second part is a programming assignment that asks you to program your choice of several graph algorithms that we have looked at recently.


Part 1: Sparse Matrices

A matrix is just a two dimensional array of numbers. We will consider only matrices that have the same number of rows as columns. This number will be denoted by n. A vector is just a one-dimensional array of numbers. We will consider vectors that have n elements. Using the standard representation for matrices and vectors, we might declare some matrices and vectors as follows (where n is a constant):

            double A[n][n], B[n][n], C[n][n];
            double V[n], W[n];

It is possible to multiply one matrix by another; the result is another matrix. It is also possible to multiply a matrix by a vector, giving another vector. Using the standard representation given above, we can define these operations with the following code:

            // Set matrix C equal to matrix A multiplied by matrix B
            
            for (int i = 0; i < n; i++) {
               for (int j = 0; j < n; j++) {
                  double sum = 0;
                  for (int k = 0; k < n; k++)
                     sum  +=  A[i][k] * B[k][j];
                  C[i][j] = sum;
               }
            }
            
            // Set vector W equal to matrix A multiplied by vector V
            
            for (int i = 0; i < n; i++) {
               double sum = 0;
               for (int k = 0; k < n; k++)
                  sum  +=  A[i][k] * V[k];
               W[i] = sum;
            }

Note that the run time of this standard matrix multiplication algorithm is Theta(n3) and the run time for multiplying a vector by a matrix is Theta(n2).

A sparse matrix is a matrix in which most of the entries are zero. This problem deals with representing sparse matrices and with multiplying them by other matrices and by vectors. Let's define the fill ratio of a sparse matrix as the fraction of entries that are non-zero.

(a) A sparse matrix, A, can be represented as a one-dimensional array of n linked lists. The ith list will store all the non-zero entries in the ith row of the matrix. Give C++ definitions for the data types needed for this representation. For which fill ratios, r, will your representation result in a savings in the total storage space used for the array? (Your answer will depend on the specific data types and data structures used.)

(b) Discuss how the basic matrix operations -- reading and setting the value of an element in the array -- would be handled using the representation from part (a). Note that the value of the element might change from non-zero to zero. What is the complexity of the operations? How does this compare with the standard matrix representation?

(c) Suppose that you want to multiply a sparse matrix by a vector, where the representation from part (a) is used for the matrix (and the usual representation is used for the vector). Give an algorithm in C++ code or pseudocode for this operation. Analyze the algorithm.

(d) Now suppose that you want to multiply two sparse matrices, A and B, which each have n rows and columns. Write a pseudocode algorithm for performing the multiplication using the representation from part (a). Analyze your algorithm. Consider the possibility of adding a second array of linked lists to the sparse matrix representation, where the second array is used to represent the columns of the matrix rather than the rows. Could this be used to improve the performance of matrix multiplication?

(e) Suppose you are writing a program that will work with sparse matrices. (They are very common, for example, in simulations of dynamical systems such as the weather.) Would you use the standard array representation, or the representation from part (a), or possibly the double linked-list representation from part (d)? How would you decide? How would your answer depend on the expected fill ratios of the matrices? How would it depend on the size, n, of the arrays? How would it depend on the operations that you want to perform on the matrices? (It is not possible to give a completely general, definitive answer to these questions. Give me the type of answer you would give if your job depended on it.)


Part 2: Programming a Graph Algorithm

You should select any one of the following programming problems to work on for this part of the assignment. Your grade will not be completely based on whether your program works. The grade will also be based on program design, style, and generality.

Option 1: Kruskal's Algorithm. Write a program that implements Kruskal's minimal spanning tree algorithm. The data for the undirected weighted graph should be read from a file. You should determine a file format that will allow you to represent any undirected weighted graph. For extra credit, implement the efficient UNION/FIND operation that we discussed in class for checking whether two vertices are in the same component.

Option 2: Dijkstra's Algorithm. Write a program that implements Dijkstra's algorithm for finding shortest paths in a directed weighted graph. The data for the undirected weighted graph should be read from a file. You should determine a file format that will allow you to represent any undirected weighted graph. You will also need some way of specifying the starting vertex for the algorithm For extra credit, use a priority queue to hold the candidate vertices, as discussed at the very end of Chapter 9.

Option 3: Graph Coloring. Write a program for finding a coloring of an undirected graph, using the "Sequential Coloring" approximation algorithm given on pages 346-347 of the NP-completeness handout. The data for the graph should be read from a file. You should determine a file format that can represent any undirected graph. Apply your program to the graph defined in the file /home/cs327/us_map_graph.txt. (This file represents a map of the United States, minus Alaska and Hawaii, and coloring the graph is equivalent to coloring the map. The file /home/cs327/us_map_graph_key.txt shows which vertex is associated with each state. These files were prepared by Oscar Barney) You might have to convert the file format to your own input file format. For extra credit, try sorting the vertices by degree before applying the algorithm, and report on how this affects the number of colors used.

Option 4: Random Maze. In class, we discussed how a random maze can be created by finding a minimal spanning tree for a certain undirected graph. Implement this algorithm. Your program should output some visual representation of the graph. (Consider writing your program in Java and drawing the graph.) The edges of the graph can be considered in any order. A random order will give an OK maze. For extra credit, develop some strategies for making a more attractive maze by considering the edges in other orders. For example, you might want the maze to contain some long straight corridors or corridors that spiral for a while. The strategies should only affect the order in which the edges are considered for inclusion in the minimal spanning tree. You can also get extra credit by implementing the efficient UNION/FIND operation that we discussed in class for checking whether two vertices are in the same component.


David Eck