CPSC 327, Spring 2019
Homework 6: Graph Programs

This homework is a programming assignment. It will be due sometime after Spring break. Note there will also be a take-home exam some time soon after the end of Spring. We will discuss timing for the exam and for the due date for this assignment.

You can work on this assignment alone or with two or three other people from the class. However, team projects will require more programming.

For the assignment, you will write several programs that implement graph algorithms. To represent graphs in the program, you are required to use the Graph class defined in Graph.java. You can find a copy in /classes/cs327. You should read the Java file carefully to learn the API for the class. You will also need a copy of weighted-graph.txt, which contains data for a small, 16-vertex weighted undirected graph. You will use it as a test case for minimal spanning tree algorithms.

To submit your work, copy all the files required to run your programs to your homework folder in /classes/cs327/homework. If you are working as part of a team, only one member of the team should submit the work; please make sure that all team members' names are included at the comments at the top of each program that you write. Your homework folder has been emptied; I will print all Java files in homework directories, except for Graph.java and (just in case anyone uses it) TextIO.java.

About Term Projects

In addition to working on the programming assignment, you should be thinking about term projects. You should meet with me sometime before Spring break or, at the latest, in the week after Spring break, to discuss and choose a topic. You should research possible topics before that meeting.

Connected Components in Random Graphs

This part of the assignment will be done both by individuals and by teams.

What happens when you create a random undirected graph by adding random edges to it? In particular, how likely is it that the vertices will be all, or mostly, connected? The number of edges that you need to get an almost connected, or even a fully connected, graph is probably smaller than you might guess. Here is a function that creates a random graph, using a seeded random number generator so that the result will be reproducible:

    /**
     * Creates a random undirected graph with specified numbers of vertices
     * and edges, using a pseudo-random number generator initialized with a
     * given seed.
     */
    private static Graph randomGraph( int vertexCt, int edgeCt, int seed ) {
        Random rand = new Random(seed);
        Graph grph;
        grph = new Graph(vertexCt);
        while (grph.getEdgeCount() < edgeCt) {
            int a = rand.nextInt(vertexCt);
            int b = rand.nextInt(vertexCt);
            if ( a != b && ! grph.edgeExists(a, b) ) {
                grph.addEdge(a, b);
            }
        }
        return grph;
    }

You will want to open this page in a web browser so you can copy-and-paste this method into your program.

Write a Java method for finding the size of the largest connected component in an undirected graph. To do this, you will have to find all connected components. You should use breadth-first search. (For the queue that you need to implement breadth-first-search, it is acceptable to use Java's LinkedList class.) We did an example in class of finding the size of every connected component, and you can base your program on that example. You can try your method on the graph generated by calling randomGraph(100,100,1). The answer should be 86. Your method should also work for a fairly large graph, such as the one generated by randomGraph(250000,300000,1). The answer should be 219570. You should write a program that begins by running your method for these two graphs; it should print out the number or vertices in the largest component in each of the two graphs.

Next, you want to investigate the following question: For a random graph with some given number of vertices, N, how does the size of the maximum connected component depend on the number of edges in the graph? You should look at how the size of the largest connected component varies as the number of edges increases from about (1/10)*N to about 2*N. You might, for example, look at graphs with 10000 vertices and with 1000, 2000, 3000, ..., 20000 edges. Note that if the number of edges is the same as the number of vertices, then each vertex is connected, on average, to just one other vertex. If the number of edges is 2*N, then each vertex is connected on average to just two other vertices. So, the graphs that you are looking at are not highly interconnected.

Finally, you also want to investigate the question: For a random graph with a given number of vertices, N, how many edges does the graph need for it to be almost certain that the graph has only one connected component? That is to say, how many edges have to be added in order to be almost certain that all of the vertices are contained in a single connected component whose size is equal to the total number of vertices in the graph. The answer will be somewhat more than 2*N, but will not be huge. Use a fairly large graph, such as 10000 vertices, or try it for several different numbers of vertices.

You should design a program (or several programs, if you prefer) that you can use to investigate these questions. You should report the findings of your investigation either in a long comment in your program or in a separate hard-copy document that you hand in in class. You should not simply write a program, run it once, and report the result! I want to know what interesting things you discover about the behavior of random graphs, and what evidence you have.

Minimal Spanning Trees

For the second part of the assignment, you will work with Prim's algorithm and/or Kruskal's algorithm for finding a Minimal Spanning Tree in a weighted, undirected graph. There are different requirements for individual and for teams. Specifically,

We will discuss all of these algorithms in class before Spring break.

For whatever MST algorithm you are implementing, you should first apply the algorithm to the graph specified in the file weighted-graph.txt. This file defines a connected, weighted, undirected graph. The format of the file is as follows: The first line of the file contains two integers giving the number of vertices and the number of edges in the graph. After that, there is one line for each edge, containing the vertex numbers of the two endpoints of the edge and the weight of the edge. The vertex numbers are integers, but the weight can be a real number. Your method for reading a graph from a file should be able to read any file in a similar format. (The MST for this graph has total weight 45.)

The second graph is a random graph generated by calling randomWeightedGraph(100,300,10,1) where randomWeightedGraph is the following method:

    static Graph randomWeightedGraph(int v, int e, int maxWeight, long seed) {
        Random rand = new Random(seed);
        Graph g = new Graph(v);
        while (g.getEdgeCount() < e) {
            int x = rand.nextInt(v);
            int y = rand.nextInt(v);
            if (x != y && ! g.edgeExists(x, y))
                g.addEdge(x, y, 1 + rand.nextInt(maxWeight));
        }
        return g;
    }

This graph generated by randomWeightedGraph(100,300,10,1) has a MST with total weight 264.

As a third example, you should run your algorithm on the graph generated by calling randomWeightedGraph(50000,300000,100,8).

Finally, except for the basic version of Prim's algorithm, you should try your method on the random graph generaged by randomWeightedGraph(300000,5000000,1000,2). It took my (old) computer about 7 seconds to generate this graph, about 2 seconds to find the MST using Kruskal's algorithm, and about one second to find the MST using the improved version of Prim's algorithm. (The basic version of Prim's algorithm is too slow to run on this graph.)