## CPSC 327, Spring 2019

Homework 6: Graph Programs

This homework is a programming assignment. It will be due sometime after Spring break. Note there will also be a take-home exam some time soon after the end of Spring. We will discuss timing for the exam and for the due date for this assignment.

You can work on this assignment alone or with two or three other people from the class. However, team projects will require more programming.

For the assignment, you will write several programs that implement graph algorithms. To represent graphs
in the program, you are required to use the Graph class defined in *Graph.java*.
You can find a copy in /classes/cs327. You should read the
Java file carefully to learn the API for the class. You will also need a copy
of weighted-graph.txt, which contains data for a small, 16-vertex
weighted undirected graph. You will use it as a test case for minimal spanning tree algorithms.

To submit your work, copy all the files required to run your programs to your homework folder in /classes/cs327/homework. If you are working as part of a team, only one member of the team should submit the work; please make sure that all team members' names are included at the comments at the top of each program that you write. Your homework folder has been emptied; I will print all Java files in homework directories, except for Graph.java and (just in case anyone uses it) TextIO.java.

### About Term Projects

**In addition to working on the programming assignment, you should be thinking about
term projects. You should meet with me sometime before Spring break or, at the latest,
in the week after Spring break, to discuss and choose a topic. You should research
possible topics before that meeting.**

### Connected Components in Random Graphs

This part of the assignment will be done both by individuals and by teams.

What happens when you create a random undirected graph by adding random edges to it? In particular, how likely is it that the vertices will be all, or mostly, connected? The number of edges that you need to get an almost connected, or even a fully connected, graph is probably smaller than you might guess. Here is a function that creates a random graph, using a seeded random number generator so that the result will be reproducible:

/** * Creates a random undirected graph with specified numbers of vertices * and edges, using a pseudo-random number generator initialized with a * given seed. */ private static Graph randomGraph( int vertexCt, int edgeCt, int seed ) { Random rand = new Random(seed); Graph grph; grph = new Graph(vertexCt); while (grph.getEdgeCount() < edgeCt) { int a = rand.nextInt(vertexCt); int b = rand.nextInt(vertexCt); if ( a != b && ! grph.edgeExists(a, b) ) { grph.addEdge(a, b); } } return grph; }

You will want to open this page in a web browser so you can copy-and-paste this method into your program.

Write a Java method for finding the *size of the largest connected component* in an
undirected graph. To do this, you will have to find all connected components. You should
use breadth-first search. (For the queue that you need to implement breadth-first-search,
it is acceptable to use Java's *LinkedList* class.)
We did an example in class of finding the size of every connected
component, and you can base your program on that example.
You can try your method on the graph generated by calling *randomGraph*(100,100,1). The
answer should be 86. Your method should also work for a fairly large graph, such
as the one generated by *randomGraph(250000,300000,1)*. The answer should be 219570.
**You should write a
program that begins by running your method for these two graphs; it should print out
the number or vertices in the largest component in each of the two graphs.**

Next, you want to investigate the following question:
* For a random graph with some given number of vertices, N, how does the size of the maximum connected component
depend on the number of edges in the graph?*
You should look at how the size of the largest connected component varies as
the number of edges increases from about (1/10)*N to about 2*N. You might, for example,
look at graphs with 10000 vertices and with 1000, 2000, 3000, ..., 20000 edges.
Note that if the number of edges is the same as the number of vertices, then
each vertex is connected, on average, to just one other vertex. If the number of
edges is 2*N, then each vertex is connected on average to just two other vertices.
So, the graphs that you are looking at are not highly interconnected.

Finally, you also want to investigate the question: * For a random graph with
a given number of vertices, N, how many edges does the graph need for it to be almost certain that
the graph has only one connected component?* That is
to say, how many edges have to be added in order to be almost certain that

**all**of the vertices are contained in a single connected component whose size is equal to the total number of vertices in the graph. The answer will be somewhat more than 2*N, but will not be huge. Use a fairly large graph, such as 10000 vertices, or try it for several different numbers of vertices.

You should design a program (or several programs, if you prefer) that you can use to investigate these questions. You should report the findings of your investigation either in a long comment in your program or in a separate hard-copy document that you hand in in class. You should not simply write a program, run it once, and report the result! I want to know what interesting things you discover about the behavior of random graphs, and what evidence you have.

### Minimal Spanning Trees

For the second part of the assignment, you will work with Prim's algorithm and/or Kruskal's algorithm for finding a Minimal Spanning Tree in a weighted, undirected graph. There are different requirements for individual and for teams. Specifically,

- If you are working alone can can do Kruskal's algorithm (including the UNION-FIND) operation, or a basic version of Prim's algorithm.
- If you are working with a partner, the two of you should do
**both**Kruskal's algorithm (including the UNION-FIND) operation, and a basic version of Prim's algorithm. - If you are working with two other people, the three of you should do
**both**Kruskal's algorithm (including the UNION-FIND) and a version of Prim's algorithm that uses a heap to speed up the selection of edges to add to the tree. Since the heap that is required has some non-standard behavior, this is somewhat more difficult than either Kruskal's algorithm or the basic version of Prim's algorithm.

We will discuss all of these algorithms in class before Spring break.

**For whatever MST algorithm you are implementing, you should first apply
the algorithm to the graph specified in the file weighted-graph.txt.**
This file defines a connected, weighted, undirected graph.
The format of the file is as follows: The first line of the file contains two integers
giving the number of vertices and the number of edges in the graph. After that, there is
one line for each edge, containing the vertex numbers of the two endpoints of the edge
and the weight of the edge. The vertex numbers are integers, but the weight can be a real number.
**Your method for reading a graph from a file should be able to read any file in a similar
format.** (The MST for this graph has total weight 45.)

The second graph is a random graph generated by calling
*randomWeightedGraph*(100,300,10,1) where *randomWeightedGraph* is the
following method:

static Graph randomWeightedGraph(int v, int e, int maxWeight, long seed) { Random rand = new Random(seed); Graph g = new Graph(v); while (g.getEdgeCount() < e) { int x = rand.nextInt(v); int y = rand.nextInt(v); if (x != y && ! g.edgeExists(x, y)) g.addEdge(x, y, 1 + rand.nextInt(maxWeight)); } return g; }

This graph generated by *randomWeightedGraph*(100,300,10,1) has a MST with total weight 264.

As a third example, you should run your algorithm on the graph generated by calling
*randomWeightedGraph*(50000,300000,100,8).

Finally, **except for the basic version of Prim's algorithm**,
you should try your method on the random graph generaged by
*randomWeightedGraph*(300000,5000000,1000,2). It took my (old) computer about 7 seconds
to generate this graph, about 2 seconds to find the MST using Kruskal's algorithm,
and about one second to find the MST using the improved version of Prim's algorithm.
(The basic version of Prim's algorithm is too slow to run on this graph.)