## CPSC 225: Intermediate Programming, Spring 2002 Programming Assignment 7: Binary Sort Trees

ASSIGNMENT SEVEN asks you to do some work with Binary Sort Trees. It is due next Wednesday, April 3.

We discussed Binary Sort Trees (BST's) in class and looked at recursive subroutines for traversing a BST and for inserting an item into a BST. If items are inserted into a BST in a random order, then the process of searching for an item in the tree is very efficient -- much more than searching a linked list. Searching for an item in a linked list with N items takes N/2 steps on average, even if the list is sorted. For a linked list, the number of steps is closer to log2(N), which is much smaller. For this assignment, you will verify this fact experimentally by constructing and analyzing random BST trees. The trees will contain integers as items, so the node type for the trees is:

```                 struct TreeNode {
int item;
TreeNode *left;
TreeNode *right;
};
```

You will need the insertBST method that we looked at in class. To make a random BST tree with a given number, N, of nodes, start with an empty tree. Then generate N random ints and insert them into the tree one by one. When you are working on this part of the program, you might want to use the printBST method to make sure that the items in the tree are in fact in the correct order, but the final version of the program will not print the trees.

Once you have a random BST, you can ask how many steps it takes to find an item in the tree. There are two different questions you can ask: What is the maximum number of steps, and what is the average number of steps. Two terms can be helpful here.

• The depth of a node in a binary tree is the number of nodes on a path from the root node to the given node. The root is at depth 1. The children of the root are at depth 2, the children of the children of the root are at depth 3, and so on.
• The height of a binary tree is the maximum depth of any node in the tree. It is the number of nodes on the longest path starting from the root.

The height of a tree is the maximum number of steps it will take to find an item in the tree. The average number of steps is equal to the average depth of all the nodes in the tree; it can be found by adding up the depths of all the nodes and dividing by the number of nodes. To find these values, you will need two recursive functions: a function to find the height of a tree and a function to add up the depths of all the nodes in the tree. To write the height function, you should think about how the height of a non-empty tree can be computed from the heights of its sub-trees. In the sum-of-depths function, you can keep track of the depths of the nodes that the function visits by using an int parameter (named depth perhaps) that goes up by one every time the function calls itself recursively.

You should also have some way of deleting all the nodes of a tree. Since your program will be building a lot of trees, you should reclaim the memory used for each tree after you are done with it. You can write a recursive function to delete the nodes of a tree. (It is not sufficient just to apply the delete operator to the root node, since that will not delete the nodes in the subtrees.)

The object of the program is to run experiments in which you create random BST's and determine their heights and average depths. You should write a function

```       void experiment(int numberOfNodes, int &height, double &averageDepth);
```

to run one experiment. The first parameter gives the number of nodes in the tree. The second parameter is used to return the height of the tree that is created. The third parameter is used to return the average depth of all the nodes in the tree.

To review, here are the functions you will need:

• find the height of a binary tree
• find the sum of the depths of the nodes in a binary tree
• delete all the nodes of a binary tree
• Create a random BST with N nodes, and find its height and average depth

Your main program should create 1000 random trees containing a given number of items. Do this by calling the last of the above functions 1000 times. The program should report the average height of the 1000 trees and the average average depth of the 1000 trees. You might also want to print out the maximum height that was observed for all the trees.

You should do this for a variety of sizes (either by having the program do it automatically or by running the program several times for given sizes). Write a paragraph describing how the answers depend on the number of items in the tree, and turn in this paragraph along with a printout of your program. You should place a copy of the program in your homework directory.

### Final Project

As I mentioned in class, the final assignment for this class is to come up with an idea for a programming project and to do it. It will only count for 30 points, and it does not have to be much longer than the rest of the assignments for the class. The main point is for you to design a project that you want to do. I might throw out ideas for projects in class, but I will not specify any details. The project can be either a program or a small library of classes that could be useful in a variety of programs. You are encouraged to discuss your ideas with me. The project can be turned in any time up to the day after the final exam. (The exam is on Saturday, May 4.)

David Eck, 25 March 2002