CS 124, Spring 2021
Lab 5b: Array of Words

Today's lab asks you to do some work with an array. This lab combines with last week's lab. The work for both labs is due by the end of the day next Monday, March 8, and can be turned in late with a 10% penalty until noon on Saturday, March 13. You will turn in two files on the submission web site: Art.java and Words.java.

The exercise for this lab involves some basic word processing: Read all of the words form a file. Count the number of different words that were found. Test whether words entered by the user were in the file or not.

You should add a copy of the file Words.java to the src folder in an Eclipse project. (It could be the same project that you use for Art.java.) You should also add a copy of the file alice-in-wonderland.txt to the same project, but that file should be outside the src folder, as discussed below.

This lab is an individual assignment. Keep in mind that you are required to design your own algorithm and write your own program. You can discuss the assignment with other people, but you must design and code your own program.

Reading Words from a File

The file Words.java already reads the words from the file that contains the book Alice in Wonderland, although it doesn't do anything with the words that it reads. For reading words, it uses a subroutine readNextWord() that is already defined in the same file. (We will cover writing subroutines in class this week.) you can read the comment on the subroutine definition to find out exactly what it does, but you don't need to understand how it works to do this lab. All you need to know is that it returns one word at a time from the file.

The program uses TextIO to read from the file. Using TextIO with files is discussed in Subsection 2.4.4 in the textbook. The command TextIO.readFile(filepath) tells TextIO to start reading from the file specified by filepath instead of from the user. The command TextIO.readStandardInput() tells it to go back to reading from the user. But again, you don't really need to understand this to do this lab.

You do need to understand what to do with the file alice-in-wonderland.txt, which has to be in the right place in order for TextIO to read it. The parameter in TextIO.readFile(filepath) is a path to the file. If you use a simple file name such as alice-in-wonderland.txt" as the filepath, then the file must be in the "current" or "working" directory when the program is run. When you run a program in Eclipse, the current directory is the directory that contains the project. This means that the file must be in the project, but not inside the src directory.

You can download alice-in-wonderland.txt to your computer and copy-and-paste it into the project that contains Words.java. When pasting, right-click on the project name, and select Paste from the popup menu. If you accidentally paste it into into a folder inside the project, you can drag it from there onto the project name to move it out of the folder.

Exercise: Array of Words

The first step for this lab is to store the words from the file into an array. Add an array of String to the program. You should not assume that you know how many words are in the file, so make a big array that you are sure will be large enough to hold all of the words. (There are certainly fewer than 100000 words in the book.) You will be using the "partially full array" idea from Subsection 3.8.4, so you also need a counter to keep track of how many words are in the array. As each word is read from the file, you should add it to the array.

To make it easier to find the number of different words, you should sort the array. You do not need to write the code to do that. Java has a built-in method for sorting an array of strings. It is defined in a class named Arrays in the standard package named java.util. To use the subroutine, you have to import that class into your program by adding

import java.util.Arrays;

to the top of Words.java. To sort a partially full array, you can then use a command of the form

Arrays.sort( array, 0, count );

where array is the name of the array variable, and count is the name of the variable that counts the number of items stored in the array. (Note: "array" is not a good name for a variable in a real program! Use the correct names for the variables in your program.)

After you have sorted the array, you can go through the array and count the number of different words that it contains. As a hint, a word in the sorted array should be counted only if it is not equal to the word that precedes it in the array (but be careful about how you handle the very first word in the array). Remember that when testing two strings for equality, you should use str1.equals(str2) not str1 == str2.

Your program should report the total number of words read from the file and the number of different words.

Finally, the program should go into a loop where it lets the user enter words and checks whether they are in the array or not. In the loop, read a word from the user, test whether it is in the array, and tell the user either that it occurred in the book or that it did not occur in the book. There has to be some way for the user to end the loop; you can decide how to do that.