CPSC 225, Spring 2010
Lab 1: Spellchecking (plus Eclipse)


The lab this week has a programming assignment and a related written assignment. The program will be a basic spellchecker, and the written assignment will ask you to analyze the program. For those of you who are not already familiar with it, this lab also includes an introduction to programming with the Eclipse IDE (Integrated Development Environment).


Programming Assignment: Spellchecking

To to help you get back into programming at the start of a new semester, this lab includes a moderately complex programming assignment. The assignment is due next Thursday, before the start of the next lab. Remember that your program will be graded for style as well as for correctness. It should follow all the rules in the style guide that was handed out on the first day of class, including the use of Javadoc comments (see below).

This assignment is to write a basic "spell checker" program that can check words entered by the user. The program will use a long list of correctly spelled words. Your program should read a word from the user and convert it to lower case. It should check whether the word is in the list of correctly spelled words. If it is, you should tell the user that the word is OK. If it is not, you should show the user a list of similar words that are in the list (if there are any).

You will need the files WordList.java and unsorted_words.txt, which you will find in the directory /classes/s10/cs225. You should add them to your Eclipse project. (For those of you who don't already know Eclipse, you will find instructions below for doing this.) The WordList class represents a list of correctly spelled words. Your program will use an object of type WordList, which has an instance method

public boolean contains(String lowerCaseWord)

that tests whether a given word is in the list. The file unsorted_words.txt contains the list of words and is read by WordList.java; you won't need to do anything with this file except add it to your project.

If you want to use TextIO.java for input, you can add that to your project too. You will find a copy in /classes/s10/cs225.

The hard part of this program is finding correctly spelled words that are "similar" to an incorrectly spelled word. The idea is to modify the incorrectly spelled word in certain ways, and look up the result in the list of correctly spelled words. Any correctly spelled words that you can find in this way go into the list of possible corrections.

You should implement the following ways of modifying the incorrectly spelled word:

  1. Delete a character. Try this for each character in the original word.
  2. Add a character. Try putting each of the 26 letters of the alphabet in each of the possible positions in the original word.
  3. Change a character. Try substituting each of the 26 letters of the alphabet for each of the characters in the original word.
  4. Swap two characters. Try reversing the order of each consecutive pair of characters in the original word.
  5. Insert a space. Try inserting a space into each possible position in the original word, breaking it into two words. In this case, you have to check that both of the words that you make are in the list of correctly spelled words.

Note that to make the modified words, you will have to "disassemble" the original word and put it back together with some changes. To do this, you will need to work with substrings of a string. Recall that if str is a variable of type String, then str.substring(0,n) is the substring consisting of the first n characters from the string, that is, the characters in positions 0 through n−1. And str.substring(n) is the substring consisting of the characters in position n through the end of the string.

For an "A" on this assignment, your program should let the user input a series of words and should spellcheck each of them as discussed here. When it presents a list of alternative words to the user, it should not include any duplicates in the list. The program should be nice to the user: Prompt for inputs, label the output, and format the output nicely. The program should break up its task into subroutines. Furthermore, the program should follow all the rules of good programming style listed in the style guide.

What to turn in and how to turn it in: To turn in your program, you should copy the program's project folder from your Eclipse workspace into the directory /classes/s10/cs225/homework/zz9999, where you should replace "zz9999" with your own user name. It is not necessary to make a printout of your work. If you have any problems doing this, ask for help during the lab next Thursday.


Written Assignment: Analysis of the Spellcheck Algorithm

The writing assignment is to write an essay discussing the run-time analysis of the spellcheck algorithm that is used in the programming assignment. You should take into account all the operations that are performed by your program, including the calls to the contains method of the WordList class. You should note that contains operates in the dumbest possible way: It simply does a linear search through an array of words, looking for the word in question. (We will discuss linear search and alternatives to it in class on Friday.)

I don't want just a "Big-Oh" analysis of the run time. You should be as specific as you can about the run time. Discuss the run time in terms of the length of the word that is being spellchecked and the number of words in the list of correctly spelled words. You can try to give some formulas that involve these quantities, and some other constants, but the main thing is to be clear about how the run time depends on the length of the words and the size of the list

Note that while you should work on the programming assignment on your own, you are allowed to discuss this writing assignment with other students. You should, however, write your own essay.

What to turn in and how to turn it in: For written assignments like this one, you can either turn in a hard copy in class, or you can submit a file to the homework directory in /classes/s10/cs225/homework. You can add the file to your Eclipse project if you want. Files submitted as attachments to email messages are also acceptable. The written assignment for this lab is due along with the programming assignment, before the beginning of lab next Wednesday.


Running Eclipse

For those of you already familiar with Eclipse: You can skim through the rest of this lab and get right to work on the assignments. If you have set up Eclipse to open a default workspace automatically, and if you would like to use a different workspace for this course, then you should: Start Eclipse. Go to the "Preferences" dialog. Open "General", then open "Startup and Shutdown", then click on "Workspaces". Check the box labeled "Prompt for workspace on startup". Restart Eclipse. You will be able to browse for a new workspace location as it starts up.

If you do not know Eclipse, read on...

Eclipse is already installed on the Linux computers that you will use during the lab. In KDE, you can access it in the "Development" sub-menu of the application menu. In Gnome, it is in the "Programming" sub-menu. You might want to add an "Eclipse" button to the panel at the top or bottom of the screen to make it easier to get to. (Ask for help on how to do this.)

Although it is not required for the course, I encourage you to consider running Eclipse on your own computer as well. It is available for Linux, Windows, and Mac OS.

To run Eclipse, you must first have a Java JDK Version 5.0 or higher working on your computer. If you do not already have it on your computer, you can download it. For Linux, the JDK should be included in your distribution's available software. Windows users can get the JDK 6 from Sun's Java download site, http://java.sun.com/javase/downloads/index.jsp. You need to download "JDK 6"; a link to installation instructions can be found on the same web page. Recent versions of Mac OS already include Java.

Once you have Java working, you can download the most recent version of Eclipse from the Eclipse downloads page, http://www.eclipse.org/downloads/. You can download the "Eclipse IDE for Java Developers"; you won't need the much larger "Eclipse IDE for Java EE Developers."

If you would like some help installing Java or Eclipse on your computer, ask for it.


Eclipse organizes your programming projects into "workspaces." Each workspace corresponds to a directory. These workspace directories hold all the files that are used by your projects, including Java source code files and compiled class files. Usually, you should let Eclipse have complete control over the workspace directories, and you should not directly change any files that are stored in them. It is possible to have as many workspaces as you want, but for this course, you will probably keep all of your projects in a single workspace.

When Eclipse starts up for the first time, it asks you to select a workspace. By default, the name of the workspace will be a directory named "workspace" in your home directory, but I usually change it to "eclipse-workspace". There is a box that you can check if you don't want to be asked to select a workspace each time you run Eclipse.

The first time that you run Eclipse, its window will be filled with a "Welcome" screen. The icons on this screen link to a large amount of information about Eclipse. You can browse through this information some time, if you like, but for now, just close the "Welcome" screen by clicking the "X" next to the word "Welcome" near the top-right corner of the window. If you want to get back to the "Welcome" screen, just select "Welcome" from the "Help" menu.

Eclipse uses the terms view and perspective to describe that way that information is organized and presented in its window. The window typically contains several views. Each view contains a certain type of information. All the views that are visible in the window constitute a perspective. A perspective is meant to contain all the views that are used in some particular activity. For now, you will just use the Java perspective, which is used for Java programming. Later, you will learn about the Debug perspective, which is for debugging programs. Each perspective is highly customizable. You can add and delete views and move them around. If you ever delete a view accidently, you can get it back by selecting it from the "Show View" submenu in the "Window" menu.

After you have closed the Welcome screen, the window will show the Java perspective. Initially, all the views are empty. Here is what the Java perspective might look like after a little work has been done:

Eclipse Window showing Java Perspective

In this window, I have closed a couple of the views that I don't use, the the "Outline" view, the "Task List" view, and the "Hierarchy" view. Remember that a view can be closed by clicking the small "X" next to the name of the view, and it can be reopened using the "Window/Show View" menu.

The "Package Explorer" view, on the left, is central to much of the work that you do in Eclipse. It contains a list of your programming projects and the Java files and resources that are contained in those projects. In the above picture, there is just one project, Assg1_Math_Quiz. Clicking on the small triangle next to the project name will show/hide the resources contained in the project.

The lower right section of the window contains several views. Currently, the "Console" view is showing. To see one of the other views, such as "Problems" or "Javadoc", just click on the tab that contains the name of the view. Sometimes, another view will pop up automatically in this area to show the output of some command. When you run a program, standard input and output are done in the "Console" view. Errors and warnings from the Java compiler are displayed in the "Problems" view.

The central area of the window is occupied by editor views. Here, I've opened two Java files, MathQuiz.java and TextIO.java, for editing. Currently, MathQuiz.java is showing; to see TextIO.java instead, click its name.

The view of MathQuiz.java shows several of the nifty features of the Java editor. The source code that you type is checked for syntax errors as you type. Errors are marked with small red carets at the bottom of the line, like the one at the end of the first line in the main() routine. The error is also marked by a red rectangle in the right margin of the editor; if you hover your mouse over the red rectangle, you see the error message for the error. In this case, you are told that a semicolon is missing. In the third line of main(), the word "prinltn" is underlined with error markers. (It's a misspelling of "println".) This error has an error light bulb () in the left margin of the editor. The light bulb means that Eclipse is not just telling you about the error but is offering you some ideas for fixing it. If you click the light bulb, you get a list of actions that can be taken to fix the problem. For the above picture, the list is:

change to println(), and other options

Double clicking on an action, such as "Change to println()", will apply that option automatically. Sometimes, you will see a warning light bulb (). A warning indicates a potential problem, but not an error. Warnings will not prevent the source code from being compiled successfully.

In fact, Eclipse might be a little too enthusiastic in marking warnings and errors. You do not necessarily have to fix every warning. And you do not have to fix every error as soon as it appears. In fact, it's impossible to do so, and in some cases the error will go away by itself after you've typed in more of your program. And remember that the fix for a programming error does not always go at the location of the error; sometimes the problem is elsewhere in the file. Furthermore, Eclipse's error system is only effective if you routinely get most of your program correct in the first place -- don't expect Eclipse to make solid Java programming skills unnecessary!


Your First Project

It's time for you to start using Eclipse. Start up Eclipse, as described above. Close the "Welcome" screen (and maybe the "Outline" view as well).

To create your first project, right-click in the "Project Explorer". In the pop-up menu that appears, go to the "New" submenu, and select "Java Project." A "Create Java Project" wizard will pop up. All you have to do is enter a name for the project, in the box labeled "Project Name". Enter "Lab1", or any other name that you like. (Warning: Don't use funny characters such as "&" or ":" in the name of the project, since they can cause problems when you work with the files outside Eclipse. The project name will be used as the name of a directory inside your workspace directory.) After entering the project name, click the "Finish" button. The project will be added to "Project Explorer", and a directory of the same name will be created in your Eclipse workspace directory.

Now, you need to create a new Java class file in your project. To do this, right-click the project name in the "Package Explorer" view. In the pop-up menu, go the "New" submenu again, and select "Class". A class creation dialog box will pop up. Again, you just have to fill in the name of the class, and click the "Finish" button. Note that you have to enter the name of the class, not the name of the file, so don't try to add a ".java" extension to the name. The name must be a valid Java class name. Use "SpellCheck" as the name. The class will be added to the package in the "Package Explorer" view, and the Java file will be created in the project directory. The class will be inside a "src" directory, which contains the source files for your project. Furthermore, a Java editor will open, showing you the initial contents of the file. As you can see, Eclipse has already added the declaration of the class. All you have to do is fill it in! Note that you can close the editor in the usual way, by clicking the X next to the file name; to open the file again, double-click its name in the "Project Explorer."

As you can see in the "Package Explorer," the class is added to the "default package." This just means that the class is not declared to belong to any package. Later in the term, you will be working with packages, but for now the default package is OK, even though the Eclipse class creation dialog will display a warning that "use of the default package is discouraged."

Eclipse often has alternative ways of doing things. Another way to create projects and classes is with the "New" submenu in the "File" menu. Even easier is to use the "create" buttons in the toolbar. These are a group of three small buttons at the top of the Eclipse window:

buttons for creating projects, packages,  and classes

Click on the left button in this group to create a new project. Click the right button to create a new class. The middle button is for creating packages. (Note that when you create a class using the button, you should first click in the Package Explorer to select the location of the class that you are creating. Otherwise, you'll have to enter the location by hand in the class creation dialog box.)

Now, to create and run a program. Add the following main() routine, or something similar, to the class that you created above:

     public static void main(String[] args) {
         System.out.print("Welcome to SpellCheck");
     }

The program is compiled automatically. To run it, right-click either on the name of the class in the "Project Explorer" or on the editor window that contains the program. In the pop-up menu, go to the "Run As" submenu, and select "Java Application" from the submenu. The program will start. The output from the print statement appears in the "Console" view in the bottom right area of the Eclipse window.

After running the program once, you can run it again simply by clicking on the "Run" button (run button) in the toolbar. Clicking this button will re-run the program that was run most recently. If you click the little black triangle to the right of the button, you'll get a list of all the programs that you have run; select a program from this list to run it.


There are several files that you will need to add to your project. They can be found in the directory /classes/s10/cs225. You will certainly need the files WordList.java and unsorted_words.txt. The WordList.java file defines a class that you will need for the assignment, and unsorted_words.txt is a "resource" that is used by WordList. You might also want to use TextIO.java in the assignment, for reading the user's input (although you are welcome to use the standard Scanner class instead).

To add an existing file to the Eclipse project, you can copy-and-paste it from a directory window: Open a directory window that shows the files. Right-click the file icon in the directory window and select "Copy". Go back to the Eclipse window and right-click in the "Package Explorer" on the location where you want to put the file -- that will be the name of the project or the "src" directory inside that project in this case. Select "Paste" from the pop-up menu. A copy of the file should appear in your project.

You can now start working on the assignment. The main program should create an object of type WordList. It should read a word from the user, and convert that word to lower case. It should call the contains() method in the WordList object to test whether or not the word entered by the user is in the list. To test that much of the program, you can simply print out "yes" or "no" to tell the user whether the word exists. Once that is working, you can go on the main part of the assignment, as described above.


You have surely already noticed that the Java editor in Eclipse does a certain amount of work for you. It automatically inserts indentation. When you type a "{" and press return, the matching "}" is inserted in the correct position. Furthermore, when you type a period, a list of things that can follow the period will pop up after a short delay, and you can select an item from the list instead of typing it in. This is called Content Assist. You can invoke Content Assist at any time while you are typing by pressing Control-Space. If you do this in the middle of a variable name or method name, pressing Control-Space will either complete the name or offer a list of possible completions. It can be used in other ways as well. For example, if you press Control-Space after typing the "(" at the beginning of a method's parameter list, you will get information about the parameters of the method. By the way, when Content Assist pops up a list, you can get rid of the list by pressing the Escape key.

Content Assist is a good thing, but I find the way it pops up automatically while I am typing to be very annoying. You can turn off this feature in the Eclipse Preferences. Select the "Preferences" command from the "Window" menu. In the Preferences dialog box, click the little triangle next to "Java" to open the list of Java preferences (if necessary), then click the triangle next to "Editor", and finally click "Content Assist." In the Content Assist preferences, uncheck the box labeled "Enable Auto Activation" and click "OK". It looks like this:

Eclipse Preferences Dialog

Eclipse has a huge number of preference settings that can be used to customize the environment. Most of the default settings are OK, but there are a few that I usually change. If you want to do the same: Under "Java / Editor / Mark Occurrences", turn off "Keep marks when the selection changes". Under "Java / Compiler / Errors/Warnings / Potential Programming Problems", change "Serializable class without serialVersionUID" from Warning to Ignore, and change "Possible accidental boolean assignment", "'Switch' case fall-through" and "Null reference" from Ignore to Warning.)


Eclipse has a lot of other useful features. We will encounter more of them as time goes on, and you can undoubtedly discover a few new ones on your own. But here are a few of the most useful:


Javadoc

Javadoc is a standard syntax for comments in Java source code. Javadoc was introduced in Section 4.5 of the CPSC 124 textbook, and you can read about it there. Javadoc comments can be processed to produce documentation of your code's API in the form of web pages. The API documentation for Java's standard classes was created with Javadoc.

Javadoc is thoroughly integrated into Eclipse. If you hilite the name of a class, method, or variable, its Javadoc documentation (if any) appears in the "Javadoc" view. If you just hover your mouse over a name, the Javadoc documentation appears in a pop-up window. When Content Assist shows a list of variable or method names, it will also show Javadoc documentation for items in the list. This makes it very worthwhile to use Javadoc comments in your code, even if you don't plan to generate web pages from the comments.

Furthermore, Eclipse will generate the outline of a Javadoc comment for you. To have it do this, just position the cursor on the line before a declaration, type /** (the first three characters of a Javadoc comment), and press return. An outline of the comment, including any appropriate Javadoc tags (such as @author, @param, @return, and @throws) will appear. Another way to add a comment is to hit Shift-Alt-J while typing in an editor window. A comment will be added to the item that contains the cursor.

Some things to keep in mind: A Javadoc comment always comes before the declaration that is being commented. Any Javadoc tags, such as @param and @return, must come at the end of the comment, after any other text in the comment. Javadoc comments can contain HTML markup tags such as <p> or <b>...</b>. This means that you should not use the characters "&" or "<" in a Javadoc comment; Write them as "&amp;" or "&lt;" instead.

In this course, you are required to use Javadoc comments for any non-private classes, member variables, and methods that you write. This requirement starts now, with the first programming assignment.


David Eck, for CPSC 225