CPSC 225, Spring 2012
Lab 1: Spellchecking (and Eclipse)
The first lab of the semester consists of a programming assignment and a related written assignment. The program will be a basic spellchecker, and the written assignment will ask you to analyze the program. For those of you who are not already familiar with it, this lab also includes an introduction to programming with the Eclipse IDE (Integrated Development Environment).
Programming Exercise: Spellchecking
To to help you get back into programming at the start of a new semester, this lab includes a moderately complex programming assignment. The assignment is due next Thursday, before the start of the next lab. Remember that your program will be graded for style as well as for correctness. It should follow all the rules in the style guide that was handed out on the first day of class, including the use of Javadoc comments (see below).
This assignment is to write a basic "spell checker" program that can check words entered by the user. The program will use a long list of correctly spelled words. Your program should read a word from the user and convert it to lower case. It should check whether the word is in the list of correctly spelled words. If it is, you should tell the user that the word is OK. If it is not, you should show the user a list of similar words that are in the list (if there are any).
Your program will use the files WordList.java and unsorted_words.txt, which you will find in the directory /classes/cs225. The WordList class represents a list of correctly spelled words. Your program will use an object of type WordList, which has an instance method
public boolean contains(String lowerCaseWord)
that tests whether a given word is in the list. The file unsorted_words.txt contains the list of words and is read by WordList.java; you won't need to do anything with this file except add it to your project folder.
The hard part of the program is finding correctly spelled words that are "similar" to an incorrectly spelled word. The idea is to modify the incorrectly spelled word in certain ways, and look up the result in the list of correctly spelled words. Any correctly spelled words that you can find in this way go into the list of possible corrections.
You should implement the following ways of modifying the incorrectly spelled word:
- Delete a character. Try this for each character in the original word.
- Add a character. Try putting each of the 26 letters of the alphabet in each of the possible positions in the original word.
- Change a character. Try substituting each of the 26 letters of the alphabet for each of the characters in the original word.
- Swap two characters. Try reversing the order of each consecutive pair of characters in the original word.
- Insert a space. Try inserting a space into each possible position in the original word, breaking it into two words. In this case, you have to check that both of the words that you make are in the list of correctly spelled words.
Note that to make the modified words, you will have to "disassemble" the original word and put it back together with some changes. To do this, you will need to work with substrings of a string. Recall that if str is a variable of type String, then str.substring(0,n) is the substring consisting of the first n characters from the string, that is, the characters in positions 0 through n−1. And str.substring(n) is the substring consisting of the characters in position n through the end of the string.
Your program should let the user input words, one at a time, and it should spellcheck each of the words, as discussed here. For extra credit, when your program presents a list of alternative words to the user, it should not include any duplicates in the list, and the list should be in alphabetical order. The program should be nice to the user: Prompt for inputs, label the output, and format the output nicely. The program should break up its task into subroutines. And don't forget good style!
What to turn in and how to turn it in: To turn in your program, you should copy the program's project folder from your Eclipse workspace into the directory /classes/cs225/homework/zz9999, where you should replace "zz9999" with your own user name. It is not necessary to make a printout of your work. If you have any problems doing this, ask for help during the lab next Thursday.
Written Assignment: Analysis of the Spellcheck Algorithm
The writing assignment is to write an essay discussing the run-time analysis of the spellcheck algorithm that is used in the programming assignment. You should take into account all the operations that are performed by your program, including the calls to the contains method of the WordList class. You should note that contains operates in the dumbest possible way: It simply does a linear search through an array of words, looking for the word in question. (We will discuss linear search and alternatives to it in class.)
I don't want just a "Big-Oh" analysis. You should be as specific as you can about the run time. Discuss the run time in terms of the length of the word that is being spellchecked and the number of words in the list of correctly spelled words. You can try to give some formulas that involve these quantities, and some other constants, but the main thing is to be clear about how the run time depends on the length of the words and the size of the list
Note that while you should work on the programming assignment on your own, you are allowed to discuss this writing assignment with other students. You should, however, write your own essay.
What to turn in and how to turn it in: For written assignments like this one, you can either turn in a hard copy in class, or you can submit a file to the homework directory in /classes/cs225/homework. You can add the file to your Eclipse project if you want. The written assignment for this lab is due by the beginning of class next Friday, January 28.
If You've Already Used Eclipse
The rest of this lab is an introduction to using Eclipse. If you have already used Eclipse, you can start right in on the assignment. However, you might want to read through the next section for some pointers on using Eclipse. In particular, if you have never set up your Eclipse preferences, you should read the suggestions about preference settings. Also, you might want to create a new workspace for your CS225 projects. (If you have set Eclipse to automatically use some existing workspace, start up Eclipse then use the command "File" / "Switch Workspace" / "Other..." to enter a new workspace name.)
To start the programming assignment, create a new project named lab1 or lab1-cs225. Copy the files WordList.java and unsorted_words.txt from /classes/cs225 into the src folder in your new project. If you want to use TextIO.java for input, you should also copy that file from /classes/cs225 into your src folder; the alternative is to use the standard Scanner class, as discussed in Section 2.4.6. You should then create a new class to represent the program described in the first section of this lab.
Running Eclipse
The rest of this lab introduces you to the Eclipse IDE and gets you started on this lab's programming assignment. Note that Eclipse and IDEs in general are also discussed in Section 2.6 of the textbook. The most recent version of Eclipse is already installed on the Linux computers that you will use during the lab. You can access it in the "Programming" sub-menu of the application menu. You might want to add an "Eclipse" button to the panel at the top of the screen to make it easier to get to. To do that, simply drag the command from the application menu to the panel.
Although it is not required for the course, I encourage you to consider running Eclipse on your own computer as well. It is available for Linux, Windows, and Mac OS. You can download the most recent version of Eclipse from the Eclipse downloads page, http://eclipse.org. You should get the "Eclipse IDE for Java Developers". Note that you need to have Java on your computer before you install Eclipse. If you would like some help installing Java or Eclipse on your computer, ask for it.
Eclipse organizes your programming projects into "workspaces." Each workspace corresponds to a directory. These workspace directories hold all the files that are used by your projects, including Java source code files and compiled class files. You should let Eclipse have complete control over the workspace directories, and you should not directly change any files that are stored in them. It is possible to have as many workspaces as you want, but for this course, you will probably keep all of your projects in the same workspace. (In fact, you can keep all the programs that you ever write in one workspace, if you want.)
When Eclipse starts up for the first time, it asks you to select a workspace. (There is a box that you can check if you don't want to be asked to select a workspace each time you run Eclipse.) By default, the name of the workspace will be "workspace", but I usually change the default name. If you expect to devote a workspace to CS225 only, you might name it "workspace-cs225".
The first time that you use a workspace, its window will be filled with a "Welcome" screen. The icons on this screen link to a large amount of information about Eclipse. You can browse through this information some time, if you like, but for now, just close the "Welcome" screen by clicking the "×" next to the word "Welcome" near the top left corner of the window. If you want to get back to the "Welcome" screen, just select "Welcome" from the "Help" menu.
Eclipse uses the terms view and perspective to describe the way that information is organized and presented in its window. The window typically contains several views. Each view contains a certain type of information. All the views that are visible in the window constitute a perspective. A perspective is meant to contain all the views that are used in some particular activity. For now, you will just use the Java perspective, which is organized for Java programming. Later, you will learn about the Debug perspective, which is for debugging programs. Each perspective is highly customizable. You can add and delete views and move them around. If you ever delete a view accidently, you can get it back by selecting it from the "Show View" submenu in the "Window" menu. If you mess up a perspective, you can get back the original, default setup with the "Reset Perspective" command in the "Windows" menu.
After you have closed the Welcome screen, the window will show the Java perspective. Initially, all the views are empty. Here is what the Java perspective might look like after a little work has been done:
In this window, I have closed a couple of the views that I don't use: the "Task List", "Outline", and "Hierarchy" views. Remember that a view can be closed by clicking the small "×" next to the name of the view, and it can be reopened using the "Window/Show View" menu.
The "Package Explorer" view, on the left, is central to much of the work that you do in Eclipse. It contains a list of your programming projects and the Java files and resources that are contained in those projects. In the above picture, there is just one project, named lab1. Clicking on the small triangle (or plus sign) next to the project name will show/hide the resources contained in the project. In a new project, there will be a directory named src where the source files for the project will be stored.
The lower right section of the window contains several views. In the picture, the "Console" view is showing. To see one of the other views, such as "Problems" or "Javadoc", just click on the tab that contains the name of the view. Sometimes, another view will pop up automatically in this area to show the output of some command. When you run a program, standard input and output are done in the "Console" view. Errors and warnings from the Java compiler are displayed in the "Problems" view.
The central area of the window is occupied by editor views. Here, I've created a Java file, Spellcheck.java, and have opened it for editing.
The view of Spellcheck.java shows several of the nifty features of the Java editor. The source code that you type is checked for syntax errors as you type. Errors are marked with small red carets at the bottom of the line. The error is also marked by a red rectangle in the right margin of the editor and sometimes by an icon in the left margin of the editor; if you hover your mouse over any error or error marker, you see an error message for the error. For the first error in the picture, for example, you would be told that a semicolon is missing. In the third line of main(), the word "printlnl" is underlined with error markers. (It's a misspelling of "println".) This error has an error light bulb () in the left margin of the editor. The light bulb means that Eclipse is not just telling you about the error but is offering you some ideas for fixing it. If you click the light bulb, you get a list of actions that can be taken to fix the problem. You will also see the list if you hover the mouse over the underlined error itself. For the above picture, the list is:
Double clicking on an action, such as "Change to println()", will apply that option automatically. Sometimes, you will see a warning light bulb (). A warning indicates a potential problem, but not an error. Warnings will not prevent the source code from being compiled successfully.
In fact, Eclipse might be a little too enthusiastic in marking warnings and errors! You do not necessarily have to fix every warning. And you do not have to fix every error as soon as it appears. In fact, it's impossible to do so. And in some cases the error will go away by itself after you've typed in more of your program. And remember that the fix for a programming error does not always go at the location of the error; sometimes the problem is elsewhere in the file. Furthermore, Eclipse's error system is only effective if you routinely get most of your program correct in the first place -- don't expect Eclipse to make solid Java programming skills unnecessary!
In fact, many of the fixes that Eclipse will offer will do things to your code that you won't understand. Not every fix is a good idea! Don't let Eclipse fix your code unless you understand what it wants to do and why!
Note that Eclipse will spell-check your comments. It will mark a misspelled word in a comment with a red underline. If you hover the mouse over a misspelled word, you will get a list of possible corrections. Double-click an item in this list to apply the correction to the word.
Your First Project
It's time for you to start using Eclipse. Start up Eclipse, as described above. Close the "Welcome" screen (and maybe the "Outline" view as well).
To create your first project, right-click in the "Project Explorer" pane. In the pop-up menu that appears, go to the "New" submenu, and select "Java Project." A "New Project" wizard will pop up. All you have to do is enter a name for the project, in the box labeled "Project Name". Enter "lab1", or "lab1-cs225", as the name of the project. After entering the project name, click the "Finish" button. The project will be added to "Project Explorer", and a directory of the same name will be created in your Eclipse workspace directory.
Now, you need to create a new Java class file in your project. To do this, right-click the project name in the "Package Explorer" view. In the pop-up menu, go the "New" submenu again, and select "Class". A class creation dialog box will pop up. Again, you just have to fill in the name of the class, and click the "Finish" button. Note that you have to enter the name of the class, not the name of the file, so don't try to add a ".java" extension to the name. The name must be a valid Java class name. Use "Spellcheck" as the class name; it will eventually be the main class for this lab's programming assignment. The class will be added to the src folder in your project, in the "default package." Furthermore, a Java editor will open, showing you the initial contents of the file. As you will see in the editor window, Eclipse has already added the declaration of the class. All you have to do is fill it in! Note that you can close the editor in the usual way by clicking the little ×. To open the file again, double-click its name in the "Project Explorer."
To see the file in the "Package Explorer," open the project by clicking the little triangle next to its name, then open the src folder in the same way, and finally the "default package." You will see Spellcheck.java listed in that package. "Default package" just means that the class is not declared to belong to any package. Later in the term, you will be working with packages, but for now the default package is OK, even though Eclipse will display a warning that "use of the default package is discouraged" when you create the class.
Next, you want to make TextIO available in the project. To do that, you have to add TextIO.java to the project. You can find a copy of TextIO.java in the directory /classes/cs225. The easiest way to get this file into your project is to copy-and-paste it from a directory window: Open a file browser window for the /classes/cs225 directory. Right-click its file icon and select "Copy" from the pop-up menu. Then right-click the src folder in your project in Eclipse's "Project Explorer" pane, and select "Paste" from the pop-up menu. TextIO should appear in the default package, inside src. (Note that you should not just copy TextIO.java into the Eclipse workspace directory in the file system; if you do that, it will not automatically appear in your project.)
Eclipse often has alternative ways of doing things. Another way to create projects and classes is to use the "create" buttons in the toolbar. These are a group of three small buttons at the top of the Eclipse window:
Click on the left button in this group to create a new Java project. Click the right button to create a new class. The middle button is for creating packages. (Note that when you create a class using the button, you should first click in the Package Explorer to select the location of the class that you are creating. Otherwise, you'll have to enter the location by hand in the class creation dialog box.)
Now, to create and run a program. Add the following main() routine, or something similar, to the class that you created above:
public static void main(String[] args) { System.out.print("What's your name? ); String name = TextIO.getln(); System.out.println("Pleased to meet you, " + name); }
The program is compiled automatically. To run it, right-click either on the name of the class in the "Project Explorer" or on the editor window that contains the program. In the pop-up menu, go to the "Run As" submenu, and select "Java Application" from the submenu. The program will start. The output from the first print statement appears in the "Console" view in the bottom right area of the Eclipse window. In order to type a response, you must first click the Console window! Type in your response and press return. The last line of the program will be executed, and the program will end.
Alternatively, if an editor window for the program is currently the selected view, you can run the program by clicking on the "Run" button () in the toolbar. If you click the little black plus sign to the right of the "Run" button, you'll get a list of all the programs that you have run; select a program from this list to run it.
You have surely already noticed that the Java editor in Eclipse does a certain amount of work for you. It automatically inserts indentation. When you type a "{" and press return, the matching "}" is inserted in the correct position. Furthermore, when you type a period, a list of things that can follow the period will pop up after a short delay, and you can select an item from the list instead of typing it in. This is called Content Assist. You can invoke Content Assist at any time while you are typing by pressing Control-Space. If you do this in the middle of a variable name or method name, pressing Control-Space will either complete the name or offer a list of possible completions. It can be used in other ways as well. For example, if you press Control-Space after typing the "(" at the beginning of a method's parameter list, you will get information about the parameters of the method. By the way, when Content Assist pops up a list, you can get rid of the list by pressing the Escape key.
Content Assist is a good thing, but I find the way it pops up automatically while I am typing to be very annoying. You can turn off this feature in the Eclipse Preferences. Select the "Preferences" command from the "Window" menu. In the Preferences dialog box, click the little triangle next to "Java" to open the list of Java preferences (if necessary), then click the triangle next to "Editor", and finally click "Content Assist." In the Content Assist preferences, uncheck the box labeled "Enable Auto Activation" and click "OK". It looks like this:
Eclipse has a huge number of preference settings that can be used to customize the environment. Most of the default settings are OK, but there are a few that I usually change. If you want to do the same: Under "Java / Editor / Mark Occurrences", turn off "Keep marks when the selection changes". Under "Java / Compiler / Errors/Warnings / Potential Programming Problems", change "Serializable class without serialVersionUID" from Warning to Ignore, and change "Possible accidental boolean assignment", "'Switch' case fall-through" and "Null reference" from Ignore to Warning.)
Eclipse has a lot of other useful features. We will encounter more of them as time goes on, and you can undoubtedly discover a few new ones on your own. But here are a few of the most useful:
- Eclipse can fix the indentation of your source code. Just hilite a segment of code and press Control-I. Note that fixing the indentation can help you find mis-matched braces and incorrectly nested blocks of code in your program. (With this feature available, there is no excuse for badly indented programs. I suggest that you apply this to all the source code files that you submit for grading in this course!)
- Hilite a segment of code and hit Control-/ to "comment out" that code by adding "//" to the beginning of each line. If the code was already commented out, it will be commented back in, by removing the //'s.
- In an editor, if you double click just after a "{" or "}", the entire block will be hilited, making it easier to figure out just where the block begins or ends.
- Hilite the name of a class, method or variable and hit F3. You will be taken to the declaration of the hilited name, even if it is in another file. (Unfortunately, this doesn't work with the keyboards on the iMac's in Lansing 310.)
- When an un-caught exception occurs in your program, the stack trace of the exception appears in the Console view. The stack trace contains links to lines in the program, and clicking on the link will take you to that line in an editor view. You should always look for the first line in the stack trace that refers to one of the classes that you wrote, rather than to one of Java's built-in classes.
- Eclipse can do smart renaming. When you rename a variable, method, or class, all references to that item will be changed to use the new name. To do a smart rename, just hilite the name, and select the "Rename" command from the "Refactor" menu. (Refactoring is a general term that refers to rearranging or modifying code to improve it or make it more general.)
- The "Source" menu offers several options for automatically generating code. I probably use "Generate Getters and Setters" most often, but the commands for overriding methods and for generating constructors are also useful.
- As you edit a file, Eclipse keeps many old versions of each file in a "history", and you can revert to earlier versions from this history if you need to. To see the history for a file, right-click the file and choose "Replace With / Local History".
Getting Started on the Spellchecker
To get started on the spell checker assignment, you need to add two files to your project. The files are WordList.java and unsorted_words.txt, and they can be found in the directory /classes/cs225. Copy these files into the src directory of your project, using the same technique that you used above for TextIO.java. You will not have to edit either of these files, but your program will have to create and use an object of type WordList, as discussed at the start of this lab.
You already have the Spellcheck class which you can use as the main class for this assignment. Just delete the content of the main() routine that you typed in earlier.
Javadoc
Javadoc is a standard syntax for comments in Java source code. Javadoc was introduced in Section 4.5 of the textbook, and you can read about it there. Javadoc comments can be processed to produce documentation of your code's API in the form of web pages. The API documentation for Java's standard classes was created with Javadoc.
Javadoc is thoroughly integrated into Eclipse. If you hilite the name of a class, method, or variable, its Javadoc documentation (if any) appears in the "Javadoc" view. If you just hover your mouse over a name, the Javadoc documentation appears in a pop-up window. When Content Assist shows a list of variable or method names, it will also show Javadoc documentation for items in the list. This makes it very worthwhile to use Javadoc comments in your code, even if you don't plan to generate web pages from the comments.
Furthermore, Eclipse will generate the outline of a Javadoc comment for you. To have it do this, just position the cursor on the line before a declaration, type /** (the first three characters of a Javadoc comment), and press return. An outline of the comment, including any appropriate Javadoc tags (such as @author, @param, @return, and @throws) will appear.
Some things to keep in mind: A Javadoc comment always comes before the declaration that is being commented. Javadoc comments should only be used for classes, methods, and instance variables. Any Javadoc tags, such as @param and @return, must come at the end of the comment, after any other text in the comment. Javadoc comments can contain HTML markup tags such as <p> or <b>...</b>. This means that you should not use the characters "&" or "<" in a Javadoc comment; Write them as "&" or "<" instead.
In this course, you are required to use Javadoc comments for any non-private classes, member variables, and methods that you write. This requirement starts now, with the Spellcheck programming assignment.