Section 10.3
Programming With Files


IN THIS SECTION, we look at several programming examples that work with files. The techniques that we need were introduced in Section 1 and Section 2.

The first example is a program that makes a list of all the words that occur in a specified file. The user is asked to type in the name of the file. The list of words is written to another file, which the user also specifies. A word here just means a sequence of letters. The list of words will be output in alphabetical order, with no repetitions. All the words will be converted into lower case, so that, for example, "The" and "the" will count as the same word.

Since we want to output the words in alphabetical order, we can't just output the words as they are read from the input file. We can store the words in an array, but since there is no way to tell in advance how many words will be found in the file, we need a "dynamic array" which can grow as large as necessary. Techniques for working with dynamic arrays were discussed in Section 8.3. The data is represented in the program by two static variables:

        static String[] words;   // An array that holds the words.
        static int wordCount;    // The number of words currently 
                                 //    stored in the array.

The program starts with an empty array. Every time a word is read from the file, it is inserted into the array (if it is not already there). The array is kept at all times in alphabetical order, so the new word has to be inserted into its proper position in that order. The insertion is done by the following subroutine:

   static void insertWord(String w) {

      int pos = 0;  // This will be the position in the array 
                    //      where the word w belongs.

      w = w.toLowerCase();  // Convert word to lower case.
      
      /* Find the position in the array where w belongs, after all the
         words that precede w alphabetically.  If a copy of w already
         occupies that position, then it is not necessary to insert
         w, so return immediately. */

      while (pos < wordCount && words[pos].compareTo(w) < 0)
         pos++;
      if (pos < wordCount && words[pos].equals(w))
         return;
         
      /* If the array is full, make a new array that is twice as 
          big, copy all the words from the old array to the new,
          and set the variable, words, to refer to the new array. */

      if (wordCount == words.length) {
         String[] newWords = new String[words.length*2];
         System.arraycopy(words,0,newWords,0,wordCount);
         words = newWords;
      }
      
      /* Put w into its correct position in the array.  Move any
         words that come after w up one space in the array to
         make room for w. */

      for (int i = wordCount; i > pos; i--)
         words[i] = words[i-1];
      words[pos] = w;
      wordCount++;

   }  // end insertWord()

This subroutine is called by the main() routine of the program to process each word that it reads from the file. If we ignore the possibility of errors, an algorithm for the program is

      Get the file names from the user
      Create a TextReader for reading from the input file
      Create a PrintWriter for writing to the output file
      while there are more words in the input file:
         Read a word from the input file
         Insert the word into the words array
      For i from 0 to wordCount - 1:
         Write words[i] to the output file

Most of these steps can generate IOExceptions, and so they must be done inside try...catch statements. In this case, we'll just print an error message and terminate the program when an error occurs.

If in is the name of the TextReader that is being used to read from the input file, we can read a word from the file with the function in.getAlpha(). But testing whether there are any more words in the file is a little tricky. The function in.eof() will check whether there are any more non-whitespace characters in the file, but that's not the same as checking whether there are more words. It might be that all the remaining non-whitespace characters are non-letters. In that case, trying to read a word will generate an error, even though in.eof() is false. The fix for this is to skip all non-letter characters before testing in.eof(). The function in.peek() allows us to look ahead at the next character without reading it, to check whether it is a letter. With this in mind, the while loop in the algorithm can be written in Java as:

        while (true) {
           while ( ! in.eof() && ! Character.isLetter(in.peek()) )
              in.getAnyChar();  // Read the non-letter character.
           if ( in.eof() )  // End if there is nothing more to read.
              break;
           insertWord( in.getAlpha() ); 
        }

With error-checking added, the complete main() routine is as follows. If you want to see the program as a whole, you'll find the source code in the file WordList.java.

   public static void main(String[] args) {
   
      TextReader in;    // A stream for reading from the input file.
      PrintWriter out;  // A stream for writing to the output file.
      
      String inputFileName;   // Input file name, specified by the user.
      String outputFileName;  // Output file name, specified by the user.
      
      words = new String[10];  // Start with space for 10 words.
      wordCount = 0;           // Currently, there are no words in array.
      
      /* Get the input file name from the user and try to create the
         input stream.  If there is a FileNotFoundException, print
         a message and terminate the program. */
      
      TextIO.put("Input file name?  ");
      inputFileName = TextIO.getln().trim();
      try {
         in = new TextReader(new FileReader(inputFileName));
      }
      catch (FileNotFoundException e) {
          TextIO.putln("Can't find file \"" + inputFileName + "\".");
          return;
      }
      
      /* Get the output file name from the user and try to create the
         output stream.  If there is an IOException, print a message
         and terminate the program. */

      TextIO.put("Output file name? ");
      outputFileName = TextIO.getln().trim();
      try {
         out = new PrintWriter(new FileWriter(outputFileName));
      }
      catch (IOException e) {
          TextIO.putln("Can't open file \"" + 
                                  outputFileName + "\" for output.");
          TextIO.putln(e.toString());
          return;
      }
      
      /* Read all the words from the input stream and insert them into
         the array of words.  Reading from a TextReader can result in
         an error of type TextReader.Error.  If one occurs, print an
         error message and terminate the program. */
      
      try {
         while (true) {
               // Skip past any non-letters in the input stream.  If
               //   end-of-stream has been reached, end the loop.
               //   Otherwise, read a word and insert it into the 
               //   array of words.
            while ( ! in.eof() && ! Character.isLetter(in.peek()) )
               in.getAnyChar();
            if (in.eof())
               break;
            insertWord(in.getAlpha());
         }
      }
      catch (TextReader.Error e) {
         TextIO.putln("An error occurred while reading from input file.");
         TextIO.putln(e.toString());
         return;
      }
      
      /* Write all the words from the list to the output stream. */

      for (int i = 0; i < wordCount; i++)
         out.println(words[i]);
      
      /* Finish up by checking for an error on the output stream and
         printing either a warning message or a message that the words
         have been output to the output file. */
      
      if (out.checkError() == true) {
         TextIO.putln("Some error occurred while writing output.");
         TextIO.putln("Output might be incomplete or invalid.");
      }
      else {
         TextIO.putln(wordCount + " words from \"" + inputFileName + 
                       "\" output to \"" + outputFileName + "\".");
      }
   
   } // end main()
   

Making a copy of a file is a pretty common operation, and most operating systems already have a command for doing so. However, it is still instructive to look at a Java program that does the same thing. Many file operations are similar to copying a file, except that the data from the input file is processed in some way before it is written to the output file. All such operations can be done by programs with the same general form.

Since the program should be able to copy any file, we can't assume that the data in the file is in human-readable form. So, we have to use InputStream and OutputStream to operate on the file rather than Reader and Writer. The program simply copies all the data from the InputStream to the OutputStream, one byte at a time. If source is the variable that refers to the InputStream, then the function source.read() can be used to read one byte. This function returns the value -1 when all the bytes in the input file have been read. Similarly, if copy refers to the OutputStream, then copy.write(b) writes one byte to the output file. So, the heart of the program is a simple while loop. (As usual, the I/O operations can throw exceptions, so this must be done in a try...catch statement.)

          while(true) {
             int data = source.read();
             if (data < 0)
                break;
             copy.write(data);
          }

The file-copy command in an operating system such as DOS or UNIX uses command line arguments to specify the names of the files. For example, the user might say "copy original.dat backup.dat" to copy an existing file, original.dat, to a file named backup.dat. Command-line arguments can also be used in Java programs. The command line arguments are stored in the array of strings, args, which is a parameter to the main() routine. The program can retrieve the command-line arguments from this array. For example, if the program is named CopyFile and if the user runs the program with the command "java CopyFile work.dat oldwork.dat", then, in the program, args[0] will be the string "work.dat" and args[1] will be the string "oldwork.dat". The value of args.length tells the program how many command-line arguments were specified by the user.

My CopyFile program gets the names of the files from the command-line arguments. It prints an error message and exits if the file names are not specified. To add a little interest, there are two ways to use the program. The command line can simply specify the two file names. In that case, if the output file already exists, the program will print an error message and end. This is to make sure that the user won't accidently overwrite an important file. However, if the command line has three arguments, then the first argument must be "-f" while the second and third arguments are file names. The -f is a command-line option, which is meant to modify the behavior of the program. The program interprets the -f to mean that it's OK to overwrite an existing program. (The "f" stands for "force," since it forces the file to be copied in spite of what would otherwise have been considered an error.) You can see in the source code how the command line arguments are interpreted by the program:

   import java.io.*;
   
   public class CopyFile {
   
      public static void main(String[] args) {
         
         String sourceName;   // Name of the source file, 
                              //    as specified on the command line.
         String copyName;     // Name of the copy, 
                              //    as specified on the command line.
         InputStream source;  // Stream for reading from the source file.
         OutputStream copy;   // Stream for writing the copy.
         boolean force;  // This is set to true if the "-f" option
                         //    is specified on the command line.
         int byteCount;  // Number of bytes copied from the source file.
         
         /* Get file names from the command line and check for the 
            presence of the -f option.  If the command line is not one
            of the two possible legal forms, print an error message and 
            end this program. */
      
         if (args.length == 3 && args[0].equalsIgnoreCase("-f")) {
            sourceName = args[1];
            copyName = args[2];
            force = true;
         }
         else if (args.length == 2) {
            sourceName = args[0];
            copyName = args[1];
            force = false;
         }
         else {
            System.out.println(
                    "Usage:  java CopyFile <source-file> <copy-name>");
            System.out.println(
                    "    or  java CopyFile -f <source-file> <copy-name>");
            return;
         }
         
         /* Create the input stream.  If an error occurs, 
            end the program. */
         
         try {
            source = new FileInputStream(sourceName);
         }
         catch (FileNotFoundException e) {
            System.out.println("Can't find file \"" + sourceName + "\".");
            return;
         }
         
         /* If the output file already exists and the -f option was not
            specified, print an error message and end the program. */
      
         File file = new File(copyName);
         if (file.exists() && force == false) {
             System.out.println(
                  "Output file exists.  Use the -f option to replace it.");
             return;  
         }
         
         /* Create the output stream.  If an error occurs, 
            end the program. */
   
         try {
            copy = new FileOutputStream(copyName);
         }
         catch (IOException e) {
            System.out.println("Can't open output file \"" 
                                                    + copyName + "\".");
            return;
         }
         
         /* Copy one byte at a time from the input stream to the output
            stream, ending when the read() method returns -1 (which is 
            the signal that the end of the stream has been reached).  If any 
            error occurs, print an error message.  Also print a message if 
            the file has been copied successfully.  */
         
         byteCount = 0;
         
         try {
            while (true) {
               int data = source.read();
               if (data < 0)
                  break;
               copy.write(data);
               byteCount++;
            }
            source.close();
            copy.close();
            System.out.println("Successfully copied " 
                                             + byteCount + " bytes.");
         }
         catch (Exception e) {
            System.out.println("Error occurred while copying.  "
                                      + byteCount + " bytes copied.");
            System.out.println(e.toString());
         }
         
      }  // end main()
      
      
   } // end class CopyFile
   

Both of the previous programs use a command-line interface, but graphical user interface programs can also manipulate files. Programs typically have an "Open" command that reads the data from a file and displays it in a window and a "Save" command that writes the data from the window into a file. We can illustrate this in Java with a simple text editor program. The window for this program uses a TextArea component to display some text that the user can edit. It also has a menu bar, with a "File" menu that includes "Open" and "Save" commands. Menus and windows for standalone programs were discussed in Section 7.7. The program also uses file dialogs, which were introduced in Section 2.

When the user selects the Save command from the File menu, the program pops up a file dialog box where the user specifies the file. The text from the TextArea is written to the file. All this is done in the following instance method (where the variable, text, refers to the TextArea):

   private void doSave() {
          // Carry out the Save command by letting the user specify
          // an output file and writing the text from the TextArea
          // to that file.
      FileDialog fd;     // A file dialog that lets the user 
                         //            specify the file.
      String fileName;   // Name of file specified by the user.
      String directory;  // Directory that contains the file.
      fd = new FileDialog(this,"Save Text to File",FileDialog.SAVE);
      fd.show();
      fileName = fd.getFile();
      if (fileName == null) {
            // The fileName is null if the user has canceled the file 
            // dialog.  In this case, there is nothing to do, so quit.
         return;
      }
      directory = fd.getDirectory();
      try {
            // Create a PrintWriter for writing to the specified
            // file and write the text from the window to that stream.
         File file = new File(directory, fileName);
         PrintWriter out = new PrintWriter(new FileWriter(file));
         String contents = text.getText();
         out.print(contents);
         out.close();
      }
      catch (IOException e) {
            // Some error has occurred while opening or closing the file.
            // Show an error message.
         new MessageDialog(this, "Error:  " + e.toString());
      }
   }

The MessageDialog class that is used at the end of this method pops up a dialog box to display the error message to the user. In a GUI program, it wouldn't make much sense to write the error to standard output, since the user is not likely to be paying attention to standard output, even if it is visible on the screen. MessageDialog is not a standard part of Java. It is defined in the file MessageDialog.java.

(By the way, there is one problem with this method, with illustrates the difficulties of dealing with text files on different computing platforms. The lines in a TextArea are separated by line feed characters, '\n'. This is the standard for separating lines in the UNIX operating system, but both Macintosh and Windows have different standards. Macintosh uses the carriage return character, '\r', as a line separater, while Windows uses the two-character sequence "\r\n". The doSave() method writes the line feed characters from the TextArea to the output file. On Macintosh and Windows computers, the output file will not be in the proper format for a text file -- assuming that the implementation of PrintWriter in your version of Java doesn't take care of the problem. The TextReader class, which is used in the method that opens files, can handle any of the three formats of text files.)

When the user selects the Open command, a dialog box allows the user to specify the file that is to be opened. It is assumed that the file is a text file. Since TextAreas are not meant for displaying large amounts of text, the number of lines read from the file is limited to one hundred at most. Before the file is read, any text currently in the TextArea is removed. Then lines are read from the file and appended to the TextArea one by one, with a line feed character at the end of each line. This process continues until one hundred lines have been read or until the end of the input file is reached. If any error occurs during this process, an error message is displayed to the user in a dialog box. Here is the complete method:

   private void doOpen() {
          // Carry out the Open command by letting the user specify
          // the file to be opened and reading up to 100 lines from 
          // that file.  The text from the file replaces the text
          // in the TextArea.
      FileDialog fd;     // A file dialog that lets the user 
                         //                       specify the file.
      String fileName;   // Name of file specified by the user.
      String directory;  // Directory that contains the file.
      fd = new FileDialog(this,"Load File",FileDialog.LOAD);
      fd.show();
      fileName = fd.getFile();
      if (fileName == null) {
            // The fileName is null if the user has canceled the file 
            // dialog.  In this case, there is nothing to do, so quit.
         return;
      }
      directory = fd.getDirectory();
      try {
             // Read lines from the file until end-of-file is detected,
             // or until 100 lines have been read.  The lines are appended
             // to the TextArea, with a line feed after each line.  The
             // test for end-of-file in the while loop is 
             // in.peek() != '\0' because calling in.eof() could skip
             // over and discard any blank lines in the file.  Blank lines
             // should be copied to the TextArea just like any other lines.
         File file = new File(directory, fileName);
         TextReader in = new TextReader(new FileReader(file));
         String line;
         text.setText("");
         int lineCt = 0;
         while (lineCt < 100 && in.peek() != '\0') {
            line = in.getln();
            text.appendText(line + '\n');
            lineCt++;
         }
         if (in.eof() == false) {
            text.appendText(
               "\n\n********** Text truncated to 100 lines! ***********\n");
         }
         in.close();
      }
      catch (Exception e) {
            // Some error has occurred while opening or closing the file.
            // Show an error message.
         new MessageDialog(this, "Error:  " + e.toString());
      }
   }

The doSave() and doOpen() methods are the only part of the text editor program that deal with files. If you would like to see the entire program, you will find the source code in the file TrivialEdit.java.


For a final example of files used in a complete program, you might want to look at ShapeDrawWithFiles.java. This file defines one last version of the ShapeDraw program, which you last saw in Section 7.7. This version has a "File" menu for saving and loading the patterns of shapes that are created with the program. The program also serves as an example of using ObjectInputStream and ObjectOutputStream, which were discussed at the end of Section 1. If you check, you'll see that the Shape class in this version has been declared to be Serializable so that objects of type Shape can be written to and read from object streams.


[ Next Section | Previous Section | Chapter Index | Main Index ]