Section 8.2
Programming with Arrays


ARRAYS ARE THE MOST BASIC AND THE MOST IMPORTANT type of data structure, and techniques for processing arrays are among the most important programming techniques you can learn. Two fundamental array processing techniques -- searching and sorting -- will be covered in Section 4. This section introduces some of the basic ideas of array processing in general.

In many cases, processing an array means applying the same operation to each item in the array. This is commonly done with a for loop. A loop for processing all the items in an array A has the form:

             // do any necessary initialization
             for (int i = 0; i < A.length; i++) {
                . . . // process A[i]
             }

Suppose, for example, that A is an array of type double[]. Suppose that the goal is to add up all the numbers in the array. An informal algorithm for doing this would be:

            Start with 0;
            Add A[0];   (process the first item in A)
            Add A[1];   (process the second item in A)
              .
              .
              .
            Add A[ A.length - 1 ];   (process the last item in A)

Putting the obvious repetition into a loop and giving a name to the sum, this becomes:

            double sum;  // The sum of the numbers in A.
            sum  = 0;    // Start with 0.
            for (int i = 0; i < A.length; i++)
               sum += A[i];  // add A[i] to the sum, for
                             //     i = 0, 1, ..., A.length - 1

Note that the continuation condition, "i < A.length", implies that the last value of i that is actually processed is A.length-1, which is the index of the final item in the array. It's important to use "<" here, not "<=", since "<=" would give an array out of bounds error.

Eventually, you should just about be able to write loops similar to this one in your sleep. I will give a few more simple examples. Here is a loop that will count the number of items in the array A which are less than zero:

              int count;  // For counting the items.
              count = 0;  // Start with 0 items counted.
              for (int i = 0; i < A.length; i++) {
                 if (A[i] < 0.0)   // if this item is less than zero...
                    count++;          //     ...then count it
              }
              // At this point, the value of count is the number
              // of items that have passed the test of being < 0

Replace the test "A[i] < 0.0", if you want to count the number of items in an array that satisfy some other property. Here is a variation on the same theme. Suppose you want to count the number of times that an item in the array A is equal to the item that follows it. The item that follows A[i] in the array is A[i+1], so the test in this case is "if (A[i] == A[i+1])". But there is a catch: This test cannot be applied when A[i] is the last item in the array, since then there is no such item as A[i+1]. The result of trying to apply the test in this case would be an array out of bounds error. This just means that we have to stop one item short of the final item:

            int count = 0;
            for (int i = 0; i < A.length - 1; i++) {
               if (A[i] == A[i+1])
                  count++;
            }

Another typical problem is to find the largest number in A. The strategy is to go through the array, keeping track of the largest number found so far. We'll store the largest number found so far in a variable called max. As we look through the array, whenever we find a number larger than the current value of max, we change the value of max to that larger value. After the whole array has been processed, max is the largest item in the array overall. The only question is, what should the original value of max be? One possibility is to start with max equal to A[0], and then to look through the rest of the array, starting from A[1], for larger items:

            double max = A[0];
            for (int i = 1; i < A.length; i++) {
               if (A[i] > max)
                  max = A[i];
            }
            // at this point, max is the largest item in A

(There is one subtle problem here. It's possible in Java for an array to have length zero. In that case, A[0] doesn't exist, and the reference to A[0] in the first line gives an array out of bounds error. However, zero-length arrays are normally something that you want to avoid in real problems. Anyway, what would it mean to ask for the largest item in an array that contains no items at all?)

As a final example of basic array operations, consider the problem of copying an array. To make a copy of our sample array A, it is not sufficient to say

           double[] B = A;

since this does not create a new array object. All it does is declare a new array variable and make it refer to the same object to which A refers. (So that, for example, a change to A[i] will automatically change B[i] as well.) To make a new array that is a copy of A, it is necessary to make a new array object and to copy each of the individual items from A into the new array:

           double[] B = new double[A.length]; // Make a new array object,
                                              //   the same size as A.
           for (int i = 0; i < A.length; i++)
              B[i] = A[i];   // Copy each item from A to B.

Copying values from one array to another is such a common operation that Java has a predefined subroutine to do it. The subroutine, System.arraycopy(), is a static member subroutine in the standard System class. Its declaration has the form

      public static void arraycopy(Object sourceArray, int sourceStartIndex,
              Object destArray, int destStartIndex, int count)

where sourceArray and destArray can be arrays with any base type. Values are copied from sourceArray to destArray. The count tells how many elements to copy. Values are taken from sourceArray starting at position sourceStartIndex and are stored in destArray starting at position destStartIndex. For example, to make a copy of the array, A, using this subroutine, you would say:

              double B = new double[A.length];
              System.arraycopy( A, 0, B, 0, A.length );


An array type, such as double[], is a full-fledged Java type, so it can be used in all the ways that any other Java type can be used. In particular, it can be used as the type of a formal parameter in a subroutine. It can even be the return type of a function. For example, it might be useful to have a function that makes a copy of an array of doubles:

         double[]  copy( double[] source ) {
                 // Create and return a copy of the array, source.
                 // If source is null, return null.
             if ( source == null )
                return null;
             double[]  cpy;  // A copy of the source array.
             cpy = new double[source.length];
             System.arraycopy( source, 0, cpy, 0, source.length );
             return cpy;
         }

The main() routine of a program has a parameter of type String[]. You've seen this used since all the way back in Chapter 2, but I haven't really been able to explain it until now. The parameter to the main() routine is an array of Strings. When the system calls the main() routine, the strings in this array are the command-line parameters. When using a command-line interface, the user types a command to tell the system to execute a program. The user can include extra input in this command, beyond the name of the program. This extra input becomes the command-line parameters. For example, if the name of the class that contains the main() routine is myProg, then the user can type "java myProg" to execute the program. In this case, there are no command-line parameters. But if the user types the command "java myProg one two three", then the command-line parameters are the strings "one", "two", and "three". The system puts these strings into an array of Strings and passes that array as a parameter to the main() routine. Here, for example, is a short program that simply prints out any command line parameters entered by the user:

       public class CLDemo {
          
          public static void main(String[] args) {
             System.out.println("You entered " + args.length
                                         + " command-line parameters.");
             if (args.length > 0) {
                System.out.println("They were:");
                for (int i = 0; i < args.length; i++)
                   System.out.println("   " + args[i]);
             }
          } // end main()
          
       } // end class CLDemo

Note that the parameter, args, is never null when main() is called by the system, but it might be an array of length zero.

In practice, command-line parameters are often the names of files to be processed by the program. I will give some examples of this in Chapter 10, when I discuss file processing.


So far, all my examples of array processing have used sequential access. That is, the elements of the array were processed one after the other in the sequence in which they occur in the array. But one of the big advantages of arrays is that they allow random access. That is, every element of the array is equally accessible at any given time.

As an example, let's look at a well-known problem called the birthday problem: Suppose that there are N people in a room. What's the chance that there are two people in the room who have the same birthday? (That is, they were born on the same day in the same month, but not necessarily in the same year.) Most people severely underestimate the probability. We actually look at a different version of the problem: Suppose you choose people at random and check their birthdays. How many people will you check before you find one who has the same birthday as someone you've already checked? Of course, the answer in a particular case depends on random factors, but we can simulate the experiment with a computer program and run the program several times to get an idea of how many people need to be checked on average.

To simulate the experiment, we need to keep track of each birthday that we find. There are 365 different possible birthdays. (We'll ignore leap years.) For each possible birthday, we need to know, has this birthday already been used? The answer is a boolean value, true or false. To hold this data, we can use an array of 365 boolean values:

           boolean[] used;
           used = new boolean[365];

The days of the year are numbered from 0 to 364. The value of used[i] is true if someone has been selected whose birthday is day number i. Initially, all the values in the array, used, are false. When we select someone whose birthday is day number i, we first check whether used[i] is true. If so, then this is the second person with that birthday. We are done. If used[i] is false, we set used[i] to be true to record the fact that we've encountered someone with that birthday, and we go on to the next person. Here is a subroutine that carries out the simulated experiment (Of course, in the subroutine, there are no simulated people, only simulated birthdays):

   static void birthdayProblem() {
          // Simulate choosing people at random and checking the
          // day of the year they were born on.  If the birthday
          // is the same as one that was seen previously, stop,
          // and output the number of people who were checked.

      boolean[] used;  // For recording the possible birthdays
                       //   that have been seen so far.  A value
                       //   of true in used[i] means that a person
                       //   whose birthday is the i-th day of the
                       //   year has been found.

      int count;       // The number of people who have been checked.

      used = new boolean[365];  // Initially, all entries are false.
      
      count = 0;

      while (true) {
             // Select a birthday at random, from 0 to 364.
             // If the birthday has already been used, quit.
             // Otherwise, record the birthday as used.
         int birthday;  // The selected birthday.
         birthday = (int)(Math.random()*365);
         count++;
         if ( used[birthday] )
            break;
         used[birthday] = true;
      }

      TextIO.putln("A duplicate birthday was found after " 
                                                + count + " tries.");

   } // end birthdayProblem()
   

This subroutine makes essential use of the fact that every element in a newly created array of booleans is set to be false. If we wanted to reuse the same array in a second simulation, we would have to reset all the elements in it to be false with a for loop

            for (int i = 0; i < 365; i++)
                used[i] = false;

Here is an applet that will run the simulation as many times as you like. Are you surprised at how few people have to be chosen, in general?

Sorry, but your browser
doesn't support Java.


One of the examples in Section 6.4 was an applet that shows multiple copies of a message in random positions, colors, and fonts. When the user clicks on the applet, the positions, colors, and fonts are changed to new random values. Like several other examples from that chapter, the applet had a flaw: It didn't have any way of storing the data that would be necessary to redraw itself. Chapter 7 introduced off-screen canvases as a solution to this problem, but off-screen canvases are not a good solution in every case. Arrays provide us with an alternative solution. Here's a new version of the applet. This version uses an array to store the position, font, and color of each string. When the applet is painted, this information is used to draw the strings, so it will redraw itself correctly when it is covered and then uncovered. When you click on the applet, the array is filled with new random values and the applet is repainted.

Sorry, but your browser
doesn't support Java.

In this applet, the number of copies of the message is given by a named constant, MESSAGE_COUNT. One way to store the position, color, and font of MESSAGE_COUNT strings would be to use four arrays:

         int[] x = new int[MESSAGE_COUNT];  
         int[] y = new int[MESSAGE_COUNT];
         Color[] color = new Color[MESSAGE_COUNT];
         Font[] font = new Font[MESSAGE_COUNT];

These arrays would be filled with random values. In the paint() method, the i-th copy of the string would be drawn at the point (x[i],y[i]). Its color would be given by color[i]. And it would be drawn in the font font[i]. This would be accomplished by the paint() method

         public void paint(Graphics g) {
            for (int i = 0; i < MESSAGE_COUNT; i++) {
               g.setColor( color[i] );
               g.setFont( font[i] );
               g.drawString( message, x[i], y[i] );
            }
         }

This approach is said to use parallel arrays. The data for a given copy of the message is spread out across several arrays. If you think of the arrays as laid out in parallel columns -- array x in the first column, array y in the second, array color in the third, and array font in the fourth -- then the data for the i-th string can be found along the the i-th row. There is nothing wrong with using parallel arrays in this simple example, but it does go against the object-oriented philosophy of keeping related data in one object. If we follow this rule, then we don't have to imagine the relationship among the data because all the data for one copy of the message is physically in one place. So, when I wrote the applet, I made a simple class to represent all the data that is needed for one copy of message:

          class StringData {
                 // Data for one copy of the message.
              int x,y;        // Position of the message.
              Color color;    // Color of the message.
              Font font;      // Font used for the message.
          }

To store the data for multiple copies of the message, I use an array of type StringData[]. The array is declared as an instance variable, with the name data:

          StringData[] data;

Of course, the value of data is null until an actual array is created and assigned to it. This is done in the init() method of the applet with the statement

          data = new StringData[MESSAGE_COUNT];

Just after this array is created, the value of each element in the array is null. We want to store data in objects of type StringData, but no such objects exist yet. All we have is an array of variables that are capable of referring to such objects. I decided to create the objects in the applet's init method. (It could be done in other places -- just so long as we avoid trying to use to an object that doesn't exist.) The objects are created with the for loop

           for (int i = 0; i < MESSAGE_COUNT; i++)
               data[i] = new StringData();

Now, the idea is to store data for the i-th copy of the message in the variables data[i].x, data[i].y, data[i].color, and data[i].font. (Make sure that you understand the notation here: data[i] refers to an object. That object contains instance variables. The notation data[i].x tells the computer: "Find your way to the object that is referred to by data[i]. Then go to the instance variable named x in that object." Variable names can get even more complicated than this.) Using the array, data, the paint() method for the applet becomes

        public void paint(Graphics g) {
           for (int i = 0; i < MESSAGE_COUNT; i++) {
              g.setColor( data[i].color );
              g.setFont( data[i].font );
              g.drawString( message, data[i].x, data[i].y );
           }
        }

There is still the matter of filling the array, data, with random values. If you are interested, you can look at the source code for the applet, RandomStringsWithArray.java.

The applet actually uses one other array. The font for a given copy of the message is chosen at random from a set of five possible fonts. In the original version of the applet, there were five variables of type Font to represent the fonts. The variables were named font1, font2, font3, font4, and font5. To select one of these fonts at random, a switch statement could be used:

       Font randomFont;  // One of the 5 fonts, chosen at random.
       int rand;         // A random integer in the range 0 to 4.
       
       rand = (int)(Math.random() * 5);
       switch (rand) {
          case 0:
             randomFont = font1;
             break;
          case 1:
             randomFont = font2;
             break;
          case 2:
             randomFont = font3;
             break;
          case 3:
             randomFont = font4;
             break;
          case 4:
             randomFont = font5;
             break;
       }

In the new version of the applet, the five fonts are stored in an array, which is named fonts. This array is declared as an instance variable

           Font[] fonts;

The array is created and filled with fonts in the init() method:

           fonts = new Font[5];  // Array to store five fonts.
           fonts[0] = new Font("Serif", Font.BOLD, 14);
           fonts[1] = new Font("SansSerif", Font.BOLD + Font.ITALIC, 24);
           fonts[2] = new Font("Monospaced", Font.PLAIN, 20);
           fonts[3] = new Font("Dialog", Font.PLAIN, 30);
           fonts[4] = new Font("Serif", Font.ITALIC, 36);

This makes it much easier to select one of the fonts at random. It can be done with the statements

           Font randomFont;  // One of the 5 fonts, chosen at random.
           int fontIndex;    // A random number in the range 0 to 4.
           fontIndex = (int)(Math.random() * 5);
           randomFont = fonts[ fontIndex ];

The switch statement has been replaced by a single line of code. This is a very typical application of arrays. Here is another example of the same sort of thing. Months are often stored as numbers 1, 2, 3, ..., 12. Sometimes, however, these numbers have to be translated into the names January, February, ..., December. The translation can be done with an array. The array could be declared and initialized as

        static String[] monthName = { "January", "February", "March",
                                      "April",   "May",      "June",
                                      "July",    "August",   "September",
                                      "October", "November", "December" };

If mth is a variable that holds one of the integers 1 through 12, then monthName[mth-1] is the name of the corresponding month. Simple array indexing does the translation for us!


[ Next Section | Previous Section | Chapter Index | Main Index ]