Section 7.5
Other Data Structures
TO A GREAT EXTENT, COMPUTER SCIENCE is the study of data structures and algorithms. What, after all, do computers do? They store data and manipulate it according to programmed algorithms. In object oriented programming, the association between data and algorithms is made explicit, since an object is just a collection of data bundled together with the methods, or algorithms, that operate on the data. However, the relationship between data structures and algorithms was recognized long before object-oriented programming became fashionable. The term that has long been used for a particular structure of data, along with the operations that are defined on that data structure, is abstract data type, or ADT. The term "abstract" here implies that the implementation -- the way the data is stored and the instructions that are executed to perform the operations -- is not important. What's important is the interface, the black-box view. Objects support the black-box view very well. An object presents a certain set of methods to the world. The implementation of those methods is hidden inside the object. In a sense, then, every class defines an abstract data type.
Java includes many standard classes that implement useful abstract data types. The String class, for example, implements an ADT in which the data structure is a string of characters and the operations include concatenation of strings, testing whether two strings are equal, determining the length of a string, and so on. You don't know, nor should you care, exactly how the characters are stored in memory or how the operations are implemented. In fact, the implementation might be different on different computers. As long as strings behave the same way on the two computers, it doesn't matter. Strings are defined by their abstract properties, not by their concrete implementation.
In this section, I'll briefly introduce some of the other standard Java classes that implement useful abstract data types. Some of the most important ADT classes -- those that implement input and output, files, and network communication -- will be covered in the next chapter.
The Date Class
The Date class is defined in the package java.util. (That means that if you want to use Dates, you have to import java.util.Date or java.util.* at the beginning of your program or use the full name, java.util.Date, in the program.) An object of type Date represents a particular time, measured to the millisecond, on a particular date. Given a Date object, you can determine what month the date occurs in by calling the instance method getMonth(), which returns a integer in the range 0 to 11. Similarly, there are instance methods for finding out the year, the hour, and so on. There are methods for comparing two Dates to see which one comes earlier in time, for converting a date into a printable string, and for parsing a string such as "September 9, 1998" to produce the Date object that the string represents.
One useful aspect of the Date class is that the default constructor, with no parameters, produces a date object representing the current time. So, if your program wants to know what time it is, you can use the command:
Date now = new Date();
Another constructor takes a string and produces the date object that the string represents. The string should include the name of the month, possibly abbreviated to three letters, the day of the month, and the year. It can also include a time in the form HH:MM or HH:MM:SS and even a time zone specification such as EST or GMT. This information can occur in various orders. This little applet will take strings you type in and try to convert them into Dates:
Now, of course Date objects can be useful in a program. But my main point here is that you can use Date objects effectively without knowing how they are implemented. Date is an abstract data type. (As a matter of fact, Java stores a date internally as the number of milliseconds since midnight on January 1, 1970. But knowing that doesn't help you understand the Date class or use it effectively.)
The BigInteger Class
A value of type int is a 32-bit integer, which can represent values up to a little over two billion. A value of type long is 64 bits, which seems like it should be big enough to represent almost any practical quantity. Nevertheless, Java has a class called BigInteger, defined in the package java.math, to represent integers with an arbitrary number of bits.
The usual arithmetic operators on integers are implemented as methods in the BigInteger class. For example, if bigint1 and bigint2 are objects of type BigInteger, then bigint1.multiply(bigint2) is the BigInteger that results from multiplying bigint1 by bigint2. (Of course, bigint2.multiply(bingint1) would give the same answer.) There are a few less well-known operations, too. For example, bigint1.gcd(bigint2) computes the greatest common divisor of the two integers.
A BigInteger can be constructed from a String:
BigInteger bigint = new BigInteger("1234567890987654321");Conversely, the instance method toString() produces the String representation of a BigInteger. To convert a regular integer into a BigInteger object, use the static method valueOf() from the BigInteger class:
BigInteger answer = BigInteger.valueOf(42);For example, here is a program segment that prints out the value of 1000 factorial (which is equal to 1 times 2 times 3 times ... times 999 times 1000):
BigInteger factorial = BigInteger.valueOf(1); for (int i = 2; i <= 1000; i++) { BigInteger num = BigInteger.valueOf(i); factorial = factorial.multiply(num); } System.out.println(factorial.toString());Now, unless you are a mathematician, all this might leave you unimpressed. However, it turns out that the ability to work with big integers is absolutely central to cryptography, which refers to encoding data so that it cannot be read except by someone who has the necessary "key" to decode it. Cryptography, in turn, is the basis for secure and confidential communication over the Internet. The BigInteger class exists to make it easy for Java programs to work with encrypted data.
The Vector Class
I already introduced the Vector class in an example in Section 4.3. A Vector represents a numbered sequence of objects. A Vector is similar to an array, except that an array has a fixed number of elements, while a vector can grow and shrink as elements are added to it and removed from it. A Vector can only hold objects. There is no such thing as a vector of ints in Java.
Some of the useful instance methods in the class Vector include:
- size() -- returns an int which gives the number of elements currently in the vector.
- elementAt(int index) -- Returns the element in the vector at position index. Elements are numbered starting from zero. An error occurs if index is less than zero or greater than or equal to the current size of the vector.
- addElement(Object obj) -- Add the element, obj, to the end of the vector.
- insertElementAt(Object obj, int index) -- Add the element, obj to the vector in position number index. Elements currently in the vector at position index and above are moved up to make room for the new object.
- removeElement(Object obj) -- If obj occurs in the vector, remove it. Other elements will be moved down to fill in the gap that is left when obj is removed.
- removeElementAt(int index) -- Remove the element in position index from the vector, moving other elements down to fill the gap.
- removeAllElements() -- Make the vector be empty.
Note that the elements in a vector are stored as objects of type Object. Since any class is a subclass of the Object class, objects of any type can be stored in vectors. If vec is a vector, then the value returned by vec.elementAt(index) is of type Object. In most cases, you'll have to type-cast this object back to its actual class in order to use it. In the example from Section 4.3, a vector named shapes is used to store objects belonging to a class named Shape. A for statement is used to draw all the Shapes in the vector to a graphics context named offScreenGraphics:
for (int i = 0; i < shapes.size(); i++) { Shape s = (Shape)shapes.elementAt(i); s.draw(offScreenGraphics); }The Vector class is defined in the package java.util, so a program that uses Vector should start with an import command, either "import java.util.Vector;" or "import java.util.*;".
The StringBuffer Class
Java's String class is designed to be very efficient for the most common operations on strings. However, it is not very efficient for concatenation, the operation of putting two strings together to make a longer string. The problem is that both strings must be copied into the new string. This can be a problem if you need to build up a string by concatenating a lot of smaller strings. For example, consider
String text = ""; // start with an empty string while (moreWords()) { String word = getNextWord(); text = text + word; }This can result in a potentially very long string being copied over and over. The solution is to use a StringBuffer instead of a string. Like a String, a StringBuffer holds a sequence of characters. However, with a StringBuffer, it is possible to append new characters onto the end of the sequence efficiently. The StringBuffer class is defined in the package java.lang, so it is automatically available in any program.
An initially created StringBuffer is empty. If sb is a StringBuffer and str is a string, then sb.append(str) adds the characters from str onto the end of the string buffer. There are versions of append() for chars, ints, and other data types. In each case, sb.append(x) converts x to a string and appends the string onto the end of sb. Note that append does not return a value -- it actually modifies the value of sb.
Once a string has been built inside the StringBuffer sb, sb.toString() will return the contents of the buffer as an ordinary string of type String. So once you've efficiently constructed your string in the buffer, you can change it to a value of type String that can be efficiently manipulated with all the usual String operations.
The HashTable Class
As a final example, we look at hash tables. Hash tables are similar to association lists, which were covered in Section 3. In fact, hash tables and association lists should really be considered as alternative implementations of the same abstract data type. However, hash tables are much more efficient than association lists.
Like an association list, a hash table is a collection of (key,value) pairs. No two pairs in the table can have the same key. The idea is that each value in the table is associated with some key, and that given a key, you can retrieve the associated value. For example, the keys might be names and the values might be phone numbers. Or the keys might be variables and the values might be the addresses of those variables in memory.
In Java, hash tables are implemented by the Hashtable class, which is defined in the package java.util. In a Hashtable, both the keys and the values are objects. The two main operations on hash tables are to get the value associated to a given key and to add a new association pair to the table. In the Hashtable class, these are implemented by the following instance methods:
- get(Object key) -- Return the value associated with the given key. If there is no pair in the table with the given key, then null is returned. The value returned by the get() method is of type Object, and it usually has to be type-cast to a more specific type.
- put(Object key, Object value) -- If there is already a pair in the table with the given key, then replace the value in that pair with the given value. Otherwise, add a new pair to the table containing the given key and value.
The hash table is a very sophisticated and useful data structure with an extremely simple interface. It's a nice example of the power of abstraction.
End of Chapter 7
[ Next Chapter | Previous Section | Chapter Index | Main Index ]