CS124: Java, Section 8.4

Section 8.4
Networking

AS FAR AS A PROGRAM IS CONCERNED, A NETWORK is just another possible source of input data, and another place where data can be output. That does oversimplify things, because networks are still not quite as easy to work with as files are. But in Java, you can do network communication using input streams and output streams, just as you can use such streams to communicate with the user or to work with files. Opening a network connection between two computers is a bit tricky, since there are two computers involved and they have to somehow agree to open a connection. And when each computer can send data to the other, synchronizing communication can be a problem. But the fundamentals are the same as for other forms of I/O.

One of the standard Java packages is called java.net. This package includes several classes that can be used for networking. Two different styles of network I/O are supported. One of these, which is fairly high-level, is based on the World-Wide Web, and provides the sort of network communication capability that is used by a Web browser when it downloads pages for you to view. The main class for this style of networking is called URL. An object of type URL is an abstract representation of a Universal Resource Locator, which is an address for an HTML document or other resource on the Web.

The second style of I/O is much more low-level, and is based on the idea of a socket. A socket is used by a program to establish a connection with another program on a network. Two-way communication over a network involves two sockets, one on each of the computers involved in the communication. Java uses a class called Socket to represent sockets that are used for network communication. The term "socket" presumably comes from an image of physically plugging a wire into a computer to establish a connection to a network, but it is important to understand that a socket, as the term is used here, is simply an object belonging to the class Socket. In particular, a program can have several sockets at the same time, each connecting it to another program running on some other computer on the network. All these connections use the same physical network connection.

This section gives a brief introduction to the URL and Socket classes, and shows how they relate to input and output streams and to exceptions.

The URL Class

The URL class is used to represent resources on the World-Wide Web. Every resource has an address, which identifies it uniquely and contains enough information for a Web browser to find the resource on the network and retrieve it. The address is called a "url" or "universal resource locator." See Section 5.3 for more information.

An object belonging to the URL class represents such an address. If you have a URL object, you can use it to open a network connection to the resource at that address. The URL class, and an associated class called URLConnection, provide a large number of methods for working with such connections, but the most straightforward method -- and the only one I will talk about here -- yields an object of type InputStream that can be used to read the data contained in the resource. For example, if the resource is a standard Web page in HTML format, then the data read through the input stream is the actual HTML code that describes the page.

A url is ordinarily specified as a string, such as "http://math.hws.edu/eck/index.html". There are also relative url's. A relative url specifies the location of a resource relative to the location of another url, which is called the base or context for the relative url. For example, if the context is given by the url http://math.hws.edu/eck/, then the incomplete, relative url "index.html" would really refer to http://math.hws.edu/eck/index.html.

An object of the class URL is not simply a string, but it can be constructed from a string representation of a url. A URL object can also be constructed from another URL object, representing a context, and a string that specifies a url relative to that context. These constructors have prototypes
          public URL(String urlName) throws MalformedURLException
          
and

          public URL(URL context, String relativeName) throws MalformedURLException
Note that these constructors will throw an exception of type MalformedURLException if the specified strings don't represent legal url's. So of course it's a good idea to put your call to the constructor inside a try statement and handle the potential MalformedURLException in a catch clause.

When you write an applet, there are two methods available that provide useful URL contexts. The method getDocumentBase(), defined in the Applet class, returns an object of type URL. This URL represents the location from which the HTML page that contains the applet was downloaded. This allows the applet to go back and retrieve other files that are stored in the same location as that document. For example,
           URL address = new URL(getDocumentBase(), "data.txt");
constructs a URL that refers to a file named data.txt on the same computer and in the same directory as the web page in which the applet is running. Another method, getCodeBase() returns a URL that gives the location of the applet itself (which is not necessarily the same as the location of the document).

Once you have a valid URL object, the method openStream() from the URL class can be used to obtain an InputStream, which can then be used to read the data from the resource that the URL points to. For example, if address is an object of type URL, you could simply say
           InputStream in = address.openStream();
to get the input stream. This method does all the work of opening a network connection, and when you read from the input stream, it does all the hard work of obtaining data over that connection. To make things even easier on yourself, you could even wrap the InputStream object in a DataInputStream or AsciiInputStream and do all your input through that.

Various exceptions can be thrown as the attempt is made to open the connection and read data from it. Most of these exception are of type IOException, and such errors must be caught and handled. But these operations can also cause security exceptions. An object of type SecurityException is thrown when a program attempts to perform some operation that it does not have permission to perform. For example, a Web browser is typically configured to forbid an applet from making a network connection to any computer other than the computer from which the applet was downloaded. If an applet attempts to connect to some other computer, a SecurityException is thrown. A security exception can be caught and handled like any other exception.

To put this all together, here is a subroutine that could be used in an applet to read a file over the network. The contents of that file, which are assumed to be in plain text format, are stored in a StringBuffer as they are read. (A StringBuffer is similar to a String, except that it can grow in size as characters are appended to it.) At the end of the method, the contents of the StringBuffer are returned as a String. This version is somewhat simplified, and the error handling is certainly not good enough for serious use:
   String loadURL(String urlName) {
   
        // Loads the data in the url specified by urlName, relative
        // to the document base, and returns that data as a String.
        // Exception handling is used to detect and respond to errors
        // that might occur by returning an error message.

      try {
      
         URL url = new URL(getDocumentBase(), urlName);   // Create an input stream
         InputStream in = url.openStream();               //    for reading the data
                                                          //    from the url.

         StringBuffer buffer = new StringBuffer();   // Store input data here until
                                                     //     it has all been read.
                                                          
         int input;  // one item read from the input stream
         do {
            input = in.read();  // This is either -1, if all the data has been
                                // read, or else it is the ASCII code of a 
                                // character read from the input stream.
            if (input >= 0) {
                char ch = (char)input;   // Convert the ASCII code to a char.
                buffer.append(ch);       // Add the character to the buffer.
            }
         } while (input >= 0);
         
         in.close();  // close the input stream (and the network connection)
         
         return buffer.toString(); // return the data that has been read.

      }
      catch (MalformedURLException e) {  // can be thrown when URL is created
         return "ERROR!  Improper syntax given for the URL to be loaded.";
      }
      catch (SecurityException e) {  // can be thrown when the connection is created
         return "SECURITY ERROR!  " + e;
      }
      catch (IOException e) {  // can be thrown while data is being read
          return "INPUT ERROR!  " + e;
      }
      
   } // end of loadURL() method
Because it can take some time to open a network connection and read the data from it, it is reasonable to create a separate Thread object to do the work asynchronously. Here is an actual working applet that uses this technique. The applet is configured so that it will attempt to load its own source code when it runs. (If there is a problem, and if you would still like to see the source code, here is a direct link.)

Sorry, your browser does not support Java.
But if you want to read the source code,
you can use this link.

You can also try to use this applet to look at the HTML source code for this very page. Just type s4.html into the input box at the bottom of the applet and then click on the Load button. However, this might generate a security exception, depending on the configuration of your browser. If so, you'll get a message to that effect in the applet. You might want to experiment with generating other errors. For example, entering bogus.html is likely to generate a FileNotFoundException, since no document of that name exists in the directory that contains this page. As another example, you can probably generate a security error by trying to connect to http://www.whitehouse.gov.

Sockets, Clients, and Servers

Communication over the Internet is based on a pair of protocols called the Internet Protocol and the Transmission Control Protocol, which are collectively referred to as TCP/IP. (In fact, there is a basic type of communication that can be done without TCP, but for this discussion, I'll stick to the full TCP/IP, which provides reliable two-way communication between networked computers.)

For two programs to communicate using TCP/IP, each program must create a socket, as discussed earlier in this section, and those sockets must be connected. Once such a connection is made, communication takes place using input streams and output streams. Each program has its own input stream and its own output stream. Data written by one program to its output stream is transmitted to the other computer. There, it enters the input stream of the program at the other end of the network connection. When that program reads data from its input stream, it is receiving the data that was transmitted to it over the network.

The hard part, then, is making a network connection in the first place. Two sockets are involved. To get things started, one program must create a socket that will wait passively until a connection request comes in from another socket. The waiting socket is said to be listening for a connection. On the other side of the connection-to-be, another program creates a socket that sends out a connection request to the listening socket. When the listening socket receives the connection request, it responds, and the connection is established. Once that is done, each program can obtain an input stream and an output stream for the connection. Communication takes place through these streams until one program or the other closes the connection.

A program that creates a listening socket is sometimes said to be a server, and the socket is called a server socket. A program that connects to a server is called a client, and the socket that it used to make a connection is called a client socket. The idea is that the server is out there somewhere on the network, waiting for a connection request from some client. The server can be thought of as offering some kind of service, and the client gets access to that service by connecting to the server. This is called the client/server model of network communication. In many actual applications, a server program can provide connections to several clients at the same time. When a client connects to a server's listening socket, that socket does not stop listening. Instead, it continues listening for additional client connections at the same time that the first client is being serviced.

This client/server model, in which there is one server program that supports multiple clients, is a perfect application for threads. A server program has one main thread that manages the listening socket. This thread runs continuously as long as the server is in operation. Whenever the server socket receives a connection request from a client, the main thread makes a new thread to handle the communications with that particular client. This client thread will run only as long as the client stays connected. The server thread and any active client threads all run simultaneously, in parallel. Client programs, on the other hand, tend to be simpler, having just one network connection and just one thread (although there is nothing to stop a program from using several client sockets at the same time, or even a mixture of client sockets and server sockets).

The URL class that was discussed at the beginning of this section uses a client socket behind the scenes to do any necessary network communication. On the other side of that connection is a server program that accepts a connection request from the URL object, reads a request from that object for some particular file on the server computer, and responds by transmitting the contents of that file over the network back to the URL object. After transmitting the data, the server closes the connection.

To implement TCP/IP connections, the java.net package provides two classes, ServerSocket and Socket. A ServerSocket represents a listening socket that waits for connection requests from clients. A Socket represents one endpoint of an actual network connection. A Socket, then, can be a client socket that sends a connection request to a server. But a Socket can also be created by a server to handle a connection request from a client. This allows the server to create multiple sockets and handle multiple connections. (A ServerSocket does not itself participate in connections; it just listens for connection requests and creates Sockets to handle the actual connections.)

To use Sockets and ServerSockets, you need to know about internet addresses. After all, a client program has to have some way to specify which computer, among all those on the network, it wants to communicate with. Every computer on the Internet has an IP address which identifies it uniquely among all the computers on the net. Many computers can also be referred to by domain names such as math.hws.edu or www.whitehouse.gov. (See Section 1.7.) Now, a single computer might have several programs doing network communication at the same time, or one program communicating with several other computers. To allow for this possibility, a port number is added to the Internet address. A port number is simply a 16-bit integer. A server does not simply listen for connections -- it listens for connections on a particular port. A potential client must know both the Internet address of the computer on which the server is running and the port number on which the server is listening. A Web server, for example, generally listens for connections on port 80; other standard Internet services also have standard port numbers. (The standard port numbers are all less than 1024. If you create your own server programs, you should use port numbers greater than 1024.)

When you construct a ServerSocket object, you have to specify the port number on which the server will listen. The prototype for the constructor is
             public ServerSocket(int port) throws IOException
As soon as the ServerSocket is established, it starts listening for connection requests. The accept() method in the ServerSocket class accepts such a request, establishes a connection with the client, and returns a Socket that can be used for communication with the client. The accept() method has the form
            public Socket accept() throws IOException
When you call the accept() method, it will not return until a connection request is received (or until some error occurs). The method is said to block while waiting for the connection. While the method is blocked, the thread that called the method can't do anything else. However, other threads in the same program can proceed. (This is why a server needs a separate thread just to wait for connection requests.) The ServerSocket will continue listening for connections until it is closed, using its close() method, or until some error occurs.

Suppose that you want a server to listen on port 1728. Each time the server receives a connection request, it should create a new thread to handle the connection with the client. Suppose that you've written a method createServiceThread(Socket) that creates such a thread. Then a simple version of the run() method for the server thread would be:
              public run() {
                 try {
                    ServerSocket server = new ServerSocket(1728);
                    while (true) {
                       Socket connection = server.accept();
                       createServiceThread(connection);
                    }
                 }
                 catch (IOException e) {
                    System.out.println("Server shut down with error: " + e);
                 }
              }
On the client side, a client socket is created using a constructor in the Socket class. To connect to a server on a known computer and port, you would use the constructor
            public Socket(String computer, int port) throws IOException
This constructor will block until the connection is established or until an error occurs. (This means that even when you write a client program, you might want to use a separate thread to handle the connection, so that the program can continue to respond to user inputs while the connection is being established. Otherwise, the program will just freeze for some indefinite period of time.) Once the connection is established, you can use the methods getInputStream() and getOutputStream() to obtain streams that can be used for communication over the connection. Keeping all this in mind, here is the outline of a method for working with a client connection:
            void doClientConnection(String computerName, int listeningPort) {
                   // computerName should give the name of the computer
                   // where the server is running, such as math.hws.edu;
                   // listeningPort should be the port on which the server
                   // listens for connections, such as 1728.
               Socket connection;
               InputStream in;
               OutputStream out;
               try {
                  connection = new Socket(computerName,listeningPort);
                  in = connection.getInputStream();
                  out = connection.getOutputStream();
               }
               catch (IOException e) {
                  System.out.println("Attempt to create connection failed with error: " + e);
                  return;
               }
                .
                .  // use the streams, in and out, to communicate with server
                .
               connection.close();
            }
All this makes network communication sound easier than it really is. (And if you think it sounded hard, then it's even harder.) If networks were completely reliable, things would be almost as easy as I've described. The problem, though, is to write robust programs that can deal with network and human error. I won't go into detail here -- partly because I don't really know enough about serious network programming in Java myself. However, what I've covered here should give you the basic ideas of network programming, and it is enough to write some simple network applications. (Just don't try to write a replacement for Netscape.)

End of Chapter 8

[ Next Chapter | Previous Section | Chapter Index | Main Index ]

Section 8.4 Networking

The URL Class

Sockets, Clients, and Servers

Section 8.4
Networking