CPSC 225, Spring 2012
Lab 11: Web Server

In this week's lab, you will write a simple multi-threaded web server program. This means working with the socket API for networking, as well as working with files and streams. Adding threads will be the last step. To finish that part, you might need to wait until we have covered threads in class. However, you will have a working web server even without threads.

You should create a new lab11 project. There are no files to copy into the project. You will write the web server from scratch. However, some of the code that you need can be copy-and-pasted from this page.

This lab is due next week, by Saturday morning, April 21. This is our last full-scale lab; for next week's lab, you will work on your final project. The final lab period, on April 26, will either be devoted to the final project or to a short exercise that you can finish during lab.

About the HTTP Protocol

Web browsers and web servers communicate using the HTTP protocol. This is a "request/response" protocol: The browser sends a request to the server and the server sends back a response. Both the request and the response start with a "header," which consists of one or more lines of text followed by a blank line. The blank line is essential since it marks the end of the header. Lines should be terminated with a CRLF (carriage-return / line-feed, or "\r\n" in Java). The blank line consists of just a CRLF.

After the blank line that ends the headers, a response can contain data, which can be of any type (text, picture, music, etc.). The data that is sent in the response is what appears in the web browser's window; the headers are not displayed to the user of the browser.

Request and response headers can get complicated, but your server will only have to deal with a few possibilities. For the request, you only need to read the first line, which should contain three tokens and should be of the form:

        GET <path-to-file> HTTP/1.1

In fact, you really only need the first two tokens from this line. The first token is called the method. HTTP supports several different. methods, but you will only implement GET, and you should send back an error message if the method is anything besides GET.

The second token in the request tells you which file you should send back to the browser. This is the essential piece of data that you need, in order to decide what to transmit to the client.

(The third token in the request must be "HTTP/1.1" or, in some very old browsers, "HTTP/1.0". A real web browser should check that this token is correct, but for our purposes, it can be ignored. There will be more lines of data in the request, but again, you can ignore them.)

A web server has a directory that contains the files that it can send to clients. The <path-to-file> tells how to find the requested file, starting in that directory. For example, if the directory is /classes/cs225/javanotes and if <path-to-file> is /index.html, then the actual file is /classes/cs225/javanotes/index.html. You obtain the actual file path by adding the <path-to-file> onto the server directory.

(A technicality that you can probably ignore: If a file name contains special characters such as spaces, they will be encoded in the HTTP request. The get the real file name from the <path-to-file>, you should really call URLDecoder.decode(<path-to-file>,"UTF-8"). Another note for Windows users: File paths on the web always use the forward slash "/" as a separator. Windows uses a backslash "\". If you try to run the web server on a windows computer, you need to globally replace "/" with "\" in the <path-to-file> for the server to work.)


Once the web server knows what file is being requested, it has to send a response. Assuming that there has been no error and that you are in fact sending a file, the response should look like this, where <mime-type> is the mime type that describes the type of data in the file, and <file-size> is the number of bytes in the file:

                 HTTP/1.1 200 OK
                 Connection: close
                 Content-Type:  <mime-type>
                 Content-Length:  <file-size>

followed by a blank line and then the contents of the file.

If an error occurred, the server should instead send a response that describes the error. For example, if the requested file does not exist, you can send a "Not Found" error response:

                 HTTP/1.1 404 Not Found
                 Connection: close
                 Content-Type: text/plain
                 
                 Sorry, the file that you requested
                 could not be found.

In this case, the last two lines are the content of the response, which will be displayed in the web browser window to the user. Remember that the browser only displays the part that comes after the blank line. (A real server would use Content-Type text/html and send HTML-formatted text as the response.)

You don't have to implement all possible error responses, but here are some possible errors that you might want to take into account:

Testing A Web Server

A web server program receives "requests" over the network for "URLs" that are available on the server computer. It sends back a "response." The response can be an error message or the file or other data that the requested URL refers to.

The web server does not care that the request comes from a web browser program. Telnet is a simple program that allows you to communicate with any server that works with plain text data. You can use telnet to contact a web server and send it a request. For example, enter

telnet www.hws.edu 80

on the command line to connect to the web server running on host www.hws.edu, on port number 80. Once you are connected, enter:

GET /index.html HTTP/1.1
Host: www.hws.edu
Connection: close

followed by an extra blank line. The server should respond by sending you the main page of the web site.

When you have written your own web server, you can use telnet to test it. This is a good way to see exactly what response your server will send. Run the server and use a command such as

telnet localhost 8080

on the command line, assuming that your server listens on port 8080. Then type a request such as

GET /index.html HTTP/1.1

The server should send back a correct header followed by a blank line and an error message or the content of the requested file. (Your server won't need the extra lines that are needed when you are sending a request to www.hws.edu.)

Your server should also work with a web browser such as Firefox, Chrome, Safari, or Internet Explorer. You just need to enter an appropriate URL in the location box of the browser. For the URL, you need to know the host where the server is running and the port number where it is listening. When you are running the browser on the same computer as the server, you can use localhost as the host name, so you would enter a URL something like this:

                localhost:8080/index.html

This is a request for the file named index.html on the top level of the server's directory. It assumes that the port number is 8080.

The server can also be contacted from a browser running on another computer. In that case, the URL can use the IP address of the computer where the server is running. For example:

                172.30.10.43:8080/index.html

You should make sure to test that both files and error messages sent by your server will appear correctly in a web browser window.

Writing your Server

Start by creating a new class for your server program. The main() routine for the class can be the standard server program like the ones that we looked at in class:

public static void main(String[] args) {
    ServerSocket server;
    try {
        server = new ServerSocket(PORT);
        System.out.println("LISTENING ON PORT NUMBER " + PORT);
    }
    catch (IOException e) {
        System.out.println("Could not start server.  Error: " + e);
        System.exit(1);
    }
    try {
        while (true) {
            Socket socket = server.accept();
            System.out.println("Connection from " + socket.getRemoteSocketAddress());
            handleConnection(socket);
        }
    }
    catch (IOException e) {
        System.out.println("Some Error Occurred.  Shutting down server.");
        System.out.println("Error: " + e);
    }
}

This assumes that the port number is given by a constant named PORT. Your job is to write the handleConnection method.

It is important that the handleConnection() method catches any exception that occurs (not just IOException), so that the exception does not crash the entire server. Furthermore, it should make sure to close the socket at the end, so that the network connection is not kept open indefinitely.

The handleConnection method must read the request from the socket's input stream, then write a response to the socket's output stream. As mentioned above, you really only need to read the first two tokens from the input stream; you can use a Scanner to do that. The output stream is a more difficult matter because some of the data that you have to send can be binary data (for example, if the response is an image file). This means that you shouldn't use a PrintWriter. Instead, you should use the output stream directly, and you should copy the following method into your code and use it for each line of text that you want to send:

/**
 * Sends one line of text to an OutputStream, in proper format for HTTP.
 * A carriage return and line feed are added to serve as end-of-line.
 * @param out  The stream where the text will be written.
 * @param text  The text that will be written, which should consist
 *    of ASCII characters only.  If text is null, no characters are
 *    transmitted, but the end-of-line is still sent.
 * @throws IOException if an error occurs while transmitting the data
 */
private static void sendAscii(OutputStream out, String text) 
                                                       throws IOException {
    if (text != null) {
        for (int i = 0; i < text.length(); i++)
            out.write(text.charAt(i));
    }
    out.write('\r');
    out.write('\n');
}

For copying the contents of a file to the output stream, you can use the following method, which we covered in class:

/**
 * Copies bytes from an input stream to an output stream until 
 * end-of-stream is detected.
 * @throws IOException if an IOExcption occurs during copying
 */
private static void copy(InputStream in, OutputStream out) throws IOException {
    byte[] buffer = new byte[8192];
    int count = in.read(buffer);
    while (count != -1) {
        out.write(buffer,0,count);
        count = in.read(buffer);
    }
    out.flush();
}

Your interaction with the client must follow the HTTP protocol. Read the first two tokens from the request. If the first token is not GET, send an error. If the first token is GET, you have to decide which file to send. You do this by adding the second token onto the name of the server directory. You can use "/classes/cs225/javanotes" as the name of the server directory, if you want. Of, if you have your own web site in your www directory, you might use that instead. In any case, once you have the file, you have to check: first, that the file exists; second, that the file is readable; and third, that the file is not a directory. If the file exists, is readable, and is not a directory, you will need to know the length of the file for the response header. You will need the File class for all this, and you will need a FileInputStream to get the contents of the file.

You will also need to know the "mime type" of the file for the response header. The mime type can be determined from the file extension, the last part of the file name. You can use the following method, which will cover almost all of the files that your server is likely to encounter:

private static String getMimeTypeFromFileName(String fileName) {
     int pos = fileName.lastIndexOf('.');
     if (pos < 0)  // no file extension in name
         return "x-application/x-unknown";
     String ext = fileName.substring(pos+1).toLowerCase();
     if (ext.equals("txt")) return "text/plain";
     else if (ext.equals("html")) return "text/html";
     else if (ext.equals("htm")) return "text/html";
     else if (ext.equals("css")) return "text/css";
     else if (ext.equals("js")) return "text/javascript";
     else if (ext.equals("java")) return "text/x-java";
     else if (ext.equals("jpeg")) return "image/jpeg";
     else if (ext.equals("jpg")) return "image/jpeg";
     else if (ext.equals("png")) return "image/png";
     else if (ext.equals("gif")) return "image/gif"; 
     else if (ext.equals("ico")) return "image/x-icon";
     else if (ext.equals("class")) return "application/java-vm";
     else if (ext.equals("jar")) return "application/java-archive";
     else if (ext.equals("zip")) return "application/zip";
     else if (ext.equals("xml")) return "application/xml";
     else if (ext.equals("xhtml")) return"application/xhtml+xml";
     else return "x-application/x-unknown";
        // Note:  x-application/x-unknown  is something made up;
        // it will probably make the browser offer to save the file.
}

Adding Threads

The server that you have written is single-threaded. It can only handle one request at a time. If a second request comes in while you are working on another request, the second request will have to wait until you are finished with the first request, even if that takes a long time because you are sending a large file over a slow network. This is not acceptable for a real server. A real server should be multi-threaded, with several threads to handle connections.

One way to write a multi-threaded server is to start a new thread to handle each connection request. (Note that this solution is still not acceptable for real servers, because starting a new thread is a relatively time-consuming thing, and because you don't want to have the possibility of having too many threads running at the same time.)

To use this technique, you will need a subclass of Thread. The class needs a run() method to specify the task that the thread will perform. In this case, it should handle one connection request. We can pass the socket for that connection to the constructor of the class. Here's the class:

private static class ConnectionThread extends Thread {
    Socket socket;
    ConnectionThread(Socket socket) {
       this.socket = socket;
    }
    public void run() {
       handleConnection(socket);
    }
}

Now, in the main routine, instead of calling handleConnection directly, you will create and start a thread of type ConnectionThread. That's all there is to it! With this change, you should have a minimal but functional multi-threaded web server.

(A better way to write a multi-threaded server is to use a thread pool, a technique that we will discuss in class.)