CPSC 441, Fall 2002
Lab 5: Web Server, Part 2
YOU SHOULD NOW have a working, if fairly basic, web server program. How can it be improved? The next stage in the project is to add some features to the Web server. If you are not satisfied with your program, you can use mine instead as a basis for further work. You will find it in the directory /home/cs441/httpd.1.
My solution actually uses a new version of the Socket class. The new version can be found in the directory /home/cs441/StreamSockets as well as in the httpd.1 directory. In the new version, a Socket has an associated input stream and an associated output stream that can be used for sending and receive data. This is an alternative to the send, sendBinary, receive, and receiveBinary functions. If socket is a variable of type Socket* then socket->in() is an input stream and socket->out() is an output stream. These streams can be used just like file streams with the << and >> operators. Data written to socket->out() is transmitted through the socket to the other end of the connection. Data transmitted from the other end can be read using socket->in(). For example, you can say:
socket->out() << "HTTP/1.0 200 OK\r\n";to transmit a line of text over the connection. It's also possible to give alternative names to the input and output streams by declaring variables of type ifstream& and ofstream& as follows:
ifstream& in(*socket); ofstream& out(*socket);With these declarations, the stream variables in and out can be used for communicating over the connection. For more information, look at the file Socket.h in the StreamSockets or httpd.1 directories. You can also look at the programs chat.cc and threaded_chat.cc in StreamSockets. These files have been modified to use streams.
You do not necessarily have to use the new version of Socket in your work. Even if you don't, you might be interested in some of the utility functions that are declared in the file http_support.h.
Non-text Files
The program that you wrote was only required to handle text files. As a first improvement, you should modify it to handle image files. What happens if you use your current program with an image file? Try it, and find out. It might depend on the browser you use, but some browsers will try to interpret the image data as text and will display the text in the browser.
Image files will have extensions "jpg", "jpeg", "gif", or "png" and should have Content-types of image/jpeg, image/gif, and image/png. You cannot transmit an image file line-by-line since it is not made up of lines. You can use the following function for sending an image file. In fact, you could send all files, including text files, using this function:
void sendFile(Socket *socket, ifstream &file) { // This file copies the contents of the file to the // socket. For efficiency, it does this in chunks of // size two kilobytes at a time. char buffer[2048]; while (true) { file.read(buffer,2048); // Try to read 2048 characters from file. streamsize chars = file.gcount(); // Actual number of chars read. if (chars == 0) { // If no chars are read, it is because there is // no more data in the file, so we are done. break; } socket->sendBinary(buffer,chars); } }(My program uses a slightly different stream-based function named send_file().)
Mime types
In HTTP, Content-types are specified as "mime types." These were originally developed to specify the content type of non-textual email. Now, however, they are used almost universally to specify type of data. A Web browser actually associates mime types with file extensions such as "html" or "jpg". Your program only uses a few different mime types, so it's easy enough to write a few if statements to check for the relevant file extensions. However, this becomes messy if you want to handle many more possibilities -- especially if you want to be able to add new mime types easily. Ideally, the mime types should be stored in a table. For example, you could use an array of structs of the type:
struct MimeType { string extension; // File extension for this type of data. string mimetype; // The mime type, such as "image/gif". };Mime types that begin with "text/" are for text files. You only need to know that, however, if you want to treat text files differently from binary files -- for example by sending text files line-by-line instead of in chunks. Again: This is not really necessary.
Modify your program so that it keeps a table of mime types and uses that table to construct "Content-type" headers based on file extensions. If you come across a file extension that is not in the table, assume that the file contains plain text data. (If you remember the standard template library, you might want to use a map for your table.)
The file /etc/mime.types contains a list of mime types. Most of these have one or more associated file extensions. As you can see, there are quite a few. Once you code a file extension and its mime type into your program, your server will be able to use files with that extension.
Configuration File
As your program becomes more and more complicated, it becomes more difficulty to "configure" it by modifying the program. What if you want to change the location of the documents available to the server? What if you want to add a mime type? What if you want to be able to control access to the server based on IP number? A real server should read configuration data from a file. Modify your program so that it does this. It should be possible (but optional) to specify the name of a configuration file on the command line. At a minimum, your program should read the following items from the configuration file:
- the directory that contains the server's files,
- the port number on which the server should listen, and
- mime types.
Instead of reading the mime types from your own file. you might want to ready the system file /etc/mime.types instead.
Other features
In the previous lab, you were asked to find two features that you would like to add to your server. You should add at least one of those features. We will discuss the possible features in class. If you decide that you don't want to add either of the features that you wrote about, you can use someone else's idea instead.
If your feature involves looking at headers in the browser's request, you will certainly want to look at (or use) my program, since it already parses the browsers's headers to extract the headers.
David Eck, October 2002