Lessons by Jon

Forms with MacHTTP

If you stuck it out through all of the CGI lessons, then here is your reward. The following pages will hopefully provide a foolproof guide to installing forms on your site for whatever reasons you might have. If you didn't stick it out, you will see many notes below directing you to reread portions of those lessons. I didn't slave over a hot computer all day just so you could ignore what I wrote.

What is a Form?

If you don't already know what forms are, I'll give you a quick overview.

When I say "forms", I'm referring to pages which use special markup tags to create what looks like an online form for users to fill out. The Form tags are part of HTML-2 (with additional tags in HTML-3 and probably any later proposed versions as well). Using these tags, you can provide a page with text fields, check buttons, radio buttons, popup menus, lists, and many other elements that can be edited by the user. The results of the user's input are sent back to you to be processed however you wish. Was that clear? If not, try this even simpler explanation with working examples.

Actually, the term "form", which I use very loosely, should properly refer to only a portion of a page. As you saw if you looked at the source text of the Simple Explanation page, there is a beginning and end tag that designates a portion of a page as containing a form. This way a single page could conceivably contain several forms, although that is usually a poor page design. Additionally, a form could be only a small portion of a large page.


Before you begin...

All of the scripts presented in the following pages assume that you have the software recommended in my lessons on CGI applications. If you don't, you might want to review those lessons first. Since AppleEvents let you tie form information into almost any application, I will focus more on data processing than on passing the information to specific applications.

Here's a short glossary of terms I will toss around without any real justification:

Form Area
A section of an HTML document delineated by a start and end tag for a form.
Form (or Online Form)
Any page that has one or more form areas in it.
Form Tag
Any tag that can be used to present a field, button, menu, or other part of a form.
CGI
Any application that is used to process the information from a form. Since you would be silly to use anything other than a CGI application, I assume that is what is used.
Form Element
A complete form tag, including start, end, and optional information.
You
The hypothetical person reading these lessons.
We
Me. I'm just trying to spread the blame.

How Forms Work - Really Basic Stuff

I assume everyone is familiar by this point with the pattern of communication between a WWW client and MacHTTP and whatever CGI application will be processing the client request. If not, you might want to review that part of the CGI lessons. There are a few differences, though, that are unique to forms. Here is a quick overview (we'll hit the details later):
  1. The user fills out the fields and buttons in the form section of a page.
  2. The user clicks on the "Submit" button to send in the information.
  3. The client assembles the information in the form into one large block of data and sends it, along with the URL for the CGI application, to MacHTTP.
  4. MacHTTP packages the data into an AppleEvent and passes it along to the CGI application.
  5. The CGI application extracts the data, converts it to a usable format, and does whatever the webmaster wants with it - add it to a database, mail it to someone, use it to search an archive, or whatever.
  6. The CGI application builds a page to return to the client and passes either that page or a URL to it back to MacHTTP who has been patiently waiting.
  7. MacHTTP passes either a page or the URL to a page back to the client.
  8. The client displays the appropriate page.

One thing you will want to be sure to notice here is how little MacHTTP has to do with all of this. MacHTTP is merely a mediator between the client software and the CGI application. It doesn't do any real work of its own. For many of us, that describes the perfect job. What a lucky application!


How to write a page with FORM elements

HTML is still a quickly evolving entity with no approved standard even for the basic tags (although there is a well-accepted definition in HTML-1). Because of this, there are new tags and attributes for forms data elements being added all the time. Check out the following pages for a good introduction to writing an HTML document which contains form elements:

Once you have a good idea of how these form elements work, move on to the next section.


Processing the Form Information

Okay, so you've got an HTML document that contains form elements now. You can see all kinds of possibilities, from a simple form to take the credit card number of anyone silly enough to give it to you to a very advanced questionaire on shopping habits that tricks people into revealing information they wouldn't even tell their spouses. How do we get that information back, though so we can start charging items to their MasterCard? First, let's dig a little deeper into the topic of how information is passed from the client to the CGI application.

Consider a form which has only three fields; name, age, and weight. These are probably small fields and so the total amount of information passed is not large. However, there are three pieces of information here to be passed. Since all three pieces of data are passed in the same argument (post_args), there must be some method of combining the three pieces so they can be separated properly later. All WWW clients use the same method of doing this. The client puts a special character, in this case an ampersand (&), between each piece of data. The result looks like this: name_data&age_data&weight_data.

Of course, there is the possibility that one of these fields might contain an ampersand. Because of this, the client first converts all ampersands in each field to it's hexadecimal equivalent (%28). Thus "Smith & Wesson" would become "Smith %28 Wesson".

There is also another problem. The HTTP specifications don't require the clients to pass the form data in any specific order. Data can arrive in order of arrangement on the page, in reverse order, or in some random mixture if the client wants to be a real pain-in-the-neck. Because of this, a method is needed to identify which data goes with which form field or button. The method used by all WWW clients is to concatenate the form element name and the form element data using an equals (=) sign. Thus, in the above example, if a Mr. Smith, age 28, who weighs 157 pounds were to fill out the form (honestly), the resulting information passed to the CGI would look something like the following: weight=157&age=28&name=Mr. Smith.

As you might have guessed, using the equals sign means that that character must also be encoded within the data before the post_args is made. In fact, there are many characters that are encoded in this way, including spaces (Unix systems abhor spaces) and most 8-bit ASCII characters. This brings up the final problem. Up until this point all of the WWW clients have agreed on how things should be done. All special characters are encoded using their hexadecimal equivalent. However, two WWW clients - specifically NCSA Mosaic and Netscape - break from this pattern when encoding spaces. These two clients use a plus (+) sign to encode all spaces in text instead of using the hexadecimal code (%20). This is a major headache for everyone trying to do forms and a waste of time. I don't see any sign that they're likely to join the flock of the enlightened any time soon, though.

So, to put the above in order, we have the following procedure being followed by WWW clients before passing the information in form fields or buttons to MacHTTP:

  1. Change all special characters in the form fields and buttons to their hexadecimal equivalents.
  2. For Mosaic and Netscape only: Convert all spaces to pluses (+).
  3. Concatenate each piece of data with it's field or button name using an equals (=) character (i.e., name=data).
  4. Concatenate all of these pairs of data into one long post_args argument using the ampersand (&) character (i.e. name1=data1&name2=data2&name3=data3).
  5. Pass this information, along with the URL of the CGI application to process it, back to MacHTTP.

Actual Tutorial Files

  1. Basic Form Processing - This lesson shows you how to extract and identify the data from each item in your form to produce readable output that can be returned, passed to a database, sent in e-mail, or whatever you want.
  2. Email Example - This example shows you how to allow people to send you e-mail by filling out a form.
More examples will be made available at my WWW pages as I get time to write them.
[Extending MacHTTP]

Jon Wiederspan
Last Edited: December 11, 1994