Lessons by Jon
Forms with MacHTTP
If you stuck it out through all of the CGI lessons, then here is your reward.
The following pages will hopefully provide a foolproof guide to installing forms
on your site for whatever reasons you might have. If you didn't stick it out,
you will see many notes below directing you to reread portions of those lessons.
I didn't slave over a hot computer all day just so you could ignore what I wrote.
What is a Form?
If you don't already know what forms are, I'll give you a quick overview.
When I say "forms", I'm referring to pages which use special markup tags
to create what looks like an online form for users to fill out.
The Form tags are part of HTML-2 (with additional tags in HTML-3 and probably
any later proposed versions as well). Using these tags, you can provide a page
with text fields, check buttons, radio buttons, popup menus, lists, and many
other elements that can be edited by the user. The results of the user's
input are sent back to you to be processed however you wish. Was that clear?
If not, try this even simpler explanation with
working examples.
Actually, the term "form", which I use very loosely, should properly refer
to only a portion of a page. As you saw if you looked at the source text of
the Simple Explanation page, there is a beginning and end tag that designates a
portion of a page as containing a form. This way a single page could conceivably
contain several forms, although that is usually a poor page design. Additionally,
a form could be only a small portion of a large page.
Before you begin...
All of the scripts presented in the following pages assume that you have
the software recommended in my lessons on CGI applications. If you don't,
you might want to review those lessons first. Since AppleEvents let you tie
form information into almost any application, I will focus more on data processing
than on passing the information to specific applications.
Here's a short glossary of terms I will toss around without any real justification:
- Form Area
- A section of an HTML document delineated by a start and end tag for a form.
- Form (or Online Form)
- Any page that has one or more form areas in it.
- Form Tag
- Any tag that can be used to present a field, button, menu, or other
part of a form.
- CGI
- Any application that is used to process the information from a form. Since
you would be silly to use anything other than a CGI application, I assume
that is what is used.
- Form Element
- A complete form tag, including start, end, and optional information.
- You
- The hypothetical person reading these lessons.
- We
- Me. I'm just trying to spread the blame.
How Forms Work - Really Basic Stuff
I assume everyone is familiar by this point with the pattern of communication
between a WWW client and MacHTTP and whatever CGI application will be processing
the client request. If not, you might want to review that part of the CGI
lessons. There are a few differences, though, that are unique to forms. Here
is a quick overview (we'll hit the details later):
- The user fills out the fields and buttons in the form section of a page.
- The user clicks on the "Submit" button to send in the information.
- The client assembles the information in the form into one large block
of data and sends it, along with the URL for the CGI application, to
MacHTTP.
- MacHTTP packages the data into an AppleEvent and passes it along to
the CGI application.
- The CGI application extracts the data, converts it to a usable format,
and does whatever the webmaster wants with it - add it to a database,
mail it to someone, use it to search an archive, or whatever.
- The CGI application builds a page to return to the client and passes
either that page or a URL to it back to MacHTTP who has been patiently
waiting.
- MacHTTP passes either a page or the URL to a page back to the client.
- The client displays the appropriate page.
One thing you will want to be sure to notice here is how little MacHTTP has
to do with all of this. MacHTTP is merely a mediator between the client software
and the CGI application. It doesn't do any real work of its own. For many of
us, that describes the perfect job. What a lucky application!
How to write a page with FORM elements
HTML is still a quickly evolving entity with no approved standard even
for the basic tags (although there is a well-accepted definition in
HTML-1). Because of this, there are new tags and attributes for
forms data elements being added all the time. Check out the
following pages for a good introduction to writing an HTML document
which contains form elements:
Once you have a good idea of how these form elements work, move on to
the next section.
Processing the Form Information
Okay, so you've got an HTML document that contains form elements
now. You can see all kinds of possibilities, from a simple form to
take the credit card number of anyone silly enough to give it to you
to a very advanced questionaire on shopping habits that tricks
people into revealing information they wouldn't even tell their
spouses. How do we get that information back, though so we can
start charging items to their MasterCard? First, let's dig a little deeper
into the topic of how information is passed from the client to the CGI application.
Consider a form which has only three fields; name, age, and weight. These
are probably small fields and so the total amount of information passed is not
large. However, there are three pieces of information here to be passed. Since all
three pieces of data are passed in the same argument (post_args), there
must be some method of combining the three pieces so they can be separated properly
later. All WWW clients use the same method of doing this. The client puts a special
character, in this case an ampersand (&), between each piece of data. The
result looks like this: name_data&age_data&weight_data.
Of course, there is the possibility that one of these fields might contain an
ampersand. Because of this, the client first converts all ampersands in each field
to it's hexadecimal equivalent (%28). Thus "Smith & Wesson" would become "Smith %28 Wesson".
There is also another problem. The HTTP specifications don't require the clients to
pass the form data in any specific order. Data can arrive in order of arrangement on the
page, in reverse order, or in some random mixture if the client wants to be a real
pain-in-the-neck. Because of this, a method is needed to identify which data goes with
which form field or button. The method used by all WWW clients is to concatenate the
form element name and the form element data using an equals (=) sign. Thus, in the
above example, if a Mr. Smith, age 28, who weighs 157 pounds were to fill out the form
(honestly), the resulting information passed to the CGI would look something like the
following: weight=157&age=28&name=Mr. Smith.
As you might have guessed, using the equals sign means that that character must
also be encoded within the data before the post_args is made. In fact, there
are many characters that are encoded in this way, including spaces (Unix systems abhor
spaces) and most 8-bit ASCII characters. This brings up the final problem. Up until
this point all of the WWW clients have agreed on how things should be done. All special
characters are encoded using their hexadecimal equivalent. However, two WWW clients -
specifically NCSA Mosaic and Netscape - break from this pattern when encoding spaces.
These two clients use a plus (+) sign to encode all spaces in text instead of using
the hexadecimal code (%20). This is a major headache for everyone trying to do forms
and a waste of time. I don't see any sign that they're likely to join the flock of the
enlightened any time soon, though.
So, to put the above in order, we have the following procedure being followed by
WWW clients before passing the information in form fields or buttons to MacHTTP:
- Change all special characters in the form fields and buttons to their hexadecimal
equivalents.
- For Mosaic and Netscape only: Convert all spaces to pluses (+).
- Concatenate each piece of data with it's field or button name using an
equals (=) character (i.e., name=data).
- Concatenate all of these pairs of data into one long post_args argument
using the ampersand (&) character (i.e. name1=data1&name2=data2&name3=data3).
- Pass this information, along with the URL of the CGI application to process it,
back to MacHTTP.
Actual Tutorial Files
- Basic Form Processing - This lesson
shows you how to extract and identify the data from each item in
your form to produce readable output that can be returned, passed
to a database, sent in e-mail, or whatever you want.
- Email Example - This example shows
you how to allow people to send you e-mail by filling out a form.
More examples will be made available at
my WWW pages
as I get time to write them.
[Extending MacHTTP]
Jon Wiederspan
Last Edited: December 11, 1994