Section 6.2
HTML Basics


APPLETS GENERALLY APPEAR ON PAGES in a Web browser program. Such pages are themselves written in a language called HTML (HyperText Markup Language). An HTML document describes the contents of a page. A Web browser interprets the HTML code to determine what to display on the page. The HTML code doesn't look much like the resulting page that appears in the browser. The HTML document does contain all the text that appears on the page, but that text is "marked up" with commands that determine the structure and appearance of the text and determine what will appear on the page in addition to the text.

HTML has developed rapidly in the last few years, and it has become a rather complicated language. In this section, I will cover just the basics of the language. While that leaves out all the fancy stuff, it does include just about everything I've used to make the Web pages in this on-line text.

It is possible to write an HTML page using an ordinary text editor, typing in all the mark-up commands by hand. However, there are many Web-authoring programs that make it possible to create Web pages without ever looking at the underlying code. Using these tools, you can compose a Web page in much the same way that you would write a paper with a word processor. For example, Netscape Composer, which is part of Netscape Communicator, works in this way. However, my opinion is that making high-quality Web pages still requires some work with raw HTML, and serious Web authors still need to learn the HTML language.

The mark-up commands used by HTML are called tags. An HTML tag takes the form

<tag-name  optional-modifiers>

Where the tag-name is a word that specifies the command, and the optional-modifiers, if present, are used to provide additional information for the command (much like parameters in subroutines). A modifier takes the form

modifier-name = value

Usually, the value is enclosed in quotes, and it must be if it is more than one word long or if it contains certain special characters. There are a few modifiers which have no value, in which case only the name of the modifier is present. HTML is case insensitive, which means that you can use uppercase and lowercase letters interchangeably in tags and modifiers.

A simple example of a tag is <HR>, which draws a line -- also called a "horizontal rule" -- across the page. The HR tag can take several possible modifiers such as WIDTH and ALIGN. For example, the short line just after the heading of this page was produced by the HTML command:

<HR  align=center  width="33%">

The WIDTH here is specified as 33% of the available space. It could also be given as a fixed number of pixels. The value for ALIGN could be CENTER, LEFT, or RIGHT. A LEFT alignment would shove the line to the left side of the page, and a RIGHT alignment, to the right side. WIDTH and ALIGN are optional modifiers. If you leave them out, then their default values will be used. The default for WIDTH is 100%, and the default for ALIGN is LEFT.

Many tags require matching closing tags, which take the form

</tag-name>

For example, the tag <PRE> must always have a matching closing tag </PRE> later in the document. The tag applies to everything that comes between the opening tag and the closing tag. The <PRE> tag tells a Web browser to display everything between the <PRE> and the </PRE> just as it is formatted in the original HTML source code, including all the spaces and carriage returns. (But tags between <PRE> and </PRE> are still interpreted by the browser.) "PRE" stands for preformatted text. All of the sample programs in these notes are formatted using the <PRE> command.

It is important for you to understand that when you don't use PRE, the computer will completely ignore the formatting of the text in the HTML source code. The only thing it pays attention to is the tags. Five blank lines in the source code have no more effect than one blank line or even a single blank space. Outside of <PRE>, if you want to force a new line on the Web page, you can use the tag <BR>, which stands for "break". For example, I might give my address as:

         David Eck<BR>
         Department of Mathematics and Computer Science<BR>
         Hobart and William Smith Colleges<BR>
         Geneva, NY 14456<BR>

If you want extra vertical space in your web page, you can use several <BR>'s in a row.

Similarly, you need a tag to indicate how the text should be broken up into paragraphs. This is done with the <P> tag, which should be placed at the beginning of every paragraph. The <P> tag has a matching </P>, which should be placed at the end of each paragraph. The closing </P> is technically optional, but it is considered good form to use it. If you want all the lines of the paragraph to be shoved over to the right, you can use <P ALIGN=RIGHT> instead of <P>. (This is mostly useful when used with one short line, or when used with <BR> to make several short lines.) You can also use <P ALIGN=CENTER> for centered lines.

By the way, if tags like <P> and <HR> have special meanings in HTML, you might wonder how I can get them to appear here on this page. To get certain special characters to appear on the page, you have to use an entity name in the HTML source code. The entity name for < is &lt;, and the entity name for > is &gt;. Entity names begin with & and end with a semicolon. The character & is itself a special character whose entity name is &amp;. There are also entity names for nonstandard characters such as the accented e, é, which has the entity name &eacute;.

The rest of this page discusses several other basic HTML tags. This is not meant to be a complete discussion. But it is enough to produce interesting pages.


Overall Document Structure

HTML documents have a standard structure. They begin with <HTML> and end with </HTML>. Between these tags, there are two sections, the head, which is marked off by <HEAD> and </HEAD>, and the body, which -- as I'm sure you have guessed -- is surrounded by <BODY> and </BODY>. Often, the head contains only one item: a title for the document. This title might be shown, for example, in the title bar of a Web browser window. The title should not contain any HTML tags. The body contains the actual page contents that are displayed by the browser. So, an HTML document takes this form:

       <HTML>
       
       <HEAD>
       <TITLE>page-title</TITLE>
       </HEAD>
       
       <BODY>
       
       page-contents
       
       </BODY>
       
       </HTML>

Web browsers are not very picky about enforcing this structure; you can probably get away with leaving out everything but the actual page contents. But it is good form to follow this structure for your pages.

The <BODY> tag can take a number of modifiers that affect the appearance of the page when it is displayed. The modifier named BGCOLOR can be used to set the background color of the page. For example,

<BODY bgcolor=white>

will ensure that the background color for the page is white. You can add modifiers to control the color of regular text (TEXT), hypertext links (LINK), and links to pages that have already been visited (VLINK). When the user clicks and holds the mouse button on a link, the link is said to be active; you can control the color of active links with the ALINK modifier. For example, how about a page with a black background, white text, blue links, red active links, and gray visited links:

<BODY  bgcolor=black  text=white  link=blue  alink=red  vlink=gray>

There are several standard color names that you can use in this context, but if you want complete control, you'll have to learn how to specify colors using hexadecimal numbers. It is also possible to use an image for the background of the page, instead of a solid color. Look up the details if you are interested.


Headings and Font Styles

HTML has a number of tags that affect the size and style of displayed text. For a heading, which is meant to stand out on a line by itself, HTML offers the tags <H1>, <H2>, ..., <H6>. These tags are always used with matching closing tags such as </H1>. The <H1> tag is meant for the most important headings and produces the largest size text. I've found <H4> through <H6> to be too small to be useful. You can use <BR> tags in headings, if you want multi-line headings. You can also use links and images, which are described below. The heading tags can take ALIGN as a modifier, with the value LEFT, RIGHT, or CENTER. For example, the heading

A Sample Heading

was written as "<H1 align=center>A Sample Heading</H1>" in the HTML source code.

There are a number of different style tags that you can apply to text. For example, bold text can be obtained by surrounding the text with <B> and </B>. You can use <i> for italic, <U> for underlined, and <TT> for typewriter style text. Most browsers support <SUB> for subscripted text and <SUP> for superscripted text. For example, "x<SUP>2</SUP>" will give: x2.

Because HTML is meant to describe the logical structure of a document, rather than its exact appearance, it has a number of tags for displaying the logical style of the text. For example, the <EM> tag is meant to emphasize the text surrounded by <EM> and </EM>, while <STRONG> is for strong emphasis. And the <CITE> style tag is meant for titles of books.

You can get even more control over the style of the text by using the <FONT>...</FONT> tag. The <FONT> tag uses modifiers such as COLOR and SIZE to control the appearance of the font. For big blue text, you would say:

<FONT color=blue size="+1">big blue text</FONT>

The value "+1" for the SIZE modifier means "a little bigger than usual." You could use "+2" for an even bigger font, "-1" for a smaller font, and so on. However, only a limited number of different sizes are available.


Lists

There are several tags for producing lists of items. The most widely used of these are <UL> and <OL>. The <OL> tag gives an "ordered list", in which the items are numbered consecutively. The item numbers are provided by the browser. The <UL> tag gives an "unordered list", in which the items are all marked with the same special symbol. In the HTML source code, each list item is indicated by placing a <LI> tag at the beginning of the item. The end of the list is marked by the appropriate closing tag, </OL> or </UL>. For example, the following source code:

       <UL>
       <LI>Isaac Asimov
       <LI>Ursula Leguin
       <LI>Greg Bear
       <LI>C. J. Cherryh
       </UL>

produces this list:


Links

The most distinctive feature of HTML is that documents can contain links to other documents. The user can follow links from page to page and in the process visit pages from all over the Internet.

The <A> tag is used to create a link. The text between the <A> and its matching </A> appears on the page. Usually, it is underlined and in a special color. The user can follow the link by clicking on this text. The <A> tag uses the modifier HREF to say which document the link should connect to. The value for HREF must be a URL (Uniform Resource Locator). A URL is a coded set of instructions for finding a document on the Internet. For example, the URL for my own "home page" is

http://math.hws.edu/eck/

To make a link to this page, such as David's Home Page, I would use the HTML source code

<A HREF="http://math.hws.edu/eck/">David's Home Page</A>

The best place to find URLs is on existing Web pages. Most browsers display the URL for the page you are currently viewing, and they can display the URL of a link if you point to the link with the mouse.

If you are writing an HTML document and you want to make a link to another document that is in the same directory, you can use a relative URL. A relative URL consists of just the name of the file. For example, the page you are now viewing comes from a directory that also contains the other sections in this chapter. For a link to Section 1, which is in a file named s1.html, the relative URL would be just "s1.html", and the complete link would look like

<A HREF="s1.html">Section 1</A>

There are also relative URLs for linking to files that are in other directories. Using relative URLs is a good idea, since if you use them, you can move a whole collection of files without changing any of the links between them (as long as you don't change the relative locations of the files).

When you type a URL into a Web browser, you can omit the "http://" at the beginning of the URL. However, in an <A> tag in an HTML document, the "http://" can only be omitted if the URL is a relative URL. For a normal URL, it is required.


Images

You can add images to a Web page with the <IMG> tag. (This is a tag that has no matching closing tag.) The actual image must be stored in a separate file from the HTML document. The <IMG> tag has a required modifier, named SRC, to specify the URL of the image file. For most browsers, the image should be in one of the formats GIF (with a file name ending in ".gif") or JPEG (with a file name ending in ".jpeg" or ".jpg"). A so-called animated gif file actually contains a series of images that the browser will display as an animation. Usually, the image is stored in the same place as the HTML document, and a relative URL is used to specify the image file.

The <IMG> tag also has several optional modifiers. It's a good idea to always include the HEIGHT and WIDTH modifiers, which specify the size of the image in pixels. Some browsers, including Netscape, handle images better if they know in advance how big they are. For browsers that can't display images, you can use the ALT modifier to specify a string that will be displayed by the browser in place of the image.

The ALIGN modifier can be used to affect the placement of the image. "ALIGN=RIGHT" will shove the image to the right edge of the page, and the text on the page will flow around the image. "ALIGN=LEFT" works similarly. (Unfortunately, "ALIGN=CENTER" doesn't have the meaning you would expect. Browsers treat images as if they are just big characters. Images can occur inside paragraphs, links, and headings, for example. Alignment values of CENTER, TOP, and BOTTOM are used to specify how the image should line up with other characters in a line of text: Should the baseline of the text be at the center, the top, or the bottom of the image? Alignment values of RIGHT and LEFT were added to HTML later, but they are the most useful values.)

For example, here is HTML code that will place an image from a file named figure1.gif on the page.

      <IMG SRC="figure1.gif" ALIGN=RIGHT HEIGHT=150
                                  WIDTH=100 ALT="Figure 1">

The image is 100 pixels wide and 150 pixels high. It will appear on the right edge of the page. If a browser can't display images, it will display the string "Figure 1" instead.

There are many places on the Web where you can get graphics for use on your Web pages. For example, http://www.iconbazaar.com makes a large number of images available. You should, of course, check on the owner's copyright policy before using someone else's images on your pages.


The Applet tag and Applet Parameters

The <APPLET> tag is used to add a Java applet to a Web page. This tag must have a matching </APPLET>. A required modifier named CODE gives the name of the compiled class file that contains the applet. HEIGHT and WIDTH modifiers are required to specify the size of the applet. If you want the applet to be centered on the page, you can put the applet in a paragraph with CENTER alignment So, an applet tag to display an applet named HelloWorldApplet centered on a Web page would look like this:

       <P ALIGN=CENTER> 
       <APPLET CODE="HelloWorldApplet.class" HEIGHT=50 WIDTH=150>
       </APPLET>
       </P> 

This assumes that the file HelloWorldApplet.class is located in the same directory with the HTML document. If this is not the case, you can use another modifier, CODEBASE, to give the URL of the directory that contains the class file. The value of CODE itself is always just a file name, not a URL.

If an applet uses a lot of .class files, it's a good idea to collect all the .class files into a single .zip or .jar file. Zip and jar files are archive files which hold a number of smaller files. Your Java development system is probably capable of creating them in some way. If your class files are in an archive, then you have to specify the name of the archive file in an ARCHIVE modifier in the <APPLET> tag. Archive files won't work on older browsers, but they should work for any browser that understands Java version 1.1 or later.

Applets can use applet parameters to customize their behavior. Applet parameters are specified by using <PARAM> tags, which can only occur between an <APPLET> tag and the closing </APPLET>. The PARAM tag has required modifiers named NAME and VALUE, and it takes the form

<PARAM  NAME="param-name"  VALUE="param-value">

The parameters are available to the applet when it runs. An applet can use the predefined method getParameter() to check for parameters specified in PARAM tags. The getParameter() method has the following interface:

String getParameter(String paramName)

The parameter paramName corresponds to the param-name in a PARAM tag. If the specified paramName actually occurs in one of the PARAM tags, then getParameter returns the associated param-value. If the specified paramName does not occur in any PARAM tag, then getParameter returns the value null. Parameter names are case-sensitive, so you can't use "size" in the PARAM tag and ask for "Size" in getParameter.

By the way, if you put anything besides PARAM tags between <APPLET> and </APPLET>, it will be ignored by any browser that supports Java. On the other hand, a browser that does not support Java will ignore the APPLET and PARAM tags. This means that if you put a message such as "Your browser doesn't support Java" between <APPLET> and </APPLET>, then that message will only appear in browsers that don't support Java.

Here is an example of an APPLET tag with PARAMs and some extra text for display in browsers that don't support Java:

      <APPLET code="ShowMessage.class" WIDTH=200 HEIGHT=50>
         <PARAM NAME="message" VALUE="Goodbye World!">
         <PARAM NAME="font" VALUE="Serif">
         <PARAM NAME="size" VALUE="36">
         <p align=center>Sorry, but your browser doesn't support Java!</p>
      </APPLET>

The applet ShowMessage would presumably read these parameters in its init() method, which might go something like this:

        String display;  // Instance variable: message to be displayed.
        String fontName; // Instance variable: font to use for display.

        public void init() {
            String value;
            value = getParameter("message"); // Get message PARAM, if any.
            if (value == null)
               display = "Hello World!";  // default value
            else
               display = value;  // Value from PARAM tag.
            value = getParameter("font");
            if (value == null)
               fontName = "SansSerif"
            else
               fontName = value;
             .
             .
             .

Dealing with the size parameter would be just a little harder, since a parameter value is always a String, and the size is supposed to be an int. This means that the String value must somehow be converted to an int. We'll worry about how to do that later.


[ Next Section | Previous Section | Chapter Index | Main Index ]