Labs for The Most Complex Machine

Introductory Lab: The Web, Java, and DataReps


THIS IS THE FIRST in a set of lab worksheets meant to be used with the introductory computer science textbook, The Most Complex Machine. The lab worksheets are written for use on the World-Wide Web, and they make use of software written in the form of applets. Applets are computer programs written in a new programming language called Java. All this is explained in the lab, which acts as an introduction to the Web and an orientation to the way applets will be used in the rest of the lab.

As part of the lab, you will use an applet called "DataReps" to help you learn about how different types of data can be represented using binary numbers. This material is related to Section 1.1 of the text.

(For a full list of labs and applets, see the index page.)

This lab includes the following sections:


The World-Wide Web

The Internet consists of millions of computers around the world, linked together by a network so that they -- and their users -- can communicate and interact. In the last few years, the Internet has become a common part of everyday life for many people. The Internet provides a number of useful services, including e-mail, USENET news groups, and the World-Wide Web. This section of the lab is a brief introduction to the Web.

The World-Wide Web, also known as the WWW or simply as the Web, consists of "pages" of information stored on computers all around the world. These pages are available to anyone with a connection to the Internet. They can be viewed with a Web browser such as Netscape or Internet Explorer. A page can contain text, pictures, sounds, three-D graphics, movies, applets, and even interactive features such as fill-in forms. Most important, a page can contain links to other pages. When you click on a link, the Web browser will fetch the page that it refers to and display it to you. So, it's pretty easy to use the Web: just point your mouse at a link, and click! Here, for example, are some links to pages you might want to visit:

The Web is huge, and it has information on almost anything you can think of. There are millions of computers on the Internet. Each of those computers can run a "Web site" and publish Web pages. No one controls this; no one has to authorize it (at least not yet). In fact, you can publish your own information on the Web. One of the later labs will deal with this.


URL's and All That

Every page of information on the Web is identified by a URL (Uniform Resource Locator). Other resources, such as pictures and sounds, are also identified by URL's. When you are viewing a page with a Web browser, the URL of that page is usually displayed in a box near the top of the browser's display window. If you know the URL for a page, you can go directly to that page by entering the URL in that box (and pressing return).

A typical URL is http://math.hws.edu/TMCM/java/index.html. This is the URL for a page that describes all the labs and applets that I have written for use with The Most Complex Machine. This URL has several meaningful parts:

The HyperText Transfer Protocol (HTTP) is the most common method used for communication on the Web. Another common protocol is File Transfer Protocol (FTP), an older method for transferring files from one computer to another. You might also run across some other protocols in URL's.

A domain name, such as math.hws.edu, identifies a particular computer on the Internet. Most of the computers that are used as "servers" of data on the Web have domain names that begin with "www", such as www.whitehouse.gov. You can often read some information about a computer from its domain name. The computer named math.hws.edu is in the Mathematics Department ("math") at Hobart and William Smith Colleges ("hws"), which is an educational institution ("edu"). The last part of the domain name, such as "gov" or "edu" is called the top-level domain. Top-level domains include:

These domains are usually used by computers in the United States. Computers in other countries generally use two-letter country codes for their top-level domains. For example, a domain name ending in "it" indicates a computer in Italy, and "ca" is used by Canadian computers.

Many companies, organizations, and institutions have "home pages" on the Web. If you know something about domain names, you can often guess the URL used by a given company, organization, or institution. For example, you might guess that the home page of the United States Senate is http://www.senate.gov or that the Coca-Cola corporation has a home page at http://www.cocacola.com. (When you use a URL that omits the directory and file name, you will usually get the home page, or index page, from the specified computer.)


Searching the Web

Because there is so much information on the Web, finding what you want can be a problem. There are several utilities that can help you to find things on the Web. First of all, there are "hierarchical indices" that list Web sites according to category. One of the largest of these indices is Yahoo.

Another way to find things on the Web is to use a "search engine." A search engine consists of an index of millions of Web pages and a program for searching the index. (The index is made by a program that constantly downloads Web pages and adds their contents to the index. No index can include all the data on the Web because the Web grows so quickly. Also, some of the data in an index will be out of date because people change or delete their Web pages.)

To use an index, all you have to do is type some words into a box and click on a button. (You can do more advanced searches, but most search engines allow you to do simple searches in this way.) You'll get back a list of Web pages that contain the words you entered. Here, for example, is a simple interface to the Alta Vista search engine. To try it, click in the input box, type some words, and the click the Submit button:

Search
and Display the Results


There are many search engines, including Alta Vista, WebCrawler, Lycos, Excite, and Infoseek. You will have to use at least one of these search engines in order to do some of the exercises at the end of the lab. To get the most out of a search engine, you should read its help or instructions page.


Java and Applets

The example of the search engines shows that a Web page can be more than just a passive collection of links. A Web page can also be interactive. The Alta Vista search engine uses a type of interaction known as a form (or "fill-in form"). You enter some data in the form and click on a submit button. Your Web browser sends your data to another computer, which responds to your data by sending a new page for your Web browser to display.

Web pages can also provide other types of interactivity, without involving a second computer. One of the new technologies on the Web is Java, a programming language that can be used to write applets, which are small programs that run on a Web page. Many Java applets are more decorative than useful, like this "Moiré" applet:

(Sorry, your browser doesn't do Java!)

(A Moiré pattern is formed by the "interference" between two similar patterns. In this applet, the basic pattern consists of lines radiating out from a command center. There are two copies of this pattern. One is fixed, while the other drifts about.)

Even this decorative applet allows some interaction. If you click-and-drag your mouse on the Moiré applet, you can control the motion of the pattern. (To "click-and-drag" means to press the mouse button and then move the mouse, while holding down the button.) If you shift-click on the applet, you can stop and restart its motion. (To "shift-click" means to hold down the shift key while you click the mouse button.) If you are using a slow computer, you might want to turn off the Moiré applet, so that it doesn't take computer processing time away from other things going on on this page.

Each of the labs for The Most Complex Machine uses an applet to help you learn something about computer science. In order to make it more convenient to use the applets and read the labs at the same time, the applets are set up to run in separate windows. The lab worksheet contains a button that you can click to launch the applet. In most of the labs, this button is close to the beginning of the lab, where it will be easy to find. (The button is itself a small applet that runs on the Web page.)

In this lab, you will use a fairly simple applet called "DataReps". You will use this applet to learn how the same binary number can be used to represent different types of data. To launch the applet, click on this button:

(Sorry, your browser doesn't do Java!)

The window that opens when you click this button will probably be marked with some kind of warning, to alert you to the fact that the window was created by an applet. For example, on my computer, there is a warning bar like this one along the bottom of windows opened by applets:

Why the warning? A Java applet is a program that you have downloaded from the Internet. Whenever you download a program, there is a danger that the program is malicious -- that it will try to damage your computer or steal information from you. A great deal of attention has been paid to making Java applets secure, that is to making sure that they can't damage your computer or access private information stored on your computer. However, nothing can stop you from entering private information, such as a password, into an applet. The warning on applet windows is there to stop malicious applets from tricking you into entering such information. For example, without the warning, the applet might imitate a window from a program that has a legitimate need for the information.


Data Representations

You'll be using the "DataReps" applet, which you launched above, in some of the exercises at the end of the lab. For now, you should read about it and experiment with it to see what it does.

This applet lets you type in a data value. You can select the type of data you want to enter by clicking on one of the five radio buttons. Just type your data into the input box at the top of the applet, and press return. You can also click on the 8-by-4 grid of "big pixels" at the center of the applet. The applet takes the data you enter, and it converts that data into a 32-bit binary number. (It has to do this in order to store it!) It then takes that same binary number and interprets it in six different ways. The six interpretations are: a binary number, an integer, a hexadecimal number, a real number, a string of four characters, and an eight-by-four grid of pixels. You should remember that you see the same string of thirty-two bits interpreted in different ways. You should also remember that the same bit-patterns could also be interpreted in an endless variety of additional ways: as a bar of music, or the chemical ingredients in a bar of soap, or your tab at your favorite bar, or....)

Here is a short explanation of each of the six data displays. You should try entering various types of values in the applet to see how they are represented as binary numbers.

Binary
This is the most direct display of the 32 bit binary number, showing a zero or one to represent each individual bit. The displayed binary number shows the full 32 bits, including any leading zeros. The computer stores the zeros, even though you don't ordinarily include leading zeros when you write a number.
Base-ten Integer
A binary number can be interpreted as a normal positive integer (0, 1, 2, 3, 4,...) written in the "base ten". Base ten is the usual way of writing numbers, using the digits 0 through 9. See Section 1.1 of The Most Complex Machine. With 32 bits, you can represent 232 different numbers. Usually, you want to use both positive and negative numbers. The scheme for representing negative numbers is a bit strange. It is explained in Subsection 2.2.3 of the text. Using 32 bits, the integers from -2147483648 to 2147483647 can be represented.
Hexadecimal
It is difficult (for humans) to read long strings of zeros and ones. Hexadecimal numbers are a kind of shorthand for writing such strings. A hexadecimal number is written using the sixteen "hexadecimal digits" 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. Each hexadecimal digit stands for four bits. So 0 represents 0000, 1 represents 0001, 2 represents 0010, ..., E represents 1110, and F represents 1111. We could also say that the hexadecimal digit A stands for the base-ten number 10 (ten), B stands for 11 (eleven), C stands for 12, D for 13, E for 14, and F for 15. A hexadecimal number is really just a number written in the base sixteen, just as ordinary integers are in the base ten and binary numbers are in the base two. You should be able to translate between hexadecimal and binary by hand. (But of course, you could use the applet instead.)
Real Number
Real numbers are numbers that can contain decimal points, like 3.14159 or -234.5, or 12.0. They can also be written using "scientific notation." For example, 2.15e12 is a way of writing 2.15 times 1012. The representation used in computers for real numbers is very complicated. And it allows some strange possibilities, such as INF and -INF, which stand for infinity and minus infinity. There are also NAN's. NAN stands for "not a number." NAN's are used to represent the results of illegal operations such as taking a square root of a negative number. Note that the integer 17 and the real number 17 have completely different representations in the computer, even though they are the same number mathematically. All-in-all, it's probably best not to worry about the internal representation of real numbers. I include this data type here for completeness, since real numbers are so important.
ASCII Text
Characters can be encoded using ASCII code, as explained in Section 1.1 of the text. Each possible character is assigned a code that is one byte (that is, eight bits) long. With 32 bits, you can represents 4 characters in ASCII code. Not every possible byte represents an ordinary, printable character. The applet shows other bytes in the form <#n>, where n is the base-ten number corresponding to the byte. For example, the byte 00000111, which is equivalent to 7 in base ten, is shown as <#7>.
Pixels
At the center of the applet, you will see an 8-by-4 grid of little squares. Each of these thirty-two squares corresponds to one bit in the binary number. You should think of these squares as being very big pixels. Each pixel can be either black or white. One bit specifies the color of one pixel -- 0 for white or 1 for black. This is how two-color graphical images can be represented by binary numbers. Again, see Section 1.1 of the text. In the applet, you can change the color of a pixel by clicking on it.

Exercises

Exercise 1: For this exercise, your goal is to use a search engine such as Alta Vista or WebCrawler to find an interesting page on the World Wide Web. Pick a topic that interests you. Think up some terms related to that topic, and search for pages containing those terms. Pick out one you find interesting. Don't settle for some boring generic page like ESPN Sports or Apple Computer Inc! Write a short paragraph saying what your topic was and how you went about doing the search. Also include the URL for the page that you find.

Exercise 2: Search the Web to find the poem written by the Greek poet Sappho about her daughter Cleis. How did you go about finding the poem? Where did you find it?

Exercise 3: Starting from one of the links given above, find the radius, in kilometers, of the planet Jupiter. How did you go about finding it?

Exercise 4: Guess the URL of the home page of each of the following. Explain your reasoning. (Check the Web to see whether you are right.)

Exercise 5: Pick out one or two of the following phrases. Each phrase is a fragment of a reasonably well-known quotation. Search the Web for uses of the phrase. (Use the Alta Vista advanced search; enter the phrase in quotes into the text-input box. This will not work with the regualar Alta Vista search.) Try to find the complete phrase and the original source of the phrase. Also, try to find a few interesting variations that people have used on their Web pages.

Exercise 6: In addition to working on some of the above exercises, you should spend some time "surfing" the World-Wide Web. Write a short essay describing your experiences with the Web and speculating on its possible impact and importance.

Exercise 7: Use the "DataReps" applet to find the following. In each case, indicate briefly what you did with the applet to answer the question.

Exercise 8: Enter the following base-10 integers into the "DataReps" applet: 1, 2, 4, 8, 16, 32, 64. Describe the corresponding pixel representation of these numbers. (The pixel representation is displayed in the center of the applet). What pattern do you see? Why does this pattern occur? What can you say about the binary representation of these numbers?

Exercise 9: Enter a four-letter word such as "TIME" into the "DataReps" applet. (Select "ASCII Text" as the input type, type the word into the applets input box, and press return.) Consider the pixel representation of the word, which is displayed in the center of the applet. Play with the pixels in the third column of pixels from the left. Turn them on and off by clicking on them, and observe what happens to each letter in the word. What happens? What does this tell you about the ASCII coding of letters? (What is the meaning of the third bit in that encoding?)

Exercise 10: This final exercise is meant to be a longer essay question. You should try to show your understanding of the way data is represented in a computer, and an appreciation for the fact that meaning depends on context and convention.

It would be legal to input 1000 into the Data Representation Applet as either a binary number, a base-ten integer, a hexadecimal number, a real number, or as ASCII text. In each case, the input is represented differently -- as a different binary number. How is it possible that five different binary numbers can all represent "1000"? What is going on here? How can the computer keep all the different meanings straight?


This is one of a series of labs written to be used with The Most Complex Machine: A Survey of Computers and Computing, an introductory computer science textbook by David Eck. For the most part, the labs are also useful on their own, and they can be freely used and distributed for private, non-commercial purposes. However, they should not be used as a formal part of a course unless The Most Complex Machine is also adopted for use in that course.

--David Eck (eck@hws.edu), Summer 1997