CPSC 120: Principles of Computer Science
Fall 2002

Lab 1: Data Representation (and Web Pages)


IN ONE PART OF THIS FIRST LAB, you will be introduced to the first of several "Java applets" that you will be using throughout the term. Each applet is meant to help you learn about some aspect of computing or computer science. The applet for this lab is pretty simple, but it illustrates some important ideas about the representation of data in a computer.

Another part of the lab will let you start work on a personal Web page that you will also work on in several labs later in the term. The Web pages that you create will be stored on the computer named math, which runs an operating system named Linux. Linux is a free alternative to windows. You have an account on this Linux system, but for the time being you will access this account through Windows.

However, before you start in on the actual content of the lab, there is some set-up to do...


Organizing your Windows Account

When you use one of the Windows computers in one of the HWS labs, you are actually using an entire network of computers. Your personal files (the "M" drive), for example, are actually stored on a computer in the computer center, Williams Hall. Every time you log onto one of the lab computers, these files appear under an icon labeled "M:\ Documents" at the upper left corner of the screen. For this course, you will need access to files on another computer, "math", which is owned by the Department of Mathematics and Computer Science. In this section of the lab, you will set up your Windows account to make it easy to access all the files you need.

The idea is to create two "shortcuts" in your "M:\ Documents" folder:

You should now have two new shortcuts in your "M:\ Documents" folder. Open this folder by double-clicking the icon. (You might want to close any other open windows.) Double-click the "cpsc120" shortcut icon. Inside, you'll see a folder named "lab1", along with some other stuff. Double-click the "lab1" folder icon. Inside are the files you need for this lab, including a link to the worksheet for the lab (this document), an icon for running a Java Applet named "DataReps", and another folder that you will use for making your web page. There will be a folder similar to the lab1 folder for each lab in this course.


Data Representation

Java, which was introduced in 1995, is still a fairly new programming language. Among other things, Java can be used to write "Java applets", which are small programs that can appear on Web pages. Applets are one of the ways in which a Web page can become an interactive experience for the user, rather than just a static document for reading. Applets that are run in a Web browser are restricted in some ways, for security reasons (since you don't really want a program that you happen to encounter on the Web to have full access to your computer). Applets can also be run on their own, instead of through a Web browser. When run on their own, applets are not subject to security restrictions. Because of this, most of the applets that you run in this course will not be run off the web. However, you can find all the applets on the Web (with the security restrictions) at http:math.hws.edu/TMCM/java/.

In this lab, you will use a little applet called "DataReps". You will use this applet to learn how the same binary number can be used to represent different types of data. To run this applet, double-click the icon named "Run DataReps" in the "lab1" folder.

For this lab, the security restrictions on applets that are run off the Web are not relevant. So, you could also run the applet by clicking the following button while reading this document in a Web browser:

(Sorry, your browser doesn't do Java!)

You'll be using this "DataReps" applet in some of the exercises at the end of the lab. First, you should read about it and experiment with it to see what it does. Then you might want to look at Exercises 1 and 2, at the end of this document.

The DataReps applet lets you type in a data value. You can select the type of data you want to enter by clicking on one of the five radio buttons. Just type your data into the input box at the top of the applet, and press return. You can also click on the 8-by-4 grid of "big pixels" at the center of the applet. The applet takes the data you enter, and it converts that data into a 32-bit binary number. (It has to do this in order to store it!) It then takes that same binary number and interprets it in six different ways. The six interpretations are: a binary number, an integer, a hexadecimal number, a real number, a string of four characters, and an eight-by-four grid of pixels. You should remember that you see the same string of thirty-two bits interpreted in different ways. You should also remember that the same bit-patterns could also be interpreted in an endless variety of additional ways: as a bar of music, or the chemical ingredients in a bar of soap, or your tab at your favorite bar, or....)

Here is a short explanation of each of the six data displays. You should try entering various types of values in the applet to see how they are represented as binary numbers.

Binary
This is the most direct display of the 32 bit binary number, showing a zero or one to represent each individual bit. The displayed binary number shows the full 32 bits, including any leading zeros. The computer stores the zeros, even though you don't ordinarily include leading zeros when you write a number.
Base-ten Integer
A binary number can be interpreted as a normal positive integer (0, 1, 2, 3, 4,...) written in the "base ten". Base ten is the usual way of writing numbers, using the digits 0 through 9. See Section 1.1 of The Most Complex Machine. With 32 bits, you can represent 232 different numbers. Usually, you want to use both positive and negative numbers. The scheme for representing negative numbers is a bit strange. It is explained in Subsection 2.2.3 of the text. Using 32 bits, the integers from -2147483648 to 2147483647 can be represented.
Hexadecimal
It is difficult (for humans) to read long strings of zeros and ones. Hexadecimal numbers are a kind of shorthand for writing such strings. A hexadecimal number is written using the sixteen "hexadecimal digits" 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. Each hexadecimal digit stands for four bits. So 0 represents 0000, 1 represents 0001, 2 represents 0010, ..., E represents 1110, and F represents 1111. We could also say that the hexadecimal digit A stands for the base-ten number 10 (ten), B stands for 11 (eleven), C stands for 12, D for 13, E for 14, and F for 15. A hexadecimal number is really just a number written in the base sixteen, just as ordinary integers are in the base ten and binary numbers are in the base two. You should be able to translate between hexadecimal and binary by hand. (But of course, you could use the applet instead.)
Real Number
Real numbers are numbers that can contain decimal points, like 3.14159 or -234.5, or 12.0. They can also be written using "scientific notation." For example, 2.15e12 is a way of writing 2.15 times 1012. The representation used in computers for real numbers is very complicated. And it allows some strange possibilities, such as INF and -INF, which stand for infinity and minus infinity. There are also NAN's. NAN stands for "not a number." NAN's are used to represent the results of illegal operations such as taking a square root of a negative number. Note that the integer 17 and the real number 17 have completely different representations in the computer, even though they are the same number mathematically. All-in-all, it's probably best not to worry about the internal representation of real numbers. I include this data type here for completeness, since real numbers are so important.
ASCII Text
Characters can be encoded using ASCII code, as explained in Section 1.1 of the text. Each possible character is assigned a code that is one byte (that is, eight bits) long. With 32 bits, you can represents 4 characters in ASCII code. Not every possible byte represents an ordinary, printable character. The applet shows non-printable characters in the form <#n>, where n is the base-ten number corresponding to the byte. For example, the byte 00000111, which is equivalent to 7 in base ten, is shown as <#7>. Note that each character corresponds to one of the rows of pixels in the pixel representation of the number.
Pixels
At the center of the applet, you will see an 8-by-4 grid of little squares. Each of these thirty-two squares corresponds to one bit in the binary number. You should think of these squares as being very big pixels. Each pixel can be either black or white. One bit specifies the color of one pixel -- 0 for white or 1 for black. This is how two-color graphical images can be represented by binary numbers. Again, see Section 1.1 of the text. In the applet, you can change the color of a pixel by clicking on it.

Starting your Web Page

Your account on the "math" computer contains a directory named "www". Anything that you place in the www directory is visible to the world on the Web. To start your web page, the folder named "For your Web Page" in the "lab1" directory contains two files that you should copy to your "www" directory. (Note: If you already have a Web page on the math server, you might not want to replace your current index.html file. You can give a different name to the new file. Please ask for help about how to do this!)

There are two ways to copy the files. Either:

As soon as you do this, you have a (very impersonal) page on the Web. (It is just a copy of this generic page). You can view your own copy of the page at a web address of the form:

http://math.hws.edu/~username

with username replaced by your own user name. (Note the "~" character that precedes the username. This "squiggle" character is important.) You can also use the address:

http://math.hws.edu/~username/index.html

When you leave out the file name at the end of the Web address, the name "index.html" is used by default. The other file, "hws.jpg," contains the picture of HWS that is loaded into the main page.

Make sure that you can view your page on the Web!


Exercises 3 and 4 ask you to edit your web page. Although there are fancy programs for editing Web pages, for now I would like you to work directly with the HTML language, which is used to define Web pages. To do this, you should edit your file with the "Notepad" text editor.

An easy way to run Notepad is to double-click the icon named "Run Notepad" in the "cpsc120" folder. You should run Notepad and open the "index.html" file in your "www" directory, using the "Open" command in Notepad's "File" menu. In the Open File dialog, you will have to switch the "Files of Type" setting from "Text Documents" to "All Files". (Be sure to open your copy of index.html, not the original version in the cpsc120 folder. You won't be allowed to change the original!) When you open the index.html file, you will see something like a program, written in HTML (HyperText Markup Language). This "program" says what is on the Web page and how it should be formatted.

Basic information about HTML can be found in Section 6.2 of my on-line Java textbook. You should read this section, except for the Java programming code at the very end. Here, though, is a summary of the most basic information:

An HTML file for a Web page contains all the text on that appears on the page. It also contains commands called "tags" that specify the layout and style of the text. Tags are also used to add things like images and applets to the page and to specify links to other pages. You can recognize tags because they are enclosed between "<" and ">". For example, <HR> is a tag that tells the Web browser to draw a horizontal line across the page. Some tags occur in pairs. For example, <P> marks the beginning of a paragraph and the matching tag </P> marks the end of the paragraph. Similarly, the <H1> and </H1> tags enclose text that is to be displayed on a line by itself as a large headline. Some tags can have modifiers to give extra information. For example, the tag

<IMG SRC="hws.jpg" WIDTH=288 HEIGHT=192>

specifies that an image is to be added to the page. It has modifiers named SRC, WIDTH, and HEIGHT. The SRC modifier gives the name of a file that contains the image. The WIDTH and HEIGHT modifiers specify the size of the rectangle on the page that will be occupied by the image. (All tag and modifier names, by the way, are case-insensitive. For example, it doesn't matter whether you type WIDTH or width or Width.)

A link to another page is specified using the <A> tag. This tag uses a modifier called "HREF" to specify the Web address of the page to which you want to link. There must be a matching </A>. The text between <A> and </A> will appear on the page as the text of the link. For example, this text in an .html file:

<a href="http://math.hws.edu/eck/">My Home Page</a>

will produce the following link on the Web page: My Home Page.

You might already have enough information to get started on the exercises below, but you should definitely read the material mentioned above and use the information in it to complete the lab exercises. You will do more work on your web page in future labs.


Exercises

These exercises are due next Wednesday, September 11. You should turn in your answers to Exercises 1 and 2 in typed or hand-written form. Exercises 3 and 4 should be available on the Web. (I will not grade your Web page until Saturday, September 14). If the name of your web page is not index.html, write the name of the page on the lab report that you turn in. Remember that you can work with a partner on the lab if you want, and you can turn in a combined lab report. If you do this, make sure that your lab report has both names on it, and make sure that I know which person's account contains the Web page for exercises 3 and 4.

Exercise 1:  The Data Representations Applet can be used to convert data from one format to another. If you want to convert the hexadecimal number A79B to a base-10 integer, for example, you can select "Hexadecimal Number" as the input type, then enter the string A79B into the applet's input box and press return. In the output area at the bottom of the applet, you can see that the corresponding base-10 integer is 42907. (You also see several other representations of the same string of bits.) Use the Data Representation Applet to answer each of the following questions:

a) What real number is represented by the same pattern of bits as the ASCII text Jane ? How did you find the answer?

b) How is the Base-ten Integer -1 (negative one) represented as a string of 32 bits? How did you find the answer?

c) What is the ASCII text code number for the character "%"? Give your answer in two forms: as a byte and as a Base-ten Integer. (Note that 32 bits represent four characters, not just one character. To understand what is going on in the applet, you need to know what <#0> means when it is displayed on the last line of the applet. This is explained above. Read about it!) How did you find your answer?

d) Compare the binary representation of the ASCII text FRED to the representation of fred. Also compare the representations of ABCD to abcd. The difference is most easily seen if you look at the pixel representation in the center of the applet. Based on your observation, what is the difference between the binary code number for a given upper case letter and the binary code number for the corresponding lower case letter? To put it another way, how would you modify the binary code number of an upper case letter to change it to lower case?

Exercise 2:  A base-ten number can be converted into a binary number by writing it as a sum of distinct powers of two. Each power of two in the sum corresponds to a 1 in the binary number. For example,

                293  =  256 + 32 + 4 + 1

                     =  28 + 25 + 22 + 20

                     =  1000000002 + 1000002 + 1002 + 12

                     =  1001001012

In the data representation applet, this relationship can also be seen in the graphical pixel representation of the binary number. What does a power of two look like in the pixel representation? Why? More generally, discuss the relationship between a sum of powers of two and its pixel representation.

Exercise 3:  If you followed the above instructions, you now have a page on the World-Wide-Web. Improve that page in the following ways: (1) Personalize it by changing the headline and the text on the page; (2) Change some of the colors that are used on the page; and (3) Add at least one additional image to the page.

Notes: Some colors on web pages, such as red and white, can be specified by name. However, the most general way of specifying color is with a six-digit hexadecimal number. The number must be in quotes and preceded by a #, as in "#0000CC". For a table that shows the hexadecimal codes for many different colors, click here. As for images, the easiest place to get them is from other Web pages, as long as you make sure that the image that you use is not protected by copyright. When you see an image you want in a Web-browser, right-click on it, choose "Save Picture As" (or similar command) from the pop-up menu, and save the image in your "www" directory. When you don't know the WIDTH and HEIGHT of an image, you can leave them out. That is, an image can be included on a web page with a tag of the form <IMG src="filename"> with no WIDTH or HEIGHT modifier. For use on a Web page, the file name should end with .jpg, .jpeg, .png, or .gif. You might want to use a "background image" on your page instead of a plain background color. To see a version of the generic web page with a background image, click here. To use a background image, just add a modifier BACKGROUND="image.jpg" to the <BODY> tag of your page (replacing image.jpg with the name of the image file that you want to use). The image file should be in the same directory with the HTML file. The background image on my sample page is a texture that I downloaded from http://www.grsites.com/textures/.

Exercise 4:  For this exercise, you will add some text and some links to your Web page. (You can remove the list of links that is already there, if you want; they are there as a model of how to put links on a page.) To make it interesting, the links should be a "mini-exhibit" on some topic such as: your home town, you favorite sports team, an important scientific discovery, your favorite movie, the history of computing, an important artist, a mostly unknown artist, or anything else you like. The links should be a variety of sites on the Web. You should have at least five links. One way to find links is to use a Web search engine. A search engine lets you type in one or more words, and it shows you a list of web pages that contain those words. The best search engine is probably Google, which can be found at http://www.google.com. In addition to the links to the pictures, your exhibit should include at least one headline and a paragraph or two to explain the exhibit.


--David Eck (eck@hws.edu), September 2002