CPSC 120: Principles of Computer Science
Spring 2002

Lab 1: Data Representation, Linux, and Web Pages


IN THE FIRST PART OF THIS FIRST LAB, you will be introduced to the first of several "Java applets" that you will be using throughout the term. Each applet is meant to help you learn about some aspect of computing or computer science. The applet for this lab is pretty simple, but it illustrates some important ideas about the representation of data in a computer. Links to the applets, as well as to a complete set of labs based on them, can be found at http://math.hws.edu/TMCM/java. Some of the labs for this course will be taken directly from that page. Others, like this one, will be new or revised for this course.

The second part of the lab will introduce you to your account on the Linux computer system of the Department of Mathematics and Computer Science. (Linux is an alternative to the more common Windows and Macintosh operating systems that you are probably already familiar with.)


Data Representation

Java, which was introduced in 1995, is still a fairly new programming language. Among other things, Java can be used to write "Java applets", which are small programs that can appear on Web pages. Applets are one of the ways in which a Web page can become an interactive experience for the user, rather than just a static document for reading.

In this lab, you will use a little applet called "DataReps". You will use this applet to learn how the same binary number can be used to represent different types of data. To launch the applet, click on this button. When you do, the applet will open as a separate window:

(Sorry, your browser doesn't do Java!)

You'll be using this "DataReps" applet in some of the exercises at the end of the lab. First, you should read about it and experiment with it to see what it does. Then you might want to look at Exercises 1 and 2, at the end of this page.

The DataReps applet lets you type in a data value. You can select the type of data you want to enter by clicking on one of the five radio buttons. Just type your data into the input box at the top of the applet, and press return. You can also click on the 8-by-4 grid of "big pixels" at the center of the applet. The applet takes the data you enter, and it converts that data into a 32-bit binary number. (It has to do this in order to store it!) It then takes that same binary number and interprets it in six different ways. The six interpretations are: a binary number, an integer, a hexadecimal number, a real number, a string of four characters, and an eight-by-four grid of pixels. You should remember that you see the same string of thirty-two bits interpreted in different ways. You should also remember that the same bit-patterns could also be interpreted in an endless variety of additional ways: as a bar of music, or the chemical ingredients in a bar of soap, or your tab at your favorite bar, or....)

Here is a short explanation of each of the six data displays. You should try entering various types of values in the applet to see how they are represented as binary numbers.

Binary
This is the most direct display of the 32 bit binary number, showing a zero or one to represent each individual bit. The displayed binary number shows the full 32 bits, including any leading zeros. The computer stores the zeros, even though you don't ordinarily include leading zeros when you write a number.
Base-ten Integer
A binary number can be interpreted as a normal positive integer (0, 1, 2, 3, 4,...) written in the "base ten". Base ten is the usual way of writing numbers, using the digits 0 through 9. See Section 1.1 of The Most Complex Machine. With 32 bits, you can represent 232 different numbers. Usually, you want to use both positive and negative numbers. The scheme for representing negative numbers is a bit strange. It is explained in Subsection 2.2.3 of the text. Using 32 bits, the integers from -2147483648 to 2147483647 can be represented.
Hexadecimal
It is difficult (for humans) to read long strings of zeros and ones. Hexadecimal numbers are a kind of shorthand for writing such strings. A hexadecimal number is written using the sixteen "hexadecimal digits" 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F. Each hexadecimal digit stands for four bits. So 0 represents 0000, 1 represents 0001, 2 represents 0010, ..., E represents 1110, and F represents 1111. We could also say that the hexadecimal digit A stands for the base-ten number 10 (ten), B stands for 11 (eleven), C stands for 12, D for 13, E for 14, and F for 15. A hexadecimal number is really just a number written in the base sixteen, just as ordinary integers are in the base ten and binary numbers are in the base two. You should be able to translate between hexadecimal and binary by hand. (But of course, you could use the applet instead.)
Real Number
Real numbers are numbers that can contain decimal points, like 3.14159 or -234.5, or 12.0. They can also be written using "scientific notation." For example, 2.15e12 is a way of writing 2.15 times 1012. The representation used in computers for real numbers is very complicated. And it allows some strange possibilities, such as INF and -INF, which stand for infinity and minus infinity. There are also NAN's. NAN stands for "not a number." NAN's are used to represent the results of illegal operations such as taking a square root of a negative number. Note that the integer 17 and the real number 17 have completely different representations in the computer, even though they are the same number mathematically. All-in-all, it's probably best not to worry about the internal representation of real numbers. I include this data type here for completeness, since real numbers are so important.
ASCII Text
Characters can be encoded using ASCII code, as explained in Section 1.1 of the text. Each possible character is assigned a code that is one byte (that is, eight bits) long. With 32 bits, you can represents 4 characters in ASCII code. Not every possible byte represents an ordinary, printable character. The applet shows non-printable characters in the form <#n>, where n is the base-ten number corresponding to the byte. For example, the byte 00000111, which is equivalent to 7 in base ten, is shown as <#7>. Note that each character corresponds to one of the rows of pixels in the pixel representation of the number.
Pixels
At the center of the applet, you will see an 8-by-4 grid of little squares. Each of these thirty-two squares corresponds to one bit in the binary number. You should think of these squares as being very big pixels. Each pixel can be either black or white. One bit specifies the color of one pixel -- 0 for white or 1 for black. This is how two-color graphical images can be represented by binary numbers. Again, see Section 1.1 of the text. In the applet, you can change the color of a pixel by clicking on it.

Your Account on math.hws.edu

While you are taking CPSC 120, you have an account on the computer network used by the Department of Mathematics and Computer Science. You will use this account to publish your web page on the World-Wide Web and to save some of the work that you will produce in exercises in later labs.

You can access your account using a program called X-Win32 on any of the Windows computers in the Colleges' public computer labs. When you start up this program, you will see a list of computers that are available. You want one of the computers named cslab1, cslab2, and so on. Double-click on any of the cslab computers. You will get a new log-in screen where you can sign on to the Math/CS Department's system. You will be assigned a user name and password for use on this system. It takes a few seconds after you log in for the computer desktop to appear.

The first time you log in, the computer will display a "Desktop Settings Wizard". On the first screen, set the country to "United States of America" and make sure that the character set is iso8859-1. The defaults on the remaining screens are probably OK. When you close the wizard, you will see two other dialog boxes: Gandolf's Tips and an information message about a "sound server". Close both of these. You probably want to check the box labeled "Don't show this message again" before you close the message about the sound server, since you don't want to see it every time you log in.

The Linux computer system that you are using is very powerful. If you want to learn more about it, you could start with the information at http://math.hws.edu/eck/about_linux/. Among the information available there are instructions for installing X-Win32 on your own computer. Note that you can also use the Linux system directly (instead of through X-Win32) in the Computer Science Lab in Lansing 310.

One important note: Your Linux account, in its default configuration, is a single-click environment. That is, you should just click once on an icon to open a folder or file.

(I should note that there is a lot more that you can use in your Math/CS account if you want to try it. For example, check the menu that you get by clicking the "K" icon at the bottom left of the screen.)


Your account already contains two folders, named www and homework. These folders are in what is called your "home directory". To see the contents of your home directory, click the little icon at the bottom of the screen that looks like a folder with a little house in front of it. (Alternatively, you can use the "Home Directory" command in the "K" menu, which is at the bottom -left of the screen.) The window you see is just like a directory window in Windows or MacOS. You can open a folder or file by clicking on it (once!). You can navigate to other folders using the arrows near the top of the window and by entering directory names in the "Location" box. Note that the actual name of your home directory is /home/username, with "username" replaced by your own user name.

Any files that you put in your www director will be visible on the World-Wide-Web. I would like you to start work on your own Web site by copying some files into this directory. To do this:

You now have a page on the World-Wide-Web. You can view it using Netscape or Internet Explorer, for example. To access it, use a URL of the form http://math.hws.edu/~username/index.html except that you should replace username with your own user name. (And, if you renamed the index.html file, use the name that you gave it in place of "index.html".) Note that the Web page includes the image, coxe.jpg.


Exercises 3 and 4 ask you to edit your web page. To do this, click on the file with the right mouse button, and select "Open With..." from the menu that pops up. A little dialog box will be displayed. Type nedit in this dialog box, and press return. An editing window will open up where you can modify the file. (Nedit is the name of one of the text-editing programs on the Linux system. I happen to like nedit, but there are other text editors, such as kwrite, that you can also try.)

The file that defines a Web page is written in HTML (HyperText Markup Language), a language for describing Web pages. Basic information about HTML can be found in Section 6.2 of my on-line Java textbook. You should read this section, except for the Java programming code at the very end. Here, though, is a summary of the most basic information:

An HTML file for a Web page contains all the text on that appears on the page. It also contains commands called "tags" that specify the layout and style of the text. Tags are also used to add things like images and applets to the page and to specify links to other pages. You can recognize tags because they are enclosed between "<" and ">". For example, <HR> is a tag that tells the Web browser to draw a horizontal line across the page. Some tags occur in pairs. For example, <P> marks the beginning of a paragraph and the matching tag </P> marks the end of the paragraph. Similarly, the <H1> and </H1> tags enclose text that is to be displayed on a line by itself as a large headline. Some tags can have modifiers to give extra information. For example, the tag

<IMG SRC="coxe.jpg" WIDTH=288 HEIGHT=192>

specifies that an image is to be added to the page. It has modifiers named SRC, WIDTH, and HEIGHT. The SRC modifier gives the name of a file that contains the image. The WIDTH and HEIGHT modifiers specify the size of the rectangle on the page that will be occupied by the image. (All tag and modifier names, by the way, are case-insensitive. For example, it doesn't matter whether you type WIDTH or width or Width.)

A link to another page is specified using the <A> tag. This tag uses a modifier called "HREF" to specify the Web address of the page to which you want to link. There must be a matching </A>. The text between <A> and </A> will appear on the page as the text of the link. For example, this text in an .html file:

<a href="http://math.hws.edu/eck/">My Home Page</a>

will produce the following link on the Web page: My Home Page.

You might already have enough information to get started on the exercises below, but you should definitely read the material mentioned above and use the information in it to complete the lab exercises. You will do more work on your web page in future labs.


Exercises

These exercises are due in class next Wednesday, January 23. You should turn in your answers to Exercises 1 and 2 in typed or hand-written form. Exercises 3 and 4 should be available on the Web. If the name of your web page is not index.html, write the name of the page on the lab report that you turn in. Remember that you can work with a partner on the lab if you want, and you can turn in a combined lab report. If you do this, make sure that your lab report has both names on it, and make sure that I know which person's account contains the Web page for exercises 3 and 4.

Exercise 1:  The Data Representations Applet can be used to convert data from one format to another. If you want to convert the hexadecimal number A79 to a base-10 integer, for example, you can select "Hexadecimal Number" as the input type, then enter the string A79 into the applet's input box and press return. In the output area at the bottom of the applet, you can see that the corresponding base-10 integer is 2681. Use the Data Representation Applet to answer each of the following questions:

a) What real number is represented by the same pattern of bits as the ASCII text Fred ? How did you find the answer?

b) How is the base-10 integer -1 (negative one) represented as a string of 32 bits? How did you find the answer?

c) What is the ASCII code number for the character "*". Give your answer in two forms: as a byte and as a base-10 number. (Note that 32 bits represent four characters, not just one character. To understand what is going on in the applet, you need to know what <#0> means when it is displayed on the last line of the applet. This is explained above.)

d) Compare the binary representation of the ASCII text FRED to the representation of fred. Also compare the representations of ABCD to abcd. The difference is most easily seen if you look at the pixel representation in the center of the applet. Based on your observation, what is the difference between the base-two ASCII code number for a given upper case letter and the base-two ASCII code number for the corresponding lower case letter? To put it another way, how would you modify the ASCII code number of an upper case letter to change it to lower case?

Exercise 2:  A base-ten number can be converted into a binary number by writing it as a sum of distinct powers of two. Each power of two in the sum corresponds to a 1 in the binary number. For example,

                293  =  256 + 32 + 4 + 1

                     =  28 + 25 + 22 + 20

                     =  1000000002 + 1000002 + 1002 + 12

                     =  1001001012

In the data representation applet, this relationship can also be seen in the graphical pixel representation of the binary number. What does a power of two look like in the pixel representation? Why? More generally, discuss the relationship between a sum of powers of two and its pixel representation.

Exercise 3:  If you followed the above instructions, you now have a page on the World-Wide-Web. Improve that page in the following ways: (1) Personalize it by changing the headline and the text on the page; (2) Change some of the colors that are used on the page; and (3) Add at least one additional image to the page.

Notes: Some colors on web pages, such as red and white, can be specified by name. However, the most general way of specifying color is with a six-digit hexadecimal number. The number must be in quotes and preceded by a #, as in "#0000CC". For a table that shows the hexadecimal codes for many different colors, click here. As for images, the easiest place to get them is from other Web pages, as long as you make sure that the image that you use is not protected by copyright. You can view web pages in Linux's Konqueror program. Just enter the URL into the "Location" box. When you see an image you want, right-click on it, choose "Save Image As" from the pop-up menu, and save the image in your www directory. If you can't find another image, you can use the picture of the textbook from http://math.hws.edu/eck/cs120/. You can find free icons and animated pictures at http://www.iconbazaar.com. When you don't know the WIDTH and HEIGHT of an image, you can leave them out. That is, an image can be included on a web page with a tag of the form <IMG src="filename">. For use on a Web page, the file name should end with .jpg, .jpeg, .png, or .gif.

Exercise 4:  For this exercise, you will add some links (at least four of them) to your Web page. (Links are discussed above.) The links will be a "mini-exhibit" about pictures of old computers. To find pages to link to, you should use a Web search engine. A search engine lets you type in one or more words, and it shows you a list of web pages that contain those words. The best search engine is probably Google, which can be found at http://www.google.com. To find pictures of old computers, for example, you might use "pictures of old computers" as the search string. You should include at least one picture of the ENIAC in your exhibit. The ENIAC was the first general-purpose computer. In addition to the links to the pictures, your exhibit should include some words, to explain what is going on.


--David Eck (eck@hws.edu), January 2002