| CPSC 220 | Introduction to Computer Architecture | Fall 2008 |
In this lab, you will write an assembler in Java for converting Larc assembly programs into Larc machine programs. You should review the Larc assembly language handout before starting this lab.
Create a lab08 directory within the ~/cs220 directory that you created in lab 1 for use in this lab. Change to the newly-created lab08 directory. Copy the provided files from the directory /classes/f08/cs220/labs/lab08 to your new lab08 directory. The following commands should work (note: the "-r" in the cp command allows for copying directories as well as files):
bash$ mkdir ~/cs220/lab08 bash$ cd ~/cs220/lab08/ bash$ cp -r /classes/f08/cs220/labs/lab08/* ~/cs220/lab08/
Note: the rest of this lab assumes you are in your lab08 directory.
Your directory should now contains 8 Java files: Assembler.java, Parser.java, Asm.java, Inst.java, Datum.java, Label.java, ISA.java, and Convert.java. Assembler.java contains skeleton code for the assembler. You will complete this code in this lab. The other Java files are already complete, but you will need to use them in Assembler.java. They are discussed below and, in addition, their application programming interface (API) is online at http://math.hws.edu/mcorliss/teaching/fall08/cs220/labs/lab08/api/.
Your directory should also contains files for running and debugging Larc programs: sim and db. In particular, sim is the text-based Larc simulator and db is the graphical debugger. See lab 7 and lab 6 for details on how to run these. Your directory should also contain a file asm, which is a working assembler. For example, you can use the command ./asm foo.s to assemble the file foo.s. You can use this program to see how the working assembler should function. Finally, your directory should contain a directory called tests, which has several assembly files in it. Use these assembly files to test your assembler. Note: these tests do not cover all of the assembly language features so you may want to write some additional tests.
There is one exercise for this week's lab due next week.
In this exercise, you will write an assembler for translating Larc assembly code into Larc machine code. For example, once your assembler is working the following command will convert the assembly program in tests/hello-world.s to a machine code file in tests/hello-world.out:
bash$ java Assembler tests/hello-world.s bash$
Note: the base name of the machine code file (tests/hello-world) is taken from the name of the assembly file and appended with .out. You can then run the machine code program by running it in the simulator (as in the previous labs):
bash$ ./sim tests/hello-world.out Hello, world! bash$
You could also debug the program by swapping ./sim with ./db. The Larc assembly language handout describes the Larc assembly language in detail. Before starting this assignment you should read this handout carefully. You are responsible for implementing an assembler which follows the specifications laid out in the handout. Note: you are only responsible for the base assembly instructions and the load address instruction from the extended list. All the other extended assembly instructions are left as extra credit.
Some parts of the assembler will be provided for you such as a class for parsing a Larc assembly program (Parser.java), classes for representing a generic assembly item (Asm.java), instruction (Inst.java), data item (Datum.java), and label (Label.java), as well as a class containing some useful variables and methods pertaining to the ISA (ISA.java) and a class containing some useful conversion methods (Convert.java). You will be responsible for completing Assembler.java, which converts the parsed assembly program into a machine program. It contains skeleton code to get you started. You will also be responsible for understanding the files (Parser.java, Asm.java, Inst.java, Datum.java, Label.java, ISA.java, Convert.java) that do not have to implement. The API (application programming interface) of each of these classes is provided off the course webpage at: http://math.hws.edu/mcorliss/teaching/spring08/cs220/labs/lab08/api/.
Parts of Assembler.java are already completed. In particular, Assembler.java already calls a parse method in Parser.java to obtain the parsed program. This is passed back to the assembler in a vector of assembly items (type Asm) called asmProgram. Each item is either an instruction (type Inst), a data item (type Datum), or a label (type Label). (Note: Inst, Datum, and Label extend Asm.) Your assembler must perform multiple passes over this vector of assembly items. A pass is just a loop over the asmProgram vector. Your assembler will then build a vector of binary 16-bit words (type String) called binaryProgram. Finally, this vector will be written to the machine code file. The writing to the machine code file is also already implemented. What is not implemented is converting the assembly program in the vector asmProgram to a binary program, which will be put in binaryProgram. You will need to implement this functionality for lab 8.
Currently Assembler.java simply prints the assembly program back to the screen in a method called printProgram() (you will eventually comment the call to printProgram() out). This method is provided to you to show you how to work with the asmProgram vector. You should look at it carefully before writing new code. Each pass that your perform over the assembly program will look similar to the loop in printProgram(). You will probably need to make two to three passes over the assembly program. In the first pass, you will need to compute an address for each instruction, data item, and label, which will be used later. Note: the text section is always placed first in the asmProgram vector, regardless of the order in the assembly file, so you dont have to worry about reordering this vector. In the second pass, you will patch each instruction or data item that uses a label. The reference to the label will be replaced with an immediate value. See the Inst and Datum API for details on how to do this. Finally, in a third pass (which could be merged with the second pass), each assembly item is converted into binary words, which are placed as strings in the vector binaryProgram. Most items will correspond to one 16-bit word. The exceptions are the .asciiz and .space data directives, which may correspond to multiple 16-bit words.
When you compute addresses for each assembly item in the first pass, you will need to keep track of which label names correspond to which addresses. The class Label has a method addToMap for adding a (label name, address) pair to a map so that later you can look up the address of that particular label. To lookup the address of a label, you would use getFromMap in the Label class. If the label exists this method will return a non-negative value, otherwise, it will return a negative value. See the API for more details on how to call these methods.
As illustrated by the code above, your assembler will need to handle any errors in the assembly file. The parser (in Parser.java) handles syntactic errors such as a bad register name or unknown operator. But some errors must be caught in Assembler.java. In particular, your assembler must catch the following:
A method assemblyError is provided for reporting errors. You should call it with the appropriate message (a string) if your assembler discovers an error.
In this lab, you can use any class that is part of the Sun Java library. However, you will probably only need Vector. See the Sun Java API for details on these classes. You may not use anything outside of the Sun Java library unless you write it yourself.
For some extra credit, you can add support for the extended assembly instructions (see the assembly language handout for details on the extended instructions). These will need to be translated into several base instructions before beginning the rest of the assembly process. Note: there is one extended instruction you have to implement regardless of whether you choose to the extra credit: load address (la). For even more extra credit, you can add support for the automatic handling of labels that map to immediates that take up more than 8 bits. For example, a branch to a label that is more than 128 instructions away requires an immediate that is larger than 8 bits and so will not fit into the LIMM field. This can be remedied by replacing the branch with a set of instructions that uses a branch to provide the conditionality and a jump-and-link to perform the actual jump. But be warned: this is very difficult to implement. If you choose not to do this for extra credit, then you can simply print an error when this occurs.
Last, but not least, you must answer the following questions, which you can put in a comment at the top of your program:
Verify that your lab08 folder contains all of the files you created or modified for this lab, then copy your entire lab08 folder to the handin directory ~mcorliss/handin/cs220/username (where username is replaced with your username). For example, if your working directory is ~/cs220/lab08/ then you could do the following:
cp -r ~/cs220/lab08 ~mcorliss/handin/cs220/username
where username is replaced with your particular username (e.g., mcorliss).