CPSC 220, Fall 2012
Lab 9b: Larc Assembler, Part 2

You have already done the first part of Lab 9, which involved writing a basic Larc assembler. That part of the lab is due today. If you are working with a partner, please let me know, and tell me where to look for your work. This part of the lab counts for 15 points, and you get full credit if your program can assemble a few of the test programs and produce machine language programs that work correctly. At this time, I am not looking for comments, and I am not worrying much about programming style.

The completed lab is due next week. [The due date has been changed to November 27, the Tuesday after Thanksgiving.] For the second part, you will add some features to your assembler. How much you have to add depends on whether you are working with a partner. The second part of the lab is worth another 25 points, and there is an opportunity for extra credit.

Added November 19: A solution to the first part of the lab is available in /classes/cs220/lab9-files. You will also find a new folder of test programs, tests-b on which you can try your assembler. There are several test programs that have errors that should be detected by the assembler, and there are several that test some features of the advanced assembler. Part b of the lab will be due after Thanksgiving. I plan to collect it on Wednesday morning, November 28, but there will be a new lab for you to work on for Tuesday, November 27.

Note to people working with a partner: I have removed one requirement from the list of extra things you have to implement.

The Basic Assignment, for Everyone

Make a copy of Assembler.java, and call the copy AdvancedAssembler.java. Write the improved assembler for Part 2 of the lab in AdvancedAssembler.java. The improved assembler should support the following features:

For labels, you should take the easy route, which is to treat all labels as if they require more than 8 bits. The technique that you will use for big values will work for small values as well. It will just use more instructions than are really necessary. However, you should not generate extra instructions for li or ALU instructions that contain immediate values in the range -128 to 127. In such cases, you only need one instruction for the li case, and two instructions for the ALU case.

In this version of the assembler, one assembly language instruction can produce more than one machine language instruction. You will have to modify the first pass of your assembler to take that into account. You will have to modify the second (or third) pass to actually generate the extra machine language code.

You will often need to generate machine language code for loading a (potentially) big number into a register. This can be done with five machine language instructions. It can be done with the following method, which you should copy into AdvancedAssembler.java.

    /**
     * Generates 5 machine language instructions and adds them to binary program.
     * The instructions will load the number bigNumber into register number
     * destinationRegister.  Registers 12 and 13 are used for scratch work.
     * @param bigNumber the number to be loaded into destinationRegister.  Only
     *   the low-order 16 bits of this number are used; the other 16 bits are ignored.
     * @param destinationRegister a number in the range 0 to 15 inclusive giving
     *   the number of the register into which bigNumber is to be loaded.
     *   Note that it is ok for the destination register to be register 12 or 13.
     * @throws IllegalArgumentException if destinationRegister is out of range
     */
    private static void generateLoadBig(int bigNumber, int destinationRegister) {
        if (destinationRegister < 0 || destinationRegister > 15)
            throw new IllegalArgumentException("Bad register number: " + destinationRegister);
        int high = (~bigNumber >> 8) & 0xFF;
        int low = (~bigNumber) & 0xFF;
        int ins;
        ins = ISA.LUI << 12 | 12 << 8 | low;
        binaryProgram.add(Convert.decimalToBinary( ins, 16 ));
        ins = ISA.LI << 12 | 13 << 8 | 8;
        binaryProgram.add(Convert.decimalToBinary( ins, 16 ));
        ins = ISA.SRL << 12 | 12 << 8 | 12 << 4 | 13;
        binaryProgram.add(Convert.decimalToBinary( ins, 16 ));
        ins = ISA.LUI << 12 | 13 << 8 | high;
        binaryProgram.add(Convert.decimalToBinary( ins, 16 ));
        ins = ISA.NOR << 12 | destinationRegister << 8 | 13 << 4 | 12;
        binaryProgram.add(Convert.decimalToBinary( ins, 16 ));
    }

In class on Monday, we will discuss how this method works, and we will discuss how to implement some of the other things that you need to do.

Extras for Programming Pairs

For people working in pairs, there are a few more things that your assembler should support:

Note: The original file Inst.java has a bug that prevented the parser from recognizing the assembly language instruction not. You should get a copy of the fixed version from this link or from /classes/cs220/lab9-files.

Note: NOT A = A NOR 0,   A OR B = NOT (A NOR B),   A AND B = (NOT A) NOR (NOT B).There are a few test programs for this part of the lab in /classes/cs220/lab9-files/tests-c

For Extra Credit

The extra credit part of the assignment is to eliminate some of the unnecessary machine language instructions that are generated by the non-extra-credit version. If you do any extra credit work, be sure to document exactly what you have done in a comment at the top of the program. (Note: I have not done any of this extra credit work myself.)

It is annoying that the assembler generates extra machine language instructions for labels that could be implemented using an 8-bit LIMM and just one instruction. However it is complicated to optimize the number of instructions for these cases. The problem is that you don't necessarily know the location associated with the label during the first assemble pass, when you are counting machine language instructions. In some cases, of course, you do know, and one improvement would be to eliminate the extra machine language instructions in those case where you know during the first pass that you won't need them. This would, at least, provide more optimal code for backward jumps.

For more extra credit, you could be more ambitious than that. One idea is to add another pass to the assembler, after the first pass. The first pass does "worst-case" calculation of machine language instructions counts, and calculates label addresses based on the worst case. The second pass could recalculate the label addresses, using the worst-case information to decide how many machine language instructions to generate for la and for branch instructions. Since the recalculation can only shrink memory addresses, you won't get any invalid results. (However, you might still have some cases in which you use more machine language instructions than you need.)

You might even think of other ways to optimize the code that you generate!