CPSC 433 Compilers Spring 2009

Project Assignment 5: Building an Optimizer

Introduction

In this project you will implement an optimizer for the Bantam Java compiler. The optimizer takes as input the root of the class hiearchy tree (i.e., the ClassTreeNode for Object) and builds optimized control flow graphs (one for each subroutine) representing the program. These control flow graphs are passed to the code generator, which uses them to generate the target assembly code.

Your optimizer will perform some high-level optimization on the abstract syntax tree as it converts it to control flow graphs as well as some low level optimization on each control flow graph. Unlike in previous projects, your compiler will target the x86 architecture (Intel's architecture and the one used in most machines including our lab machines) rather than the MIPS architecture. This will allow you to obtain accurate performance results in order to evaluate the effectiveness of your optimizer.

As in the previous projects, you will be writing the optimizer in Java from scratch. You will be given some utility code, which will help you in completing the project, but much of the design of the optimizer will be your own.

This project is due at the start of the final exam, which is in just over three weeks, i.e., on Saturday, May 9th at 1:30PM. However, a portion of your solution will be due next week by the start of lab and in two weeks at the start of lab (details are specified below).

Setup

Create a pa5 directory within the ~/cs433 directory for use in this project. Change to the newly-created pa5 directory. Note: your working directory must be ~/cs433/pa5 in order to work correctly with some of the tools in the project. So make sure you don't make a mistake in creating this directory.

Once you have finished creating directory pa5 and have moved into it, then execute the following command:

bash$ make -f /classes/s09/cs433/projects/pa5/Makefile start

This command creates several files and directories you will need. In particular, it creates a Makefile for building the compiler and turning in the project; a Java file Main.java, which contains the main method of the compiler; and several directories, which contain various Java packages needed by the compiler:

The API (Application Programming Interface) for each Java class used in this project is online at http://math.hws.edu/mcorliss/teaching/spring09/cs433/projects/pa5/api/ (generated using javadoc). You should use this as a resource in completing this project.

Project Details

As mentioned above, the optimizer is passed the root node of the class hierarchy tree from the semantic analyzer. From the root node, we can of course reach any other class tree node. Each node contains meta-information about the class as well as a link to the AST node representing that particular class. Any expression AST nodes embedded within the class AST node is annotated with its type, which was done by the semantic analyzer. The optimizer can assume that the program it is passed is a legal, Bantam program. The optimizer traverses the class hierarchy tree and AST nodes, performing several optimizations along the way, converts the representation into control flow graphs (one for each subroutine in the program), performing some optimization on this representation as well, before passing this representation on to the code generator.

In this project, we will be targeting a different architecture than MIPS, namely x86 (the Intel architecture, which is also used by AMD). The reason for the switch is so that we can get performance results. In other words, because x86 is the architecture of the machines we will be using, we can actually run our compiled programs natively rather than via a simulator (e.g., SPIM). For the most part, this change should not effect you, as you are being supplied with an x86 code generator. However, the process for compiling and running Bantam Java is slightly different as outlined below in the section titled "Compiling Your Compiler".

We will be using global or intra-procedural analysis. We will not optimize across procedures. Your optimizer will transform a single subroutine at a time, using the appropriate intermediate representation for that subroutine.

Control flow graph. We will be using two intermediate representations in the optimizer. The first is the AST, which you are undoubtedly familiar with after doing the earlier projects. The second is a control flow graph. If you recall from class, a control flow graph has nodes containing basic blocks and directed edges that indicate control transfers between blocks. A basic block is just a list of instructions with no interior control flow (only the last instruction can be a control instruction, control can only transfer to the first instruction). A class for representing basic blocks is available in package cfg called BasicBlock. Note: there is no class for representing control flow graphs, instead you will use the starting basic block object for each subroutine (initialization subroutines and user-defined methods) to represent each control flow graph. See the API for more details.

The basic block will be made up of 3-Address Code instructions, which are instructions with at most 3 operands (a destination and two source operands). These are also called quadruples (or quads) since there are 4 pieces of possible information: operator, destination, source 1, and source 2. A class for representing quads is available in package cfg. See the API for more details.

High-level optimization. Your optimizer will need to perform some high-level optimization on the AST as it translates the AST into a control flow graph. In particular, your optimizer will need to do the following optimizations:

For extra credit, you can add other AST-level optimizations (see below for more details).

Low-level optimization. You will also need to perform some low-level optimization as well. In particular, you will need to perform a data flow analysis on each control flow graph in order to find redundant expressions. You will use this analysis to perform common subexpression elimination in each control flow block both within a basic block and across basic blocks (although within a single subroutine). Data flow analysis and common subexpression elimination is described in the book and will be described in class.

For extra credit you can add other analyses and transformations. For example, the reference compiler also does constant folding, constant propagation, dead code elimination, and dead assignment elimination.

Compiling Your Compiler

To compile your compiler, you can use the Makefile as in the last four projects. For example, the following will build the compiler:

bash$ make

If your compiler was successfully built, then a shell script called bantamc-opt is created (note: "bantamc-opt" not "bantamc"), which will run your compiler. You can use this compiler as in PA4, PA3, PA2 and PA1 to fully compile Bantam Java programs, and eventually generate optimized target code (after you complete PA5).

For example, the following command runs the compiler on the Bantam Java program Test.btm with optimization enabled.

bash$ bantamc-opt -opt 1 Test.btm

This will create an assembly file called out.s, which can be assembled and run (which is discussed below). The "-opt" flag is used to enable and disable optimization via the command-line as well as to toggle the level of optimization. The number after the "-opt" specifies the optimization level. 0 turns optimization off (which is the same as not specifying "-opt" at all), 1 uses the optimization program representation (a control flow graph) but does not optimizations, 2 turns on high-level optimizations at the AST level, and 3 turns on high- and low-level optimizations at both the AST level and control flow graph level. Note: this command will fail if your optimization level 1 is not implemented (you will get an exception). To enable more aggressive optimization you can increase the optimization level. For example, the following enables the most aggressive optimization:

bash$ bantamc-opt -opt 3 Test.btm

To print out the control flow graph given any non-zero level of optimization, we can use the "-so" flag, which means stop after optimization (code generation is not done). You can look at the printed control flow graph to determine if your optimizer is working correctly. (Note: "-so" does nothing when "-opt" is not used or when the optimization level is 0.) Here is an example using the "-so" flag:

bash$ bantamc-opt -opt 3 -so Test.btm

You can use the "-so" flag to see what kinds of transformations your optimizer is able to do by varying the optimization level. These will be evident by looking at the differences in the printed control flow graphs.

You can also run the reference compiler to compare with your own compiler. It is located in /classes/s09/cs433/bin (which is in your command path) and called bantamc-opt-ref. For example, the following would run the reference optimizer on Bantam Java program Test.btm with an optimization level of 3:

bash$ bantamc-opt-ref -opt 3 -so Test.btm

Be aware: with aggressive optimization enabled, the reference compiler can take a long time to compile longer programs (e.g., a minute or two).

You can even compare your compiler's output with the reference compiler's output using the Unix tool diff. For example, the following commands do this comparison:

bash$ bantamc-opt -opt 3 -so Test.btm > mine.txt
bash$ bantamc-opt-ref -opt 3 -so Test.btm > ref.txt
bash$ diff mine.txt ref.txt
bash$

Unlike in previous projects, you will need to assemble your compiled program before you can run it. We will use an existing assembler, which is part of the gcc C compiler toolset. To compile and assemble a program you would do the following:

bash$ bantamc-opt -opt 3 -o test.s Test.btm
bash$ gcc -o test test.s /classes/s09/cs433/lib/runtime.s

The first command compiles Test.btm with an optimization level of 3 and puts the assembly code into a file called test.s (specified via the "-o" flag). The second command uses gcc to assemble test.s and build a binary (or machine code file) called test (specified via the "-o" flag). Note: to assemble the compiled program we also need to include the (x86) runtime system, which is located at /classes/s09/cs433/lib/runtime.s.

To run the program, we could then do the following (we need to be in the same directory where test was built):

bash$ ./test

Because we're using x86, we can run the program natively, i.e., without a simulator like SPIM.

Testing Your Optimizer

You should make sure you thoroughly test out your optimizer One way to test it out is to supply the "-so" flag to the Bantam Java compiler, which stops compilation after optimization and prints the control flow graph. You can verify that this control flow graph looks correct. You can also compare it with the reference compiler (see the paragraph above for how to do this using the Unix command diff). You can also view the different control flow graphs as you toggle the optimization level to see how it changes.

But eventually you will want to actually compile and run some programs to make sure that the compiled program is not broken and to see how well your optimizer does in terms of performance. For example, the following compiles a program, assembles it, and then calculates the running time of the program:

bash$ bantamc-opt -opt 3 -o test.s Test.btm
bash$ gcc -o test test.s /classes/s09/cs433/lib/runtime.s
bash$ time ./test

The time command will print out the running time of test (it prints out 3 numbers -- you will want the "real" time). You can compile Test.btm with varying levels of optimization to see how the running time changes.

In the directory /classes/s09/cs433/examples/opt/ are some optimization benchmarks, which you can use to test your optimizer. They are non-interactive (even the ones that look like they should be interactive have been made non-interactive) so you will not have to worry about the speed at which you enter input. They also run for 5 seconds to a minute so that it will be clear how well your optimizer performs. You should test your optimizer on these benchmarks and make sure that it gets some improvement on all of them for optimization levels 1 and 2, and that it never significantly slows down any program. To see how they are supposed to function, which is sometimes not obvious, compile and run them using the reference compiler.

When testing performance, you should make sure you are not running any other work-intensive programs as this could perturb your results.

You will be graded in part on the effectiveness of your optimizer, so make sure you take this testing very seriously.

Questions

Answer the following non-technical questions regarding your thoughts on this assignment. These are required so make sure you do them.

Extra Credit

Some extra credit will go to the student with the most effective optimizer. For more extra credit, you can add some additional optimizations. For example, you could perform constant propagation, constant folding, or dead code elimination. The amount of extra credit will vary greatly based on the amount of work that you do and the effectiveness of your optimizations.

Checkpoints

As mentioned in the syllabus, multi-week projects will have checkpoints. Although this is a three-week project, a portion of the project will be due in one week and another portion will be due in two weeks to ensure that you are keeping up with the workload.

Checkpoint 1. For checkpoint 1, which is due in one week, you should implement optimization level 1. In other words, your optimizer should be able to convert the AST into a control flow graph, which the code generator will use to generate code.

Checkpoint 2. For checkpoint 2, which is due in two weeks, you should implement optimization level 2. In other words, your optimizer should perform some high-level optimization including dynamic dispatch removal, instanceof check removal, and divide by zero check removal.

In the final week, you will implement optimization level 3, i.e., removing redundant expressions in the control flow graph.

When you have completed either checkpoint, you should submit it, by following the procedure outlined below underneath the handin header (i.e., "make submit"). You can resubmit a checkpoint at any point before the due date, however, you cannot turn in a checkpoint late.

For this project, each checkpoint will be worth 10% of your total grade for PA5 (both checkpoints together are worth 20%).

Important: these checkpoints represent the minimum amount of work you need to do in the first two weeks in order to successfully complete the entire project. You may want to do more work each week, so that you will not have a busy third project week.

Handin

This project is due in 3 weeks and 2 days, at the start of the final exam on Saturday, May 9th at 1:30PM. Note: you will not be able to use your late days for this project so make sure you submit it on time. Note also: a portion of this project is due in one week at the start of lab on April 23rd and a second portion is due in two weeks at the start of lab on April 30th (see the description above about checkpoints). When you are ready to submit your solution, verify that your pa5 folder contains all of the files you created or modified for this project, then move to your project directory, and execute the following command:

bash$ make submit

This command will copy all of the files in your pa5 directory to your handin folder (which is at /classes/s09/cs433/handin/<username>/pa5) where <username> is replaced with your username.

Note: you can do this more than once, however, be aware that it will overwrite your previously submitted files, each time you issue this command.


Good luck and have fun!