Lexical Analysis Construction



The archive tigerc.zip (link) contains all the supporting files you need for this assignment. For instructions on installation, see "Setting Up the Project", below.


Over the rest of the semester, you will implement a compiler for the Tiger programming language. Tiger, developed by Prof. Andrew Appel, is a traditional imperative-style programming language, with a syntax similar to that of the ML family (though it lacks advanced features such as polymorphic type inference and first-class functions).

The language is described in Prof. Appel's series of texts, Modern Compiler Implementation in Java (1st Ed.), Modern Compiler Implementation in C, and Modern Compiler Implementation in ML. A short yet fairly complete guide to the language has been written by Prof. Stephen Edwards (see "Reading", above).

In this assignment, you will complete a JFlex specification of the lexical analysis for Tiger. To get you started, a skeleton distribution and supporting files are available:

Setting Up the Project (Eclipse-specific instructions)

Make sure you have Eclipse and the JDK installed

This document assumes that you already have installed both the JDK (at least v1.6, minimum) and Eclipse. Install both of those, if you haven't already done so. Second, you are going to want to organize your work for this class in its own folder. Go ahead and make that folder now.

If you'd rather work with a different IDE than Eclipse, please let me know. There's nothing in this project that requires Eclipse. You can even run everything from the command line, though the set up tasks will be different.

Set up the compiler project in Eclipse

  1. Download tigerc.zip, and unzip it in a suitable folder. The result will be a folder, tigerc, containing a number of files and subdirectories:

      --> doc
            -- tiger.pdf                         (Prof. Edwards' language guide)
      --> lib
            -- jflex-1.6.1.jar                   (lexer generator for Java)
            -- java-cup-11b.jar                  (parser generator for Java)
      --> src
            --> tigerc
                   --> syntax
                         --> parse
                               -- Lexer.java     (an interface used by the parser)
                               -- Tiger.flex     (The JFlex specification file)
                               -- TigerSyms.java (symbol definitions for Tiger)
                   --> util
                         --  ErrorMsg.java       (basic error reporting mechanism)
      --> test
            --> testcases                        (sample Tiger source files)
                   -- comment_badNesting.tig
                   -- merge.tig
                   -- ... (dozens more)
            -- TestLexer.java                    (test driver for the lexer)
      --  build.xml
  2. In Eclipse, select File -> New -> Project, and select the "Java/Java Project" option. (Like everything concerning Eclipse, the exact menu text can vary across platforms and versions!)
  3. The choice of project name is yours. Here, we'll use "TigerC".
  4. Uncheck "use default location", click "Browse", and navigate to the root of the folder containing the initial distribution, tigerc. Click "open". If you did it correctly, the "Location" bar should give the full directory path of the project, ending in the folder tigerc.
  5. Click Next. In the new dialogue box, under the Source tab, you should see the directory structure:
      --> doc
      --> lib
      --> src
      --> test
      --  build.xml

    Click Finish, and the project should open in Eclipse

  6. It is possible (likely), that you'll see a build error in TestLexer.java, related to a failure to import java_cup.runtime.Symbol. If so, you'll need to configure the build path for the project:

    1. Right-click on the project name, and select Build Path -> Configure Build Path
    2. Click on the Libraries tab, then click on "Add External JARs".
    3. In the tigerc/lib directory, select java-cup-11b.jar.
    4. Repeat this for jflex-1.6.1.jar, as well.

Available Ant tasks

This project is organized through the Apache Ant software build system, which is installed with the Eclipse distribution. Though powerful, the learning curve for this system is extremely steep, so for the sake of your focus on this project, a complete build file, build.xml, is included.

Through the build file, Ant is able to handle crucial aspects of the construction of a major piece of software, including the management of class paths, integration with third-party libraries, and automation of common tasks.

To use the system, right-click on build.xml in the PackageExplorer, and select Run As --> Ant Build .... You will get a dialogue box listing a number of "targets" to execute. For this project, the important ones are:

Your Job

Complete the lexical specification of Tiger, as begun in the skeleton Tiger.flex file. The keywords and operators in the language are given in the reference document, and you will need to make sure that you have implemented recognition of all of them. In addition, pay attention to the following:

Testing Your Work

The test/testcases subfolder has dozens of Tiger source code files that you can use to test your work. However, doing so from within Eclipse is cumbersome. It's a lot easier to test from the command line, instead. To do so, make sure you've navigated to the subfolder tigerc/bin. From there, invoke the test driver with the following command:

java -cp $CLASSPATH:lib/java-cup-11b.jar  test.TestLexer test/testcases/test1.tig 

with "test1.tig" replaced by whatever Tiger source file you're testing with.

Sample Input/Output

The source file test1.tig:

/* an array type 
     /* and an array variable */
	type  arrtype = array of int
	var arr1:arrtype := arrtype [10] of 0
On my solution, this gives the stream of tokens

LET 54
ID("arrtype") 65
EQ 73
OF 81
ID("int") 84
VAR 89
ID("arr1") 93
ID("arrtype") 98
ID("arrtype") 109
INT(10) 118
OF 122
INT(0) 125
IN 127
ID("arr1") 131
END 136
EOF 140

Turn In:

Your completed Tiger.jflex file, both a paper copy and electronically.

John H. E. Lasseter