Lexical Analysis Construction



The archive hw1.zip contains all the supporting files you need for this assignment. For instructions on installation, see "Setting Up the Project", below.


Over the rest of the semester, you will implement a compiler for the Tiger programming language. Tiger, developed by Prof. Andrew Appel, is a traditional imperative-style programming language, with a syntax similar to that of the ML family (though it lacks advanced features such as polymorphic type inference and first-class functions).

The language is described in Prof. Appel's series of texts, Modern Compiler Implementation in Java, Modern Compiler Implementation in C, and Modern Compiler Implementation in ML. A copy of the Java edition is on library reserve, if you are interested in details of the language. A short yet fairly complete guide to the language has been written by Prof. Stephen Edwards (see "Reading", above).

In this assignment, you will complete the skeleton of a JFlex specification of the lexical analysis for Tiger. To get you started, a skeleton distribution and supporting files are available:

Setting Up the Project

Make sure you have Eclipse and the JDK installed

This document assumes that you already have installed both the JDK (v1.6, minimum) and Eclipse. Install both of those, if you haven't already done so. Second, you are going to want to organize your work for this class in its own folder. Go ahead and make that folder now.

Set up the compiler project in Eclipse

  1. Download hw1.zip, and unzip it in a suitable folder. The result will be a folder, tigerc, containing a number of files and subdirectories:

      --> doc
            -- tiger.pdf                         (Prof. Edwards' language guide)
      --> lib
            -- JFlex.jar                         (lexer generator for Java)
            -- java-cup-11a.jar                  (parser generator for Java)
      --> src
            --> tigerc
                   --> codegen                   (unused, for now)
                   --> intermediate              (unused)
                   --> semant                    (unused)
                   --> syntax
                         --> absyn               (unused)
                         --> parse
                               -- Lexer.java     (an interface used by the parser)
                               -- Tiger.flex     (The JFlex specification file)
                               -- TigerSyms.java (symbol definitions for Tiger)
                   --> util
                         --  ErrorMsg.java       (basic error reporting mechanism)
      --> test
            --> testcases                        (sample Tiger source files)
                   -- comment_badNesting.tig
                   -- merge.tig
                   -- ... (dozens more)
            -- TestLexer.java                    (test driver for the lexer)
      --  build.xml
  2. In Eclipse, select File -> New -> Java Project
  3. The choice of project name is yours. Here, we'll use "TigerC".
  4. Uncheck "use default location", click "Browse", and navigate to the root of the folder containing the initial distribution, tigerc. Click "open". If you did it correctly, the "Location" bar should give the full directory path of the project, ending in the folder tigerc.
  5. Click Next. In the new dialogue box, under the Source tab, you should see the directory structure:
      --> doc
      --> lib
      --> src
      --> test
      --  build.xml

Available Ant tasks

This project is organized through the Apache Ant software build system, which is installed with the Eclipse distribution. Though powerful, the learning curve for this system is extremely steep, so for the sake of your focus on this project, a complete build file, build.xml, is included.

Through the build file, Ant is able to handle crucial aspects of the construction of a major piece of software, including the management of class paths, integration with third-party libraries, and automation of common tasks.

To use the system, right-click on build.xml in the PackageExplorer, and select Run As --> Ant Build .... You will get a dialogue box listing a number of "targets" to execute. For this project, the important ones are:

Your Job

Complete the lexical specification of Tiger, as begun in the skeleton Tiger.flex file. The keywords and operators in the language are given in the reference document, and you will need to make sure that you have implemented recognition of all of them. In addition, pay attention to the following:

Sample Input/Output

The source file test1.tig:

/* an array type 
     /* and an array variable */
	type  arrtype = array of int
	var arr1:arrtype := arrtype [10] of 0
On my solution, this gives the stream of tokens

LET 54
ID("arrtype") 65
EQ 73
OF 81
ID("int") 84
VAR 89
ID("arr1") 93
ID("arrtype") 98
ID("arrtype") 109
INT(10) 118
OF 122
INT(0) 125
IN 127
ID("arr1") 131
END 136
EOF 140

Turn In:

Your completed Tiger.jflex file, both a paper copy and electronically.

John H. E. Lasseter