Due on Wednesday, May 6th

Software

Background

The Expression datatype represents the abstract syntax of a purely functional language, with integer and boolean values and explicit, monomorphic type annotations on every variable and function declaration.

datatype Type = 
  TInt  
  | TBool  
  | TFunc of Type * Type
;

datatype BOp = Add | Sub | Mul | Div | Mod
  | Eq | Leq | Lt | Geq | Gt
  | And | Or
  
datatype UOp = Min | Not 

datatype Expression =
  True | False
  | Number of int
  | Fn of string * Type * Type * Expression
                      (*  Fn (, , ,body) *) 
  | Binary of Expression * BOp * Expression
  | Unary of UOp * Expression
  | Cond of Expression * Expression * Expression
  | Call of Expression * Expression
  | Variable of string
  | Lett of string * Type * Expression * Expression
  | Lettrec of string * Type *  Expression * Expression
;

We also need an implementation of environments

datatype 'a Environment = Env of (string * 'a) list;

and of "values"

datatype Value = 
  Integer of int 
  | Boolean of bool
  | Closure of string * Expression * (Value Environment ref)
;

Our interpreter is implemented with the function interpret, which takes an Expression and produces a value. If the expression is ill-typed, a RuntimeBug exception will be thrown, but with correct typechecking ahead of time, these cases will never be encountered.

This distribution is quite no frills, in order to simplify things. In particular, there isn't even a parser for the language's concrete syntax: all you have here is the definition of the abstract syntax. Still, that's enough to test things and get a feel for what's going on, by building a few AST objects manually. The following examples are included in the distributed source file:

val a = Number(42);

(* (2 * -6) + (4*3)  *)
val b = Binary(Binary(Number(2),Mul, Number(~6)), 
               Add,
               Binary(Number(4),Mul,Number(3)));
               
(* 4 - 3 - 3 - 1 *)
val c = Binary(Binary(Binary(Number(4),Sub,Number(3)),
                      Sub,
                      Number(3)),
               Sub,
               Number(1));


val d = Binary(b,Gt,c);  (* ((2 * -6) + (4*3)) > (4 - 3 - 3 - 1) *)

(*
val eF = 
 let val rec fact = fn n => if (n = 1) then 1 else n * (fact (n-1))
 in
    fact 
 end   
*)          
      
val e =  Lettrec("fact",TFunc(TInt,TInt),
                        (Fn("n",  TInt, TInt,
                         Cond(Binary(Variable("n"), Eq, Number(1)), 
                             Number(1),
                             Binary(Variable("n"),Mul,
                                   Call(Variable("fact"),
                                         Binary(Variable("n"), Sub, Number(1))
                                        )))
                    )),
	               Variable("fact") );

(*
let val f = eF in f 5 end
*)	                    
val f = Lett("f",TFunc(TInt,TInt),e, Call(Variable("f"), Number(5)));

Running the interpreter on each of these inputs will result in the correct Value.

- interpret a;
> val it = Integer 42 : Value/8
- interpret b;
> val it = Integer 0 : Value/8
- interpret c;
> val it = Integer ~3 : Value/8
- interpret d;
> val it = Boolean true : Value/8
- interpret e;
> val it =
    Closure("n", Cond(***),***)
     : Value/8
- interpret f;
> val it = Integer 120 : Value/8

Your Job

Add to this file a function, typecheck that takes an Expression and either:

Here's a transcript of my solution on the examples a-f, above:

- typecheck a;
> val it = TInt : Type/8
- typecheck b;
> val it = TInt : Type/8
- typecheck c;
> val it = TInt : Type/8
- typecheck d;
> val it = TBool : Type/8
- typecheck e;
> val it = TFunc(TInt, TInt) : Type/8
- typecheck f;
> val it = TInt : Type/8

The operations BOp and UOp have the usual meanings. The intention is that Add, Sub, Mul, Div, Min, Mod, Leq, Lt, Geq, Gt can only be applied to integers, that Not, And, Or can only be applied to booleans, and that Eq can only be applied to two expressions of the same type (two integers or two booleans but not a boolean and an integer). A Cond(e1,e2,e3) expression is like the Java expression e1? e2:e3 : the expression e1 should have boolean type, and the expressions e2, e3 should have the same type (which is also, of course the type of the expression overall). The type of the "body" expression in Lett and Lettrec constructs must be consistent with the declared type of the local variable. The type of a function body in Fn must match the declared return type, under the assumption of the declared type for the argument. A Call must

You will find that the structure of this function is virtually identical to that of interpret, from the inductive definition on AST structure to the use of environments, and even the auxiliary functions for checking binary and unary operators. This is no accident, of course, since type checking is just interpretation over an abstract value domain. Indeed, the only significant differences should be that Types are returned instead of Values, and that a TypeError is thrown on ill-typed expressions, rather than RuntimeBug.

John H. E. Lasseter