In Definitional interpreters for higher-order programming languages (here), John Reynolds considers the definition of a simple applicative programming language by means of an interpreter written in a similar language.
An important and frequently used method of defining a programming language is to give an interpreter for the language that is written in a second, hopefully better understood language. We will call these two languages the defined and defining languages, respectively. Reynolds, 1998.
Reynolds demonstrates how, without diligence, the semantics of the interpreter's implementation language may leak into the interpretation. Leaky interpreters are not equivalent to the defining semantics, and may exhibit incorrect side effects or return wrong values. Take Haskell, a language with non-strict semantics. The GHC compiler uses lazy evaluation as an implementation of these non-strict semantics. The GHCi Haskell interpreter honours these non-strict semantics in the same way.
Say hello to a simple functional programming language called Drippy
Calculator, or Drippy for short. User defined functions are
supported. Integer is the only type. The two constructs are +
which
adds two integers, and poison()
which crashes a program when
evaluated. Here's a simple Drippy program:
int add(int a, int b) { return(a + b); } int main() { int x,y,z; x = 2; y = 3; z = add(x, y); return(z); }
An interpretation of this program prints 5. Drippy is a strictly
evaluated language and variables are immutable. The expressions in
Drippy are constants, variables and function calls. Every function is
a sequence of statements and ends with a return(..)
. A few notes on
Drippy evaluation (shared with Reynold's notes on his applicative
language):
Statements are evaluated from top to bottom. After each statement evaluation the environment is modified, which binds variables to values. The evaluation of a constant always gives the same value, regardless of its environment. The value from evaluating variables and function calls depend on their environment's context.
An applicative expression takes the form r0(r1,..,rn). The expressions r0,r1,..,rn are evaluted in the same environment to obtain values f, a1,..,an.
If f is not a function of n arguments, then an error stop occurs.
Otherwise, the function f is applied to the arguments a1,..,an, and if this application produces a result, then the result is the value of the application expression.
Let's implement a front-end for Drippy in with a BNF grammar.
entrypoints Program ; separator Function "" ; terminator Statement ";" ; separator Expr "," ; separator Argument "," ; terminator VarDec ";" ; separator nonempty Ident "," ; Prog. Program ::= [Function] Main ; Fn. Function ::= Type Ident Params Body ; FnBody. Body ::= "{" [VarDec] [Statement] Return "}" ; Mn. Main ::= "int" "main" "(" ")" Body ; ExprConstant. Expr ::= Integer ; ExprId. Expr ::= Ident ; ExprFnCall. Expr ::= FnCall ; Prms. Params ::= "(" [Argument] ")" ; Arg. Argument ::= Type Ident ; Var. VarDec ::= Type [Ident]; TypeInt. Type ::= "int" ; StmtAssign. Statement ::= Assignment ; StmtLangCall. Statement ::= LanguagePoisonFn ; AssignId. Assignment ::= Ident "=" Expr ; FnCallNArity. FnCall ::= Ident "(" [Expr] ")" ; LangFnCall. FnCall ::= LanguageFn ; LangPoisonFnCall. FnCall ::= LanguagePoisonFn ; LangAdd. LanguageFn ::= Expr "+" Expr ; LangPoison. LanguagePoisonFn ::= "poison" "(" ")" ; Rtn. Return ::= "return" "(" Expr ")" ";" ;
Let's generate a lexer and parser from this, using the BNFC software.
$ bnfc -m --haskell Drippy.cf $ make
Given Drippy's strict semantics, the following program should crash
once the third argument in the add(x, y, poison())
is evaluated in
the asssignment to z
.
int add(int a, int b, int c) { return(a + b); } int main() { int x,y,z; x = 2; y = 3; z = add(x, y, poison()); return(z); }
Here's a Haskell implementation of DrippyI, a Drippy interpreter.
module Main where import AbsDrippy import ParDrippy import ErrM import Data.Map (Map) import qualified Data.Map as Map import Data.Maybe import System.Environment (getArgs) main :: IO () main = do result <- evalDrippy =<< readFile . head =<< getArgs print result data Context = Context { localVars :: Map Ident (Maybe Integer) , funs :: Map Ident Function } evalDrippy :: String -> IO Integer evalDrippy s = do let Ok prog = pProgram (myLexer s) interpret prog initVars :: Map Ident (Maybe Integer) -> VarDec -> Map Ident (Maybe Integer) initVars lclVars (Var _varType idents) = foldl (\cxt ident -> Map.insert ident Nothing cxt) lclVars idents interpret :: Program -> IO Integer interpret (Prog fns mainF) = evalMain mainF fns evalStmt :: Context -> Statement -> Context evalStmt cxt (StmtAssign (AssignId ident expr)) = cxt { localVars = Map.insert ident (Just (evalExpr cxt expr)) (localVars cxt) } evalStmt cxt (StmtLangCall LangPoison) = error "crash: poisonous statement." evalLangCall :: Context -> LanguageFn -> Integer evalLangCall cxt (LangAdd e1 e2) = evalExpr cxt e1 + evalExpr cxt e2 evalFnCall :: Context -> FnCall -> Integer evalFnCall cxt (FnCallNArity ident exprs) = evalUserFn cxt ident exprs evalFnCall cxt (LangFnCall langFun) = evalLangCall cxt langFun evalFnCall cxt (LangPoisonFnCall LangPoison) = error "crash: poisonous function argument." evalExpr :: Context -> Expr -> Integer evalExpr _cxt (ExprConstant constant) = constant evalExpr cxt (ExprId ident) = fromJust (fromJust (Map.lookup ident (localVars cxt))) evalExpr cxt (ExprFnCall fn) = evalFnCall cxt fn evalUserFn :: Context -> Ident -> [Expr] -> Integer evalUserFn cxt ident args = let Fn _ _ (Prms params) (FnBody vrs stmts (Rtn returnStmt)) = fromJust $ Map.lookup ident (funs cxt) xs = zipWith (\(Arg _type paramId) arg -> (paramId,Just (evalExpr cxt arg))) params args paramVals = Map.union (Map.fromList xs) (localVars cxt) lclVars = foldl initVars paramVals vrs localCxt = foldl evalStmt (cxt { localVars = lclVars }) stmts in evalExpr localCxt returnStmt evalMain :: Main -> [Function] -> IO Integer evalMain (Mn (FnBody vrs stmts (Rtn returnStmt))) fns = do let vars = foldl initVars Map.empty vrs lclFuns = Map.fromList $ map (\f@(Fn _ ident _ _) -> (ident,f)) fns cxt = Context vars lclFuns cxt' = foldl evalStmt cxt stmts return (evalExpr cxt' returnStmt)
Let's interpret our poisonous program to make sure it does indeed crash...
$ ./DrippyI my_application.drippy 5
Oh dear, the non-strict semantics of Haskell have leaked into
DrippyI. The poisonous()
argument in the add(..)
call in the z
assignment has not been evaluated. The leak is in evalUserFn
, and
the poisonous expression is the last element in the args
argument. Although evalExpr
is used in zipWith
in the definition
of xs
, it is never applied. The function body of add
provides the
clue -- the third argument is never needed to evaluate the
expression in the return(a+b)
call.
int add(int a, int b, int c) { return(a + b); }
Haskell's call-by-need semantics have leaked in, the environment has
not bound the c
variable to any (normal form) value.
To recover the strict call-by-value semantics of Drippy, all
function arguments are evaluated to normal form (requirement (2)) with
deepseq
. This is achieved in the definition of paramVals
in a
revised evalUserFn
.
evalUserFn :: Context -> Ident -> [Expr] -> Integer evalUserFn cxt ident args = let Fn _ _ (Prms params) (FnBody vrs stmts (Rtn returnStmt)) = fromJust $ Map.lookup ident (funs cxt) xs = zipWith (\(Arg _type paramId) arg -> (paramId,Just (evalExpr cxt arg))) params args paramVals = map snd xs `deepseq` Map.union (Map.fromList xs) (localVars cxt) -- the fix lclVars = foldl initVars paramVals vrs localCxt = foldl evalStmt (cxt { localVars = lclVars }) stmts in if length params /= length args then error $ show ident ++ " expected " ++ show (length params) ++ " args" ++ " but " ++ show (length args) ++ " passed." else evalExpr localCxt returnStmt
This also implements requirement (3) of Drippy's evaluation semantics,
which says that the number of function arguments must be the same as
the number of expected arguments as per the user's function
definition. If they are not same, an error must be thrown. This is
checked in length params =/ length args
if condition.
Let's try evaluating our poisonous program again:
$ ./DrippyI my_application.drippy Drippy: crash: poisonous function argument.
Great, the laziness leaked is sealed. Now let's test that requirement
(3) is satisfied by instead trying to assign a value to z
with
add(x, y, poison(), 7)
...
$ ./DrippyI my_application.drippy Drippy: Ident "add" expected 3 args but 4 passed.
Hurrah, DrippyI leaks less!